Checksum computation in x86_64
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
3
down vote
favorite
This is an implementation of the TCP/IP checksum computation as described in https://tools.ietf.org/html/rfc1071.
The function is writen in DynASM, an assembler preprocessor that is part of LuaJIT. I didn't use much of the fancy stuff of DynASM so pretty much it looks like native X86_64 I think.
-- Prologue.
| push rbp
| mov rbp, rsp
-- Accumulative sum (16-bit).
| xor r9, r9 -- Clear out r9. Stores accumulated sum.
| mov rcx, 0 -- Set index to 0.
| 1:
| cmp rcx, rsi -- Rsi stores size arg. Compare index to size.
| jg >2 -- Go to branch '2' if equals.
| mov ah, [rdi + rcx] -- Read 16-bit of data and swapping byte order.
| mov al, [rdi + rcx+1]
| add r9, rax -- Sum and accumulate into r9.
| add rcx, 2 -- Increase index by 2.
| jmp <1 -- Go to beginning of loop.
| 2:
| mov rax, r9 -- Save accumulated sum in rax.
-- Sum carry until done.
| 4:
| shr r9, 16 -- Right-shift r9 16-bit to obtain value of carry.
| cmp r9, 0 -- Check whether there's carry.
| jz >5 -- In case not, go to '4'.
| and rax, 0xffff -- Clear out higher part of rax.
| add rax, r9 -- Sum carry to rax.
| mov r9, rax -- Assign rax to r9 to check if there's carry again.
| jmp <4 -- Go to beggining of loop.
-- One-complement.
| 5:
| not rax -- One-complement of rax.
| and rax, 0xffff -- Clear out higher part of rax.
-- Epilogue.
| mov rsp, rbp
| pop rbp
-- Return.
| ret
I verified against a Lua implementation the result is correct. I was wondering if I could write the code in a different way to use less instructions. For instance, the RFC says it's possible to sum the octets as a dword. I tried but I didn't manage to get a correct checksum.
assembly tcp checksum
add a comment |Â
up vote
3
down vote
favorite
This is an implementation of the TCP/IP checksum computation as described in https://tools.ietf.org/html/rfc1071.
The function is writen in DynASM, an assembler preprocessor that is part of LuaJIT. I didn't use much of the fancy stuff of DynASM so pretty much it looks like native X86_64 I think.
-- Prologue.
| push rbp
| mov rbp, rsp
-- Accumulative sum (16-bit).
| xor r9, r9 -- Clear out r9. Stores accumulated sum.
| mov rcx, 0 -- Set index to 0.
| 1:
| cmp rcx, rsi -- Rsi stores size arg. Compare index to size.
| jg >2 -- Go to branch '2' if equals.
| mov ah, [rdi + rcx] -- Read 16-bit of data and swapping byte order.
| mov al, [rdi + rcx+1]
| add r9, rax -- Sum and accumulate into r9.
| add rcx, 2 -- Increase index by 2.
| jmp <1 -- Go to beginning of loop.
| 2:
| mov rax, r9 -- Save accumulated sum in rax.
-- Sum carry until done.
| 4:
| shr r9, 16 -- Right-shift r9 16-bit to obtain value of carry.
| cmp r9, 0 -- Check whether there's carry.
| jz >5 -- In case not, go to '4'.
| and rax, 0xffff -- Clear out higher part of rax.
| add rax, r9 -- Sum carry to rax.
| mov r9, rax -- Assign rax to r9 to check if there's carry again.
| jmp <4 -- Go to beggining of loop.
-- One-complement.
| 5:
| not rax -- One-complement of rax.
| and rax, 0xffff -- Clear out higher part of rax.
-- Epilogue.
| mov rsp, rbp
| pop rbp
-- Return.
| ret
I verified against a Lua implementation the result is correct. I was wondering if I could write the code in a different way to use less instructions. For instance, the RFC says it's possible to sum the octets as a dword. I tried but I didn't manage to get a correct checksum.
assembly tcp checksum
1
It looks like you don't handle odd length messages correctly. Can you verify?
â vnp
Jan 18 at 0:28
Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
â Heslacher
Jan 18 at 8:34
@Hestacher Sorry, I didn't know. I appreciate you pointed that out. On the other hand, the guidelines say that is OK to post improved code as an answer ("If you want to show everyone how you improved your code, but don't want to ask another question, then post an answer to your own question. Selfie answers are acceptable on Stack Exchange sites, and even encouraged"). Thus, I posted the update as an answer.
â Diego Pino
Jan 18 at 9:02
add a comment |Â
up vote
3
down vote
favorite
up vote
3
down vote
favorite
This is an implementation of the TCP/IP checksum computation as described in https://tools.ietf.org/html/rfc1071.
The function is writen in DynASM, an assembler preprocessor that is part of LuaJIT. I didn't use much of the fancy stuff of DynASM so pretty much it looks like native X86_64 I think.
-- Prologue.
| push rbp
| mov rbp, rsp
-- Accumulative sum (16-bit).
| xor r9, r9 -- Clear out r9. Stores accumulated sum.
| mov rcx, 0 -- Set index to 0.
| 1:
| cmp rcx, rsi -- Rsi stores size arg. Compare index to size.
| jg >2 -- Go to branch '2' if equals.
| mov ah, [rdi + rcx] -- Read 16-bit of data and swapping byte order.
| mov al, [rdi + rcx+1]
| add r9, rax -- Sum and accumulate into r9.
| add rcx, 2 -- Increase index by 2.
| jmp <1 -- Go to beginning of loop.
| 2:
| mov rax, r9 -- Save accumulated sum in rax.
-- Sum carry until done.
| 4:
| shr r9, 16 -- Right-shift r9 16-bit to obtain value of carry.
| cmp r9, 0 -- Check whether there's carry.
| jz >5 -- In case not, go to '4'.
| and rax, 0xffff -- Clear out higher part of rax.
| add rax, r9 -- Sum carry to rax.
| mov r9, rax -- Assign rax to r9 to check if there's carry again.
| jmp <4 -- Go to beggining of loop.
-- One-complement.
| 5:
| not rax -- One-complement of rax.
| and rax, 0xffff -- Clear out higher part of rax.
-- Epilogue.
| mov rsp, rbp
| pop rbp
-- Return.
| ret
I verified against a Lua implementation the result is correct. I was wondering if I could write the code in a different way to use less instructions. For instance, the RFC says it's possible to sum the octets as a dword. I tried but I didn't manage to get a correct checksum.
assembly tcp checksum
This is an implementation of the TCP/IP checksum computation as described in https://tools.ietf.org/html/rfc1071.
The function is writen in DynASM, an assembler preprocessor that is part of LuaJIT. I didn't use much of the fancy stuff of DynASM so pretty much it looks like native X86_64 I think.
-- Prologue.
| push rbp
| mov rbp, rsp
-- Accumulative sum (16-bit).
| xor r9, r9 -- Clear out r9. Stores accumulated sum.
| mov rcx, 0 -- Set index to 0.
| 1:
| cmp rcx, rsi -- Rsi stores size arg. Compare index to size.
| jg >2 -- Go to branch '2' if equals.
| mov ah, [rdi + rcx] -- Read 16-bit of data and swapping byte order.
| mov al, [rdi + rcx+1]
| add r9, rax -- Sum and accumulate into r9.
| add rcx, 2 -- Increase index by 2.
| jmp <1 -- Go to beginning of loop.
| 2:
| mov rax, r9 -- Save accumulated sum in rax.
-- Sum carry until done.
| 4:
| shr r9, 16 -- Right-shift r9 16-bit to obtain value of carry.
| cmp r9, 0 -- Check whether there's carry.
| jz >5 -- In case not, go to '4'.
| and rax, 0xffff -- Clear out higher part of rax.
| add rax, r9 -- Sum carry to rax.
| mov r9, rax -- Assign rax to r9 to check if there's carry again.
| jmp <4 -- Go to beggining of loop.
-- One-complement.
| 5:
| not rax -- One-complement of rax.
| and rax, 0xffff -- Clear out higher part of rax.
-- Epilogue.
| mov rsp, rbp
| pop rbp
-- Return.
| ret
I verified against a Lua implementation the result is correct. I was wondering if I could write the code in a different way to use less instructions. For instance, the RFC says it's possible to sum the octets as a dword. I tried but I didn't manage to get a correct checksum.
assembly tcp checksum
edited Jan 18 at 8:34
Heslacher
43.9k359152
43.9k359152
asked Jan 18 at 0:14
Diego Pino
29329
29329
1
It looks like you don't handle odd length messages correctly. Can you verify?
â vnp
Jan 18 at 0:28
Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
â Heslacher
Jan 18 at 8:34
@Hestacher Sorry, I didn't know. I appreciate you pointed that out. On the other hand, the guidelines say that is OK to post improved code as an answer ("If you want to show everyone how you improved your code, but don't want to ask another question, then post an answer to your own question. Selfie answers are acceptable on Stack Exchange sites, and even encouraged"). Thus, I posted the update as an answer.
â Diego Pino
Jan 18 at 9:02
add a comment |Â
1
It looks like you don't handle odd length messages correctly. Can you verify?
â vnp
Jan 18 at 0:28
Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
â Heslacher
Jan 18 at 8:34
@Hestacher Sorry, I didn't know. I appreciate you pointed that out. On the other hand, the guidelines say that is OK to post improved code as an answer ("If you want to show everyone how you improved your code, but don't want to ask another question, then post an answer to your own question. Selfie answers are acceptable on Stack Exchange sites, and even encouraged"). Thus, I posted the update as an answer.
â Diego Pino
Jan 18 at 9:02
1
1
It looks like you don't handle odd length messages correctly. Can you verify?
â vnp
Jan 18 at 0:28
It looks like you don't handle odd length messages correctly. Can you verify?
â vnp
Jan 18 at 0:28
Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
â Heslacher
Jan 18 at 8:34
Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
â Heslacher
Jan 18 at 8:34
@Hestacher Sorry, I didn't know. I appreciate you pointed that out. On the other hand, the guidelines say that is OK to post improved code as an answer ("If you want to show everyone how you improved your code, but don't want to ask another question, then post an answer to your own question. Selfie answers are acceptable on Stack Exchange sites, and even encouraged"). Thus, I posted the update as an answer.
â Diego Pino
Jan 18 at 9:02
@Hestacher Sorry, I didn't know. I appreciate you pointed that out. On the other hand, the guidelines say that is OK to post improved code as an answer ("If you want to show everyone how you improved your code, but don't want to ask another question, then post an answer to your own question. Selfie answers are acceptable on Stack Exchange sites, and even encouraged"). Thus, I posted the update as an answer.
â Diego Pino
Jan 18 at 9:02
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
2
down vote
accepted
Your code, as shown, has a bug in that it will read past the end of the buffer you're computing the checksum for (by either 1 or 2 bytes, depending on if the length is odd or even). As that is easy enough to fix, I'll assume you make the appropriate changes to the code and move on to answering your question.
If you read over Section 2, paragraphs (B), of the RFC, you'll see that the addition can be done in either byte order. So rather than using two instructions to byte-swap the two bytes you're reading, just do one (mov ax,[rdi+rcx]
). This will result in the final checksum being byte-swapped, so you'll have to do the swap (xchg al,ah
) to get the same value to return from this function. If you're not doing anything with the checksum other than storing it into the TCP/IP packet, don't swap the return value, and don't swap the bytes when you write them out.
You can remove the cmp r9,0
instruction in the 4
block, as the shr
instruction will set the z
flag if the new register value is 0.
Now, to expand this method to utilize 32 or 64 bit registers, you can replace the (newly added) mov ax
instruction with mov eax
or mov rax
(with appropriate changes to the value of ecx
). This will need adjustments at the end to shift, add, and handle the carry to collapse the multiple bytes pairs into one, and the carry handling with the 64 bit register will also need to be different (inserting an adc r9,0
after the add r9,rax
instruction). Also, the handling for packets with sizes that are not multiples of 4 (or 8) will need to be added.
Details of this technique are discussed in the RFC.
Thanks a lot for the suggestions. I updated the question with a diff of the changes implemented.
â Diego Pino
Jan 18 at 8:25
add a comment |Â
up vote
0
down vote
Here is a diff with the changes suggested by @1201ProgramAlarm. The most important update is not overreading the buffer and handling the last byte in case the size of the array is an odd number. Then there are other suggestions implemented such as reading one 16-bit word in one read and not swapping the bytes. I agree it makes more sense the function returns the checksum in network-byte order and only convert it to host-byte order if it's needed to print out.
6c6
< | mov rcx, 0 -- Set index to 0.
---
> | mov rcx, rsi -- Init index to second-parameter (size).
8,11c8,12
< | cmp rcx, rsi -- Rsi stores size arg. Compare index to size.
< | jg >2 -- Go to branch '2' if equals.
< | mov ah, [rdi + rcx] -- Read 16-bit of data and swapping byte order.
< | mov al, [rdi + rcx+1]
---
> | cmp rcx, 1 -- If index is less or equal to 1.
> | jle >2 -- Go to branch '2'.
> | mov r8, rsi -- Set size to r8.
> | sub r8, rcx -- Decrease r8 by index.
> | mov ax, [rdi + r8] -- Fetch 16-bit word from data + r8.
13c14
< | add rcx, 2 -- Increase index by 2.
---
> | sub rcx, 2 -- Decrease index by 2.
14a16
> -- Handle last byte of data.
15a18,24
> | cmp rcx, 1 -- If rcx was equals to 1.
> | jne >3
> | mov rcx, rsi -- Set rcx to rsi-1.
> | sub rcx, 1
> | mov ax, [rdi + rcx] -- Read last byte of data.
> | add r9, rax -- And add it to r9.
> | 3:
20d28
< | cmp r9, 0 -- Check whether there's carry.
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
accepted
Your code, as shown, has a bug in that it will read past the end of the buffer you're computing the checksum for (by either 1 or 2 bytes, depending on if the length is odd or even). As that is easy enough to fix, I'll assume you make the appropriate changes to the code and move on to answering your question.
If you read over Section 2, paragraphs (B), of the RFC, you'll see that the addition can be done in either byte order. So rather than using two instructions to byte-swap the two bytes you're reading, just do one (mov ax,[rdi+rcx]
). This will result in the final checksum being byte-swapped, so you'll have to do the swap (xchg al,ah
) to get the same value to return from this function. If you're not doing anything with the checksum other than storing it into the TCP/IP packet, don't swap the return value, and don't swap the bytes when you write them out.
You can remove the cmp r9,0
instruction in the 4
block, as the shr
instruction will set the z
flag if the new register value is 0.
Now, to expand this method to utilize 32 or 64 bit registers, you can replace the (newly added) mov ax
instruction with mov eax
or mov rax
(with appropriate changes to the value of ecx
). This will need adjustments at the end to shift, add, and handle the carry to collapse the multiple bytes pairs into one, and the carry handling with the 64 bit register will also need to be different (inserting an adc r9,0
after the add r9,rax
instruction). Also, the handling for packets with sizes that are not multiples of 4 (or 8) will need to be added.
Details of this technique are discussed in the RFC.
Thanks a lot for the suggestions. I updated the question with a diff of the changes implemented.
â Diego Pino
Jan 18 at 8:25
add a comment |Â
up vote
2
down vote
accepted
Your code, as shown, has a bug in that it will read past the end of the buffer you're computing the checksum for (by either 1 or 2 bytes, depending on if the length is odd or even). As that is easy enough to fix, I'll assume you make the appropriate changes to the code and move on to answering your question.
If you read over Section 2, paragraphs (B), of the RFC, you'll see that the addition can be done in either byte order. So rather than using two instructions to byte-swap the two bytes you're reading, just do one (mov ax,[rdi+rcx]
). This will result in the final checksum being byte-swapped, so you'll have to do the swap (xchg al,ah
) to get the same value to return from this function. If you're not doing anything with the checksum other than storing it into the TCP/IP packet, don't swap the return value, and don't swap the bytes when you write them out.
You can remove the cmp r9,0
instruction in the 4
block, as the shr
instruction will set the z
flag if the new register value is 0.
Now, to expand this method to utilize 32 or 64 bit registers, you can replace the (newly added) mov ax
instruction with mov eax
or mov rax
(with appropriate changes to the value of ecx
). This will need adjustments at the end to shift, add, and handle the carry to collapse the multiple bytes pairs into one, and the carry handling with the 64 bit register will also need to be different (inserting an adc r9,0
after the add r9,rax
instruction). Also, the handling for packets with sizes that are not multiples of 4 (or 8) will need to be added.
Details of this technique are discussed in the RFC.
Thanks a lot for the suggestions. I updated the question with a diff of the changes implemented.
â Diego Pino
Jan 18 at 8:25
add a comment |Â
up vote
2
down vote
accepted
up vote
2
down vote
accepted
Your code, as shown, has a bug in that it will read past the end of the buffer you're computing the checksum for (by either 1 or 2 bytes, depending on if the length is odd or even). As that is easy enough to fix, I'll assume you make the appropriate changes to the code and move on to answering your question.
If you read over Section 2, paragraphs (B), of the RFC, you'll see that the addition can be done in either byte order. So rather than using two instructions to byte-swap the two bytes you're reading, just do one (mov ax,[rdi+rcx]
). This will result in the final checksum being byte-swapped, so you'll have to do the swap (xchg al,ah
) to get the same value to return from this function. If you're not doing anything with the checksum other than storing it into the TCP/IP packet, don't swap the return value, and don't swap the bytes when you write them out.
You can remove the cmp r9,0
instruction in the 4
block, as the shr
instruction will set the z
flag if the new register value is 0.
Now, to expand this method to utilize 32 or 64 bit registers, you can replace the (newly added) mov ax
instruction with mov eax
or mov rax
(with appropriate changes to the value of ecx
). This will need adjustments at the end to shift, add, and handle the carry to collapse the multiple bytes pairs into one, and the carry handling with the 64 bit register will also need to be different (inserting an adc r9,0
after the add r9,rax
instruction). Also, the handling for packets with sizes that are not multiples of 4 (or 8) will need to be added.
Details of this technique are discussed in the RFC.
Your code, as shown, has a bug in that it will read past the end of the buffer you're computing the checksum for (by either 1 or 2 bytes, depending on if the length is odd or even). As that is easy enough to fix, I'll assume you make the appropriate changes to the code and move on to answering your question.
If you read over Section 2, paragraphs (B), of the RFC, you'll see that the addition can be done in either byte order. So rather than using two instructions to byte-swap the two bytes you're reading, just do one (mov ax,[rdi+rcx]
). This will result in the final checksum being byte-swapped, so you'll have to do the swap (xchg al,ah
) to get the same value to return from this function. If you're not doing anything with the checksum other than storing it into the TCP/IP packet, don't swap the return value, and don't swap the bytes when you write them out.
You can remove the cmp r9,0
instruction in the 4
block, as the shr
instruction will set the z
flag if the new register value is 0.
Now, to expand this method to utilize 32 or 64 bit registers, you can replace the (newly added) mov ax
instruction with mov eax
or mov rax
(with appropriate changes to the value of ecx
). This will need adjustments at the end to shift, add, and handle the carry to collapse the multiple bytes pairs into one, and the carry handling with the 64 bit register will also need to be different (inserting an adc r9,0
after the add r9,rax
instruction). Also, the handling for packets with sizes that are not multiples of 4 (or 8) will need to be added.
Details of this technique are discussed in the RFC.
answered Jan 18 at 1:28
1201ProgramAlarm
2,5852618
2,5852618
Thanks a lot for the suggestions. I updated the question with a diff of the changes implemented.
â Diego Pino
Jan 18 at 8:25
add a comment |Â
Thanks a lot for the suggestions. I updated the question with a diff of the changes implemented.
â Diego Pino
Jan 18 at 8:25
Thanks a lot for the suggestions. I updated the question with a diff of the changes implemented.
â Diego Pino
Jan 18 at 8:25
Thanks a lot for the suggestions. I updated the question with a diff of the changes implemented.
â Diego Pino
Jan 18 at 8:25
add a comment |Â
up vote
0
down vote
Here is a diff with the changes suggested by @1201ProgramAlarm. The most important update is not overreading the buffer and handling the last byte in case the size of the array is an odd number. Then there are other suggestions implemented such as reading one 16-bit word in one read and not swapping the bytes. I agree it makes more sense the function returns the checksum in network-byte order and only convert it to host-byte order if it's needed to print out.
6c6
< | mov rcx, 0 -- Set index to 0.
---
> | mov rcx, rsi -- Init index to second-parameter (size).
8,11c8,12
< | cmp rcx, rsi -- Rsi stores size arg. Compare index to size.
< | jg >2 -- Go to branch '2' if equals.
< | mov ah, [rdi + rcx] -- Read 16-bit of data and swapping byte order.
< | mov al, [rdi + rcx+1]
---
> | cmp rcx, 1 -- If index is less or equal to 1.
> | jle >2 -- Go to branch '2'.
> | mov r8, rsi -- Set size to r8.
> | sub r8, rcx -- Decrease r8 by index.
> | mov ax, [rdi + r8] -- Fetch 16-bit word from data + r8.
13c14
< | add rcx, 2 -- Increase index by 2.
---
> | sub rcx, 2 -- Decrease index by 2.
14a16
> -- Handle last byte of data.
15a18,24
> | cmp rcx, 1 -- If rcx was equals to 1.
> | jne >3
> | mov rcx, rsi -- Set rcx to rsi-1.
> | sub rcx, 1
> | mov ax, [rdi + rcx] -- Read last byte of data.
> | add r9, rax -- And add it to r9.
> | 3:
20d28
< | cmp r9, 0 -- Check whether there's carry.
add a comment |Â
up vote
0
down vote
Here is a diff with the changes suggested by @1201ProgramAlarm. The most important update is not overreading the buffer and handling the last byte in case the size of the array is an odd number. Then there are other suggestions implemented such as reading one 16-bit word in one read and not swapping the bytes. I agree it makes more sense the function returns the checksum in network-byte order and only convert it to host-byte order if it's needed to print out.
6c6
< | mov rcx, 0 -- Set index to 0.
---
> | mov rcx, rsi -- Init index to second-parameter (size).
8,11c8,12
< | cmp rcx, rsi -- Rsi stores size arg. Compare index to size.
< | jg >2 -- Go to branch '2' if equals.
< | mov ah, [rdi + rcx] -- Read 16-bit of data and swapping byte order.
< | mov al, [rdi + rcx+1]
---
> | cmp rcx, 1 -- If index is less or equal to 1.
> | jle >2 -- Go to branch '2'.
> | mov r8, rsi -- Set size to r8.
> | sub r8, rcx -- Decrease r8 by index.
> | mov ax, [rdi + r8] -- Fetch 16-bit word from data + r8.
13c14
< | add rcx, 2 -- Increase index by 2.
---
> | sub rcx, 2 -- Decrease index by 2.
14a16
> -- Handle last byte of data.
15a18,24
> | cmp rcx, 1 -- If rcx was equals to 1.
> | jne >3
> | mov rcx, rsi -- Set rcx to rsi-1.
> | sub rcx, 1
> | mov ax, [rdi + rcx] -- Read last byte of data.
> | add r9, rax -- And add it to r9.
> | 3:
20d28
< | cmp r9, 0 -- Check whether there's carry.
add a comment |Â
up vote
0
down vote
up vote
0
down vote
Here is a diff with the changes suggested by @1201ProgramAlarm. The most important update is not overreading the buffer and handling the last byte in case the size of the array is an odd number. Then there are other suggestions implemented such as reading one 16-bit word in one read and not swapping the bytes. I agree it makes more sense the function returns the checksum in network-byte order and only convert it to host-byte order if it's needed to print out.
6c6
< | mov rcx, 0 -- Set index to 0.
---
> | mov rcx, rsi -- Init index to second-parameter (size).
8,11c8,12
< | cmp rcx, rsi -- Rsi stores size arg. Compare index to size.
< | jg >2 -- Go to branch '2' if equals.
< | mov ah, [rdi + rcx] -- Read 16-bit of data and swapping byte order.
< | mov al, [rdi + rcx+1]
---
> | cmp rcx, 1 -- If index is less or equal to 1.
> | jle >2 -- Go to branch '2'.
> | mov r8, rsi -- Set size to r8.
> | sub r8, rcx -- Decrease r8 by index.
> | mov ax, [rdi + r8] -- Fetch 16-bit word from data + r8.
13c14
< | add rcx, 2 -- Increase index by 2.
---
> | sub rcx, 2 -- Decrease index by 2.
14a16
> -- Handle last byte of data.
15a18,24
> | cmp rcx, 1 -- If rcx was equals to 1.
> | jne >3
> | mov rcx, rsi -- Set rcx to rsi-1.
> | sub rcx, 1
> | mov ax, [rdi + rcx] -- Read last byte of data.
> | add r9, rax -- And add it to r9.
> | 3:
20d28
< | cmp r9, 0 -- Check whether there's carry.
Here is a diff with the changes suggested by @1201ProgramAlarm. The most important update is not overreading the buffer and handling the last byte in case the size of the array is an odd number. Then there are other suggestions implemented such as reading one 16-bit word in one read and not swapping the bytes. I agree it makes more sense the function returns the checksum in network-byte order and only convert it to host-byte order if it's needed to print out.
6c6
< | mov rcx, 0 -- Set index to 0.
---
> | mov rcx, rsi -- Init index to second-parameter (size).
8,11c8,12
< | cmp rcx, rsi -- Rsi stores size arg. Compare index to size.
< | jg >2 -- Go to branch '2' if equals.
< | mov ah, [rdi + rcx] -- Read 16-bit of data and swapping byte order.
< | mov al, [rdi + rcx+1]
---
> | cmp rcx, 1 -- If index is less or equal to 1.
> | jle >2 -- Go to branch '2'.
> | mov r8, rsi -- Set size to r8.
> | sub r8, rcx -- Decrease r8 by index.
> | mov ax, [rdi + r8] -- Fetch 16-bit word from data + r8.
13c14
< | add rcx, 2 -- Increase index by 2.
---
> | sub rcx, 2 -- Decrease index by 2.
14a16
> -- Handle last byte of data.
15a18,24
> | cmp rcx, 1 -- If rcx was equals to 1.
> | jne >3
> | mov rcx, rsi -- Set rcx to rsi-1.
> | sub rcx, 1
> | mov ax, [rdi + rcx] -- Read last byte of data.
> | add r9, rax -- And add it to r9.
> | 3:
20d28
< | cmp r9, 0 -- Check whether there's carry.
answered Jan 18 at 9:02
Diego Pino
29329
29329
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f185354%2fchecksum-computation-in-x86-64%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
1
It looks like you don't handle odd length messages correctly. Can you verify?
â vnp
Jan 18 at 0:28
Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
â Heslacher
Jan 18 at 8:34
@Hestacher Sorry, I didn't know. I appreciate you pointed that out. On the other hand, the guidelines say that is OK to post improved code as an answer ("If you want to show everyone how you improved your code, but don't want to ask another question, then post an answer to your own question. Selfie answers are acceptable on Stack Exchange sites, and even encouraged"). Thus, I posted the update as an answer.
â Diego Pino
Jan 18 at 9:02