string length in x64 assembly (fasm)
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
10
down vote
favorite
Please critique this very, very basic routine which returns the length of a given char buffer or "string."
strlen: ; NOTE: RDI IS THE DEFAULT SRC FOR SCASB
push rdi
push rcx
xor rcx, rcx
mov rcx, -1
xor al, al
cld
repne scasb
neg rcx
sub rcx, 1
mov rax, rcx
pop rcx
pop rdi
ret
strings assembly x86
add a comment |Â
up vote
10
down vote
favorite
Please critique this very, very basic routine which returns the length of a given char buffer or "string."
strlen: ; NOTE: RDI IS THE DEFAULT SRC FOR SCASB
push rdi
push rcx
xor rcx, rcx
mov rcx, -1
xor al, al
cld
repne scasb
neg rcx
sub rcx, 1
mov rax, rcx
pop rcx
pop rdi
ret
strings assembly x86
2
I would be helpful to indicate whether you're coding to the Sys V ABI or the Microsoft ABI for AMD64.
â Jonathon Reinhart
Jul 31 at 17:41
don't usexor al, al
. In general avoid partial register update like that
â phuclv
Aug 1 at 2:08
add a comment |Â
up vote
10
down vote
favorite
up vote
10
down vote
favorite
Please critique this very, very basic routine which returns the length of a given char buffer or "string."
strlen: ; NOTE: RDI IS THE DEFAULT SRC FOR SCASB
push rdi
push rcx
xor rcx, rcx
mov rcx, -1
xor al, al
cld
repne scasb
neg rcx
sub rcx, 1
mov rax, rcx
pop rcx
pop rdi
ret
strings assembly x86
Please critique this very, very basic routine which returns the length of a given char buffer or "string."
strlen: ; NOTE: RDI IS THE DEFAULT SRC FOR SCASB
push rdi
push rcx
xor rcx, rcx
mov rcx, -1
xor al, al
cld
repne scasb
neg rcx
sub rcx, 1
mov rax, rcx
pop rcx
pop rdi
ret
strings assembly x86
edited Jul 31 at 17:20
200_success
123k14143398
123k14143398
asked Jul 31 at 14:51
the_endian
37119
37119
2
I would be helpful to indicate whether you're coding to the Sys V ABI or the Microsoft ABI for AMD64.
â Jonathon Reinhart
Jul 31 at 17:41
don't usexor al, al
. In general avoid partial register update like that
â phuclv
Aug 1 at 2:08
add a comment |Â
2
I would be helpful to indicate whether you're coding to the Sys V ABI or the Microsoft ABI for AMD64.
â Jonathon Reinhart
Jul 31 at 17:41
don't usexor al, al
. In general avoid partial register update like that
â phuclv
Aug 1 at 2:08
2
2
I would be helpful to indicate whether you're coding to the Sys V ABI or the Microsoft ABI for AMD64.
â Jonathon Reinhart
Jul 31 at 17:41
I would be helpful to indicate whether you're coding to the Sys V ABI or the Microsoft ABI for AMD64.
â Jonathon Reinhart
Jul 31 at 17:41
don't use
xor al, al
. In general avoid partial register update like thatâ phuclv
Aug 1 at 2:08
don't use
xor al, al
. In general avoid partial register update like thatâ phuclv
Aug 1 at 2:08
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
9
down vote
Saving rcx
is usually not necessary, it is not callee-save in common calling conventions. On Linux (and similar) rdi
also does not need to be saved, I guess you're using that since the Win64 calling convention does not pass an argument in rdi
. You can save them anyway if you want, which can be useful if you're using custom calling conventions. Saving an even number of registers makes the stack not-16-aligned though, you will probably get away with that now, but for example if you call some function that uses XMM registers it may save them at locations that it assumes are aligned (and there are some other cases where it causes trouble).
xor rcx, rcx
mov rcx, -1
The xor
is not useful, rcx
does not need to be zeroed before overwriting it for correctness reasons, and simply mov
-ing into a 64 (or 32) bit register already has no dependency on the previous value. By the way, when you do want to zero a 64bit register, you can use a 32bit xor
since writing to the low 32 bits of a register zeroes out the top half of the 64 bit register. There is not really an immediate performance difference, but using the 32bit version often lets you save the REX prefix, unless of course one of the "numbered registers" is an operand.
Because -x - 1= ~x + 1 - 1 = ~x
(using the definition of two's complement, -x = ~x + 1
) and you don't use the flags set by the sub
,
neg rcx
sub rcx, 1
mov rax, rcx
is equivalent to:
not rcx
mov rax, rcx
So all combined, this function could be simplified slightly to (assuming saving rdi
and rcx
is useful):
strlen:
push rdi
push rcx
mov rcx, -1
xor eax, eax
repne scasb
not rcx
mov rax, rcx
pop rcx
pop rdi
ret
1
How do you feel aboutxor ecx, ecx ; dec rcx
(5 bytes) instead ofmov rcx, -1
(7 bytes)? Or evenlea rcx, -1[rax]
(4 bytes)? But more importantly: comments. When it comes to asm, I'm a big fan of lots of comments. In particular, if registers are being saved for custom calling reasons (or whatever), you'd certainly want some comments saying so.
â David Wohlferd
Jul 31 at 23:25
xor r32, r32
should be used even for the high numbered registers, sincexor r64, r64
is not recognized in KNL. @DavidWohlferd see Set all bits in CPU register to 1 efficiently
â phuclv
Aug 1 at 2:05
@phuclv That link seems to like mylea rcx, -1[rax]
solution, since we already have a zeroed register we can use (rax).
â David Wohlferd
Aug 1 at 2:53
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
9
down vote
Saving rcx
is usually not necessary, it is not callee-save in common calling conventions. On Linux (and similar) rdi
also does not need to be saved, I guess you're using that since the Win64 calling convention does not pass an argument in rdi
. You can save them anyway if you want, which can be useful if you're using custom calling conventions. Saving an even number of registers makes the stack not-16-aligned though, you will probably get away with that now, but for example if you call some function that uses XMM registers it may save them at locations that it assumes are aligned (and there are some other cases where it causes trouble).
xor rcx, rcx
mov rcx, -1
The xor
is not useful, rcx
does not need to be zeroed before overwriting it for correctness reasons, and simply mov
-ing into a 64 (or 32) bit register already has no dependency on the previous value. By the way, when you do want to zero a 64bit register, you can use a 32bit xor
since writing to the low 32 bits of a register zeroes out the top half of the 64 bit register. There is not really an immediate performance difference, but using the 32bit version often lets you save the REX prefix, unless of course one of the "numbered registers" is an operand.
Because -x - 1= ~x + 1 - 1 = ~x
(using the definition of two's complement, -x = ~x + 1
) and you don't use the flags set by the sub
,
neg rcx
sub rcx, 1
mov rax, rcx
is equivalent to:
not rcx
mov rax, rcx
So all combined, this function could be simplified slightly to (assuming saving rdi
and rcx
is useful):
strlen:
push rdi
push rcx
mov rcx, -1
xor eax, eax
repne scasb
not rcx
mov rax, rcx
pop rcx
pop rdi
ret
1
How do you feel aboutxor ecx, ecx ; dec rcx
(5 bytes) instead ofmov rcx, -1
(7 bytes)? Or evenlea rcx, -1[rax]
(4 bytes)? But more importantly: comments. When it comes to asm, I'm a big fan of lots of comments. In particular, if registers are being saved for custom calling reasons (or whatever), you'd certainly want some comments saying so.
â David Wohlferd
Jul 31 at 23:25
xor r32, r32
should be used even for the high numbered registers, sincexor r64, r64
is not recognized in KNL. @DavidWohlferd see Set all bits in CPU register to 1 efficiently
â phuclv
Aug 1 at 2:05
@phuclv That link seems to like mylea rcx, -1[rax]
solution, since we already have a zeroed register we can use (rax).
â David Wohlferd
Aug 1 at 2:53
add a comment |Â
up vote
9
down vote
Saving rcx
is usually not necessary, it is not callee-save in common calling conventions. On Linux (and similar) rdi
also does not need to be saved, I guess you're using that since the Win64 calling convention does not pass an argument in rdi
. You can save them anyway if you want, which can be useful if you're using custom calling conventions. Saving an even number of registers makes the stack not-16-aligned though, you will probably get away with that now, but for example if you call some function that uses XMM registers it may save them at locations that it assumes are aligned (and there are some other cases where it causes trouble).
xor rcx, rcx
mov rcx, -1
The xor
is not useful, rcx
does not need to be zeroed before overwriting it for correctness reasons, and simply mov
-ing into a 64 (or 32) bit register already has no dependency on the previous value. By the way, when you do want to zero a 64bit register, you can use a 32bit xor
since writing to the low 32 bits of a register zeroes out the top half of the 64 bit register. There is not really an immediate performance difference, but using the 32bit version often lets you save the REX prefix, unless of course one of the "numbered registers" is an operand.
Because -x - 1= ~x + 1 - 1 = ~x
(using the definition of two's complement, -x = ~x + 1
) and you don't use the flags set by the sub
,
neg rcx
sub rcx, 1
mov rax, rcx
is equivalent to:
not rcx
mov rax, rcx
So all combined, this function could be simplified slightly to (assuming saving rdi
and rcx
is useful):
strlen:
push rdi
push rcx
mov rcx, -1
xor eax, eax
repne scasb
not rcx
mov rax, rcx
pop rcx
pop rdi
ret
1
How do you feel aboutxor ecx, ecx ; dec rcx
(5 bytes) instead ofmov rcx, -1
(7 bytes)? Or evenlea rcx, -1[rax]
(4 bytes)? But more importantly: comments. When it comes to asm, I'm a big fan of lots of comments. In particular, if registers are being saved for custom calling reasons (or whatever), you'd certainly want some comments saying so.
â David Wohlferd
Jul 31 at 23:25
xor r32, r32
should be used even for the high numbered registers, sincexor r64, r64
is not recognized in KNL. @DavidWohlferd see Set all bits in CPU register to 1 efficiently
â phuclv
Aug 1 at 2:05
@phuclv That link seems to like mylea rcx, -1[rax]
solution, since we already have a zeroed register we can use (rax).
â David Wohlferd
Aug 1 at 2:53
add a comment |Â
up vote
9
down vote
up vote
9
down vote
Saving rcx
is usually not necessary, it is not callee-save in common calling conventions. On Linux (and similar) rdi
also does not need to be saved, I guess you're using that since the Win64 calling convention does not pass an argument in rdi
. You can save them anyway if you want, which can be useful if you're using custom calling conventions. Saving an even number of registers makes the stack not-16-aligned though, you will probably get away with that now, but for example if you call some function that uses XMM registers it may save them at locations that it assumes are aligned (and there are some other cases where it causes trouble).
xor rcx, rcx
mov rcx, -1
The xor
is not useful, rcx
does not need to be zeroed before overwriting it for correctness reasons, and simply mov
-ing into a 64 (or 32) bit register already has no dependency on the previous value. By the way, when you do want to zero a 64bit register, you can use a 32bit xor
since writing to the low 32 bits of a register zeroes out the top half of the 64 bit register. There is not really an immediate performance difference, but using the 32bit version often lets you save the REX prefix, unless of course one of the "numbered registers" is an operand.
Because -x - 1= ~x + 1 - 1 = ~x
(using the definition of two's complement, -x = ~x + 1
) and you don't use the flags set by the sub
,
neg rcx
sub rcx, 1
mov rax, rcx
is equivalent to:
not rcx
mov rax, rcx
So all combined, this function could be simplified slightly to (assuming saving rdi
and rcx
is useful):
strlen:
push rdi
push rcx
mov rcx, -1
xor eax, eax
repne scasb
not rcx
mov rax, rcx
pop rcx
pop rdi
ret
Saving rcx
is usually not necessary, it is not callee-save in common calling conventions. On Linux (and similar) rdi
also does not need to be saved, I guess you're using that since the Win64 calling convention does not pass an argument in rdi
. You can save them anyway if you want, which can be useful if you're using custom calling conventions. Saving an even number of registers makes the stack not-16-aligned though, you will probably get away with that now, but for example if you call some function that uses XMM registers it may save them at locations that it assumes are aligned (and there are some other cases where it causes trouble).
xor rcx, rcx
mov rcx, -1
The xor
is not useful, rcx
does not need to be zeroed before overwriting it for correctness reasons, and simply mov
-ing into a 64 (or 32) bit register already has no dependency on the previous value. By the way, when you do want to zero a 64bit register, you can use a 32bit xor
since writing to the low 32 bits of a register zeroes out the top half of the 64 bit register. There is not really an immediate performance difference, but using the 32bit version often lets you save the REX prefix, unless of course one of the "numbered registers" is an operand.
Because -x - 1= ~x + 1 - 1 = ~x
(using the definition of two's complement, -x = ~x + 1
) and you don't use the flags set by the sub
,
neg rcx
sub rcx, 1
mov rax, rcx
is equivalent to:
not rcx
mov rax, rcx
So all combined, this function could be simplified slightly to (assuming saving rdi
and rcx
is useful):
strlen:
push rdi
push rcx
mov rcx, -1
xor eax, eax
repne scasb
not rcx
mov rax, rcx
pop rcx
pop rdi
ret
edited Jul 31 at 17:18
answered Jul 31 at 17:11
harold
59625
59625
1
How do you feel aboutxor ecx, ecx ; dec rcx
(5 bytes) instead ofmov rcx, -1
(7 bytes)? Or evenlea rcx, -1[rax]
(4 bytes)? But more importantly: comments. When it comes to asm, I'm a big fan of lots of comments. In particular, if registers are being saved for custom calling reasons (or whatever), you'd certainly want some comments saying so.
â David Wohlferd
Jul 31 at 23:25
xor r32, r32
should be used even for the high numbered registers, sincexor r64, r64
is not recognized in KNL. @DavidWohlferd see Set all bits in CPU register to 1 efficiently
â phuclv
Aug 1 at 2:05
@phuclv That link seems to like mylea rcx, -1[rax]
solution, since we already have a zeroed register we can use (rax).
â David Wohlferd
Aug 1 at 2:53
add a comment |Â
1
How do you feel aboutxor ecx, ecx ; dec rcx
(5 bytes) instead ofmov rcx, -1
(7 bytes)? Or evenlea rcx, -1[rax]
(4 bytes)? But more importantly: comments. When it comes to asm, I'm a big fan of lots of comments. In particular, if registers are being saved for custom calling reasons (or whatever), you'd certainly want some comments saying so.
â David Wohlferd
Jul 31 at 23:25
xor r32, r32
should be used even for the high numbered registers, sincexor r64, r64
is not recognized in KNL. @DavidWohlferd see Set all bits in CPU register to 1 efficiently
â phuclv
Aug 1 at 2:05
@phuclv That link seems to like mylea rcx, -1[rax]
solution, since we already have a zeroed register we can use (rax).
â David Wohlferd
Aug 1 at 2:53
1
1
How do you feel about
xor ecx, ecx ; dec rcx
(5 bytes) instead of mov rcx, -1
(7 bytes)? Or even lea rcx, -1[rax]
(4 bytes)? But more importantly: comments. When it comes to asm, I'm a big fan of lots of comments. In particular, if registers are being saved for custom calling reasons (or whatever), you'd certainly want some comments saying so.â David Wohlferd
Jul 31 at 23:25
How do you feel about
xor ecx, ecx ; dec rcx
(5 bytes) instead of mov rcx, -1
(7 bytes)? Or even lea rcx, -1[rax]
(4 bytes)? But more importantly: comments. When it comes to asm, I'm a big fan of lots of comments. In particular, if registers are being saved for custom calling reasons (or whatever), you'd certainly want some comments saying so.â David Wohlferd
Jul 31 at 23:25
xor r32, r32
should be used even for the high numbered registers, since xor r64, r64
is not recognized in KNL. @DavidWohlferd see Set all bits in CPU register to 1 efficientlyâ phuclv
Aug 1 at 2:05
xor r32, r32
should be used even for the high numbered registers, since xor r64, r64
is not recognized in KNL. @DavidWohlferd see Set all bits in CPU register to 1 efficientlyâ phuclv
Aug 1 at 2:05
@phuclv That link seems to like my
lea rcx, -1[rax]
solution, since we already have a zeroed register we can use (rax).â David Wohlferd
Aug 1 at 2:53
@phuclv That link seems to like my
lea rcx, -1[rax]
solution, since we already have a zeroed register we can use (rax).â David Wohlferd
Aug 1 at 2:53
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f200669%2fstring-length-in-x64-assembly-fasm%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
2
I would be helpful to indicate whether you're coding to the Sys V ABI or the Microsoft ABI for AMD64.
â Jonathon Reinhart
Jul 31 at 17:41
don't use
xor al, al
. In general avoid partial register update like thatâ phuclv
Aug 1 at 2:08