Hacker News new | past | comments | ask | show | jobs | submit login

I'm not super familiar with ARM / ARM64 assembly and was confused as to how x0 was incremented. Was going to ask here, but decided to not be lazy and just look it up.

  const float f = *data++;


  ldr s1, [x0], #4
Turns out this instruction loads and increments x0 by 4 at the same time. It looks like you can use negative values too, so could iterate over something in reverse.

Kind of cool, I don't think x86_64 has a single instruction that can load and increment in one go.




lods and stos do load/store + increment of `rsi` or `rdi` respectively. There's also movs to copy between two memory addresses + increment

Usually seen in conjunction with rep, which repeats the above instruction `rcx` times.

A simple memset of 10 bytes:

    mov rcx, 10
    mov rdi, dest
    mov rax, 0
    rep stosb
You can use the w, d or q suffixes to advance by 2, 4 or 8 bytes.


Oh cool, and it looks like it can also decrement by 1/2/4/8.

> After the byte, word, or doubleword is transferred from the memory ___location into the AL, AX, or EAX register, the (E)SI register is incremented or decremented automatically according to the setting of the DF flag in the EFLAGS register. (If the DF flag is 0, the (E)SI register is incremented; if the DF flag is 1, the ESI register is decremented.) The (E)SI register is incremented or decremented by 1 for byte operations, by 2 for word operations, or by 4 for doubleword operations.

https://www.felixcloutier.com/x86/lods:lodsb:lodsw:lodsd:lod...




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: