Hacker News new | past | comments | ask | show | jobs | submit login

Duff's device, as emitted by GCC[0], is a bit on the verbose side but still quite neat. In particular the single-instruction computed goto that uses a look-up table made up of 8 quad-words, filled in by the linker.

Note the '.section .rodata' directive which actually places the quads pointers, seemingly interleaved with code, in a read-only data section.

Note also the dec/test/jle instructions implementing the while loop occur before the last of the eight copy operations, and interleaved with the next-to-last copy operation.

  duff:
  .LFB0:
          .cfi_startproc
          lea     eax, [rdi+7]
          mov     r8d, 8
          mov     rcx, rdx
          cdq
          idiv    r8d
          mov     r9d, eax
          mov     eax, edi
          cdq
          idiv    r8d
          cmp     edx, 7
          ja      .L2
          mov     edx, edx
          jmp     [QWORD PTR .L4[0+rdx*8]]
          .section        .rodata
          .align 8
          .align 4
  .L4:
          .quad   .L3
          .quad   .L5
          .quad   .L6
          .quad   .L7
          .quad   .L8
          .quad   .L9
          .quad   .L10
          .quad   .L11
          .text
  .L11:
          mov     al, BYTE PTR [rsi]
          inc     rsi
          mov     BYTE PTR [rcx], al
  .L10:
          mov     al, BYTE PTR [rsi]
          inc     rsi
          mov     BYTE PTR [rcx], al
  .L9:
          mov     al, BYTE PTR [rsi]
          inc     rsi
          mov     BYTE PTR [rcx], al
  .L8:
          mov     al, BYTE PTR [rsi]
          inc     rsi
          mov     BYTE PTR [rcx], al
  .L7:
          mov     al, BYTE PTR [rsi]
          inc     rsi
          mov     BYTE PTR [rcx], al
  .L6:
          mov     al, BYTE PTR [rsi]
          inc     rsi
          mov     BYTE PTR [rcx], al
  .L5:
          mov     al, BYTE PTR [rsi]
          dec     r9d
          inc     rsi
          test    r9d, r9d
          mov     BYTE PTR [rcx], al
          jle     .L2
  .L3:
          mov     al, BYTE PTR [rsi]
          inc     rsi
          mov     BYTE PTR [rcx], al
          jmp     .L11
  .L2:
          ret
          .cfi_endproc
___

edit: formatting, sigh

[0] v 7.3.0 64bit; gcc -S -Os -masm=intel




> Note the '.section .rodata' directive which places ..., seemingly interleaved with code, in a read-only data section.

It only specifies which section that part of "code" goes into, the linker pools it all up in a binary image and fills in the address for the tags when linking as directed by the linker script[0].

[0]: https://sourceware.org/binutils/docs/ld/Simple-Example.html


I think that Duff device is more interesting in C :)

https://en.wikipedia.org/wiki/Duff%27s_device




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: