Hacker News new | past | comments | ask | show | jobs | submit login

i += 7;

Wouldn't this cause a sizable performance hit due to being misaligned most of the time?




Author here. This was true several generations ago (core2, for instance), now the performance penalty is negligible.


What about ARM?


Sorry, have no idea.


No: many (most?) modern SIMD instructions don’t require alignment. From the Intel Intrinsics Guide (can’t figure out how to link directly to it, sorry) on _mm_loadu_si128:

> Load 128-bits of integer data from memory into dst. mem_addr does not need to be aligned on any particular boundary.


Doesn't need, but is there a performance difference? I seem to remember there is no difference between _mm_load_si128 and _mm_loadu_si128 on modern CPUs, but I'm not sure.


loadu is not the best example because its sole purpose is loading unaligned data. There is a separate load for aligned data.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: