No: many (most?) modern SIMD instructions don’t require alignment. From the Intel Intrinsics Guide (can’t figure out how to link directly to it, sorry) on _mm_loadu_si128:
> Load 128-bits of integer data from memory into dst. mem_addr does not need to be aligned on any particular boundary.
Doesn't need, but is there a performance difference? I seem to remember there is no difference between _mm_load_si128 and _mm_loadu_si128 on modern CPUs, but I'm not sure.
Wouldn't this cause a sizable performance hit due to being misaligned most of the time?