It's rare to have a meaningful algorithm where memory is the absolute only bottleneck. Not much else aside from memcpy and even that can see a benefit from SIMD tuning on many systems.
Especially since high performance software is glad to have even a 1% boost in speed.
Especially since high performance software is glad to have even a 1% boost in speed.