Hacker News new | past | comments | ask | show | jobs | submit login

Obviously, bandwidth is not the same as latency.

GPUs cards makers the trade of between bandwidth and latency i n favor of latency. When you're doing mostly branch free processing in large chunks that's the trade of to make. All you need is a strait forward pre-fetcher and you don't need to worry about latency.

That's not true for general purpose CPUs that perform lots of branches, that need to predicted (so we can predict what to fetch). The data processed on CPUs tends to be different (structures vs. vectors) and lots of pointer chasing (vtables, linked lists, hash tables, trees). That requires lower latencies since the access pasterns a lot more random.

The stated goal of HBM is taming the power consumption (and thus also heat) of GPU systems while keeping the same (or higher) bandwidth. The name HBM stands for high bandwidth memory.

And while HBM has a lower clock frequency compared to GDDR5 (like 1/4) it has a much wider bus. The bus on HBM is 1024 bits vs 32 bits for GDDR. At one time it can send 32x times the data in the bus. 32x / 4 = 8. The transfer rate of is 8 times bigger. The recent radeon cards with HBM now have memory transfer speeds of 256GB/s vs the 48GB for GGDR5.

Again, HBM trades latency for bandwidth. It negates some the latency drops due to 1/4 of the clock by putting the HBM memory on die vs off die.

I think you're conflating a few different arguments. GPU workloads are not latency sensitive, so in GPU land transfer speed (bandwidth) is speed.




I am not sure you are familiar with modern GPU architecture. Both AMD's and NVidia GPU have no problems with branches. They do not do prediction and prefetch because it's pretty pointless on a single issue architecture. I believe the ISA docs are available to general public - you could easily familiarize yourself with them. I am also quite familiar with latency and bandwidth so the concept of negating one with another sounds very amateurish to me. If you could do that then everyone switched to high bandwidth memory and negated all the latency :) Speed is still speed and bandwidth is still bandwidth.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: