GPU memory is all about bandwidth, not latency. DDR5 can do 4-8 GT/s x 64-bit bu...

GPU memory is all about bandwidth, not latency. DDR5 can do 4-8 GT/s x 64-bit bus per DIMM, so maxing 128 GB/s with a dual memory controller, 512 GB/s with 8x memory controllers on server chips, but GDDR6 can run at twice the frequency and has a memory bus ~5x as wide in the 4090, so you get an order of magnitude bump in throughput, so nearly 1 TB/s on a consumer product. Datacenter GPUs (e.g. A100) with HBM2e doubles that to 2 TB/s