Hacker News new | past | comments | ask | show | jobs | submit login

I don't quite understand how he gets a 10,000x speedup from a 100x transistor count decrease. Does die area increase with the square of transistor count?



What he's doing is representing numbers with their logarithms, with limited precision. A floating-point multiplier/divider, then, turns into a fairly small adder, which is much smaller and faster. Square roots and squaring turn into bit shifting. They have some clever method for doing addition/subtraction efficiently. And since they can fit all this in a small area with short critical paths, they can clock it very, very fast, and include a lot of them on a chip.


It would be rather interesting programming a machine where division was faster than addition!


Well, he says it's ~100x faster than a GPU, and GPUs are ~100x faster than CPUs (in the applications for which they are suited), so the 10,000x figure is the speedup from CPUs.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: