Hacker News new | past | comments | ask | show | jobs | submit login

It'd be really nice to have good memory bandwidth usage metrics collected from a wide range of devices while doing inference.

For example, how close does it get to the peak, and what's the median bandwidth during inference? And is that bandwidth, rather than some other clever optimization elsewhere, actually providing the Mac's performance?

Personally, I don't develop HPC stuff on a laptop - I am much more interested in what a modern PC with Intel or AMD and nvidia can do, when maxxed out. But it's certainly interesting to see that some of Apple's arch decisions have worked out well for local LLMs.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: