> The best engineering minds have been focused on scaling transformer pre and po... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

mvdtnz 5 months ago | parent | context | favorite | on: OpenAI, Google and Anthropic are struggling to bui...

> The best engineering minds have been focused on scaling transformer pre and post training for the last three years because they had good reason to believe it would work, and it has up until now.

Or because the people running companies who have fooled investors into believing it will work can afford to pay said engineers life-changing amounts of money.

slashdave 5 months ago [–]

The improvements in transformer implementation (e.g. "Flash Attention") have saved gobs of money on training and inference, I am guessing most likely more than the salary of those researchers.

Consider applying for YC's Summer 2025 batch! Applications are open till May 13
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact