Hacker News new | past | comments | ask | show | jobs | submit | from login
Beyond Self-Attention: How a Small Language Model Predicts the Next Token (shyam.blog)
3 points by bilsbie on Feb 13, 2024 | past
Beyond self-attention: How a small language model predicts the next token (shyam.blog)
474 points by tplrbv on Feb 4, 2024 | past | 85 comments

Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: