Hacker News new | past | comments | ask | show | jobs | submit login

You're right, I was wrong to say "most challenging" as there have been harder ones coming out recently. I think the correct statement would be "most challenging long-standing benchmark" as I don't believe any other test designed in 2019 has resisted progress for so long. FrontierMath is only a month old. And of course the real key feature of ARC is that it is easy for humans. FrontierMath is (intentionally) not.



They should put some famous, unsolved problems in the next edition so ML researchers do some actually useful work while they're "gaming" the benchmarks :)


I'm certain that the big labs will be gunning for the Millenium Prize problems.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: