Hacker News new | past | comments | ask | show | jobs | submit login

Honest question for you: are these puzzles actually a good way to test the models?

The answers are certainly in the training set, likely many times over.

I’d be curious to see performance on Bracket City, which was featured here on HN yesterday.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: