Hacker News new | past | comments | ask | show | jobs | submit login

It shows objectively that one model got better at this specific kind of weird puzzle that doesn't translate to anything because it is just a pointless pattern matching puzzle that can be trained for, just like anything else. In fact they specifically trained for it, they say so upfront.

It's like the modern equivalent of saying "oh when AI solves chess it'll be as smart as a person, so it's a good benchmark" and we all know how that nonsense went.




Hmm, you could be right, but you could also be very wrong. Jury's still out, so the next few years will be interesting.

Regarding the value of "pointless pattern matching" in particular, I would refer you to Douglas Hofstadter's discussion of Bongard problems starting on page 652 of _Godel, Escher, Bach_. Money quote: "I believe that the skill of solving Bongard [pattern recognition] problems lies very close to the core of 'pure' intelligence, if there is such a thing."


Well I certainly at least agree with that second part, the doubt if there is such a thing ;)

The problem with pattern matching of sequences and transformers as an architecture is that it's something they're explicitly designed to be good at with self attention. Translation is mainly matching patterns to equivalents in different languages, and continuing a piece of text is following a pattern that exists inside it. This is primarily why it's so hard to draw a line between what an LLM actually understands and what it just wings naturally through pattern memorization and why everything about them is so controversial.

Honestly I was really surprised that all models did so poorly on ARC in general thus far, since it really should be something they ought to be superhuman at from the get-go. Probably more of a problem that it's visual in concept than anything else.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: