Hacker News new | past | comments | ask | show | jobs | submit login

> One example is enough to disprove the "not capable of" nonsense. There are other examples too.

Gotcha, fair enough. Throw enough chess data in during training, I'm sure they'd be pretty good at chess.

I don't really understand what you're trying to say in your next paragraph. LLMs surely have plenty of training data to be familiar with the rules of chess. They also purportedly have the reasoning skills to use their familiarity to connect the dots and actually play. It's trivially true that this issue can be plastered over by shoving lots of chess game training data into them, but the success of that route is not a positive reflection on their reasoning abilities.




Gradient descent is a dumb optimizer. LLM training is not at all like a human reading a book and more like evolution tuning adaptations over centuries. You would not expect either process to be aware of anything they are converging towards. So having lots of books that talk about chess in training will predictably just return a model that knows how to talk about chess really well. I'm not surprised they may know how to talk about the rules but play them poorly.

And that post had a follow-up. Post-training messing things up could well be the issue seeing the impact even a little more examples and/or regurgitation made. https://dynomight.net/more-chess/


The whole premise on which the immense valuations of these AI companies is based on is that they are learning general reasoning skills from their training on language. That is, that simply training on text is going to eventually give the AI the ability to generate language that reasons at more or less human level in more or less any ___domain of knowledge.

This whole premise crashes and burns if you need task-specific training, like explicit chess training. That is because there are far too many tasks that humans need to be competent at in order to be useful in society. Even worse, the vast majority of those tasks are very hard to source training data for, unlike chess.

So, if we accept that LLMs can't learn chess unless they explicitly include chess games in the training set, then we have to accept that they can't learn, say, to sell business software unless they include business software pitches in the training set, and there are going to be FAR fewer of those than chess games.


>The whole premise on which the immense valuations of these AI companies is based on is that they are learning general reasoning skills from their training on language.

And they do, just not always in the ways we expect.

>This whole premise crashes and burns if you need task-specific training, like explicit chess training.

Everyone needs task specific training. Any human good at chess or anything enough to make it a profession needs it. So I have no idea why people would expect any less for a Machine.

>then we have to accept that they can't learn, say, to sell business software unless they include business software pitches in the training set, and there are going to be FAR fewer of those than chess games.

Yeah so ? How much business pitches they need in the training set has no correlation with chess. I don't see any reason to believe what is already present isn't enough. There's enough chess data on the internet to teach them chess too, it's just a matter of how much open AI care about it.


Chess is a very simple game, and having basic general reasoning skills is more than enough to learn how to play it. It's not some advanced mathematics or complicated human interaction - it's a game with 30 or so fixed rules. And chess manuals have numerous examples of actual chess games, it's not like they are pure text talking about the game.

So, the fact that LLMs can't learn this sample game despite probably including all of the books ever written on it in their training set tells us something about their general reasoning skills.


As in: they do not have general reasoning skills.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: