> These LLMs are polymaths that can spit out content at a super human rate.
Do you mean in theory or currently? Because currently, LLMs make simple errors (eg [1]) and are more capable of spitting out, well, nonsense. I think it's safe to say we're a long way from LLMs producing anything creatively good.
I'll put it this way: you won't be getting The Godfather from LLMs anytime soon but you can probably get an industrial film with generic music that tells you how to safely handle solvents, maybe.
Computers are generally good at doing math but LLMs generally aren't [2] and that really demonstrates the weaknesses in this statistical approach. ChatGPT (as one example) doesn't understand what numbers are or how to multiply them. It relies seeing similar answers to derive a likely answer so it often gets the first and large digits of the answer correct but not the middle. You can't keep scaling the input data to have it see every possible math question. That's just not practical.
Now multiplying two large numbers is a solvable problem. Counting Rs in strawberry is a solvable problem. But statistical LLMs are going to have a massive long tail of these problems. It's really going to take the next generational change to make progress.
Both the "count the Rs in strawberry" and the "multiply two large numbers" things have been solved for over a year now by the tool usage pattern: give an LLM the ability to delegate to a code execution environment for things it's inherently bad at and train it how to identify when to use that option.
I think the point is that playing whack a mole is an effective practical strategy to shore up individual weaknesses (or even classes of weaknesses) but that doesn’t get you to general reasoning unless you think that intelligence evolved this way. Given the adaptability of intelligence across the animal kingdom to novel environments never seen before, I don’t think that can be anything other than a short term strategy for AGI.
I think we’re in agreement. It’s going to take next generation architecture to address the flaws where the LLM often can’t even correct its mistake when it’s pointed out as with the strawberry example.
I still think transformers and LLMs will likely remain as some component within that next gen architecture vs something completely alien.
Do you mean in theory or currently? Because currently, LLMs make simple errors (eg [1]) and are more capable of spitting out, well, nonsense. I think it's safe to say we're a long way from LLMs producing anything creatively good.
I'll put it this way: you won't be getting The Godfather from LLMs anytime soon but you can probably get an industrial film with generic music that tells you how to safely handle solvents, maybe.
Computers are generally good at doing math but LLMs generally aren't [2] and that really demonstrates the weaknesses in this statistical approach. ChatGPT (as one example) doesn't understand what numbers are or how to multiply them. It relies seeing similar answers to derive a likely answer so it often gets the first and large digits of the answer correct but not the middle. You can't keep scaling the input data to have it see every possible math question. That's just not practical.
Now multiplying two large numbers is a solvable problem. Counting Rs in strawberry is a solvable problem. But statistical LLMs are going to have a massive long tail of these problems. It's really going to take the next generational change to make progress.
[1]: https://www.inc.com/kit-eaton/how-many-rs-in-strawberry-this...
[2]: https://www.reachcapital.com/2024/07/16/why-llms-are-bad-at-...