Hacker News new | past | comments | ask | show | jobs | submit login

I’m starting to think this is an unsolvable problem with LLMs. The very act of “reasoning” requires one to know that they don’t know something.

LLMs are giant word Plinko machines. A million monkeys on a million typewriters.

LLMs are not interns. LLMs are assumption machines.

None of the million monkeys or the collective million monkeys are “reasoning” or are capable of knowing.

LLMs are a neat parlor trick and are super powerful, but are not on the path to AGI.

LLMs will change the world, but only in the way that the printing press changed the world. They’re not interns, they’re just tools.




I think LLMs are definitely on the path to AGI in the same way that the ball bearing was on the path to the internal combustion engine. I think its quite likely that LLMs will perform important functions within the system of an eventual AGI.


We're learning valuable lessons from all modern large-scale (post-AlexNet) NN architectures, transformers included, and NNs (but maybe trained differently) seem a viable approach to implement AGI, so we're making progress ... but maybe LLMs will be more inspiration than part of the (a) final solution.

OTOH, maybe pre-trained LLMs could be used as a hardcoded "reptilian brain" that provides some future AGI with some base capabilities (vs being sold as newborn that needs 20 years of parenting to be useful) that the real learning architecture can then override.


I would think they'd be more likely to form the language centre of a composite AGI brain. If you read through the known functions of the various areas involved in language[0] they seem to map quite well to the capabilities of transformer based LLMs especially the multi-modal ones.

[0] https://en.wikipedia.org/wiki/Language_center


It's not obvious that an LLM - a pre-trained/frozen chunk of predictive statistics - would be amenable to being used as an integral part of an AGI that would necessarily be using a different incremental learning algorithm.

Would the transformer architecture be compatible with the needs of an incremental learning system? It's missing the top down feedback paths (finessed by SGD training) needed to implement prediction-failure driven learning that feature so heavily in our own brain.

This is why I could more see a potential role for a pre-trained LLM as a separate primitive subsystem to be overidden, or maybe (more likely) we'll just pre-expose an AGI brain to 20 years of sped-up life experience and not try to import an LLM to be any part of it!


Its entirely possible to have an AGI language model that is periodically retrained as slang, vernacular, and semantic embeddings shift in their meaning. I have little doubt that something very much like an LLM (a machine that turns high dimensional intent into words) will form an AGIs 'language center' at some point.


Yes, an LLM can be periodically retrained, which is what is being done today, but a human level AGI needs to be able to learn continuously.

If we're trying something new and make a mistake, then we need to seamlessly learn from the mistake and continue - explore the problem and learn from successes and failures. It wouldn't be much use if your "AGI" intern stopped at it's first mistake and said "I'll be back in 6 months after I've been retrained not to make THAT mistake".


I don't think there's a single way that we learn things, there's too much variety in how, when and why things are committed to memory and still more of a difference with things that actually update our thinking process or world model. We forget the overwhelming majority of sense perceptions immediately and even when we are intentionally trying to learn something we will fail to recall it even a few seconds after we see it. Even when we succeed in short term recall the thing we have "learnt" may be gone the next day or we may only recall it correctly some small number of times out of many attempts. Contrary to that some things are immediately and permanently ingrained in our minds if they are extremely impactful in some way or sometimes for no apparent reason at all. It's too deep of a topic to go into but all this is to say that it isn't so simple as to say that continued pretraining of an LLM is completely dissimilar to how humans learn, in fact the question and answer style of fine tuning that is so widely used to add new knowledge or steer a model to respond in a certain way is extremely similar to how humans learn e.g. quizzing or testing with immediate feedback and repeating the process with many samples that vary their wording while still pertaining to the same information is one of the best ways for people to memorize information.


This may be accurate. I wonder if there's enough energy in the world for this endeavour.


Of course!

1. We've barely scratched the surface of this solution space; the focus only recently started shifting from improving model capabilities to improving training costs. People are looking at more efficient architectures, and lots of money is starting to flow in that direction, so it's a safe bet things will get significantly more efficient.

2. Training is expensive, inference is cheap, copying is free. While inference costs add up with use, they're still less than costs of humans doing the equivalent work, so out of all things AI will impact, I wouldn't worry about energy use specifically.


Humans don't require immense amounts of energy to function. The reasons why LLMs do is because we are essentially using brute force as the methodology for making them smarter for the lack of better understanding of how this works. But this then gives us a lot of material to study to figure that part out for future iterations of the concept.


Are you so sure about that? How much energy went into training the self-assembling chemical model that is the human brain? I would venture to say literally astronomical amounts.

You have to compare apples to apples. It took literally the sum total of billions of years of sunlight energy to create humans.

Exploring solution spaces to find intelligence is expensive, no matter how you do it.


Humans normally need about 30 years of training before they’re competent.


LLMs mostly know what they know. Of course, that doesn't mean they're going to tell you.

https://news.ycombinator.com/item?id=41504226


It probably depends on your problem space. In creative writing, I wonder if its even perceptible if the LLM is creating content at the boundaries of its knowledge base. But for programming or other falsifiable (and rapidly changing) disciplines it is noticeable and a problem.

Maybe some evaluation of the sample size would be helpful? If the LLM has less than X samples of an input word or phrase it could include a cautionary note in its output, or even respond with some variant of “I don’t know”.


In creative writing the problem becomes things like word choice and implications that have unexpected deviations from its expectations.

It can get really obvious when it's repeatedly using clichés. Both in repeated phrases and in trying to give every story the same ending.


> I wonder if its even perceptible if the LLM is creating content at the boundaries of its knowledge base

The problem space in creative writing is well beyond the problem space for programming or other "falsifiable disciplines".


> It probably depends on your problem space

Makes me wonder if the medical doctors can ever blame the LLM over other factors for killing their patients.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: