In my experience LLM generated answers are more comparable to an ad-hoc answer b...

TeMPOraL · 2024-12-23T08:40:28 1734943228

More like "have already skimmed half of the entire Internet in the past", but yeah. That's exactly the mental model IMO one should have with LLMs.

Of course don't forget that "writing up an answer off the top of their head based on that, filling in any missing material by just making something up" is what everyone does all the time, and in particular it's what experts do in their areas of expertise. How often those snap answers and hasty extrapolations turn out correct is, literally, how you measure understanding.

EDIT:

There's some deep irony here, because with LLMs being "all system 1, no system 2", we're trying to give them the same crutches we use on the road to understanding, but have them move the opposite direction. Take "chain of thought" - saying "let's think step by step" and then explicitly going through your reasoning is not understanding - it's the direct opposite of it. Think of a student that solves a math problem step by step - they're not demonstrating understanding or mastery of the subject. On the contrary, they're just demonstrating they can emulate understanding by more mechanistic, procedural means.

jacobolus · 2024-12-23T08:48:36 1734943716

Okay, but if you read written work by an expert (e.g. a book published by a reputable academic press or a journal article in a peer-reviewed journal), you get a result whose details were all checked out, and can be relied on to some extent. By looking up in the citation graph you can track down their sources, cross-check claims against other scholars', look up survey sources putting the work in context, think critically about each author's biases, etc., and it's possible to come to some kind of careful analysis of the work's credibility and assess the truth value of claims made. By doing careful search and study it's possible to get to some sense of the scholarly consensus about a topic and some idea of the level of controversy about various details or interpretations.

If instead you are reading the expert's blog post or hastily composed email or chatting with them on an airplane you get a different level of polish and care, but again you can use context to evaluate the source and claims made. Often the result is still "oh yeah this seems pretty insightful" but sometimes "wow, this person shouldn't be speculating outside of their area of expertise because they have no clue about this".

With LLM output, the appropriate assessment (at least in any that I have tried, which is far from exhaustive) is basically always "this is vaguely topical bullshit; you shouldn't trust this at all".

twometwo · 2024-12-23T10:08:52 1734948532

I am just curious about this. You said the word never, and I think your claim can be tested, perhaps you could post a list of five obscure questions for a LLM to answer and then someone could ask that to a good LLM for you, or an expert in that field, to assess the value of the answers.

Edited: I just submitted an ASK HN post about this.

jstummbillig · 2024-12-23T13:11:18 1734959478

> I've never gotten an answer from an LLM to a tricky or obscure question about a subject I already know anything about that seemed remotely competent.

Certainly not my experience with the current SOTA. Without being more specific, it's hard to discuss. Feel free to name something that can be looked at.