Hacker News new | past | comments | ask | show | jobs | submit login

The diagram you're replying to agrees with this.



Does it? The way I'm reading it, the first step is LLM turning human imprecise thinking into specific and exact commands


That's true, but that is the input section of the diagram, not the output section where [specific and exact output] is labeled, so I believe there was legitimate confusion I was responding to.

To your point, which I think is separate but related, that IS a case where LLMs are good at producing specific and exact commands. The models + the right prompt are pretty reliable at tool calling by themselves, because you give them a list of specific and exact things they can do. And they can be fully specific and exact at inference time with constrained output (although you may still wish it called a different tool.)


The tool may not even exist. LLMs are really terrible at admitting where the limits of the training are. They will imagine a tool into being. They will also claim the knowledge is within their realm, when it isn't.


At inference time you can constrain output to a strict json schema that only includes valid tools.


That would only be possible, if you could prevent hallucinations from ever occurring. Which you can't. Even if you supply a strict schema, the model will sometimes act outside of it - and infer the existence of "something similar".


That's not true. You say the model will sometimes act outside of the schema, but models don't act at all, they don't hallucinate by themselves, they don't produce text at all, they do all of this in conjunction with your inference engine.

The model's output is a probability for every token. Constrained output is a feature of the inference engine. With a strict schema the inference engine can ignore every token that doesn't adhere to the schema and select the top token that does adhere to the schema.


Just because the answer adheres to the schema does not mean that it’s correct.


Yes, we've been discussing "specific and exact" output. As I said, you might wish it called at different tool; nothing in this discussion is addressing that.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: