In my experience, if you confuse an LLM by deviating from the the "expected", then all the shims of logic seem to disappear, and it goes into hallucination mode.
Tbf that was exactly my point. An adult might use 'inference' and 'reasoning' to ask clarification, or go with an internal logic of their choosing.
ChatGPT here went with a lexigraphical order in Python for some reason, and then proceeded to make false statements from false observations, while also defying its own internal logic.
"six" > "ten" is true because "six" comes after "ten" alphabetically.
No.
"ten" > "seven" is false because "ten" comes before "seven" alphabetically.
No.
From what I understand of LLMs (which - I admit - is not very much), logical reasoning isn't a property of LLMs, unlike information retrieval. I'm sure this problem can be solved at some point, but a good solution would need development of many more kinds of inference and logic engines than there are today.
In my experience, if you confuse an LLM by deviating from the the "expected", then all the shims of logic seem to disappear, and it goes into hallucination mode.