No, it's necessary to either know that it's a trick question or to have a feeling that it is based on context. The entire point of a question like that is to trick your understanding.
You're tricking the model because it has seen this specific trick question a million times and shortcuts to its memorized solution. Ask it literally any other question, it can be as subtle as you want it to be, and the model will pick up on the intent. As long as you don't try to mislead it.
I mean, I don't even get how anyone thinks this means literally anything. I can trick people who have never heard of the trick with the 7 wives and 7 bags and so on. That doesn't mean they didn't understand, they simply did what literally any human does, make predictions based on similar questions.
> I can trick people who have never heard of the trick with the 7 wives and 7 bags and so on. That doesn't mean they didn't understand
They could fail because they didn’t understand the language. Didn’t have a good memory to memorize all the steps, or couldn’t reason through it. We could pose more questions to probe which reason is more plausible.
The trick with the 7 wives and 7 bags and so on is that no long reasoning is required. You just have to notice one part of the question that invalidates the rest and not shortcut to doing arithmetic because it looks like an arithmetic problem. There are dozens of trick questions like this and they don't test understanding, they exploit your tendency to predict intent.
But sure, we could ask more questions and that's what we should do. And if we do that with LLMs we can quickly see that when we leave the basin of the memorized answer by rephrasing the problem, the model solves it. And we would also see that we can ask billions of questions to the model, and the model understands us just fine.
Some people solve trick questions easily simply because they are slow thinkers who pay attention to every question, even non-trick questions, and don't fast-path the answer based on its similarity to a past question.
Interestingly, people who make bad fast-path answers often call these people stupid.
It does mean something. It means that the model is still more on the memorization side than being able to independently evaluate a question separate from the body of knowledge it has amassed.
No, that's not a conclusion we can draw, because there is nothing much more to do than memorize the answer to this specific trick question. That's why it's a trick question, it goes against expectations and therefore the generalized intuitions you have about the ___domain.
We can see that it doesn't memorize much at all by simply asking other questions that do require subtle understanding and generalization.
You could ask the model to walk you through an imaginary environment, describing your actions. Or you could simply talk to it, quickly noticing that for any longer conversation it becomes impossibly unlikely to be found in the training data.
To be able to answer a trick question, it’s first necessary to understand the question.