> I don't ask someone how many r's there are in strawberry by spelling out strawberry, I just say the word.
No, I would actually be pretty confident you don’t ask people that question… at all. When is the last time you asked a human that question?
I can’t remember ever having anyone in real life ask me how many r’s are in strawberry. A lot of humans would probably refuse to answer such an off-the-wall and useless question, thus “failing” the test entirely.
A useless benchmark is useless.
In real life, people overwhelmingly do not need LLMs to count occurrences of a certain letter in a word.
No, I would actually be pretty confident you don’t ask people that question… at all. When is the last time you asked a human that question?
I can’t remember ever having anyone in real life ask me how many r’s are in strawberry. A lot of humans would probably refuse to answer such an off-the-wall and useless question, thus “failing” the test entirely.
A useless benchmark is useless.
In real life, people overwhelmingly do not need LLMs to count occurrences of a certain letter in a word.