I came away much less impressed than you did. The "step by step analysis" consis...

I came away much less impressed than you did. The "step by step analysis" consists mostly of it considering, ruling out, and reconsidering an obviously invalid move. The code that it "tries to write" first zooms and pans around the image for no reason as it's already identified the layout of the pieces in the initial analysis. It then tries to import a library it has not yet installed in the sandbox (in addition to importing `chess.polyglot` for no discernable reason) before giving up on that thread entirely. It then manages to write a one-liner that contains an IndentationError before spending more time/tokens reestablishing the board layout. It does all of this before finally delegating the question to a search engine.

If you just paste the image into a search engine (without needing to include the text prompt) the first result contains the solution. We live in a world where Sam Altman claims that usage of words like "please" and "thank you" in prompts have cost OpenAI "tens of millions of dollars"[0]. In this case, OpenAI's "most powerful reasoning model"[1] spends 7m 51s churning through expensive output tokens spinning its wheels before ultimately giving up and searching the internet. This strikes me as incredibly wasteful. It feels like the LLM equivalent of "punch[ing] through the table". The most impressive thing to me here is that OpenAI is getting people to pay for all this nonsense.

[0] https://www.usatoday.com/story/tech/2025/04/22/please-thank-...

[1] https://platform.openai.com/docs/models/compare