I don't know why you would expect it to see a gorilla without an image to look a...

Dylan16807 · 2025-02-09T07:34:31 1739086471

Without an image? No, not at all. It's supposed to make its own image. And it did make its own image. But it didn't properly analyze the image it made.

taberiand · 2025-02-09T10:35:31 1739097331

That's a feature that would need to be implemented. There's no reason to think it could look at the image of the plot it generated automatically, but feeding it the image it generated back to it is no different to if it did view it automatically

Dylan16807 · 2025-02-09T23:39:43 1739144383

The point of telling it to explore the data is so I don't have to think of every angle myself. Humans can get an understanding from visuals that LLMs can't match, apparently, even without gimmicks.

taberiand · 2025-02-10T01:35:21 1739151321

The llm is able to see the gorilla when shown the image in the same way you would show a human an image.

Imagine if you gave someone the raw data and told them to write code to graph the output but on to a screen they couldn't see. They would not be able to tell you it's a gorilla until you turn the monitor around and show them.

Humans are still better at seeing the image, sure (for now), but the llm is a tool with certain features and abilities. You can't make up a scenario that is misusing the tool and then pretend that it doesn't work - especially when it seems you want it to use it without applying your own brain power to the process

And to be clear, I'm open to criticism of llms and exploration of their limitations - but I'm tired of hearing complaints that amount to PEBKAC.

Dylan16807 · 2025-02-10T02:21:40 1739154100

When I tell a human to analyze the data, I sure don't expect them to interpret it as "write code to graph it to a screen you can't see". You found the problem but glossed right over it.

> misusing the tool and then pretend that it doesn't work

It was told to analyze and then it did a bad job of analyzing. I don't care if an LLM expert expects this already, it's worth pointing out to everyone else. It's not PEBKAC.

krageon · 2025-02-10T09:21:52 1739179312

If a user misunderstands the purpose and value of a tool, this is PEBKAC.

Dylan16807 · 2025-02-10T17:30:27 1739208627

The tool doesn't succeed at a reasonable task that humans can do. That's not PEBKAC, and warning people about it is a good thing.

This type of analysis is not outside the purpose of the tool. You're making excuses at this point. Do you really think it would be wrong to add that capability in the future?

It's a technical limitation, one that is far from obvious.

taberiand · 2025-02-10T21:14:26 1739222066

I think you think the llm is a magic box with an intelligent being inside that can magically do whatever you want it to, somehow. It is software. It has capabilities and limitations. Learn them, and use it appropriately. Or don't, you don't have to use it. But don't expect it to just do whatever you think it should do.

Dylan16807 · 2025-02-12T05:14:43 1739337283

> It has capabilities and limitations. Learn them, and use it appropriately.

THAT IS THE POINT OF THE ARTICLE

If you can't figure that out, don't insult me.