LLM responses are random. One's failure is other's success. When evaluating we a... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

smusamashah 3 months ago | parent | context | favorite | on: Are LLMs able to notice the “gorilla in the data”?

LLM responses are random. One's failure is other's success. When evaluating we all should do rerurns and see how many times it fails or succeeds.

Without number of rerurns, the result is as good as random.

dartos 3 months ago [–]

Okay?

OC was saying that the article said that Claude recognized the “artistic” lines of the image from just the scatter plot data.

That isn’t what happened.

The author added a png of the plot to the conversation.

Idk why I need to explain that twice.

Consider applying for YC's Summer 2025 batch! Applications are open till May 13
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact