100% of the time when I post a critique someone replies with this. I tell them I...

imtringued · on April 30, 2024

RT-2 is a vision language model fine tuned on the current vision input and actuator positions as the output. Google uses a bunch of TPUs to produce a full response at a cycle rate of 3 Hz and the VLM has learned the kinematics of the robot and knows how to pick up objects according to given instructions.

Given the current rate of progress, we will have robots that can learn simple manual labor from human demonstrations (e.g. Youtube as a dataset, no I do not mean bimanual teleoperation) by the end of the decade.

Workaccount2 · on April 30, 2024

Usually when I encounter sentiment like this it is because they only have used 3.5 (evidently not the case here) or that their prompting is terrible/misguided.

When I show a lot of people GPT4 or Claude, some percentage of them jump right to "What year did Nixon get elected?" or "How tall is Barack Obama?" and then kind of shrug with a "Yeah, Siri could do that ten years ago" take.

Beyond that you have people who prompt things like "Make a stock market program that has tabs for stocks, and shows prices" or "How do you make web cookies". Prompts that even a human would struggle greatly with.

For the record, I use GPT4 and Claude, and both have dramatically boosted my output at work. They are powerful tools, you just have to get used to massaging good output from them.

parineum · on April 30, 2024

> or that their prompting is terrible/misguided.

This is the "You're not using it right" defense.

It's an LLM, it's supposed to understand human language queries. I shouldn't have to speak LLM to speak to an LLM.

Filligree · on April 30, 2024

That is not the reality today. If you want good results from an LLM, then you do need to speak LLM. Just because they appear to speak English doesn't mean they act like a human would.

jiggawatts · on April 30, 2024

People don’t even know how to use traditional web search properly.

Here’s a real scenario: A Citrix virtual desktop crashed because a recent critical security fix forced an upgrade of a shared DLL. The output is a really specific set of errors in a stack trace. I watched with my own two eyes an IT professional typed the following phrase into Google: “Why did my PC crash?”

Then he sat there and started reading through each result… including blog posts by random kids complaining about Windows XP.

I wish I could say this kind of thing is an isolated incident.

Aeolun · on May 1, 2024

I mean, you need to speak German to talk to a German. It’s not really much different for LLM, just because the language they speak has a root in English doesn’t mean it actually is English.

And even if it was, there’s plenty of people completely unintelligible in English too…

cma · on April 30, 2024

You see no difference between non-RLHFed GPT3 from early 2022 and GPT-4 in 2024? It's a very broad consensus that there is a huge difference so that's why I wanted to clarify and make sure you were comparing the right things.

What type of usages are you testing? For general knowledge it hallucinates way less often, and for reasoning and coding and modifying its past code based on English instructions it is way, way better than GPT-3 in my experience.

harryp_peng · on April 30, 2024

I always use GPT4 to write boiler plate code etc. It probably automates 50% of my tasks, pretty good.

Eisenstein · on April 30, 2024

It's fine, you don't have a use for it so you don't care. I personally don't spend any effort getting to know things that I don't care about and have no use for; but I also don't tell people who use tools for their job or hobby that I don't need how much those tools are useless and how their experience using them is distorted or wrong.

kolinko · on April 30, 2024

Usually people who post such claims haven’t used anything beyond gpt3. That’s why you get questions.

Also, the difference is so big and so plainly visible that I guess people don’t know how to even answer someone saying they don’t see it. That’s why you get crickets.