These comments are getting ridiculous. I remember when this test was first discu...

ignoramous · 2024-12-20T19:33:00 1734723180

> These comments are getting ridiculous.

Not really. Francois (co-creator of the ARC Prize) has this to say:

  The v1 version of the benchmark is starting to saturate. There were already signs of this in the Kaggle competition this year: an ensemble of all submissions would score 81%

  Early indications are that ARC-AGI-v2 will represent a complete reset of the state-of-the-art, and it will remain extremely difficult for o3. Meanwhile, a smart human or a small panel of average humans would still be able to score >95% ... This shows that it's still feasible to create unsaturated, interesting benchmarks that are easy for humans, yet impossible for AI, without involving specialist knowledge. We will have AGI when creating such evals becomes outright impossible.

  For me, the main open question is where the scaling bottlenecks for the techniques behind o3 are going to be. If human-annotated CoT data is a major bottleneck, for instance, capabilities would start to plateau quickly like they did for LLMs (until the next architecture). If the only bottleneck is test-time search, we will see continued scaling in the future.

https://x.com/fchollet/status/1870169764762710376 / https://ghostarchive.org/archive/Sqjbf

ben_w · 2024-12-20T19:10:35 1734721835

> It's time people woke up and realised that the old age of AI is over. This new kind is here to stay and it will take over the world. And you better guess it'll be sooner rather than later and start to prepare.

I was just thinking about how 3D game engines were perceived in the 90s. Every six months some new engine came out, blew people's minds, was declared photorealistic, and was forgotten a year later. The best of those engines kept improving and are still here, and kinda did change the world in their own way.

Software development seemed rapid and exciting until about Halo or Half Life 2, then it was shallow but shiny press releases for 15 years, and only became so again when OpenAI's InstructGPT was demonstrated.

While I'm really impressed with current AI, and value the best models greatly, and agree that they will change (and have already changed) the world… I can't help but think of the Next Generation front cover, February 1997 when considering how much further we may be from what we want: https://www.giantbomb.com/pc/3045-94/forums/unreal-yes-this-...

TeMPOraL · 2024-12-20T22:40:59 1734734459

> Software development seemed rapid and exciting until about Halo or Half Life 2, then it was shallow but shiny press releases for 15 years

The transition seems to map well to the point where engines got sophisticated enough, that highly dedicated high-schoolers couldn't keep up. Until then, people would routinely make hobby game engines (for games they'd then never finish) that were MVPs of what the game industry had a year or three earlier. I.e. close enough to compete on visuals with top photorealistic games of a given year - but more importantly, this was a time where you could do cool nerdy shit to impress your friends and community.

Then Unreal and Unity came out, with a business model that killed the motivation to write your own engine from scratch (except for purely educational purposes), we got more games, more progress, but the excitement was gone.

Maybe it's just a spurious correlation, but it seems to track with:

> and only became so again when OpenAI's InstructGPT was demonstrated.

Which is again, if you exclude training SOTA models - which is still mostly out of reach for anyone but a few entities on the planet - the time where anyone can do something cool that doesn't have a better market alternative yet, and any dedicated high-schooler can make truly impressive and useful work, outpacing commercial and academic work based on pure motivation and focus alone (it's easier when you're not being distracted by bullshit incentives like user growth or making VCs happy or churning out publications, farming citations).

It's, once again, a time of dreams, where anyone with some technical interest and a bit of free time can make the future happen in front of their eyes.

hansonkd · 2024-12-20T21:14:30 1734729270

> how much further we may be from what we wan

The timescale you are describing for 3D graphics is 4 years from the 1997 cover you posted to the release of Halo which you are saying plateaued excitement because it got advanced enough.

An almost infinitesimally small amount of time in terms of history human development and you are mocking the magazine being excited for the advancement because it was... 4 years yearly?

ben_w · 2024-12-20T21:34:56 1734730496

No, the timescale is "the 90s", the the specific example is from 1997, and chosen because of how badly it aged. Nobody looks at the original single-player Unreal graphics today and thinks "this is amazing!", but we all did at the time — Reflections! Dynamic lighting! It was amazing for the era — but it was also a long way from photorealism. ChatGPT is amazing… but how far is it from Brent Spiner's Data?

The era was people getting wowed from Wolfenstein (1992) to "about Halo or Half Life 2" (2001 or 2004).

And I'm not saying the flattening of excitement was for any specific reason, just that this was roughly when it stopped getting exciting — it might have been because the engines were good enough for 3D art styles beyond "as realistic as we can make it", but for all I know it was the War On Terror which changed the tone of press releases and how much the news in general cared. Or perhaps it was a culture shift which came with more people getting online and less media being printed on glossy paper and sold in newsagents.

Whatever the cause, it happened around that time.

TeMPOraL · 2024-12-20T22:57:58 1734735478

I'm still holding on to my hypothesis in that the excitement was sustained in large part because this progress was something a regular person could partake in. Most didn't, but they likely known some kid who was. And some of those kids run the gaming magazines.

This was a time where, for 3D graphics, barriers to entry got low (math got figured out, hardware was good enough, knowledge spread), but the commercial market didn't yet capture everything. Hell, a bulk of those excited kids I remember, trying to do a better Unreal Tournament after school instead of homework (and almost succeeding!), they went on create and staff the next generation of commercial gamedev.

(Which is maybe why this period lasted for about as long as it takes for a schoolkid to grow up, graduate, and spend few years in the workforce doing the stuff they were so excited about.)

ben_w · 2024-12-20T23:33:18 1734737598

Could be.

I was one of those kids, my focus was Marathon 2 even before I saw Unreal. I managed to figure out enough maths from scratch to end up with the basics of ray casting, but not enough at the time to realise the tricks needed to make that real time on a 75 MHz CPU… and then we all got OpenGL and I went through university where they explained the algorithms.

torginus · 2024-12-20T19:26:36 1734722796

The weird thing about the phenomenon you mention is only after the field of software engineering has plateaued 15 years ago, as you mentioned, that this insane demand for engineers did arise, with corresponding insane salaries.

It's a very strange thing I've never understood.

dwaltrip · 2024-12-20T21:10:44 1734729044

My guess: It’s a very lengthy, complex, and error-prone process to “digitize” human civilization (government, commerce, leisure, military, etc). The tech existed, we just didn’t know how to use it.

We still barely know how to use computers effectively, and they have already transformed the world. For better or worse.

jcims · 2024-12-20T19:07:58 1734721678

I agree, it's like watching a meadow ablaze and dismissing it because it's not a 'real forest fire' yet. No it's not 'real AGI' yet, but *this is how we get there* and the pace is relentless, incredible and wholly overwhelming.

I've been blessed with grandchildren recently, a little boy that's 2 1/2 and just this past Saturday a granddaughter. Major events notwithstanding, the world will largely resemble today when they are teenagers, but the future is going to look very very very different. I can't even imagine what the capability and pervasiveness of it all will be like in ten years, when they are still just kids. For me as someone that's invested in their future I'm interested in all of the educational opportunities (technical, philosphical and self-awareness) but obviously am concerned about the potential for pernicious side effects.

lawlessone · 2024-12-20T18:55:32 1734720932

Failing the test may prove the AI is not intelligent. Passing the test doesn't necessarily prove it is.

NitpickLawyer · 2024-12-20T19:02:53 1734721373

Your comment reminds me of this quote from a book published in the 80s:

> There is a related “Theorem” about progress in AI: once some mental function is programmed, people soon cease to consider it as an essential ingredient of “real thinking”. The ineluctable core of intelligence is always in that next thing which hasn’t yet been programmed. This “Theorem” was first proposed to me by Larry Tesler, so I call it Tesler’s Theorem: “AI is whatever hasn’t been done yet.”

6gvONxR4sf7o · 2024-12-20T19:32:24 1734723144

I've always disliked this argument. A person can do something well without devising a general solution to the thing. Devising a general solution to the thing is a step we're talking all the time with all sorts of things, but it doesn't invalidate the cool fact about intelligence: whatever it is that lets us do the thing well without the general solution is hard to pin down and hard to reproduce.

All that's invalidated each time is the idea that a general solution to that task requires a general solution to all tasks, or that a general solution to that task requires our special sauce. It's the idea that something able to to that task will also be able to do XYZ.

And yet people keep coming up with a new task that people point to saying, 'this is the one! there's no way something could solve this one without also being able to do XYZ!'

8note · 2024-12-20T20:48:43 1734727723

id consider that it doing the test at all, without proper compensation is a sign that it isnt intelligent

esafak · 2024-12-21T14:46:03 1734792363

Motivation is not hard to instill. Fortunately, they have chosen not to do so.

philipkglass · 2024-12-20T19:10:23 1734721823

If AI takes over white collar work that's still half of the world's labor needs untouched. There are some promising early demos of robotics plus AI. I also saw some promising demos of robotics 10 and 20 years that didn't reach mass adoption. I'd like to believe that by the time I reach old age the robots will be fully qualified replacements for plumbers and home health aides. Nothing I've seen so far makes me think that's especially likely.

I'd love more progress on tasks in the physical world, though. There are only a few paths for countries to deal with a growing ratio of old retired people to young workers:

1) Prioritize the young people at the expense of the old by e.g. cutting old age benefits (not especially likely since older voters have greater numbers and higher participation rates in elections)

2) Prioritize the old people at the expense of the young by raising the demands placed on young people (either directly as labor, e.g. nurses and aides, or indirectly through higher taxation)

3) Rapidly increase the population of young people through high fertility or immigration (the historically favored path, but eventually turns back into case 1 or 2 with an even larger numerical burden of older people)

4) Increase the health span of older people, so that they are more capable of independent self-care (a good idea, but difficult to achieve at scale, since most effective approaches require behavioral changes)

5) Decouple goods and services from labor, so that old people with diminished capabilities can get everything they need without forcing young people to labor for them

reducesuffering · 2024-12-20T20:51:31 1734727891

> If AI takes over white collar work that's still half of the world's labor needs untouched.

I am continually baffled that people here throw this argument out and can't imagine the second-order effects. If white collar work is automated by AGI, all the RnD to solve robotics beyond imagination will happen in a flash. The top AI labs, the people smartest enough to make this technology, all are focusing on automating AGI Researchers and from there follows everything, obviously.

brotchie · 2024-12-20T21:05:29 1734728729

+1, the second and third order effects aren't trivial.

We're already seeing escape velocity in world modeling (see Google Veo2 and the latest Genesis LLM-based physics modeling framework).

The hardware for humanoid robots is 95% of the way there, the gap is control logic and intelligence, which is rapidly being closed.

Combine Veo2 world model, Genesis control planning, o3-style reasoning, and you're pretty much there with blue collar work automation.

We're only a few turns (<12 months) away from an existence proof of a humanoid robot that can watch a Youtube video and then replicate the task in a novel environment. May take longer than that to productionize.

It's really hard to think and project forward on an exponential. We've been on an exponential technology curve since the discovery of fire (at least). The 2nd order has kicked up over the last few years.

Not a rational approach to look back at robotics 2000-2022 and project that pace forwards. There's more happening every month than in decades past.

philipkglass · 2024-12-20T21:19:56 1734729596

I hope that you're both right. In 2004-2007 I saw self driving vehicles make lightning progress from the weak showing of the 2004 DARPA Grand Challenge to the impressive 2005 Grand Challenge winners and the even more impressive performance in the 2007 Urban Challenge. At the time I thought that full self driving vehicles would have a major commercial impact within 5 years. I expected truck and taxi drivers to be obsolete jobs in 10 years. 17 years after the Urban Challenge there are still millions of truck driver jobs in America and only Waymo seems to have a credible alternative to taxi drivers (even then, only in a small number of cities).

QuantumGood · 2024-12-20T19:07:47 1734721667

"it will take over the world"

Calibrating to the current hype cycle has been challenging with AI pronouncements.

foobarqux · 2024-12-20T18:51:21 1734720681

You should look up the terms necessary and sufficient.

sigmoid10 · 2024-12-20T18:58:11 1734721091

The real issue is people constantly making up new goalposts to keep their outdated world view somewhat aligned with what we are seeing. But these two things are drifting apart faster and faster. Even I got surprised by how quickly the ARC benchmark was blown out of the water, and I'm pretty bullish on AI.

foobarqux · 2024-12-20T19:11:46 1734721906

The ARC maintainers have explicitly said that passing the test was necessary but not sufficient so I don't know where you come up with goal-post moving. (I personally don't like the test; it is more about "intuition" or in-built priors, not reasoning).

manmal · 2024-12-20T20:01:42 1734724902

Are you like invested in LLM companies or something? You‘re pushing the agenda hard in this thread.

Workaccount2 · 2024-12-20T19:30:32 1734723032

You are telling a bunch of high earning individuals ($150k+) that they may be dramatically less valuable in the eat future. Of course the goal posts will keep being pushed back and the acknowledgements will never come.

samvher · 2024-12-20T18:39:39 1734719979

What kind of preparation are you suggesting?

johnny_canuck · 2024-12-20T18:56:18 1734720978

Start learning a trade

whynotminot · 2024-12-20T20:07:13 1734725233

I feel like that’s just kicking the can a little further down the road.

Our value proposition as humans in a capitalist society is an increasingly fragile thing.

jorblumesea · 2024-12-20T19:07:51 1734721671

that's going to work when every white collar worker goes into the trades /s

who is going to pay for residential electrical work lol and how much will you make if some guy from MIT is going to compete with you

sigmoid10 · 2024-12-20T18:46:00 1734720360

This is far too broad to summarise here. You can read up on Sutskever or Bostrom or hell even Steven Hawking's ideas (going in order from really deep to general topics). We need to discuss everything - from education over jobs and taxes all the way to the principles of politics, our economy and even the military. If we fail at this as a society, we will at the very least create a world where the people who own capital today massively benefit and become rich beyond imagination (despite having contributed nothing to it), while the majority of the population will be unemployable and forever left behind. And the worst case probably falls somewhere between the end of human civilisation and the end of our species.

astrange · 2024-12-20T19:34:13 1734723253

One way you can tell this isn't realistic is that it's the plot of Atlas Shrugged. If your economic intuitions produce that book it means they are wrong.

> while the majority of the population will be unemployable and forever left behind

Productivity improvements increase employment. A superhuman AI is a productivity improvement.

ben_w · 2024-12-21T13:30:23 1734787823

> Productivity improvements increase employment.

Sometimes: the productivity improvements from the combustion engine didn't increase employment of horses, it displaced them.

But even when productivity improvements do increase employment, it's not always to our advantage: the productivity improvements from Eli Whitney's cotton gin included huge economic growth and subsequent technological improvements… and also "led to increased demands for slave labor in the American South, reversing the economic decline that had occurred in the region during the late 18th century": https://en.wikipedia.org/wiki/Cotton_gin

A superhuman AI that's only superhuman in specific domains? We've been seeing plenty of those, "computer" used to be a profession, and society can re-train but it still hurts the specific individuals who have to be unemployed (or start again as juniors) for the duration of that training.

A superhuman AI that's superhuman in every ___domain, but close enough to us in resource requirements that comparative advantage is still important and we can still do stuff, relegates us to whatever the AI is least good at.

A superhuman AI that's superhuman in every ___domain… as soon as someone invents mining, processing, and factory equipment that works on the moon or asteroids, that AI can control that equipment to make more of that equipment, and demand is quickly — O(log(n)) — saturated. I'm moderately confident that in this situation, the comparative advantage argument no longer works.

BriggyDwiggs42 · 2024-12-21T13:28:15 1734787695

No, Atlas shrugged explicitly believes that the wealthy beneficiaries are also the ones doing the innovation and the labor. Human/superhuman AI, if not self-directed but more like a tool, may massively benefit whoever happens to be lucky enough to be directing it when it arises. This does not imply that the lucky individual benefits on the basis of their competence.

The idea that productivity improvements increase unemployment is just fundamentally based on a different paradigm. There is absolutely no reason to think that when a machine exists that can do most things that a human can do as well if not better for less or equal cost, this will somehow increase human employment. In this scenario, using humans in any stage of the pipeline would be deeply inefficient and a stupid business decision.

kelseyfrog · 2024-12-20T18:48:51 1734720531

What we're going to do is punt the questions and then convince ourselves the outcome was inevitable and if anything it's actually our fault.

bluerooibos · 2024-12-20T21:59:37 1734731977

The goalposts have moved, again and again.

It's gone from "well the output is incoherent" to "well it's just spitting out stuff it's already seen online" to "WELL...uhh IT CAN'T CREATE NEW/NOVEL KNOWLEDGE" in the space of 3-4 years.

It's incredible.

We already have AGI.

levocardia · 2024-12-20T19:10:41 1734721841

I'm a little torn. ARC is really hard, and Francois is extremely smart and thoughtful about what intelligence means (the original "On the Measure of Intelligence" heavily influenced my ideas on how to think about AI).

On the other hand, there is a long, long history of AI achieving X but not being what we would casually refer to as "generally intelligent," then people deciding X isn't really intelligence; only when AI achieves Y will it be intelligence. Then AI achieves Y and...