We can be certain of this by 1) looking at the structure of these engines, 2) looking at the kinds of errors that they make, and 3) looking at their learning methods.
The engines are basically indexes of common associations, maps of frequency of occurrence. Regurgitating a bunch of stuff that has a high correlation to your input is NOT intelligence, it is the result of having an insanely large map. This can often produce impressive and useful results, but it is not intelligence or wielding concepts.
For errors, the image generators provide some of the best illustrations. They produce images most associated with the inputs. One error illustrates this very well, asked to produce an image of a woman sitting on a sailboat, the bikini-clad woman looks great, until you see it — her face and torso are facing mostly towards the camera, but also, her buttocks are facing the camera and legs sitting pointing away from us. No intelligent person or concept-wielding "AI" would produce such an error - it'd know the relationships with head, torso, buttocks and legs. These don't. Another telling type of error is when asked to produce an image of Person X on a new background, when the training set had only a handful of images of Person X. It cannot do it - it returns essentially one of the full training images, with no new background. There is obviously zero concept of what a person is, or what the boundaries of a human shape would be. They can only produce these results with hundreds of thousands of images, so what is built up is the set of things that match or don't match the label (e.g., "astronaut" or "Barack Obama".), so that the actual images are statistically separated from the thousands of backgrounds.
Which brings us to how they learn. Intelligent beings from worms to humans learn and abstract on incredibly small data sets. By the time a child can use a crayon, having seen only hundreds of humans, s/he can separate out what is a human from the background (might not make a good drawing yet, but knows the difference). Show a child a single new thing, and s/he will separate it from the background immediately. In contrast, these LLMs and GANs require input of nearly the entire corpus of human knowledge, and can only some of the time output something resembling the right thing.
It is entirely different from intelligence (which is not to say it isn't often useful). But the more I learn about how they work and are built, the less I'm worried about this entire generation of machines. It is no more cause for worry than an observation 25 years ago that Google could do the work of 10000 librarian person-hours in 0.83 seconds. Great stuff, changes values of some types of work, but not an existential threat.
I agree that we can conclude that AlphaGo, GPT, and stable diffusion are geographically far from an AGI in program-design-space, just like we could conclude that an airship, an airplane, and a rocket are all far apart from each other in aircraft-design-space.
But I don't think this offers certainty that AGI won't be developed for a long time (temporal distance). Nor that there are a large number of fundamental breakthroughs needed or new hardware, rather than just one or two key software architecture insights.
With the eager investment and frantic pace of research competition, it seems like there will only be increasing pressure to explore AI-design-space for the near future, which might mean that even radically different and improved designs might be discovered in a short time.
That, right there, is the key - radically different and improved; i.e., not an extension of the current stuff.
I fully agree that the enthusiasm generated by the impressive stunts of ALphaGO/GPT/SD, etc. does bring enthusiasm, investment, and activity to the field which will shorten any search.
The catch for me is that these technologies, as impressive as they are, 1) not themselves a direct step towards AGI (beyond generating enthusiasm/investment), 2) tell us nothing about how much further we will need to search.
That radical improvement may be right under our nose, or a millenium away.
This reminds me of Hero's aeolipile, a steam engine invented over 2000 years ago. It could be said that we almost got the industrial revolution right then. Yet it took another 1800+ years for the other breakthroughs and getting back around to it. Plus, Hero's engine was exactly using the correct principles, whereas these AG/GPT/SD are clearly NOT onto the correct principles.
So, how much will this enthusiasm, investment, and activity speed the search? If its just an order of magnitude, we're still 180 years away. If it's three orders of magnitude, it'll be late next year, and if it's five, it'll be here next weekend.
So, I guess, in short, we've both read Bostrom's book, agree on that the AGI runaway scenario is a serious concern, but that these aren't any form of AGI, but might, as an secondary effect of their generated enthusiasm and genuine (albeit flaky) usefulness, accelerate the runaway AGI scenario?
EDIT: considering your "airship/airplane/rocket distances in aircraft-design-space" analogy. It seems we don't even know if what we've got with AG/GPT/SD is an airship, and need a rocket, or if we've got an airplane, but actually need a warp drive.
So, we know we're accelerating the search in the problem/design space. But, how can we answer the question of how big a space we'll need to search, and how big is our investment relative to the search volume?
Well, what we do have in our heads is a human brain, which I believe is not more powerful than a Turing machine, and is a working proof-of-concept created by a random greedy trial-and-error incremental process in a not-astronomical number of generations out of a population of less than one million primates. That tells me that we're probably not a warp-drive distance away from finding a working software implementation of its critical elements. And each time a software problem goes from "unsolvable by a computer, yet trivial for the human brain" to "trivial for both", it seems to me that we lose more than just another CAPTCHA. We're losing grounds to believe that anything the brain does is fundamentally all that difficult for computers to do, once we just stop being confused about how to do it.
This has happened very frequently over my lifespan and even more rapidly in the past 12 months, so it no longer feels surprising when it happens. I think we've basically distilled the core elements of planning, intuition, perception, imagination, and language; we're clearly not there yet with reasoning, reflection, creativity, or abstraction, but I don't see why another 10 or 20 years of frantic effort won't get us there. GPT, SD, and Segment Anything are not even extensions or scaling-up of AlphaGo, so there are clearly multiple seams being mined here, and very little hesitation to explore more widely while cross-pollinating ideas, techniques, and tooling.
Interesting approach, especially to the questions raised
>>not more powerful than a Turing machine
In many ways less powerful, but also has some orthogonal capabilities?
>>working proof-of-concept
For sure!
>>probably not a warp-drive distance away from finding a working software implementation of its critical elements
>>I don't see why another 10 or 20 years of frantic effort won't get us there
Agree. My sense is that an AGI is on a similar time and frantic effort scale, although with not quite the same reasoning. I think it is not just airplane-to-rocket tech, but closer than warp-drive tech. It also depends if we're talking about a general-ish tech or a runaway AGI singularity.
>>created by a random greedy trial-and-error incremental process in a not-astronomical number of generations out of a population of less than one million primates.
True, although setting the baseline at primates is very high. Even lower mammals and birds (avian dinosaur descendants) have significant abstraction and reasoning capabilities. The "mere" birds-nest problem, of making a new thing out of random available materials, is very nontrivial.
So, we first need to create that level of ability to abstract. This would include having the "AI" "understand" physical constructs such as objects, hiding, the relationship between feet, knees, hips, torso and head (and that in humans, the feet and knees point in the same direction as the face...), the physical interactions between objects... probably the entire set of inferences now embedded in CYC, and more. THEN, we need to abstract again to get from the primate to the runaway symbolic and tool wielding processing of humans and beyond.
It seems that the first problem set will be more difficult. Looking again to the biological evolution, how much longer did it take for biology to develop the ability to abstract 3D shapes and relations (first hunting predators?). It was a heck of a lot more time an iterations than the million primates for a few million generations. So, this might be similar.
>>to explore more widely while cross-pollinating ideas, techniques, and tooling.
Yup, key there.
Another key is being more biomimetic, both in the simulation of neuron functioning and in deeply integrating sensor suites to the computing system. The idea that we are just brains in jars seems an abstraction (distraction?) too far. I have a hard time seeing how our brains are more than a big node in our whole nervous and indeed biological system, and the input from the entire body is essential to growing the brain. I expect we might find something similar about AI.
OTOH, in airplanes, our method of propulsion and control are quite different vs the biological solutions from birds (although the lift principles are the same), and we're still integrating a lot of bird "tech" into flying. Wheels vs legs might be a better example, although the hottest thing is legged robotics, since they don't need roads... It seems that we are similarly developing clunky, limited, and very-artificial intelligence systems, before we get to building the flexible systems seen in biology...
The engines are basically indexes of common associations, maps of frequency of occurrence. Regurgitating a bunch of stuff that has a high correlation to your input is NOT intelligence, it is the result of having an insanely large map. This can often produce impressive and useful results, but it is not intelligence or wielding concepts.
For errors, the image generators provide some of the best illustrations. They produce images most associated with the inputs. One error illustrates this very well, asked to produce an image of a woman sitting on a sailboat, the bikini-clad woman looks great, until you see it — her face and torso are facing mostly towards the camera, but also, her buttocks are facing the camera and legs sitting pointing away from us. No intelligent person or concept-wielding "AI" would produce such an error - it'd know the relationships with head, torso, buttocks and legs. These don't. Another telling type of error is when asked to produce an image of Person X on a new background, when the training set had only a handful of images of Person X. It cannot do it - it returns essentially one of the full training images, with no new background. There is obviously zero concept of what a person is, or what the boundaries of a human shape would be. They can only produce these results with hundreds of thousands of images, so what is built up is the set of things that match or don't match the label (e.g., "astronaut" or "Barack Obama".), so that the actual images are statistically separated from the thousands of backgrounds.
Which brings us to how they learn. Intelligent beings from worms to humans learn and abstract on incredibly small data sets. By the time a child can use a crayon, having seen only hundreds of humans, s/he can separate out what is a human from the background (might not make a good drawing yet, but knows the difference). Show a child a single new thing, and s/he will separate it from the background immediately. In contrast, these LLMs and GANs require input of nearly the entire corpus of human knowledge, and can only some of the time output something resembling the right thing.
It is entirely different from intelligence (which is not to say it isn't often useful). But the more I learn about how they work and are built, the less I'm worried about this entire generation of machines. It is no more cause for worry than an observation 25 years ago that Google could do the work of 10000 librarian person-hours in 0.83 seconds. Great stuff, changes values of some types of work, but not an existential threat.