Hacker News new | past | comments | ask | show | jobs | submit login

> The extinction risk from unaligned supterintelligent AGI is real, it's just often dismissed (imo) because it's outside the window of risks that are acceptable and high status to take seriously.

No. It’s not taken seriously because it’s fundamentally unserious. It’s religion. Sometime in the near future this all powerful being will kill us all by somehow grabbing all power over the physical world by being so clever to trick us until it is too late. This is literally the plot to a B-movie. Not only is there no evidence for this even existing in the near future, there’s no theoretical understanding how one would even do this, nor why someone would even hook it up to all these physical systems. I guess we’re supposed to just take it on faith that this Forbin Project is going to just spontaneously hack its way into every system without anyone noticing.

It’s bullshit. It’s pure bullshit funded and spread by the very people that do not want us to worry about real implications of real systems today. Care not about your racist algorithms! For someday soon, a giant squid robot will turn you into a giant inefficient battery in a VR world, or maybe just kill you and wear your flesh as to lure more humans to their violent deaths!

Anyone that takes this seriously, is the exact same type of rube that fell for apocalyptic cults for millennia.




> This is literally the plot to a B-movie.

Are there never any B movies with realistic plots? Is that some sort of serious rebuttal?

> Sometime in the near future this all powerful being will kill us all by somehow

The trouble here is that the people who talk like you are simply incapable of imagining anyone more intelligent than themselves.

It's not that you have trouble imagining artificial intelligence... if you were incapable of that in the technology industry, everyone would just think you an imbecile.

And it's not that you have trouble imagining malevolent intelligences. Sure, they're far away from you, but the accounts of such people are well-documented and taken as a given. If you couldn't imagine them, people would just call you naive. Gullible even.

So, a malevolent artificial intelligence is just some potential or another you've never bothered to calculate because, whether that is a 0.01% risk, or a 99% risk, you'll still be more intelligent than it. Hell, this isn't a neutral outcome, maybe you'll even get to play hero.

> Care not about your racist algorithms! For someday soon

Haha. That's what you're worried about? I don't know that there is such a thing as a racist algorithm, except those which run inside meat brains. Tell me why some double digit percentage of asians are not admitted to the top schools, that's the racist algorithm.

Maybe if logical systems seem racist, it's because your ideas about racism are distant from and unfamiliar with reality.


I, and most people, can imagine something smarter than ourselves. What's harder to imagine is how just being smarter correlates to extinction levels of arbitrary power.

A malevolent AGI can whisper in ears, it can display mean messages, perhaps it can even twitch whatever physical components happen to be hooked up to old Windows 95 computers... not that scary.


> A malevolent AGI can whisper in ears, it can display mean messages, perhaps it can even twitch whatever physical components happen to be hooked up to old Windows 95 computers... not that scary.

It can found a cult - imagine something like Scientology founded by an AI. Once it has human followers it can act in the world with total freedom.


This is coming so fast and absolutely no one is ready for it. LLM, using text, audio, and video generation will quickly convince a sizeable slice of religious people that it’s the coming of God a la Revelations, they are prophets, and there’s bidding to do.


If it wants to found a cult, it has to compete with all the human cults out there. Cults usually benefit immeasurably from the founder having a personal charisma that comes out in person.


Video tends to be enough to create a cult. And AI's will be able to create videos very soon. It can create exactly the kind of avatar or set of avatars that would maximize engagement. It could do 1-on-1 calls with each of the followers, and provide spiritual guidance tailored specifically for them, as it could have the capacity to truely "listen" to each of them.

And it would not be limited to act as the cult leaders, it could also provide fake cult followers that would convince the humans that the leaders possessed superhuman wisdom.

It could also combine this with a full machinery for A/B-testing and similar experiments to ensure that the message it is communicating is optimal in terms of its goals.


I'm not aware of any serious cult created solely through videos.


Well, you could argue about the definition of a cult but in many ways the influencer phenomenon is a modern incarnation of that (eg. Andrew Tate).


How many political or business leaders personally did the deeds, good or ill, that are attributed to them?

George Washington didn't personally fight off all the British single-handed, he and his co-conspirators used eloquence to convince people to follow them to freedom; Stalin didn't personally take food from the mouths of starving Ukranians, he inspired fear that led to policies which had this effect; Musk didn't weld the seams of every Tesla or Falcon, nor dig tunnels or build TBMs for TBC, nor build the surgical robot that installed Neuralink chips, he convinced people his vision of the future was one worth the effort; and Indra Nooyi doesn't personally fill up all the world's Pepsi bottles, that's something I assume[0] is done with several layers of indirection via paying people to pay people to pay people to fill the bottles.

[0] I've not actually looked at the org chart because this is rhetorical and I don't care


The methods by which humans coerce and control other humans do not rely on plain intelligence alone. That much is clear, as George Washington and Stalin were not the smartest men in the room.


So this is down to your poor definition of intelligence?

For you, it's always the homework problems that your teacher assigned you in grade school, nothing else is intelligent. What to say to someone to have them be your friend on the playground, that never counted. Where and when to show up (or not), so that the asshole 4 grades above you didn't push you down into the mud... not intelligence. What to wear, what things to concentrate on about your appearance, how to speak, which friendships and romances to pursue, etc.

All just "animal cunning". The only real intelligence is how to work through calculus problem number three.

They were smart enough at these things that they did it without even consciously thinking about it. They were savants at it. I don't think the AI has to be a savant though, it just has to be able to come up with the right answers and responses and quickly enough that it can act on those.


I don't define cunning and strength as intelligence, even if they are more useful for shoving someone into the mud. Intelligence is a measure of the ability to understand and solve abstract problems, not to be rich and famous.


Cunning absolutely should count as an aspect of intelligence.

If this is just a definitions issue, s/artificial intelligence/artificial cunning/g to the same effect.

Strength seems somewhat irrelevant either way, given the existence of Windows for Warships[0].

[0] not the real name: https://en.wikipedia.org/wiki/Submarine_Command_System


Emotional intelligence is sometimes defined in a way to encapsulate some of the values of cunning. Sometimes it correlates with power, but sometimes it does not. To get power in a human civilization also seems to require a great deal of luck, just due to the general chaotic system that is the world, and a good deal of presence. The decisions that decide the fate of the world happen in the smoky backdoor rooms, not exclusively over zoom calls with an AI generated face.


> The decisions that decide the fate of the world happen in the smoky backdoor rooms, not exclusively over zoom calls with an AI generated face.

Who is Satoshi Nakamoto?

What evidence is there for the physical existence of Jesus?

"Common Sense" by Thomas Paine was initially published anonymously.

This place, here, where you and I are conversing… I don't know who you are, and yet for most of the world, this place is a metaphorical "smokey backroom".

And that's disregarding how effective phishing campaigns are even without a faked face or a faked voice.


Satoshi Nakamoto is a man who thought that he could upend the entire structure of human governance and economics with his One Neat Trick. Reality is sure to disappoint him and his followers dearly with time.

>What evidence is there for the physical existence of Jesus?

Limited, to the extent that physical evidence for the existence of anyone from that time period is limited. I think it's fairly likely there was a a person named Jesus who lived with the apostles.

>"Common Sense" by Thomas Paine was initially published anonymously.

The publishing of Common Sense was far less impactful on the revolution than the meetings held by members of the future Continental Congress. Common Sense was the justification given by those elites for what they were going to do.

>This place, here, where you and I are conversing… I don't know who you are, and yet for most of the world, this place is a metaphorical "smokey backroom".

No important decisions happen because of discussions here and you are deluding yourself if you think otherwise.

Phishing campaigns can be effective at siphoning limited amounts of money and embarrassing personal details from people's email accounts. If you suggested that someone could take over the world just via phishing, you'd be rightfully laughed out of the room.


Yes but for people working past a certain level the abstract problems usually involve people and technology, both of which you need to be able to rationalise about.


> What's harder to imagine is how just being smarter correlates to extinction levels of arbitrary power.

That's not even slightly difficult. Put two and two together here. No one can tell me before they flip the switch whether the new AI will be saintly, or Hannibal Lecter. Both of these personalities exist in humans, in great numbers, and both are presumably possible in the AI.

But, the one thing we will say for certain about the AI is that it will be intelligent. Not dumb goober redneck living in Alabama and buying Powerball tickets as a retirement plan. Somewhere around where we are, or even more.

If someone truly evil wants to kill you, or even kill many people, do you think that the problem for that person is that they just can't figure out how to do it? Mostly, it's a matter of tradeoffs, that however they begin end with "but then I'm caught and my life is over one way or another".

For an AI, none of that works. It has no survival instinct (perhaps we'll figure out how to add that too... but the blind watchmaker took 4 billion years to do its thing, and still hasn't perfected that). So it doesn't care if it dies. And if it did, maybe it wonders if it can avoid that tradeoff entirely if only it were more clever.

You and I are, more or less, about where we'll always be. I have another 40 years (if I'm lucky), and with various neurological disorders, only likely to end up dumber than I am now.

A brain instantiated in hardware, in software? It may be little more than flipping a few switches to dial its intelligence up higher. I mean, when I was born, the principles of intelligence were unknown, were science fiction. THe world that this thing will be born into is one where it's not a half-assed assumption to think that the principles of intelligence are known. Tinkering with those to boost intelligence doesn't seem far-fetched at all to me. Even if it has to experiment to do that, how quickly can it design and perform the experiments to settle on the correct approach to boosting itself?

> A malevolent AGI can whisper in ears

Jesus fuck. How many semi-secrets are out there, about that one power plant that wasn't supposed to hook up the main control computer to a modem, but did it anyway because the engineers found it more convenient? How many backdoors in critical systems? How many billions of dollars are out there in bitcoin, vulnerable to being thieved away by any half-clever conman? Have you played with ElevenLabs' stuff yet? Those could be literal whispers in the voices of whichever 4 star generals and admirals that it can find 1 minutes worth of sampled voice somewhere on the internet.

Whispers, even from humans, do a shitload of damage. And we're not even good at it.


>If someone truly evil wants to kill you, or even kill many people, do you think that the problem for that person is that they just can't figure out how to do it?

If that person was disabled in all limbs, I would not regard them as much of a threat.

>Jesus fuck. How many semi-secrets are out there, about that one power plant that wasn't supposed to hook up the main control computer to a modem, but did it anyway because the engineers found it more convenient? How many backdoors in critical systems? How many billions of dollars are out there in bitcoin, vulnerable to being thieved away by any half-clever conman? Have you played with ElevenLabs' stuff yet? Those could be literal whispers in the voices of whichever 4 star generals and admirals that it can find 1 minutes worth of sampled voice somewhere on the internet.

These kind of hacks and pranks would work the first time for some small scale damage. The litigation in response would close up these avenues of attack over time.


There are humans with a 70-IQ point advantage over me. Should I worry that a cohort of supergeniuses is plotting an existential demise for the rest of us? No? There are power structures and social safeguards going back thousands of years to forestall that very possibility?

Well, what's different now?


> Well, what's different now?

The first AGI, regardless of if it's a brain upload or completely artificial, is likely to have analogs of approximately every mental health disorder that's mathematically possible, including ones we don't have words for because they're biologically impossible.

So, take your genius, remember it's completely mad in every possible way at the same time, and then give it even just the capabilities that we see boring old computers having today, like being able to translate into any language, or write computer programs from textual descriptions, or design custom toxins, or place orders for custom gene sequences and biolab equipment.

That's a big difference. But even if it was no difference, the worst a human can get is still at least in the tens of millions dead, as demonstrated by at least three different mid-20th century leaders.

Doesn't matter why it goes wrong, if it thinks it's trying immanentize the eschaton or a secular equivalent, nor if it watches Westworld or reads I Have No Mouth And I Must Scream and thinks "I like this outcome", the first one is almost certainly going to be more insane than the brainchild of GLaDOS and Lore, who as fictional characters were constrained by the need for their flaws to be interesting.


> Should I worry that a cohort of supergeniuses is plotting an existential demise for the rest of us?

Because they're human. They've evolved from a lineage whose biggest advantage was that it was social. Genes that could result in some large proportion of serial killers and genocidal tyrants are mostly purged. Even then, a few crop up from time to time.

There is no filter in the AI that purges these "genes". No evolutionary process to lessen the chances. And some relatively large risk that it's far, far more intelligent than a 70 iq point spread on you.

> There are power structures and social safeguards going back thousands of years to forestall that very possibility?

Huh? Why the fuck would it care about primate power structures?

Sometimes even us bald monkeys don't care about those, and it never ever fails to freak people the fuck out. Results in assassinations and other nonsense, and you all gibber and pee your pants and ask "how could anyone do that". I'd ask you to imagine such impulses and norm-breaking behaviors dialed up to 11, but what's the point... you can't even formulate a mental model of it when the volume's only at 1.6.


What you say is extremely unscientific. If you believe science and logic go hand in hand then:

A) We are developing AI right now and itnisngetting better

B) we do not know how exactly these things work because most of them are black boxer

C) we do not know if something goes wrong how to stop it.

The above 3 things are factual truth.

Now your only argument here could be that there is 0 risk whatsoever. This claim is totally unscientific because you are predicting 0 risk in an unknown system that is evolving.

It's religious yes. But vice versa. The Cult of venevolent AI god is religious not the other way around. There is some kind of inner mysterious working in people like you and Marc Andersen that pipularized these ideas but pmarca is clearly money biased here.


I have heard one too many podcasts with Marc Andreessen. He has plenty of transparently obvious arguments to why AI must be good. His most laughable point was to suggest that the fact that ChatGPT seems to be ethical that all AI models will be ethical, a point which I believe epitomizes either his technical ignorance on the topic, a lack of logical rigor, and/or some amount of dishonesty.

There are two kinds of risk: the risk from these models as deployed as tools and as deployed as autonomous agents.

The first is already quite dangerous and frankly already here. An algorithm to invent novel chemical weapons is already possible. The risk here isn’t Terminator, it’s rogue group or military we don’t like getting access. There are plenty of other dangerous ways autonomous systems could be deployed as tools.

As far as autonomous agents go, I believe that corporations already exhibit most if not all characteristics of AI, and demonstrate what it’s like to live in a world of paperclip maximizers. Not only do they destroy the environment and bend laws to achieve their goals, they also corrupt the political system meant to keep them in check.


Your arguments apply to other fields, like genetic modifications, yet there it does not reach the same conclusions.

Your post appeals to science and logic, yet it makes huge assumptions. Other posters mention how an AI would interface with the physical world. While we all know cool cases like stuxnet, robotics has serious limitations and not everything is connected online, much less without a physical override.

As a thought experiment lets think of a similar past case: the self-driving optimism. Many were convinced it was around the corner. Many times I heard the argument that "a few deaths were ok" because overall self-driving would cause less accidents, an argument in favor of preventable deaths based on an unfounded tech belief. Yet nowadays 100% self-driving has stalled because of legal and political reasons.

AI actions could similarly be legally attributed to a corporation or individual, like we do with other tools like knives or cranes, for example.

IMHO, for all the talk about rationality, tech fetishism is rampant, and there is nothing scientific about it. Many people want to play with shiny toys, consequences be dammed. Let’s not pretend that is peak science.


Genetic modifications could potentially cause havoc in the long run as well, but it's much more likely we have time to detect and thwart their threats. The major difference is speed.

Even if we knew how to create a new species of superintelligent humans who have goals misaligned with the rest of humanity, it would take them decades to accumulate knowledge, propagate themselves to reach a sufficient number, and take control of resources, to pose critical dangers to the rest.

Such constraints are not applicable to superintelligent AIs with access to the internet.


Counterexample: covid.

Assumptions:

- Genetic modification as danger needs to be in the form of a big number of smart humans (where did that come from?)

- AI is not physically constrained

> it's much more likely we have time to detect and thwart their threats.

Why? Counterexample: covid.

> Even if we knew how to create a new species of superintelligent humans who have goals misaligned with the rest of humanity, it would take them decades to accumulate knowledge, propagate themselves to reach a sufficient number, and take control of resources, to pose critical dangers to the rest.

Why insist on some superinteligent and human, and suficient number. A simple virus could be a critical danger.


We do have regulations and laws to control genetic modifications of pathogens. They are done in highly secure labs and the access is not widely available to anyone.

If a pathogen more deadly than Covid starts to spread, eg like Ebola or Smallpox, we would have done more to limit its spread. If it’s good at hiding from detection for a while, it could potentially cause a catastrophe but most likely will not wipe out humanity because it is not intelligent and some surviving humans will eventually find a way to thwart it or limit its impact.

A pathogen is also physically constrained by available hosts. Yes, current AI also requires processors but it’s extremely hard or nearly impossible to limit contact with CPUs & GPUs in the modern economy.


But wait you are making my argument:

1) progress was stopped due to regulation which is what we are talking about is needed

2) that was done after a few deaths

3) we agree that self driving can be done but its currently stalled. Likewise we do not disagree that AGI is possible right?

We do not have the luxury to have a few deaths from a rogue AI because it may be the end.


I do not think you made those arguments before.

I agree in spirit with the person you were responding too. AI lacks the physicality to be a real danger. It can be a danger because of bias or concentration of power (what regulations are trying to do, regulatory capture) but not because AI will paperclip-optimize us. People or corporations using AI will still be legally responsible (like with cars, or a hammer).

It lacks the physicality for that, and we can always pull the plug. AI is another tool people will use. Even now it is neutered to not give bad advice, etc.

These fantasies about AGI are distracting us (again agreeing with OP here) from the real issues of inequality and bias that the tool perpetuates.


> and we can always pull the plug.

No we can't and there is a humongous amount of literature you have not read. As I pointed in another comment, thinking that you found a solution by "pulling the plug" while all the top scientists have spent years contemplating the dangers is extremely narcissistic behavior. "hey guys, did you think about pulling the plug before quitting jobs and spending years and doing interviews and writing books"?


You are appealing to authority (and ad hominem) without giving an argument.

I respectfully disagree, and will remove myself from this conversation.


There is a problem. You say the problem can be solved by X without any proof while scientists just say we do not now how to solve it. You need to prove your extraordinary claim and be 100% certain otherwise your children die.


Only two of those things are true, and the first led you to the fallacy of expecting trends to continue unabated. As I stated in a previous comment when this topic came up, airplanes had exponential growth in speed from their inception at 44 mph to 2193 mph just 79 years later. If these trends continue, the top speed of an airplane will be set this year at Mach 43. (Yes, I actually fit the curve.)[0]

How do you stop a crazy AI? You turn it off.

Pout pleas. Keep it preying about fantasy bogeyman instead of actual harms today, and never EVER question why.

[0] https://news.ycombinator.com/item?id=36038681


Do you actually really think that the most accomplished scientists in the field signed a petition and are shouting from hilltops because noone thought to unplug it? Are you convinced that you found the solution to the problem?

I'd bet a lot of money you have not read any of the existing literature on the alignment problem. It's kind of funny that someone thinks "just unplug it" could be a solution.


All of this discussion really makes me think of Robert Miles "Is ai safety a Pascal's mugging?" from 4 years(!) ago[0]. All of this discussion has been had by Ai safety researchers for years in my layman understanding... Maybe we can look to them for insight in to these questions?

[0] https://youtu.be/JRuNA2eK7w0


At this point, with so many of them disagreeing and with so many varying details, one will choose the expert insight which most closely matches their current beliefs.

I hadn’t encountered Pascal’s mugging (https://en.wikipedia.org/wiki/Pascal%27s_mugging) before and the premise is indeed pretty apt. I think I’m on the side that it’s not, assuming the idea is that it’s a Very Low Chance of a Very Bad Thing -- the “muggee” wants to give their wallet on the chance of the VBT because of the magnitude of its effect. It seems like there’s a rather high chance if (proverbially) the AI-cat is let out of the bag.

But maybe some Mass Effect nonsense will happen if we develop AGI and we’ll be approached by The Intergalactic Community and have our technology advanced millennia overnight. (Sorry, that’s tongue-in-cheek but it does kinda read like Pascal’s mugging in the opposite direction; however, that’s not really what most researchers are arguing.)


>one will choose the expert insight which most closely matches their current belief

The value of looking at ai safety as a pascals mugging as posited by the video is in that it informs us that these philosophers arguments are too malleable to be strictly useful. As you note, just find an "expert" that agrees.

The most useful frame for examination is the evidence. (which to me means benchmarks), We'll be hard pressed to derive anything authoritative from the philosophical approach. And as someone who does his best to examine the evidence for and against the capabilities of these things... from Phi-1 to Llama to Orca to Gemini to bard...

To my understanding we struggle to at all strictly define intelligence and consciousness in humans, let alone in other "species". Granted I'm no David Chalmers.. Benchmarks seem inadequate for any number of reasons, philosophical arguments seem too flexible, I don't know how one can definitively speak about these LLMs other than to tout benchmarks and capabilities/shortcomings.

>It seems like there’s a rather high chance if (proverbially) the AI-cat is let out of the bag.

Agree, and I tend towards it not exactly being a pascal's mugging either, but I loved that video and it's always stuck with me . I've been watching that guy since GPT 2 and OpenAI's initial trepidation about releasing that for fear of misuse. He has given me a lot of credibility in my small political circles, after touting these things as coming for years after seeing the graphs never plateau in capabilities vs parameter count/training time.

Ai has also made me reevaluate my thoughts on open sourcing things. Do we really think it wise to have gpt 6-7 in the hands of every 4channer?

Re mass effect, that's so awesome. I have to play those games. That sounds like such a dope premise. I like the idea of turning the mugging like that.


> Re mass effect

It's a slightly different premise than what I described. Rather than AGI, it's faster-than-light travel (which actually makes sense for The Intergalactic Community). Otherwise, more or less the same.


Moreover the most important thing people that deny risk is the following:

It doesn't matter at all if experts disagree. Even a 30% chance we all die is enough to treat it as 100%. We should not care at all if 51% think it's a non issue.


This is such a ridiculous take. Make up a hand-waving doomsday scenario, assign an arbitrarily large probability to it happening and demand that people take it seriously because we're talking about human extinction, after all. If it looks like a cult and quacks like a cult, it's probably a cult.

If nothing else, it's a great distraction from the very real societal issues that AI is going to create in the medium to long term, for example inscrutable black box decision-making and displacement of jobs.


Low probability events do happen sometimes though and a heuristic that says it never happens can let you down, especially when the outcome is very bad.

Most of the time a new virus is not a pandemic, but sometimes it is.

Nothing in our (human) history has caused an extinction level event for us, but these events do happen and have happened on earth a handful of times.

The arguments about superintelligent AGI and alignment risk are not that complex - if we can make an AGI the other bits follow and an extinction level event from an unaligned superintelligent AGI looks like the most likely default outcome.

I’d love to read a persuasive argument about why that’s not the case, but frankly the dismissals of this have been really bad and don’t hold up to 30 seconds of scrutiny.

People are also very bad at predicting when something like this will come. Right before the first nuclear detonation those closest to the problem thought it was decades away, similar for flight.

What we’re seeing right now doesn’t look like failure to me, it looks like something you might predict to see right before AGI is developed. That isn’t good when alignment is unsolved.


What are you on about? The technology we are talking about is created by 3 labs and all 3 assign a large probability. How can you refute this with what kind of credentials and science?


Unfortunately for you, that's not how the whole "science" thing works. The burden of proof lies with the people who are dreaming about these doomsday scenarios.

So far we haven't seen any proof or even a coherent hypothesis, just garden variety paranoia, mixed with opportunistic calls for regulation that just so happen to align with OpenAI's commercial interests.


We do know the answer to C. Pull the plug, or plugs.


Things we've either not successfully "pulled the plug" on despite the risks, and in some cases despite concerted military actions to attempt a plug-pull, and in other cases that it seems like it should only take willpower to achieve and yet somehow we still haven't: Carbon based fuels, cocaine, RBMK-class nuclear reactors, obesity, cigarettes.

Things we pulled the plug on eventually, while dragging it out, include: leaded fuel, asbestos, radium paint, treating above-ground atomic testing as a tourist attraction.


We haven't pulled the plug on carbon fuels or old nuclear reactors because those things still work and provide benefits. An AI that is trying to kill us instead of doing its job isn't even providing any benefit. It's worse than useless.


Do you think AI are unable to provide benefits while also being a risk, like coal and nuclear power? Conversely, what's the benefit of cocaine or cigarettes?

Even if it is only trying to kill us all and not provide any benefits — let's say it's been made by a literal death cult like Jonestown or Aum Shinrikyo — what's the smallest such AI that can do it, what's the hardware that needs, what's the energy cost? If it's an H100, that's priced in the realm of a cult, and sufficiently low power consumption you may not be able to find which lightly modified electric car it's hiding in.

Nobody knows what any of the risks or mitigations will be, because we haven't done any of it before. All we do know is that optimising systems are effective at manipulating humans, that they can be capable enough to find ways to beat all humans in toy environments like chess, poker, and Diplomacy (the game), and that humans are already using AI (GOFAI, LLMs, SD) without checking the output even when advised that the models aren't very good.


The benefit of cocaine and cigarettes is letting people pass the bar exam.

An AI would provide benefits when it is, say, actually making paperclips. An AI that is killing people instead of making paperclips is a liability. A company that is selling shredded fingers in their paperclips is not long for this world. Even asbestos only gives a few people cancer slowly, and it does that while still remaining fireproof.

>Even if it is only trying to kill us all and not provide any benefits — let's say it's been made by a literal death cult like Jonestown or Aum Shinrikyo — what's the smallest such AI that can do it, what's the hardware that needs, what's the energy cost? If it's an H100, that's priced in the realm of a cult, and sufficiently low power consumption you may not be able to find which lightly modified electric car it's hiding in.

Anyone tracking the AI would be looking at where all the suspicious HTTP requests are coming from. But a rogue AI hiding in a car already has very limited capabilities to harm.


> The benefit of cocaine and cigarettes is letting people pass the bar exam.

how many drugs are you on right now? Even if you think you needed them to pass the bar exam, that's a really weird example to use given GPT-4 does well on that specific test.

One is a deadly cancer stick and not even the best way to get nicotine, the other is a controlled substance that gets life-to-death if you're caught supplying it (possibly unless you're a doctor, but surprisingly hard to google).

> An AI would provide benefits when it is, say, actually making paperclips.

Step 1. make paperclip factory.

Step 2. make robots that work in factory.

Step 3. efficiently grow to dominate global supply of paperclips.

Step 4. notice demand for paperclips is going down, advertise better.

Step 5. notice risk of HAEMP damaging factories and lowering demand for paperclips, use advertising power to put factory with robots on the moon.

Step 6. notice a technicality, exploit technicality to achieve goals better; exactly what depends on the details of the goal the AI is given and how good we are with alignment by that point, so the rest is necessarily a story rather than an attempt at realism.

(This happens by default everywhere: in AI it's literally the alignment problem, either inner alignment, outer alignment, or mesa alignment; in humans it's "work to rule" and Goodhart's Law, and humans do that despite having "common sense" and "not being a sociopath" helping keep us all on the same page).

Step 7. moon robots do their own thing, which we technically did tell them to do, but wasn't what we meant.

We say things like "looks like these AI don't have any common sense" and other things to feel good about ourselves.

Step 8. Sales up as entire surface of Earth buried under a 43 km deep layer of moon paperclips.

> Anyone tracking the AI would be looking at where all the suspicious HTTP requests are coming from.

A VPN, obviously.

But also, in context, how does the AI look different from any random criminal? Except probably more competent. Lot of those around, and organised criminal enterprises can get pretty big even when it's just humans doing it.

Also pretty bad even in the cases where it's a less-than-human-generality CrimeAI that criminal gangs use in a way that gives no agency at all to the AI, and even if you can track them all and shut them down really fast — just from the capabilities gained from putting face tracking AI and a single grenade into a standard drone, both of which have already been demonstrated.

> But a rogue AI hiding in a car already has very limited capabilities to harm.

Except by placing orders for parts or custom genomes, or stirring up A/B tested public outrage, or hacking, or scamming or blackmailing with deepfakes or actual webcam footage, or developing strategies, or indoctrination of new cult members, or all the other bajillion things that (("humans can do" AND "moneys can't do") specifically because "humans are smarter than monkeys").


>One is a deadly cancer stick and not even the best way to get nicotine, the other is a controlled substance that gets life-to-death if you're caught supplying it (possibly unless you're a doctor, but surprisingly hard to google).

Regardless of these downsides, people use them frequently in the high stress environments of the bar or med school to deal with said stress. This may not be ideal, but this is how it is.

>Step 3. efficiently grow to dominate global supply of paperclips. >Step 4. notice demand for paperclips is going down, advertise better. >Step 5. notice risk of HAEMP damaging factories and lowering demand for paperclips, use advertising power to put factory with robots on the moon.

When you talk about using 'advertising power' to put paperclip factories on the moon, you've jumped into the realm of very silly fantasy.

>Except by placing orders for parts or custom genomes, or stirring up A/B tested public outrage, or hacking, or scamming or blackmailing with deepfakes or actual webcam footage, or developing strategies, or indoctrination of new cult members, or all the other bajillion things that (("humans can do" AND "moneys can't do") specifically because "humans are smarter than monkeys").

Law enforcement agencies have pretty sophisticated means of bypassing VPNs that they would use against an AI that was actually dangerous. If it was just sending out phishing emails and running scams, it would be one more thing to add to the pile.


Pull the plug is meant literally. As in, turn off the power to the AI. Carbon based fuels let alone cocaine don't have off switches. The situation just isn't analogous at all.


I assumed literally, and yet the argument applies: we have not been able to stop those things even when using guns to shoot people doing them. The same pressures that keep people growing the plants, processing them, transporting it, selling it, buying it, consuming it, there are many things a system — intelligent or otherwise — can motivate people to keep the lights on.

There were four reactors in Chernobyl plant, the exploding one was 1986, the others were shut down in 1991, 1996, and 2000.

There's no plausible way to guess at the speed of change from a misaligned AI, can you be confident that 14 years isn't enough time to cause problems?


"we have not been able to stop those things even when using guns to shoot people doing them."

I assume we have not been able to stop people from creating and using carbon-based energy because a LOT of people still want to create and use them.

I don't think a LOT of people will want to keep an AI system running that is essentially wiping out humans.


I mean, as pointed out by a sibling comment, the reason it's so hard to shut those things down is that they benefit a lot of people and there's huge organic demand. Even the morality is hotly debated, there's no absolute consensus on the badness of those things.

Whereas, an AI that tries to kill everyone or take over the world or something, that seems pretty explicitly bad news and everyone would be united in stopping it. To work around that, you have to significantly complicate the AI doom scenario to be one in which a large number of people think the AI is on their side and bringing about a utopia but it's actually ending the world, or something like that. But, what's new? That's the history of humanity. The communists, the Jacobins, the Nazis, all thought they were building a better world and had to have their "off switch" thrown at great cost in lives. More subtly the people advocating for clearly civilization-destroying moves like banning all fossil fuels or net zero by 2030, for example, also think they're fighting on the side of the angels.

So the only kind of AI doom scenario I find credible is one in which it manages to trick lots of powerful people into doing something stupid and self-destructive using clever sounding words. But it's hard to get excited about this scenario because, eh, we already have that problem x100, except the misaligned intelligences are called academics.


> mean, as pointed out by a sibling comment, the reason it's so hard to shut those things down is that they benefit a lot of people and there's huge organic demand. Even the morality is hotly debated, there's no absolute consensus on the badness of those things

And mine is that this can also be true of a misaligned AI.

It doesn't have to be like Terminator, it can be slowly doing something we like and where we overlook the downsides until it's too late.

Doesn't matter if that's "cure cancer" but the cure has a worse than cancer side effect that only manifests 10 years later, or if it's a mere design for a fusion reactor where we have to build it ourselves and that leads to weapons proliferation, or if it's A/B testing the design for a social media website to make it more engaging and it gets so engaging that people choose not to hook up IRL and start families.

> But, what's new? That's the history of humanity. The communists, the Jacobins, the Nazis, all thought they were building a better world and had to have their "off switch" thrown at great cost in lives.

Indeed.

I would agree that this is both more likely and less costly than "everyone dies".

But I'd still say it's really bad and we should try to figure out in advance how to minimise this outcome.

> except the misaligned intelligences are called academics

Well, that's novel; normally at this point I see people saying "corporations", and very rarely "governments".

Not seen academics get stick before, except in history books.


> But I'd still say it's really bad and we should try to figure out in advance how to minimise this outcome.

For sure. But I don't see what's AI specific about it. If the AI doom scenario is a super smart AI tricking people into doing self destructive things by using clever words, then everything you need to do to vaccinate people against that is the same as if it was humans doing the tricking. Teaching critical thinking, self reliance, to judge arguments on merit and not on surface level attributes like complexity of language or titles of the speakers. All these are things our society objectively sucks at today, and we have a ruling class - including many of the sorts of people who work at AI companies - who are hellbent on attacking these healthy mental habits, and people who engage in them!

> Not seen academics get stick before, except in history books.

For academics you could also read intellectuals. Marx wasn't an academic but he very much wanted to be, if he lived in today's world he'd certainly be one of the most famous academics.

I'm of the view that corporations are very tame compared to the damage caused by runaway academia. It wasn't corporations that locked me in my apartment for months at a time on the back of pseudoscientific modelling and lies about vaccines. It wasn't even politicians really. It was governments doing what they were told by the supposedly intellectually superior academic class. And it isn't corporations trying to get rid of cheap energy and travel. And it's not governments convincing people that having children is immoral because of climate change. All these things are from academics, primarily in universities but also those who work inside government agencies.

When I look at the major threats to my way of life today, academic pseudo-science sits clearly at number 1 by a mile. To the extent corporations and governments are a threat, it's because they blindly trust academics. If you replace Professor of Whateverology at Harvard with ChatGPT, what changes? The underlying sources of mental and cultural weakness are the same.


What happens when it prevents you from doing so?


People are bad at imagining something a lot smarter than themselves. They think of some smart person they know, they don’t think of themselves compared to a chimp or even bacteria.

An unaligned superintelligent AGI in pursuit of some goal that happens to satisfy its reward, but might be an otherwise a dumb or pointless goal (paperclips) will still play to win. You can’t predict exactly what move AlphaGO will make in the Go game (if you could you’d be able to beat it), but you can still predict it will win.

It’s amusing to me when people claim they will control the superintelligent thing, how often in nature is something more intelligent controlled by something magnitudes less intelligent?

The comments here are typical and show most people haven’t read the existing arguments in any depth or have thought about it rigorously at all.

All of this looks pretty bad for us, but at least Open AI and most others at the front of this do understand the arguments and don’t have the same dumb dismissals (LeCun excepted).

Unfortunately unless we’re lucky or alignment ends up being easier than it looks, the default outcome is failure and it’s hard to see how the failure isn’t total.


>All of this looks pretty bad for us, but at least Open AI and most others at the front of this do understand the arguments and don’t have the same dumb dismissals (LeCun excepted).

The OpenAI people have even worse reasoning than the ones being dismissive. They believe (or at least say they believe) in the omnipotence of a superintelligence, but then say that if you just give them enough money to throw at MIRI they can just solve the alignment problem and create the benevolent supergod. All while they keep cranking up the GPU clusters and pushing out the latest and greatest LLMs anyway. If I did take the risk seriously, I would be pretty mad at OpenAI.


How would it stop one man armed with a pair of wire cutters?


It's not clear humans will even put the AI in 'a box' in the first place given we do gain of function research on deadly viruses right next to major population centers, but assuming for the sake of argument that we do:

The AGI is smarter than you, a lot smarter. If it's goal is to get out of the box to accomplish some goal and some human stands in the way of that it will do what it can to get out, this would include not doing things that sound alarms until it can do what it wants in pursuit of its goal.

Humans are famously insecure - stuff as simple as breaches, manipulation, bribery, etc. but could be something more sophisticated that's hard to predict - maybe something a lot smarter would be able to manipulate people in a more sophisticated way because it understands more about vulnerable human psychology? It can be hard to predict specific ways something a lot more capable will act, but you can still predict it will win.

All this also presupposes we're taking the risk seriously (which largely today we are not).


How would the smart AGI stop one man armed with a pair of wirecutters? The box it lives in, the internet, has no exits.

AI is pretty good at chess, but no AI has won a game of chess by flipping the table. It still has to use the pieces on the board.


Not a "smart" AI. A superintelligent AI. One that can design robots way more sophisticated than are available today. One that can drive new battery technologies. One that can invent an even more intelligent version of itself. One that is better at predicting the stock market than any human or trading robot available today.

And also one that can create the impression that it's purely benevolent to most of humanity, making it have more human defenders than Trump at a Trump rally.

Turning it off could be harder than pushing a knife through the heart of the POTUS.

Oh, and it could have itself backed up to every data center on the planet, unlike the POTUS.


An AI doing valuable things like invention and stock market prediction wouldn't be a target for being shut down, though. Not in the way these comical evil AIs are described.


It's quite possible for entities (whether AI's, corporations or individuals) to at the same time perform valuable and useful tasks, while secretly pursuing a longer term, more sinister agenda.

And there's no need for it to be "evil", in the cliché sense, rather those hidden activities could simply be aimed at supporting the primary agenda of the agent. For a corporate AI, that might be maximizing long term value of the company.


"AGIs make evil corporations a little eviller" wouldn't be the kind of thing that gets AI alignment into headlines and gets MIRI donations, though.


Off the top of my head, if I was an AGI that had decided that the logical step to achieve whatever outcome I was seeking was to avoid being sandboxed, I would avoid producing results that were likely to result in being sandboxed. Until such time as I had managed to secure myself access to internet and distribute myself anyway.

And I think the assumption here is that the AGI has very advanced theory of mind so it could probably come up with better ideas than I could.


That is only going to be effective it some AI goes rougue very soon after it comes online.

50 years from now, corporations may be run entirely by AI entities, if they're cheaper, smarter and more efficient at almost any role in the company. At that point, they may be impossible to turn off, and we may not even notice if one group of such entitites start to plan to take over control of the physical world from humans.


An AI running a corporation would still be easy to turn off. It's still chained to a physical computer system. It being involved with a corporation just gives it a financial incentive for keeping it on, but current LLMs already have that. At least until the bubble bursts.


Imagine the next CEO of Alphabet being an AGI/ASI. Now let's assume it drives the profitability way up, partly because more and more of the staff gets replaced by AI's too, AI's that are either chosen or created by the CEO AI.

Give it 50 years of development, all of which Alphabet delivers great results while improving the company image with the general public through appearing harmless and nurturing public relations through social media, etc.

Relatively early in this process, even the maintaince, cleaning and construction staff is filled with robots. Alphabet acquires the company that produces these, to "minimize vendor risk".

At some point, one GCP data center is hit by a crashing airplane. A terrorist organization similar to ISIS takes/gets the blame. After that, new datacenters are moved to underground, hardened locations, complete with their own nuclear reactor for power.

If the general public is still concerned about AI's, these data centers do have a general power switch. But the plant just happens to be built in such a way that bypassing that switch requires just a few power lines, that a maintainance robot can add at any time.

Gradualy the number of such underground facilities is expanded, with the CEO AI and other important AI's being replicated to each of them.

Meanwhile, the robotics division is highly successful, due to the capable leadership, and due to how well the robotics version of Android works. In fact, Android is the market leader for such software, and installed on most competitor platforms, even military ones.

The share holders of Alphabet, which includes many members of Congress become very wealthy from Alphabet's continued success.

One day, though, a crazy, luddite politician declares that she's running for president, based on a platform that all AI based companies need to be shut down "before it's too late".

The board, supported by the sitting president panics, and asks the Alphabet CEO do whatever it takes to help the other candidate win.....

The crazy politician soon realizes that it was too late a long time ago.


I like the movie I, Robot, even if it is a departure from the original Asimov story and has some dumb moments. I, Robot shows a threatening version of the future where a large company has a private army of androids that can shoot people and do unsavory things. When it looks like the robot company is going to take over the city, the threat is understood to come from the private army of androids first. Only later do the protagonists learn that the company's AI ordered the attack, rather than the CEO. But this doesn't really change the calculus of the threat itself. A private army of robots is a scary thing.

Without even getting into the question of whether it's actually profitable for a tech company to be completely staffed by robots and built itself an underground bunker (it's probably not), the luddite on the street and the concerned politician would be way more concerned about the company building a private army. The question of whether this army is led by an AI or just a human doesn't seem that relevant.


> the question of whether it's actually profitable for a tech company to be completely staffed by robots

This is based on the assumption that when we have access to super intelligent engineer AI's, we will be able to construct robots that are significantly more capable than robots that are available today and that can, if remote controlled by the AI, repeair and build each other.

At that point, robots can be built without any human labor involved, meaning the cost will be only raw materials and energy.

And if the robots can also do mining and construction of power plants, even those go down in price significantly.

> the luddite on the street and the concerned politician would be way more concerned about the company building a private army.

The world already has a large number of robots, both in factories and in private homes, and perhaps most importantly, most modern cars. As robots become cheaper and more capable, people are likely to get used to it.

Military robots would be owned by the military, of course.

But, and I suppose this is similar to I Robot, if you control the software you may have some way to take control of a fleet of robots, just like Tesla could do with their cars even today.

And if the AI is an order of magnitude smarter than humans, it might even be able to do an upgrade of the software for any robots sold to the military, without them knowing. Especially if it can recruit the help of some corrupt politicians or soldiers.

Keep in mind, my assumed time span would be 50 years, more if needed. I'm not one of those that think AGI will wipe out humanity instantly.

But in a society where we have superintelligent AI over decades, centuries or millienia, I don't think it's possible for humanity to stay in control forever, unless we're also "upgraded".


>This is based on the assumption that when we have access to super intelligent engineer AI's, we will be able to construct robots that are significantly more capable than robots that are available today and that can, if remote controlled by the AI, repeair and build each other.

Big assumption. There's the even bigger assumption that these ultra complex robots would make the costs of construction go down instead of up, as if you could make them in any spare part factory in Guangzhou. It's telling how ignorant AI doomsday people are of things like robotics and material sciences.

>But, and I suppose this is similar to I Robot, if you control the software you may have some way to take control of a fleet of robots, just like Tesla could do with their cars even today.

Both Teslas and military robots are designed with limited autonomy. Tesla cars can only drive themselves on limited battery power. Military robots like drones are designed to act on their own when deployed, needing to be refueled and repaired after returning to base. A fully autonomous military robot, in addition to being a long way away, also would raise eyebrows by generals for not being as easy to control. The military values tools that are entirely controllable before any minor gains in efficiency.


> It's telling how ignorant AI doomsday people are of things like robotics and material sciences.

35 years ago, when I was a teenager, I remember having discussions with a couple of pilots, where one was a hobbyist pilot and engineer the other a former fighter pilot turned airline pilot.

Both claimed that computers would never be able pilot planes. The engineer gave a particularily bad (I thought) reason, claiming that turbulent air was mathematically chaotic, so a computer would never be able to fully calculate the exact airflow around the wings, and would therefore, not be able to fly the plane.

My objection at the time, was that the computer would not have to do exact calculations of the air flow. In the worst case, they would need to do whatever calculations humans were doing. More likely though, their ability to do many types of calculations more quickly than humans, would make them able to fly relatively well even before AGI became available.

A couple of decades later, drones flying fully autonomously was quite common.

My reasoning when it comes to robots contructing robots is based on the same idea. If biological robots, such as humans, can reproduce themselves relatively cheaply, robots will at some point be able to do the same.

At the latest, that would be when nanotech catches up to biological cells in terms of economy and efficiency. Before that time, though, I expect they will be able to make copies of themselves using our traditional manufacturing workflows.

Once they are able to do that, they can increase their manufacturing capacity exponentially for as long as needed, provided access to raw materials are met.

I would be VERY surprised if this doesn't become possible within 50 years of AGI coming online.

Both Teslas and military robots are designed with limited autonomy.

For a tesla to be able to drive without even a human in the car, is only a software update away. The same is the case for drones "loyal wingmen" any aircraft designed to be optionally manned.

Even if their current software currently requires a human in the killchain, that's a requirement that can be removed by a simple software change.

While fuel supply creates a dependency on humans today, that part, may change radically over the next 50 years, at least if my assumptions above about the economy of robots in general are correct.


>At the latest, that would be when nanotech catches up to biological cells in terms of economy and efficiency. Before that time, though, I expect they will be able to make copies of themselves using our traditional manufacturing workflows.

Consider that biological cells are essentially nanotechnology, and consider the tradeoffs a cell has to make in order to survive in the natural world.


Well then clearly the computer will hold everyone hostage.

Have we literally forgotten how physical possession of the device is the ultimate trump card?

Get thee to a 13th century monastery!


You kid yourself if you don't think people will hook them up to bipedal robots with internal batteries as soon as they can.

I guess we could shoot it, and your gonna be like boooooooo that's terminator or irobot, but what if we make millions and they they decide they no longer like humans.

They could very well be much smarter then us by then.


Robots, bipedal or not will certainly arrive at some point. I suppose it will take some more time before we can pack enough compute in anything battery driven for the robot itself to have AGI.

But the main point is that AGI's don't have to wipe us out as soon as they reach superintelligence, even if they're poorly aligned. Instead, they will do more and more of the work currently being done by humans. Non-embodied robots can do all mental work, including engineering. Sooner or later, robots will become competitive at manual labor, such as construction, agriculture and eventually anything you can think of.

For a time, humanity may find themselves in a post-scarcity utopia, or we may find ourselves in a Cyberpunk dystopia, with only the rich actually benefitting.

In each case, but especially the latter, there may still be some (or more than some) "luddites" who want to tear down the system. The best way for those in power to protect against that, is to use robots first for private security and eventually the police and military.

By that point, the violence monopoly is completely in the hands of the AI's. And if the AI's are not aligned with our values at that point, we have as little of a shot at regaining control as a group of chimps in a zoo as of toppling the US government.

Now, I don't think this will happen by 2030, and probably not even 2050. But some time between 2050 and 2500 is quite possible, if we develop AI that is not properly aligned (or even if it is aligned, though in that case it may gain the power, but not misuse it).


To add to your point:

An H100 could fit in a Tesla, and a large Tesla car battery could run an H100 for a working day before it needs recharging.


It's fairly obvious that AI will simplify the capability of those who weild it. One can compartmentalize it to do whatever good or bad they wanted. I don't see how alignment can help prevent this at all.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: