Because we can get AI related technologies to do things living creatures can’t, ...

negative_person · on May 5, 2024

Can you point to any behaviour in human beings you'd unlearn if theyd also forget the consequences?

We spend billions trying to predict human behaviour and yet we are surprised everyday, "AGI" will be no simpler. We just have to hope the dataset was aligned so the consequences are understood, and find a way to contain models that don't.

williamtrask · on May 5, 2024

You seem to be focusing a lot on remembering or forgetting consequences. Yes, ensuring models know enough about the world to only cause the consequences they desire is a good way for models to not create random harm. This is probably a good thing.

However, there are many other reasons why you might want a neural network to provably forget something. The main reason has to do with structuring an AGI's power. Even though the simple-story of AGI is something like "make it super powerful, general, and value aligned and humanity will prosper". However, the reality is more nuanced. Sometimes you want a model to be selectively not powerful as a part of managing value mis-alignment in practice.

To pick a trivial example, you might want a model to enter your password in some app one time, but not remember the password long term. You might want it to use and then provably forget your password so that it can't use your password in the future without your consent.

This isn't something that's reliably doable with humans. If you give them your password, they have it — you can't get it back. This is the point at which we'll have the option to pursue the imitation of living creatures blindly, or choose to turn away from a blind adherence to the AI/AGI story. Just like we reached the point at which we decided whether flying planes should have flapping wings dogmatically — or whether we should pursue the more economically and politically competitive thing. Planes don't flap their wings, and AI/AGI will be able to provably forget things. And that's actually the better path.

A recent work co-authors and I published related to this: https://arxiv.org/pdf/2012.08347

aeonik · on May 5, 2024

The feeling of extreme euphoria and its connection to highly addictive drugs like Heroin might be a use case. Though I'm not sure how well something like that would work in practice.

everforward · on May 5, 2024

Is that possible to do without also forgetting why it’s dangerous? That seems like it would fuel a pattern of addiction where the person gets addicted, forgets why, then gets addicted again because we wiped their knowledge of the consequences the first time around.

Then again, I suppose if the addiction was in response to a particular stimulus (death of a family member, getting fired, etc) and that stimulus doesn’t happen again, maybe it would make a difference?

It does have a tinge of “those who don’t recall the past are doomed to repeat it”.

aeonik · on May 5, 2024

After a certain point I think someone can learn enough information to derive almost everything from first principles. But I think it might work temporarily.

There's a movie about this idea called "Eternal Sunshine of a Spotless Mind".

I find it hard I believe that you can surgically censor one chunk of information, and cut off the rest of the information. Especially if it's general physical principles.

I also don't have a nice topological map of how all the world's information is connected to the moment, so I can't back up by opinions.

Though I'm still rooting for the RDF/OWL and Semantic Web folks, they might figure it out.

Brian_K_White · on May 5, 2024

It sounds like the only answer for AI is the same as the only answer for humans.

Wisdom. Arriving at actions and reactions based on better understanding of the interconnectedness and interdependency of everything and everyone. (knowing more not less, and not selective or bowdlerized)

And most humans don't even have it. Most humans are not interested and don't believe and certainly don't act as though "What's good for you is what's good for me, what harms you harms me." Every day a tech podcaster or youtuber says this or that privacy loss or security risk "doesn't affect you or me", they all affect you and me, when a government or company gives themselves and then abuses power over a single person anywhere, that is a hit to you and me even though we aren't that person, because that person is somebody, and you and I are somebody.

Most humans ridicule anyone that talks like that and don't let them near any levers of power at any scale. They might be ok with it in inconsequential conversational contexts like a dinner party or this or this forum, but not in any decision-making context. Anyone talking like that is an idiot and disconnected from reality, they might drive the bus off the bridge because the peace fairies told them to.

If an AI were better than most humans and had wisdom, and gave answers that conflicted with selfishness, most humans would just decide they don't like the answers and instructions coming from the AI and just destroy it, or at least ignore it, pretty much as they do today with humans who say things they don't like.

Perhaps one difference is an AI could actually be both wise and well-intentioned rather than a charlatan harnessing the power of a mass of gullables, and it could live longer than a human and it's results could become proven-out over time. Some humans do get recognized eventually, but by then it doesn't do the rest of us any good because they can no longer be a leader as they're too old or dead. Then again maybe that's required actually. Maybe the AI can't prove itself because you can never say of the AI, "What does he get out of it by now? He lived his entire life saying the same thing, if he was just trying to scam everyone for money or power or something, what good would it even do him now? He must have been sincere the whole time."

But probably even the actual good AI won't do much good, again for the same reason as with actually good humans, it's just not what most people want. Whatever individuals say about what their values are, by the numbers only the selfish organisations win. Even when a selfish organization goes too far and destroys itself, everyone else still keeps doing the same thing.

AvAn12 · on May 5, 2024

A few things to exclude from training might include: - articles with mistakes such as incorrect product names, facts, dates, references - fraudulent and non-repeatable research findings - see John Ioannidis among others - outdated and incorrect scientific concepts like phlogiston and LaMarckian evolution - junk content such as 4-chan comments section content - flat earther "science" and other such nonsense - debatable stuff like: do we want material that attributes human behavior to astrological signs or not? And when should a response make reference to such? - prank stuff like script kiddies prompting 2+2=5 until an AI system "remembers" this - intentional poisoning of a training set with disinformation - suicidal and homicidal suggestions and ideation - etc.

Even if we go with the notion that AGI is coming, there is no reason its training should include the worst in us.

beeboobaa3 · on May 5, 2024

Seeing dad have sex with mom.