I’m not trivializing risks. I’m characterizing output. These systems aren’t theo...

fatherzine · on Nov 19, 2023

oh, these teams are useless bureaucratic busybodies that only mask the real issue: ai is explosively powerful, and nobody has the slightest clue on how to steward that power and avoid the pain ai will unfortunately unleash.

peyton · on Nov 19, 2023

Sounds more like a religion than a well-defined business objective.

fatherzine · on Nov 19, 2023

not entirely sure what you refer to, but here's a possibly flawed and possibly unrelated analogy: while our nervous systems depend on low intensity electric fields to function, subjecting them to artificial fields orders of magnitude more intense is well documented to cause intense pain, and as the intensity increases, eventually to death by electrocution. i submit that, sadly, we are going to observe the same phenomenon with intelligence as the parameter.

ben_w · on Nov 19, 2023

> Give me a concrete example of a harm prevented

One can only do this by inventing a machine to observe the other Everett Branches where people didn't do safety work.

Without that magic machine, the closest one can get to what you're asking for is to see OpenAI's logs for which completions for which prompts they're blocking; if they do this with content from the live model and not just the original red-team effort leading up to launch, then it's lost in the noise of all the other search results.

nvm0n2 · on Nov 19, 2023

This is veering extremely close to tiger-protecting rock territory.

ben_w · on Nov 19, 2023

There's certainly a risk of that, but I think the second paragraph is enough to push it away from that problem in this specific instance.

fatherzine · on Nov 19, 2023

The second paragraph is veering extremely close to shamanic person worship territory -- shamans have privileged access to the otherworld that we mere mortals lack.

ben_w · on Nov 19, 2023

Again, I agree there's certainly a risk of that, but OpenAI did show at least some examples from their pre-release red teaming of gpt-4.

What OpenAI showed definitely doesn't convince everyone (see some recent replies to my other comments for an example), though as find the examples sufficiently convincing I am unfortunately unable to see things from the POV of those who don't and therefore can't imagine what would change the minds of doubters.

Retric · on Nov 19, 2023

Production LLM’s have been modified to avoid showing kids how to make dangerous chemicals using household chemicals. That’s a specific hazard being mitigated.