Hacker News new | past | comments | ask | show | jobs | submit login

I’m not trivializing risks. I’m characterizing output. These systems aren’t theoretical anymore. They’re used by hundreds of millions of people daily in one form or another.

What are these teams accomplishing? Give me a concrete example of a harm prevented. “Pen is mightier than the sword” is an aphorism.




oh, these teams are useless bureaucratic busybodies that only mask the real issue: ai is explosively powerful, and nobody has the slightest clue on how to steward that power and avoid the pain ai will unfortunately unleash.


Sounds more like a religion than a well-defined business objective.


not entirely sure what you refer to, but here's a possibly flawed and possibly unrelated analogy: while our nervous systems depend on low intensity electric fields to function, subjecting them to artificial fields orders of magnitude more intense is well documented to cause intense pain, and as the intensity increases, eventually to death by electrocution. i submit that, sadly, we are going to observe the same phenomenon with intelligence as the parameter.


> Give me a concrete example of a harm prevented

One can only do this by inventing a machine to observe the other Everett Branches where people didn't do safety work.

Without that magic machine, the closest one can get to what you're asking for is to see OpenAI's logs for which completions for which prompts they're blocking; if they do this with content from the live model and not just the original red-team effort leading up to launch, then it's lost in the noise of all the other search results.


This is veering extremely close to tiger-protecting rock territory.


There's certainly a risk of that, but I think the second paragraph is enough to push it away from that problem in this specific instance.


The second paragraph is veering extremely close to shamanic person worship territory -- shamans have privileged access to the otherworld that we mere mortals lack.


Again, I agree there's certainly a risk of that, but OpenAI did show at least some examples from their pre-release red teaming of gpt-4.

What OpenAI showed definitely doesn't convince everyone (see some recent replies to my other comments for an example), though as find the examples sufficiently convincing I am unfortunately unable to see things from the POV of those who don't and therefore can't imagine what would change the minds of doubters.


Production LLM’s have been modified to avoid showing kids how to make dangerous chemicals using household chemicals. That’s a specific hazard being mitigated.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: