Hacker News new | past | comments | ask | show | jobs | submit login

If 1 in 1,000 generations will randomly produce memorized CSAM that slipped into the training set then yeah, it's pretty damn unsafe to use. Producing memorized images has precedent[0].

Is it unlikely? Sure, but worth validating.

[0] https://arxiv.org/abs/2301.13188




Do you have an example? I've never heard of anyone accidentally generating CSAM, with any model. "1 in 1,000" is just an obviously bogus probability, there must have been billions of images generated using hundreds of different models.

Besides, and this is a serious question, what's the harm of a model accidentally generating CSAM? If you weren't intending to generate these images then you would just discard the output, no harm done.

Nobody is forcing you to use a model that might accidentally offend you with its output. You can try "aligning" it, but you'll just end up with Google Gemini style "Sorry I can't generate pictures of white people".


Earlier datasets used by SD were likely contaminated with CSAM[0]. It was unlikely to have been significant enough to result in memorized images, but checking the safety of models increases that confidence.

And yeah I think we should care, for a lot of reasons, but a big one is just trying to stay well within the law.

[0] https://www.404media.co/laion-datasets-removed-stanford-csam...


SD always removed enough nsfw material that this probably never made it in there.


Then you know almost nothing about the SD 1.5 ecosystem apparently. I've finetuned multiple models myself and it's nearly impossible to get rid of the child-bias in anime-derived models (which applies to 90 % of character focussed models) including nsfw ones. Took me like 30 attempts to get somewhere reasonable and it's still noticeable.


If we're being honest, anime and anything "anime-derived" is uncomfortably close to CSAM as a source material, before you even get SD involved, so I'm not surprised.

What I had in mind were regular general purpose models which I've played around with quite extensively.


Why not run the safety check on the training data?


They try to, but it is difficult to comb through billions of images, and at least some of SD's earlier datasets were later found to have been contaminated with CSAM[0].

https://www.404media.co/laion-datasets-removed-stanford-csam...


Okay, by "safety checks" you meant the already unlawful things like CSAM, but not politically-overloaded beliefs like "diversity"? The latter is what the comment[1] you were replying to was referring to (viz. "considering the recent Gemini debacle"[2]).

[1] https://news.ycombinator.com/item?id=39466991

[2] https://news.ycombinator.com/item?id=39456577


Right, by "rather have this [nothing]" I meant Stable Diffusion doing some basic safety checking, not Google's obviously flawed ideas of safety. I should have made that clear.

I posed the worst-case scenario of generating actual CSAM in response to your question, "What particular image that you think a random human will ask the AI to generate, which then leads to concrete harm in the real world?"


Could you elaborate on the concrete real world harm?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: