If 1 in 1,000 generations will randomly produce memorized CSAM that slipped into...

dns_snek · on Feb 22, 2024

Do you have an example? I've never heard of anyone accidentally generating CSAM, with any model. "1 in 1,000" is just an obviously bogus probability, there must have been billions of images generated using hundreds of different models.

Besides, and this is a serious question, what's the harm of a model accidentally generating CSAM? If you weren't intending to generate these images then you would just discard the output, no harm done.

Nobody is forcing you to use a model that might accidentally offend you with its output. You can try "aligning" it, but you'll just end up with Google Gemini style "Sorry I can't generate pictures of white people".

causal · on Feb 22, 2024

Earlier datasets used by SD were likely contaminated with CSAM[0]. It was unlikely to have been significant enough to result in memorized images, but checking the safety of models increases that confidence.

And yeah I think we should care, for a lot of reasons, but a big one is just trying to stay well within the law.

[0] https://www.404media.co/laion-datasets-removed-stanford-csam...

astrange · on Feb 22, 2024

SD always removed enough nsfw material that this probably never made it in there.

7moritz7 · on Feb 22, 2024

Then you know almost nothing about the SD 1.5 ecosystem apparently. I've finetuned multiple models myself and it's nearly impossible to get rid of the child-bias in anime-derived models (which applies to 90 % of character focussed models) including nsfw ones. Took me like 30 attempts to get somewhere reasonable and it's still noticeable.

dns_snek · on Feb 22, 2024

If we're being honest, anime and anything "anime-derived" is uncomfortably close to CSAM as a source material, before you even get SD involved, so I'm not surprised.

What I had in mind were regular general purpose models which I've played around with quite extensively.

yreg · on Feb 22, 2024

Why not run the safety check on the training data?

causal · on Feb 22, 2024

They try to, but it is difficult to comb through billions of images, and at least some of SD's earlier datasets were later found to have been contaminated with CSAM[0].

https://www.404media.co/laion-datasets-removed-stanford-csam...

srid · on Feb 22, 2024

Okay, by "safety checks" you meant the already unlawful things like CSAM, but not politically-overloaded beliefs like "diversity"? The latter is what the comment[1] you were replying to was referring to (viz. "considering the recent Gemini debacle"[2]).

[1] https://news.ycombinator.com/item?id=39466991

[2] https://news.ycombinator.com/item?id=39456577

causal · on Feb 22, 2024

Right, by "rather have this [nothing]" I meant Stable Diffusion doing some basic safety checking, not Google's obviously flawed ideas of safety. I should have made that clear.

I posed the worst-case scenario of generating actual CSAM in response to your question, "What particular image that you think a random human will ask the AI to generate, which then leads to concrete harm in the real world?"

thomquaid · on Feb 22, 2024

Could you elaborate on the concrete real world harm?