Earlier datasets used by SD were likely contaminated with CSAM[0]. It was unlikely to have been significant enough to result in memorized images, but checking the safety of models increases that confidence.
And yeah I think we should care, for a lot of reasons, but a big one is just trying to stay well within the law.
And yeah I think we should care, for a lot of reasons, but a big one is just trying to stay well within the law.
[0] https://www.404media.co/laion-datasets-removed-stanford-csam...