Hacker News new | past | comments | ask | show | jobs | submit login

I think of the SD3 as a further evolution of SD1.5/2/XL and StableCascade as a branching path. It is unclear which will be better in the long term, so why not cover both directions if they have the resources to do so?



I suspect Stable Cascade may incorporate a DiT at some point. The UNet is easily swapped out. SC’s main innovation is the training of a semantic compressor model and a VQGAN that translates the latent output from the diffusion model back to image space - rather than relying on a VAE.

It’s a really smart architecture and I think is fertile ground for stacking on new things like DiT.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: