Serious question, I'd genuinely like to know - why?
You didn't license the images when training Stable Diffusion, and yet you did for Stable Audio? In both cases the training should either be fair use and legal without any licensing, or be infringing and need licensing. Why is audio different than images? Am I missing something here?