It's a slightly better model for TTS. With extra training focusing on reading the script exactly as written.
e.g. the audio-preview model when given instruction to speak "What is the capital of Italy" would often speak "Rome". This model should be much better in that regard
=
No plans to have localized voice models, but we do want to bring expand the menu of voices with voices that are best at different accents
Great to hear thanks. My favorite was "I would like you to repeat the following in an Australian accent: Hi there, welcome to Sydney." which was more often than not swapping "Hi there" for "G'day"!
e.g. the audio-preview model when given instruction to speak "What is the capital of Italy" would often speak "Rome". This model should be much better in that regard
= No plans to have localized voice models, but we do want to bring expand the menu of voices with voices that are best at different accents