Hacker News new | past | comments | ask | show | jobs | submit login

It's a slightly better model for TTS. With extra training focusing on reading the script exactly as written.

e.g. the audio-preview model when given instruction to speak "What is the capital of Italy" would often speak "Rome". This model should be much better in that regard

= No plans to have localized voice models, but we do want to bring expand the menu of voices with voices that are best at different accents




Great to hear thanks. My favorite was "I would like you to repeat the following in an Australian accent: Hi there, welcome to Sydney." which was more often than not swapping "Hi there" for "G'day"!




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: