It's a slightly better model for TTS. With extra training focusing on reading th... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

jeffharris 51 days ago | parent | context | favorite | on: OpenAI Audio Models

It's a slightly better model for TTS. With extra training focusing on reading the script exactly as written.

e.g. the audio-preview model when given instruction to speak "What is the capital of Italy" would often speak "Rome". This model should be much better in that regard

= No plans to have localized voice models, but we do want to bring expand the menu of voices with voices that are best at different accents

robbomacrae 51 days ago [–]

Great to hear thanks. My favorite was "I would like you to repeat the following in an Australian accent: Hi there, welcome to Sydney." which was more often than not swapping "Hi there" for "G'day"!

Consider applying for YC's Summer 2025 batch! Applications are open till May 13
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact