Hacker News new | past | comments | ask | show | jobs | submit login

Is this right? The current best TTS from OpenAI uses gpt-4o-audio-preview which is $2.50 input text, $80 output audio, the new gpt-4o-mini-tts is $0.60 input text, $12 output audio. An average 5x price reduction.

Going the other way, transcribe with gpt-4o-audio-preview price was $40 input audio, $10 output text, the new gpt-4o-transcribe is $6 input audio and $10 output text. Like a 7x reduction on the input price.

TTS/Transcribe with gpt-4o-audio-preview was a hack where you had to prompt with 'listen/speak this sentence:' and it often got it wrong. These new dedicated models are exactly what we needed.

I'm currently using the Google TTS API which is really good, fast and cheap. They charges $16 per million characters which is exactly the same as OpenAI's $0.015 per minute estimate.

Unfortunately it's not really worth switching over if the costs are exactly the same. Transcription on the other hand is 1.6¢/minute with Google and 0.6¢/minute with OpenAI now, that might be worth switching over for.




you can compare TTS pricing here: https://artificialanalysis.ai/text-to-speech

Previous offering from OpenAI was $15 for TTS and $30 for TTS HD so not 5x reduction. This one is slighly cheaper but definitely more capable (if you need control vibe)


That's a really cool page thanks. Does it have stats for other languages?

In my experience the OpenAI TTS APIs were really bad, messing up all the time in foreign languages. Practically unusable for my use case. You'd have to use the gpt-4o-audio-preview to get anything close to passable, but it was expensive. Which is why I'm using Google TTS which is very fast, high quality, and provides first class support for almost every language.

I look forward to comparing it with this model, the price being the same is unfortunate as there's less incentive to switch. The transcribe price is cheaper than Google it looks like so that's worth considering.


Interesting for me Open TTS for Polish was better than Google TTS (but they have few options) - which one did you used? WaveNet?

Sadly haven't seen quality evaluation for TTS for foreign languages


Depends on what's available for the language, but yea Wavenet and Neural2. With OpenAI TTS I'd often get weird bugs where the first API call comes back all garbled, but the second API call comes back fine. Wasting money. On top of that more expensive and higher latency. I'm interested to try out this new one.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: