What ElevenLabs and OpenAI call “speech to speech” are completely different.
ElevenLabs’ takes as input audio of speech and maps it to a new speech audio that sounds like a different speaker said it, but with the exact same intonation.
OpenAI’s is an end-to-end multimodal conversational model that listens to a user speaking and responds in audio.
ElevenLabs’ takes as input audio of speech and maps it to a new speech audio that sounds like a different speaker said it, but with the exact same intonation.
OpenAI’s is an end-to-end multimodal conversational model that listens to a user speaking and responds in audio.