Minimax Audio Voice Clone API Usage Guide
Overview
Minimax Audio Voice Clone allows you to clone any voice from an audio sample and use it to generate custom speech. Simply provide URLs to your audio files, and the system will automatically handle downloading, processing, and voice cloning. The cloned voice can then speak any text you provide.Key Features:
- Synchronous Operation: Get results immediately in 5-15 seconds
- URL-Based Input: Provide audio URLs, backend handles all processing
- Style Control: Optional prompt audio to define speaking style, tone, and emotion
- Audio Enhancement: Built-in noise reduction and volume normalization
- High-Quality Audio: Supports MP3, M4A, and WAV formats
Authentication
All API requests require authentication using an API key. Include your API key in the Authorization header:Submit Voice Clone Request
Endpoint
Request Format
Request Parameters
| Parameter | Type | Required | Description | Default | Constraints |
|---|---|---|---|---|---|
model | string | Yes | Model identifier | - | "minimax-audio-voice-clone-speech-2.6-hd" |
payload.text | string | Yes | Text content to be synthesized using the cloned voice | - | Required, non-empty string |
payload.source_audio | string | Yes | URL of the source audio file for voice cloning. Backend downloads automatically. | - | Valid HTTP/HTTPS URL. Supported formats: mp3, m4a, wav |
payload.voice_id | string | No | The voice_id of the cloned voice. Length range:[8:256], must start with an English letter, must not duplicated | Auto-generated (request ID) | Alphanumeric string, underscores allowed |
payload.prompt_audio | string | No | URL of the prompt audio file. Defines speaking style/emotion. Must be used with prompt_text(less than 8s). | - | Valid HTTP/HTTPS URL. Supported formats: mp3, m4a, wav, flac |
payload.prompt_text | string | No | Description of the prompt audio (e.g., “This voice sounds natural and pleasant”) | - | Required if prompt_audio is provided |
payload.need_noise_reduction | boolean | No | Apply noise reduction to the generated audio | false | true or false |
payload.need_volumn_normalization | boolean | No | Apply volume normalization to the generated audio | false | true or false |
Response
Voice Clone is synchronous and returns the result immediately (typically within 5-15 seconds).Check Request Status
Endpoint
Example
Response
Request Status Values
Voice Clone is synchronous, so the response will immediately return one of these statuses:| Status | Description |
|---|---|
success | Voice cloning completed successfully |
failed | Voice cloning failed (see error message) |