Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.gmicloud.ai/llms.txt

Use this file to discover all available pages before exploring further.

Audio models turn text into speech, clone or style voices, generate music, or edit audio. Capabilities and latency profiles differ by provider and tier.

Technical topics

  • Text-to-speech (TTS) — Natural speech from text; variants tuned for quality vs speed.
  • Voice cloning — Reference audio to match timbre or style where supported.
  • Real-time / low-latency — Models optimized for interactive or live use cases.
  • Languages & prosody — Multilingual support, emotion, and pacing depend on the specific model.
  • Music generation — Lyrics- or prompt-driven music where available.

Model API & platform docs

For serving modes (serverless vs dedicated), billing, rate limits, task polling, and unified API patterns, see the API Reference section.

Model list

ModelModel IDOrganization
minimax-audio-voice-clone-speech-2.6-hdminimax-audio-voice-clone-speech-2.6-hdminimax
minimax-audio-voice-clone-speech-2.6-turbominimax-audio-voice-clone-speech-2.6-turbominimax
minimax-music-2.5minimax-music-2.5minimax
minimax-tts-speech-01-hdminimax-tts-speech-01-hdminimax
minimax-tts-speech-01-turbominimax-tts-speech-01-turbominimax
minimax-tts-speech-02-hdminimax-tts-speech-02-hdminimax
minimax-tts-speech-02-turbominimax-tts-speech-02-turbominimax
minimax-tts-speech-2.5-hd-previewminimax-tts-speech-2.5-hd-previewminimax
minimax-tts-speech-2.5-turbo-previewminimax-tts-speech-2.5-turbo-previewminimax
minimax-tts-speech-2.6-hdminimax-tts-speech-2.6-hdminimax
minimax-tts-speech-2.6-turbominimax-tts-speech-2.6-turbominimax
Realtime-tts-1.5-maxinworld-tts-1.5-maxinworld
Realtime-tts-1.5-miniinworld-tts-1.5-miniinworld
Realtime-tts-2inworld-tts-2inworld
Step-Audio-EditXStep-Audio-EditXstepfun-ai