Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.gmicloud.ai/llms.txt

Use this file to discover all available pages before exploring further.

Model ID
inworld-tts-2

Inworld Realtime TTS 2 API Usage Guide

Overview

Inworld Realtime TTS 2 is Inworld’s next-generation text-to-speech model, delivering higher quality audio with lower latency. It supports the same 65 voices across 16 languages as TTS 1.5, with enhanced naturalness and expressiveness. Includes phoneme-level timing and viseme symbols for lip-sync animation.

Key Features:

  • 65 Voices across 16 languages
  • Higher Quality: Improved naturalness and expressiveness over TTS 1.5
  • Lower Latency: Optimized for real-time applications
  • Word & Character Timestamps: Optional alignment metadata for captions and highlights
  • Multiple Formats: MP3, WAV, OGG_OPUS, FLAC, ALAW, MULAW
  • Text Normalization: Automatic expansion of numbers, dates, abbreviations

Available Voices (65 total)

English (25 voices)

VoiceDescriptionTags
AlexEnergetic mid-range malefriendly, expressive
AshleyWarm, natural femalewarm, mellow
BlakeRich, intimate maleintimate, romantic
CarterRadio announcer-style maleintense, motivational
CliveBritish male, calmcalm, friendly, british
CraigOlder British male, refinedposh, raspy, british
DeborahGentle, elegant femalegentle, elegant
DennisSmooth, calm maleoutgoing, upbeat
DominusRobotic, deep malerobotic, monotone
EdwardFast-talking, emphatic maleemphatic, shouty
ElizabethProfessional femaleinformative, calm
HadesCommanding, gruff malecommanding, gruff
HanaBright, expressive femalebright, playful
JuliaQuirky, high-pitched femalechildish, quirky
LunaCalm, relaxing femalecalm, relaxing
MarkEnergetic male, rapid-firearticulate, engaging
OliviaBritish female, upbeatcute, upbeat, british
PixieChildlike femalecartoonish, high-pitched
PriyaFemale, Indian accentfriendly, gentle
RonaldBritish male, deep voiceconfident, expressive, british
SarahYoung adult femaleupbeat, excited
ShaunFriendly, dynamic malecalm, casual
TheodoreGravelly male, elderlyelderly, wise
TimothyLively American malehyped, upbeat
WendyBritish female, poshpleasant, casual, british

Chinese 中文 (4 voices)

VoiceDescriptionTags
Yichen年轻男声clear, friendly
Xiaoyin年轻女声, 温柔polite, kind
Xinyi女声, 中性professional, warm
Jing活力女声soft, clear

Japanese 日本語 (2 voices)

VoiceDescriptionTags
Asuka若い女性energetic, clear
Satoshi男性, 表現力豊かnervous, curious

Korean 한국어 (4 voices)

VoiceDescriptionTags
Hyunwoo젊은 남성polite, warm
Minji젊은 여성light, bright
Seojun성숙 남성deep, authoritative
Yoona여성, 부드러움sad, clear

Other Languages

  • French (4): Alain, Hélène, Mathieu, Étienne
  • German (2): Johanna, Josef
  • Spanish (4): Diego, Lupita, Miguel, Rafael
  • Italian (2): Gianni, Orietta
  • Portuguese-BR (2): Heitor, Maitê
  • Russian (4): Svetlana, Elena, Dmitry, Nikolai
  • Dutch (4): Erik, Katrien, Lennart, Lore
  • Polish (2): Szymon, Wojciech
  • Hindi (2): Riya, Manoj
  • Hebrew (2): Yael, Oren
  • Arabic (2): Nour, Omar

Authentication

All API requests require Bearer authentication:
Authorization: Bearer YOUR_API_KEY

Submit TTS Request

Endpoint

POST /api/v1/ie/requestqueue/apikey/requests

Request Format

curl -X POST "https://console.gmicloud.ai/api/v1/ie/requestqueue/apikey/requests" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "inworld-tts-2",
    "payload": {
      "text": "Hello, world! What a wonderful day!",
      "voice_id": "Dennis",
      "audio_encoding": "MP3",
      "sample_rate_hertz": 48000,
      "speaking_rate": 1.0,
      "temperature": 1.1,
      "timestamp_type": "WORD"
    }
  }'

Response

Inworld TTS 2 is synchronous and returns results immediately.
{
    "request_id": "abc123-def456",
    "model": "inworld-tts-2",
    "status": "success",
    "outcome": {
        "audio_url": "https://storage.googleapis.com/...",
        "media": [{"type": "audio", "url": "https://..."}],
        "usage": {
            "processed_characters": 35,
            "model_id": "inworld-tts-2"
        },
        "timestamp_info": {
            "wordAlignment": {
                "words": ["Hello,", "world!"],
                "wordStartTimeSeconds": [0, 0.37],
                "wordEndTimeSeconds": [0.37, 0.83],
                "phoneticDetails": [...]
            }
        }
    }
}