Quick Start - GMI Cloud Documentation

Create an account and get your API key

Sign in to the GMI Cloud Console. Go to Settings → API Keys and create a new key. Copy it — you’ll use it in the next step.

Make your first inference call

GMI’s inference API is OpenAI-compatible. Swap in your API key and endpoint:

from openai import OpenAI

client = OpenAI(
    base_url="https://api.gmi-serving.com/v1",
    api_key="YOUR_GMI_API_KEY",
)

response = client.chat.completions.create(
    model="meta-llama/Llama-3.3-70B-Instruct",
    messages=[{"role": "user", "content": "Hello, what can you do?"}],
)

print(response.choices[0].message.content)

Explore what's next

Pick where to go based on your use case:

Browse the model catalog — text, image, video, and audio models
Set up a dedicated endpoint — reserve capacity for production
Provision GPU compute — managed clusters and bare-metal for training

Go deeper

Browse models

Text, image, video, and audio models available on GMI.

Dedicated endpoints

Reserve capacity for production workloads.

GPU Compute

Managed clusters and bare-metal for training and fine-tuning.

GMI Studio

Build multi-step AI pipelines visually.

API Reference

Full REST API docs for all GMI services.

Agent Frameworks

Plug GMI into Hermes, Dify, and OpenClaw.

Welcome to GMI Cloud

​Go deeper