Welcome to GMI Cloud - GMI Cloud Documentation

GMI’s inference API is OpenAI-compatible — swap your endpoint and API key to get started. Run serverless inference, scale on dedicated GPU clusters, build visual AI workflows, or publish agents on the marketplace.

Products

Inference

Serverless and Dedicated endpoints for chat, vision, image, video, and audio models. OpenAI-compatible APIs.

GPU Compute

Managed Kubernetes clusters, container instances, and bare-metal servers on H200 and B200 GPUs.

GMI Studio

Visual workflow builder. Connect models with nodes to make multi-step pipelines for media and text.

GMI AgentBox

Marketplace of ready-to-use AI agents. Browse, use, or publish your own.

Inference integrations

Connect GMI inference to your dev tools and agent frameworks.

Coding Tools

Use GMI models inside Claude Code, Codex, and Cursor.

Agent Frameworks

Plug GMI into Hermes, Dify, and OpenClaw.

Guides & reference

Documentation and API specs across all GMI products.

Guides

Task-focused walkthroughs across products: agents, coding tools, model quickstarts, and migration.

API Reference

REST APIs for IAM, Compute, IDC, and Inference services. Full request and response schemas.

Model catalog

Text

LLMs for chat, code, and reasoning.

Image

Generation, editing, and batch image workflows.

Video

Text-to-video, image-to-video, and editing.

Audio

TTS, voice cloning, and music generation.

Quick links

Console

Manage everything in one place.

Pricing

Current rates for inference and compute.

Quickstart

Get an API key, make your first call, and explore in minutes.

Contact Sales

Enterprise pricing or early access.

Quick Start

​Products