Documentation Index
Fetch the complete documentation index at: https://docs.gmicloud.ai/llms.txt
Use this file to discover all available pages before exploring further.
Model ID
API Usage
MiniMax-M2.5 supports two API formats:
- OpenAI-compatible: /v1/chat/completions endpoint
- Anthropic-compatible: /v1/messages endpoint (with extended thinking support)
API Examples
OpenAI-Compatible Endpoint
Generate a model response using the chat completions endpoint.
Shell
curl --request POST \
--url https://api.gmi-serving.com/v1/chat/completions \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer *************' \
--data '{
"model": "MiniMaxAI/MiniMax-M2.5",
"messages": [
{"role": "system", "content": "You are a helpful AI assistant with strong reasoning and coding ability."},
{"role": "user", "content": "Explain how the Mixture-of-Experts architecture improves inference efficiency in large language models."}
],
"max_tokens": 1024
}'
Python
import requests
import json
url = "https://api.gmi-serving.com/v1/chat/completions"
headers = {
"Content-Type": "application/json",
"Authorization": "Bearer YOUR_API_KEY"
}
payload = {
"model": "MiniMaxAI/MiniMax-M2.5",
"messages": [
{"role": "system", "content": "You are a capable AI coding assistant"},
{"role": "user", "content": "Refactor this multi-file Python module to make it async-ready"}
],
"max_tokens": 1024
}
response = requests.post(url, headers=headers, json=payload)
print(json.dumps(response.json(), indent=2))
Anthropic-Compatible Endpoint (with Extended Thinking)
The /v1/messages endpoint provides access to the model’s reasoning process through the “thinking” content block. You can also use the Anthropic SDK directly by configuring the base URL.
Using Anthropic SDK (Recommended)
# Configure environment variables
export ANTHROPIC_BASE_URL=https://api.minimax.io/anthropic
export ANTHROPIC_API_KEY=${YOUR_API_KEY}
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="MiniMax-M2.5",
max_tokens=1000,
system="You are a helpful assistant.",
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": "Hi, how are you?"
}
]
}
]
)
for block in message.content:
if block.type == "thinking":
print(f"Thinking:\n{block.thinking}\n")
elif block.type == "text":
print(f"Text:\n{block.text}\n")
Shell (Direct API)
curl --request POST \
--url https://api.gmi-serving.com/v1/messages \
-H 'Content-Type: application/json' \
-H 'x-api-key: *************' \
--data '{
"model": "MiniMaxAI/MiniMax-M2.5",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Solve this step by step: What is 23 * 47?"}
]
}'
Shell (with Streaming)
curl --request POST \
--url https://api.gmi-serving.com/v1/messages \
-H 'Content-Type: application/json' \
-H 'x-api-key: *************' \
--data '{
"model": "MiniMaxAI/MiniMax-M2.5",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Tell me a three sentence bedtime story about a unicorn."}
],
"stream": true
}'
Python (with Streaming)
import anthropic
client = anthropic.Anthropic()
stream = client.messages.create(
model="MiniMax-M2.5",
max_tokens=1000,
system="You are a helpful assistant.",
messages=[
{"role": "user", "content": [{"type": "text", "text": "Hi, how are you?"}]}
],
stream=True,
)
for chunk in stream:
if chunk.type == "content_block_delta":
if hasattr(chunk, "delta") and chunk.delta:
if chunk.delta.type == "thinking_delta":
print(chunk.delta.thinking, end="", flush=True)
elif chunk.delta.type == "text_delta":
print(chunk.delta.text, end="", flush=True)