Documentation Index
Fetch the complete documentation index at: https://docs.gmicloud.ai/llms.txt
Use this file to discover all available pages before exploring further.
Model ID
API Usage
You can access GLM-4.5-FP8 through the same RESTful Chat Completions API used by other GLM models.
The following examples demonstrate text generation and function-calling usage.
API Examples
Generate a Chat Completion
Use the chat completion endpoint to generate responses from GLM-4.5-FP8.
Shell
curl --request POST \
--url https://api.gmi-serving.com/v1/chat/completions \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer *************' \
--data '{
"model": "zai-org/GLM-4.5-FP8",
"messages": [
{"role": "system", "content": "You are a knowledgeable AI assistant."},
{"role": "user", "content": "Explain the concept of quantum entanglement in simple terms."}
],
"temperature": 0.7,
"max_tokens": 800
}'
Function Calling
curl --request POST \
--url https://api.gmi-serving.com/v1/chat/completions \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer *************' \
--data '{
"temperature": 0,
"max_tokens": 100,
"model": "zai-org/GLM-4.5-FP8",
"tools": [
{
"type": "function",
"function": {
"name": "get_stock_price",
"description": "Retrieve the current stock price for a given company.",
"parameters": {
"type": "object",
"properties": {
"symbol": {
"type": "string",
"description": "Ticker symbol of the company, e.g., AAPL or TSLA."
}
},
"required": ["symbol"]
}
}
}
],
"messages": [
{
"role": "user",
"content": "What is the current price of Apple stock?"
}
]
}'
Python SDK Usage
import requests
import json
url = "https://api.gmi-serving.com/v1/chat/completions"
headers = {
"Content-Type": "application/json",
"Authorization": "Bearer *************"
}
payload = {
"model": "zai-org/GLM-4.5-FP8",
"messages": [
{"role": "system", "content": "You are a knowledgeable AI assistant."},
{"role": "user", "content": "Explain the concept of quantum entanglement in simple terms."}
],
"temperature": 0.7,
"max_tokens": 800
}
response = requests.post(url, headers=headers, json=payload)
print(json.dumps(response.json(), indent=2))