Documentation Index
Fetch the complete documentation index at: https://docs.gmicloud.ai/llms.txt
Use this file to discover all available pages before exploring further.
Model ID
deepseek-ai/DeepSeek-V4-Flash
The model includes hybrid attention for efficient long-context processing and supports configurable reasoning modes. It is well suited for applications such as coding assistants, chat systems, and agent workflows where responsiveness and cost efficiency are important.
API Usage
You can interact with the deepseek-ai/DeepSeek-V4-Flash model through various programming languages and methods. Below are examples showing how to use the model’s API.
API Examples
Generate a model response using the chat endpoint of DeepSeek-V4-Flash.
Shell
curl --request POST \
--url https://api.gmi-serving.com/v1/chat/completions \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer *************' \
--data '{
"model": "deepseek-ai/DeepSeek-V4-Flash",
"messages": [
{"role": "system", "content": "You are a helpful AI assistant"},
{"role": "user", "content": "List 3 countries and their capitals."}
],
"temperature": 0,
"max_tokens": 500
}'
# example for function call
curl --request POST \
--url https://api.gmi-serving.com/v1/chat/completions \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer *************' \
--data '{
"temperature": 0,
"max_tokens": 100,
"model": "deepseek-ai/DeepSeek-V4-Flash",
"tools": [
{
"type": "function",
"function": {
"name": "query_weather",
"description": "Get weather of an city, the user should supply a city first",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The city, e.g. Beijing"
}
},
"required": [
"city"
]
}
}
}
],
"messages": [
{
"role": "user",
"content": "Hows the weather like in Qingdao today"
}
]
}'
Python
import requests
import json
url = "https://api.gmi-serving.com/v1/chat/completions"
headers = {
"Content-Type": "application/json",
"Authorization": "Bearer *************"
}
payload = {
"model": "deepseek-ai/DeepSeek-V4-Flash",
"messages": [
{"role": "system", "content": "You are a helpful AI assistant"},
{"role": "user", "content": "List 3 countries and their capitals."}
],
"temperature": 0,
"max_tokens": 500
}
response = requests.post(url, headers=headers, json=payload)
print(json.dumps(response.json(), indent=2))