Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.gmicloud.ai/llms.txt

Use this file to discover all available pages before exploring further.

Model ID
deepseek-ai/DeepSeek-V4-Flash
The model includes hybrid attention for efficient long-context processing and supports configurable reasoning modes. It is well suited for applications such as coding assistants, chat systems, and agent workflows where responsiveness and cost efficiency are important.

API Usage

You can interact with the deepseek-ai/DeepSeek-V4-Flash model through various programming languages and methods. Below are examples showing how to use the model’s API.

API Examples

Generate a model response using the chat endpoint of DeepSeek-V4-Flash.

Shell

curl --request POST \
  --url https://api.gmi-serving.com/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer *************' \
  --data '{
    "model": "deepseek-ai/DeepSeek-V4-Flash",
    "messages": [
      {"role": "system", "content": "You are a helpful AI assistant"},
      {"role": "user", "content": "List 3 countries and their capitals."}
    ],
    "temperature": 0,
    "max_tokens": 500
  }'
# example for function call
curl --request POST \
  --url https://api.gmi-serving.com/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer *************' \
  --data '{
    "temperature": 0,
    "max_tokens": 100,
    "model": "deepseek-ai/DeepSeek-V4-Flash",
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "query_weather",
                "description": "Get weather of an city, the user should supply a city first",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "city": {
                            "type": "string",
                            "description": "The city, e.g. Beijing"
                        }
                    },
                    "required": [
                        "city"
                    ]
                }
            }
        }
    ],
    "messages": [
        {
            "role": "user",
            "content": "Hows the weather like in Qingdao today"
        }
    ]
}'

Python

import requests
import json

url = "https://api.gmi-serving.com/v1/chat/completions"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer *************"
}

payload = {
    "model": "deepseek-ai/DeepSeek-V4-Flash",
    "messages": [
        {"role": "system", "content": "You are a helpful AI assistant"},
        {"role": "user", "content": "List 3 countries and their capitals."}
    ],
    "temperature": 0,
    "max_tokens": 500
}

response = requests.post(url, headers=headers, json=payload)
print(json.dumps(response.json(), indent=2))