Skip to main content
To maintain system stability and equitable access, our API enforces rate limiting—controlling how frequently an organization can send requests within a certain time window.

How do these rate limits work?

API rate limits are defined in two ways:
  • TPM (Tokens per Minute) for LLM models
  • RPH (Requests per Hour) for video models
These limits are enforced at the organization level.

Usage tiers

Rate limits vary by usage tier, with each tier offering different quotas for each model. By default, organizations are assigned to Tier 1. To request a higher rate limit, please contact support@gmicloud.ai for an upgrade.

Rate Limit Table

Model NameTier 1 TPMTier 2 TPMTier 3 TPMTier 4 TPMTier 5 TPM
deepseek-ai/DeepSeek-R1100,000450,000800,0002,000,00030,000,000
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B100,000450,000800,0002,000,00030,000,000
deepseek-ai/DeepSeek-R1-Distill-Qwen-7B100,000450,000800,0002,000,00030,000,000
deepseek-ai/DeepSeek-R1-Distill-Qwen-14B100,000450,000800,0002,000,00030,000,000
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B100,000450,000800,0002,000,00030,000,000
deepseek-ai/DeepSeek-R1-Distill-Llama-8B100,000450,000800,0002,000,00030,000,000
deepseek-ai/DeepSeek-R1-Distill-Llama-70B100,000450,000800,0002,000,00030,000,000
deepseek-ai/DeepSeek-R1-Zero100,000450,000800,0002,000,00030,000,000
deepseek-ai/DeepSeek-R2100,000450,000800,0002,000,00030,000,000
deepseek-ai/DeepSeek-V3100,000450,000800,0002,000,00030,000,000
deepseek-ai/DeepSeek-V3-Base100,000450,000800,0002,000,00030,000,000
deepseek-ai/DeepSeek-V3-0324100,000450,000800,0002,000,000150,000,000
deepseek-ai/DeepSeek-V3.1100,0002,000,0004,000,00010,000,000150,000,000
deepseek-ai/DeepSeek-V3.1-Terminus100,0002,000,0004,000,00010,000,000150,000,000
deepseek-ai/DeepSeek-Prover-V2-671B100,000450,000800,0002,000,000150,000,000
deepseek-ai/DeepSeek-R1-0528100,0002,000,0004,000,00010,000,000150,000,000
meta-llama/Llama-3.1-8B100,0002,000,0004,000,00010,000,000150,000,000
meta-llama/Llama-3.3-70B-Instruct100,0002,000,0004,000,00010,000,000150,000,000
meta-llama/Llama-4-Scout-17B-16E-Instruct100,0002,000,0004,000,00010,000,000150,000,000
meta-llama/Llama-4-Maverick-17B-128E-Instruct100,0002,000,0004,000,00010,000,000150,000,000
Qwen/QwQ-32B100,0002,000,0004,000,00010,000,000150,000,000
Qwen/QwQ-32B-Preview100,0002,000,0004,000,00010,000,000150,000,000
Qwen/QwQ-32B-GGUF100,0002,000,0004,000,00010,000,000150,000,000
Qwen/QwQ-32B-AWQ100,0002,000,0004,000,00010,000,000150,000,000
Qwen/Qwen2.5-0.5B-Instruct100,0002,000,0004,000,00010,000,000150,000,000
Qwen/Qwen2.5-1.5B-Instruct100,0002,000,0004,000,00010,000,000150,000,000
Qwen/Qwen2.5-7B-Instruct100,0002,000,0004,000,00010,000,000150,000,000
Qwen/Qwen2.5-7B-Instruct-1M100,0002,000,0004,000,00010,000,000150,000,000
Qwen/Qwen2.5-7B-Instruct-AWQ100,0002,000,0004,000,00010,000,000150,000,000
Qwen/Qwen2.5-7B-Instruct-GGUF100,0002,000,0004,000,00010,000,000150,000,000
Qwen/Qwen2.5-7B-Instruct-GPTQ-Int4100,0002,000,0004,000,00010,000,000150,000,000
Qwen/Qwen2.5-7B-Instruct-GPTQ-Int8100,0002,000,0004,000,00010,000,000150,000,000
Qwen/Qwen2.5-VL-7B-Instruct100,0002,000,0004,000,00010,000,000150,000,000
Qwen/Qwen2.5-Coder-7B-Instruct100,0002,000,0004,000,00010,000,000150,000,000
Qwen/Qwen2.5-Math-7B-Instruct100,0002,000,0004,000,00010,000,000150,000,000
Qwen/Qwen2.5-14B100,0002,000,0004,000,00010,000,000150,000,000
Qwen/Qwen2.5-14B-Instruct100,0002,000,0004,000,00010,000,000150,000,000
Qwen/Qwen2.5-14B-Instruct-AWQ100,0002,000,0004,000,00010,000,000150,000,000
Qwen/Qwen2.5-14B-Instruct-GGUF100,0002,000,0004,000,00010,000,000150,000,000
Qwen/Qwen2.5-14B-Instruct-GPTQ-Int4100,0002,000,0004,000,00010,000,000150,000,000
Qwen/Qwen2.5-Coder-14B-Instruct100,0002,000,0004,000,00010,000,000150,000,000
Qwen/Qwen2.5-32B100,0002,000,0004,000,00010,000,000150,000,000
Qwen/Qwen2.5-32B-Instruct100,0002,000,0004,000,00010,000,000150,000,000
Qwen/Qwen2.5-Coder-32B-Instruct100,0002,000,0004,000,00010,000,000150,000,000
Qwen/Qwen3-235B-A22B-FP8100,0002,000,0004,000,00010,000,000150,000,000
Qwen/Qwen3-32B-FP8100,0002,000,0004,000,00010,000,000150,000,000
Qwen/Qwen3-30B-A3B100,0002,000,0004,000,00010,000,000150,000,000
Qwen/Qwen3-Next-80B-A3B-Instruct100,0002,000,0004,000,00010,000,000150,000,000
Qwen/Qwen3-Next-80B-A3B-Thinking100,0002,000,0004,000,00010,000,000150,000,000
Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8100,0002,000,0004,000,00010,000,000150,000,000
Qwen/Qwen3-235B-A22B-Instruct-2507-FP8100,0002,000,0004,000,00010,000,000150,000,000
Qwen/Qwen3-235B-A22B-Thinking-2507-FP8100,0002,000,0004,000,00010,000,000150,000,000
moonshotai/Kimi-K2-Instruct100,0002,000,0004,000,00010,000,000150,000,000
moonshotai/Kimi-K2-Instruct-0905100,0002,000,0004,000,00010,000,000150,000,000
openai/gpt-oss-120b100,0002,000,0004,000,00010,000,000150,000,000
zai-org/GLM-4.5-FP8100,0002,000,0004,000,00010,000,000150,000,000
zai-org/GLM-4.5-Air-FP8100,0002,000,0004,000,00010,000,000150,000,000
zai-org/GLM-4.6100,0002,000,0004,000,00010,000,000150,000,000
I