Skip to main content

Rate limits

To maintain system stability and equitable access, our API enforces rate limiting—controlling how frequently an organization can send requests within a certain time window.

How do these rate limits work?

API rate limits are defined in two ways:

  • TPM (Tokens per Minute) for LLM models
  • RPH (Requests per Hour) for video models

These limits are enforced at the organization level.

Usage tiers

Rate limits vary by usage tier, with each tier offering different quotas for each model. By default, organizations are assigned to Tier 1. To request a higher rate limit, please contact [email protected] for an upgrade.

Rate Limit Table

Mode NameTier NameTPM
deepseek-ai/DeepSeek-R1Tier 130,000
deepseek-ai/DeepSeek-R1Tier 2450,000
deepseek-ai/DeepSeek-R1Tier 3800,000
deepseek-ai/DeepSeek-R1Tier 42,000,000
deepseek-ai/DeepSeek-R1Tier 530,000,000
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5BTier 130,000
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5BTier 2450,000
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5BTier 3800,000
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5BTier 42,000,000
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5BTier 530,000,000
deepseek-ai/DeepSeek-R1-Distill-Qwen-7BTier 130,000
deepseek-ai/DeepSeek-R1-Distill-Qwen-7BTier 2450,000
deepseek-ai/DeepSeek-R1-Distill-Qwen-7BTier 3800,000
deepseek-ai/DeepSeek-R1-Distill-Qwen-7BTier 42,000,000
deepseek-ai/DeepSeek-R1-Distill-Qwen-7BTier 530,000,000
deepseek-ai/DeepSeek-R1-Distill-Qwen-14BTier 130,000
deepseek-ai/DeepSeek-R1-Distill-Qwen-14BTier 2450,000
deepseek-ai/DeepSeek-R1-Distill-Qwen-14BTier 3800,000
deepseek-ai/DeepSeek-R1-Distill-Qwen-14BTier 42,000,000
deepseek-ai/DeepSeek-R1-Distill-Qwen-14BTier 530,000,000
deepseek-ai/DeepSeek-R1-Distill-Qwen-32BTier 130,000
deepseek-ai/DeepSeek-R1-Distill-Qwen-32BTier 2450,000
deepseek-ai/DeepSeek-R1-Distill-Qwen-32BTier 3800,000
deepseek-ai/DeepSeek-R1-Distill-Qwen-32BTier 42,000,000
deepseek-ai/DeepSeek-R1-Distill-Qwen-32BTier 530,000,000
deepseek-ai/DeepSeek-R1-Distill-Llama-8BTier 130,000
deepseek-ai/DeepSeek-R1-Distill-Llama-8BTier 2450,000
deepseek-ai/DeepSeek-R1-Distill-Llama-8BTier 3800,000
deepseek-ai/DeepSeek-R1-Distill-Llama-8BTier 42,000,000
deepseek-ai/DeepSeek-R1-Distill-Llama-8BTier 530,000,000
deepseek-ai/DeepSeek-R1-Distill-Llama-70BTier 130,000
deepseek-ai/DeepSeek-R1-Distill-Llama-70BTier 2450,000
deepseek-ai/DeepSeek-R1-Distill-Llama-70BTier 3800,000
deepseek-ai/DeepSeek-R1-Distill-Llama-70BTier 42,000,000
deepseek-ai/DeepSeek-R1-Distill-Llama-70BTier 530,000,000
deepseek-ai/DeepSeek-V3Tier 130,000
deepseek-ai/DeepSeek-V3Tier 2450,000
deepseek-ai/DeepSeek-V3Tier 3800,000
deepseek-ai/DeepSeek-V3Tier 42,000,000
deepseek-ai/DeepSeek-V3Tier 530,000,000
deepseek-ai/DeepSeek-V3-BaseTier 130,000
deepseek-ai/DeepSeek-V3-BaseTier 2450,000
deepseek-ai/DeepSeek-V3-BaseTier 3800,000
deepseek-ai/DeepSeek-V3-BaseTier 42,000,000
deepseek-ai/DeepSeek-V3-BaseTier 530,000,000
deepseek-ai/DeepSeek-R2Tier 130,000
deepseek-ai/DeepSeek-R2Tier 2450,000
deepseek-ai/DeepSeek-R2Tier 3800,000
deepseek-ai/DeepSeek-R2Tier 42,000,000
deepseek-ai/DeepSeek-R2Tier 530,000,000
deepseek-ai/DeepSeek-R1-ZeroTier 130,000
deepseek-ai/DeepSeek-R1-ZeroTier 2450,000
deepseek-ai/DeepSeek-R1-ZeroTier 3800,000
deepseek-ai/DeepSeek-R1-ZeroTier 42,000,000
deepseek-ai/DeepSeek-R1-ZeroTier 530,000,000
meta-llama/Llama-3.1-8BTier 1200,000
meta-llama/Llama-3.1-8BTier 22,000,000
meta-llama/Llama-3.1-8BTier 34,000,000
meta-llama/Llama-3.1-8BTier 410,000,000
meta-llama/Llama-3.1-8BTier 5150,000,000
meta-llama/Llama-3.1-8B-InstructTier 1200,000
meta-llama/Llama-3.1-8B-InstructTier 22,000,000
meta-llama/Llama-3.1-8B-InstructTier 34,000,000
meta-llama/Llama-3.1-8B-InstructTier 410,000,000
meta-llama/Llama-3.1-8B-InstructTier 5150,000,000
meta-llama/Llama-3.3-70B-InstructTier 1200,000
meta-llama/Llama-3.3-70B-InstructTier 22,000,000
meta-llama/Llama-3.3-70B-InstructTier 34,000,000
meta-llama/Llama-3.3-70B-InstructTier 410,000,000
meta-llama/Llama-3.3-70B-InstructTier 5150,000,000
Qwen/QwQ-32BTier 1200,000
Qwen/QwQ-32BTier 22,000,000
Qwen/QwQ-32BTier 34,000,000
Qwen/QwQ-32BTier 410,000,000
Qwen/QwQ-32BTier 5150,000,000
Qwen/QwQ-32B-PreviewTier 1200,000
Qwen/QwQ-32B-PreviewTier 22,000,000
Qwen/QwQ-32B-PreviewTier 34,000,000
Qwen/QwQ-32B-PreviewTier 410,000,000
Qwen/QwQ-32B-PreviewTier 5150,000,000
Qwen/QwQ-32B-GGUFTier 1200,000
Qwen/QwQ-32B-GGUFTier 22,000,000
Qwen/QwQ-32B-GGUFTier 34,000,000
Qwen/QwQ-32B-GGUFTier 410,000,000
Qwen/QwQ-32B-GGUFTier 5150,000,000
Qwen/QwQ-32B-AWQTier 1200,000
Qwen/QwQ-32B-AWQTier 22,000,000
Qwen/QwQ-32B-AWQTier 34,000,000
Qwen/QwQ-32B-AWQTier 410,000,000
Qwen/QwQ-32B-AWQTier 5150,000,000
Qwen/Qwen2.5-0.5B-InstructTier 1200,000
Qwen/Qwen2.5-0.5B-InstructTier 22,000,000
Qwen/Qwen2.5-0.5B-InstructTier 34,000,000
Qwen/Qwen2.5-0.5B-InstructTier 410,000,000
Qwen/Qwen2.5-0.5B-InstructTier 5150,000,000
Qwen/Qwen2.5-1.5B-InstructTier 1200,000
Qwen/Qwen2.5-1.5B-InstructTier 22,000,000
Qwen/Qwen2.5-1.5B-InstructTier 34,000,000
Qwen/Qwen2.5-1.5B-InstructTier 410,000,000
Qwen/Qwen2.5-1.5B-InstructTier 5150,000,000
Qwen/Qwen2.5-7B-InstructTier 1200,000
Qwen/Qwen2.5-7B-InstructTier 22,000,000
Qwen/Qwen2.5-7B-InstructTier 34,000,000
Qwen/Qwen2.5-7B-InstructTier 410,000,000
Qwen/Qwen2.5-7B-InstructTier 5150,000,000
Qwen/Qwen2.5-7B-Instruct-1MTier 1200,000
Qwen/Qwen2.5-7B-Instruct-1MTier 22,000,000
Qwen/Qwen2.5-7B-Instruct-1MTier 34,000,000
Qwen/Qwen2.5-7B-Instruct-1MTier 410,000,000
Qwen/Qwen2.5-7B-Instruct-1MTier 5150,000,000
Qwen/Qwen2.5-7B-Instruct-AWQTier 1200,000
Qwen/Qwen2.5-7B-Instruct-AWQTier 22,000,000
Qwen/Qwen2.5-7B-Instruct-AWQTier 34,000,000
Qwen/Qwen2.5-7B-Instruct-AWQTier 410,000,000
Qwen/Qwen2.5-7B-Instruct-AWQTier 5150,000,000
Qwen/Qwen2.5-7B-Instruct-GGUFTier 1200,000
Qwen/Qwen2.5-7B-Instruct-GGUFTier 22,000,000
Qwen/Qwen2.5-7B-Instruct-GGUFTier 34,000,000
Qwen/Qwen2.5-7B-Instruct-GGUFTier 410,000,000
Qwen/Qwen2.5-7B-Instruct-GGUFTier 5150,000,000
Qwen/Qwen2.5-7B-Instruct-GPTQ-Int4Tier 1200,000
Qwen/Qwen2.5-7B-Instruct-GPTQ-Int4Tier 22,000,000
Qwen/Qwen2.5-7B-Instruct-GPTQ-Int4Tier 34,000,000
Qwen/Qwen2.5-7B-Instruct-GPTQ-Int4Tier 410,000,000
Qwen/Qwen2.5-7B-Instruct-GPTQ-Int4Tier 5150,000,000
Qwen/Qwen2.5-7B-Instruct-GPTQ-Int8Tier 1200,000
Qwen/Qwen2.5-7B-Instruct-GPTQ-Int8Tier 22,000,000
Qwen/Qwen2.5-7B-Instruct-GPTQ-Int8Tier 34,000,000
Qwen/Qwen2.5-7B-Instruct-GPTQ-Int8Tier 410,000,000
Qwen/Qwen2.5-7B-Instruct-GPTQ-Int8Tier 5150,000,000
Qwen/Qwen2.5-VL-7B-InstructTier 1200,000
Qwen/Qwen2.5-VL-7B-InstructTier 22,000,000
Qwen/Qwen2.5-VL-7B-InstructTier 34,000,000
Qwen/Qwen2.5-VL-7B-InstructTier 410,000,000
Qwen/Qwen2.5-VL-7B-InstructTier 5150,000,000
Qwen/Qwen2.5-Coder-7B-InstructTier 1200,000
Qwen/Qwen2.5-Coder-7B-InstructTier 22,000,000
Qwen/Qwen2.5-Coder-7B-InstructTier 34,000,000
Qwen/Qwen2.5-Coder-7B-InstructTier 410,000,000
Qwen/Qwen2.5-Coder-7B-InstructTier 5150,000,000
Qwen/Qwen2.5-Math-7B-InstructTier 1200,000
Qwen/Qwen2.5-Math-7B-InstructTier 22,000,000
Qwen/Qwen2.5-Math-7B-InstructTier 34,000,000
Qwen/Qwen2.5-Math-7B-InstructTier 410,000,000
Qwen/Qwen2.5-Math-7B-InstructTier 5150,000,000
Qwen/Qwen2.5-14BTier 1200,000
Qwen/Qwen2.5-14BTier 22,000,000
Qwen/Qwen2.5-14BTier 34,000,000
Qwen/Qwen2.5-14BTier 410,000,000
Qwen/Qwen2.5-14BTier 5150,000,000
Qwen/Qwen2.5-14B-InstructTier 1200,000
Qwen/Qwen2.5-14B-InstructTier 22,000,000
Qwen/Qwen2.5-14B-InstructTier 34,000,000
Qwen/Qwen2.5-14B-InstructTier 410,000,000
Qwen/Qwen2.5-14B-InstructTier 5150,000,000
Qwen/Qwen2.5-14B-Instruct-AWQTier 1200,000
Qwen/Qwen2.5-14B-Instruct-AWQTier 22,000,000
Qwen/Qwen2.5-14B-Instruct-AWQTier 34,000,000
Qwen/Qwen2.5-14B-Instruct-AWQTier 410,000,000
Qwen/Qwen2.5-14B-Instruct-AWQTier 5150,000,000
Qwen/Qwen2.5-14B-Instruct-GGUFTier 1200,000
Qwen/Qwen2.5-14B-Instruct-GGUFTier 22,000,000
Qwen/Qwen2.5-14B-Instruct-GGUFTier 34,000,000
Qwen/Qwen2.5-14B-Instruct-GGUFTier 410,000,000
Qwen/Qwen2.5-14B-Instruct-GGUFTier 5150,000,000
Qwen/Qwen2.5-14B-Instruct-GPTQ-Int4Tier 1200,000
Qwen/Qwen2.5-14B-Instruct-GPTQ-Int4Tier 22,000,000
Qwen/Qwen2.5-14B-Instruct-GPTQ-Int4Tier 34,000,000
Qwen/Qwen2.5-14B-Instruct-GPTQ-Int4Tier 410,000,000
Qwen/Qwen2.5-14B-Instruct-GPTQ-Int4Tier 5150,000,000
Qwen/Qwen2.5-Coder-14B-InstructTier 1200,000
Qwen/Qwen2.5-Coder-14B-InstructTier 22,000,000
Qwen/Qwen2.5-Coder-14B-InstructTier 34,000,000
Qwen/Qwen2.5-Coder-14B-InstructTier 410,000,000
Qwen/Qwen2.5-Coder-14B-InstructTier 5150,000,000
Qwen/Qwen2.5-32BTier 1200,000
Qwen/Qwen2.5-32BTier 22,000,000
Qwen/Qwen2.5-32BTier 34,000,000
Qwen/Qwen2.5-32BTier 410,000,000
Qwen/Qwen2.5-32BTier 5150,000,000
Qwen/Qwen2.5-32B-InstructTier 1200,000
Qwen/Qwen2.5-32B-InstructTier 22,000,000
Qwen/Qwen2.5-32B-InstructTier 34,000,000
Qwen/Qwen2.5-32B-InstructTier 410,000,000
Qwen/Qwen2.5-32B-InstructTier 5150,000,000
Qwen/Qwen2.5-Coder-32B-InstructTier 1200,000
Qwen/Qwen2.5-Coder-32B-InstructTier 22,000,000
Qwen/Qwen2.5-Coder-32B-InstructTier 34,000,000
Qwen/Qwen2.5-Coder-32B-InstructTier 410,000,000
Qwen/Qwen2.5-Coder-32B-InstructTier 5150,000,000