Gemini 2.5 Flash Image API Usage Guide
Overview
Gemini 2.5 Flash Image is optimized for image understanding and generation, offering a balance of price and performance. It uses the speed and cost-effectiveness of Gemini 2.5 Flash to provide fast and efficient image generation and editing capabilities. Supports text-to-image, image editing, and multi-turn conversations. Each generated image consumes 1290 tokens. Reference: https://docs.cloud.google.com/vertex-ai/generative-ai/docs/models/gemini/2-5-flash-imageAuthentication
All API requests require authentication using an API key. Include your API key in the Authorization header:Submit Image Generation Request
Base URL
Endpoint
Request Format
Request Parameters
| Parameter | Type | Required | Description | Default | Constraints |
|---|---|---|---|---|---|
prompt | string | Yes | Text description of the desired image. | - | Required |
image | string/array | No | Optional reference image URLs (up to 3). Supported formats: PNG, JPEG, WebP, HEIC, HEIF. Max 7MB inline or 30MB from GCS. | - | Max 3 images |
aspect_ratio | string | No | Aspect ratio of the generated image. | ”1:1” | Options: “1:1”, “3:2”, “2:3”, “3:4”, “4:3”, “4:5”, “5:4”, “9:16”, “16:9”, “21:9” |
Response
Check Request Status
Endpoint
Example
Response
Request Status Values
| Status | Description |
|---|---|
queued | Request is waiting to be processed |
processing | Image generation is in progress |
success | Image generation completed successfully |
failed | Image generation failed |
cancelled | Request was cancelled |
List Your Requests
Endpoint
Example
Get Model Information
Endpoint
Example
List Available Models
Endpoint
Example
Response
Multi-turn Conversation (Iterative Image Editing)
This model supports multi-turn conversations for iterative image editing. After generating an image, you can continue refining it by providing additional instructions.How It Works
- First Turn: Send a regular request with
prompt(and optionalimage) - Response: The response includes
next_turn_contents- a pre-formatted conversation history - Next Turn: Copy
next_turn_contentsto your payload’scontentsfield, then add your new instruction to the last user turn - Repeat: Continue iterating until satisfied
First Turn Request
First Turn Response (with next_turn_contents)
Second Turn Request (Using next_turn_contents)
Copynext_turn_contents to contents, then fill in the last user turn with your new instruction:
Multi-turn Tips
- Text-only edits: Just fill in the
textfield in the last user turn - Add reference image: Include a
fileDatawithfileUripointing to a GCS or HTTP URL - Empty fields are ignored: Empty
textorfileUriare automatically filtered out - Conversation history: Each response includes updated
next_turn_contentsfor the next iteration
Model Specifications
| Specification | Value |
|---|---|
| Model ID | gemini-2.5-flash-image |
| Max input tokens | 32,768 |
| Max output tokens | 32,768 |
| Max input images | 3 |
| Max output images per prompt | 10 |
| Tokens per output image | 1,290 |
| Supported image types | PNG, JPEG, WebP, HEIC, HEIF |
| Max file size (inline) | 7 MB |
| Max file size (GCS) | 30 MB |
| Temperature | 0.0–2.0 (default 1.0) |
| topP | 0.0–1.0 (default 0.95) |
| topK | 64 (fixed) |
Tips for Better Results
- Prompt Clarity: Use detailed, specific prompts for precise results.
- Multi-turn Iteration: For complex edits, use multi-turn mode to refine the image step by step.