Gemini Omni Flash API Usage Guide
Overview
Gemini Omni Flash generates 720p video from a text prompt, optionally guided by reference images and/or an input video. It produces 24 FPS video at durations of 3-10 seconds (1-second increments) in 16:9 or 9:16. All generated videos are marked with invisible SynthID watermarking and C2PA content credentials.Authentication
Include your API key in the Authorization header:Submit a request
Endpoint
Example
Request parameters
Parameter names match our Veo video models, so a single integration works across both.| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | Text description of the video to generate. |
reference_image | string/array | No | Up to 5 reference image URLs (image-to-video). |
video | string | No | One input video URL, up to 10s (video-to-video). |
durationSeconds | integer | No | Video length in seconds, 3-10 (default 5). |
aspectRatio | enum | No | 16:9 (default) or 9:16. |
resolution | enum | No | 720p (only supported value in this preview). |
Check status
outcome.media_urls[0].url is the generated video (720p mp4), with thumbnail_image_url alongside.
Pricing
- Video output (720p): $0.10 per second of generated video — same rate with or without audio, and independent of aspect ratio.
- Input references (text/image/video) are billed at $1.50 per 1M tokens (Gemini tokenization).
Capabilities & limits (public preview)
- Duration: 3-10s (1s increments); Resolution: 720p; Aspect ratio: 16:9 or 9:16
- Up to 5 reference images; input video up to 10s
- Not yet supported: audio references, last-frame guidance, scene extension