Kling V3 Omni & Create Element Integration Guide
This guide covers how to use the Kling V3 Omni video generation model and the Kling Create Element model through the request queue API.Overview
| Model | Model ID | Purpose |
|---|---|---|
| Kling V3 Omni | kling-v3-omni | Unified video generation: text-to-video, image-to-video, video editing, multi-shot storyboards, element-driven generation |
| Kling Create Element | kling-create-element | Create reusable character/object elements from images or videos for consistent appearances across generations |
kling-create-element, then reference them in kling-v3-omni video generation using element_id and the <<<element_1>>> prompt syntax.
Kling V3 Omni (kling-v3-omni)
Capabilities
- Text-to-Video: Generate videos purely from text prompts
- Image-to-Video: Use start frame and/or end frame images to guide generation
- Video Editing: Edit an existing video (
refer_type: "base") - Feature Reference: Use a video as a style/motion reference (
refer_type: "feature") - Element-Driven: Reference custom elements for consistent characters/objects
- Multi-Shot Storyboards: Create multi-shot videos with per-shot prompts and durations
- Native Audio: Optional simultaneous audio generation (
sound: "on")
Parameters
Core Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
prompt | string | Conditional | - | Text prompt (max 2500 chars). Supports <<<element_1>>>, <<<image_1>>>, <<<video_1>>> references. Required when multi_shot is false. |
mode | enum | No | "pro" | "std" (720P, cost-effective) or "pro" (1080P, higher quality) |
duration | string | No | "5" | Video length in seconds: "3" through "15". Ignored when using video editing (refer_type: "base"). |
aspect_ratio | enum | No | - | "16:9", "9:16", or "1:1". Required when not using first-frame reference or video editing. |
sound | enum | No | "off" | "on" or "off". Must be "off" when video_list is provided. |
Image & Video Input Parameters (JSON type)
image_list (json, optional)
Reference image list. Images can serve as start/end frames or as element/scene/style references. Types: "first_frame", "end_frame". End frame requires a first frame. Formats: .jpg/.jpeg/.png, max 10MB, min 300px, aspect ratio 1:2.5–2.5:1.
video_list (json, optional)
Reference video. refer_type: "base" (video editing) or "feature" (style/motion reference). keep_original_sound: "yes" or "no". When video_list is provided, sound must be "off". Formats: .mp4/.mov, 3–10s, 720–2160px, 24–60fps, max 200MB.
element_list (json, optional)
Reference elements. Supports three formats that can be mixed in the same list:
Mode 1. Existing elements (by ID):
frontal_image and refer_images directly. The system auto-creates the element via the Create Element API before generating the video.
refer_videos directly. The system auto-creates the element via the Create Element API.
element_name, element_description, tag_list, element_voice_id. If element_name or element_description are omitted, they are auto-generated.
Combined count of images + elements must not exceed 7 (or 4 when video_list is present).
Multi-Shot Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
multi_shot | boolean | No | true to enable multi-shot mode. When true, prompt is ignored; use multi_prompt instead. |
shot_type | enum | No | "customize". Required when multi_shot is true. |
multi_prompt (json, optional)
Per-shot storyboard info. Up to 6 shots, each prompt max 512 chars. Shot durations must sum to total duration. Required when multi_shot is true and shot_type is "customize".
Other Parameters
watermark_info (json, optional)
Whether to generate watermarked results simultaneously.
Limits on Combined References
| Scenario | Max images + elements |
|---|---|
| No reference video | 7 |
| With reference video | 4 (video elements not supported) |
| With first/last frame video | 3 elements |
| More than 2 images | End frame not supported |
Submit Request Examples
All examples usePOST /requests with the following structure:
1. Text-to-Video (Basic)
Generate a video from a text prompt.2. Text-to-Video with Native Audio
Generate a video with synchronized audio.3. Image-to-Video (First Frame Only)
Use a single image as the starting frame.4. Image-to-Video (First Frame + End Frame)
Guide generation with both a start and end image for smooth transitions.5. Video Editing (refer_type: “base”)
Edit an existing video. The output duration matches the input video. Note:sound must be "off".
6. Feature Reference Video
Use a video as a style/motion reference (not for editing).7. Element-Driven Generation
Reference a custom element created viakling-create-element. Use the <<<element_N>>> syntax in the prompt where N corresponds to the order in element_list.
8. Multiple Elements
Reference multiple elements in a single generation.9. Element + First Frame Image
Combine a custom element with a starting frame image.10. Element + Feature Reference Video
Combine a custom element with a video style reference. Note: combined images + elements must not exceed 4 when a video is present.11. Multi-Shot Storyboard
Create a multi-shot video with custom per-shot prompts. The sum of shot durations must equal the totalduration.
12. Multi-Shot with Elements
Combine multi-shot storyboarding with custom elements.13. Inline Image Element (Auto-Created)
Instead of creating elements separately, provide image references directly inelement_list. The system auto-creates the element before generating the video.
14. Inline Video Element (Auto-Created)
Provide a video reference directly to auto-create a video character element.15. Mixed Existing and Inline Elements
Combine pre-created elements (by ID) with inline elements in the same request.16. Watermark Enabled
Kling Create Element (kling-create-element)
Overview
Create reusable custom elements (characters, animals, items, costumes, scenes, effects) that can be referenced in Kling V3 Omni video generation. Elements maintain consistent appearance across multiple generations. There are two ways to create an element:| Method | reference_type | Input | Notes |
|---|---|---|---|
| Multi-Image Element | "image_refer" | 1 frontal image + 1–3 additional reference images | Broader availability across models |
| Video Character Element | "video_refer" | 1 reference video (3–8s) | Audio with human voice triggers voice customization |
Parameters
Core Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
element_name | string | Yes | Element name, max 20 characters. |
element_description | string | Yes | Element description, max 100 characters. |
reference_type | enum | Yes | "image_refer" (multi-image) or "video_refer" (video character). |
Image Reference Parameters (when reference_type is "image_refer")
| Parameter | Type | Required | Description |
|---|---|---|---|
frontal_image | image | Yes | Frontal reference image URL. Formats: .jpg/.jpeg/.png, max 10MB, min 300px, aspect ratio 1:2.5–2.5:1. |
refer_images | image[] | Yes | 1–3 additional reference images from different angles. Same format constraints as frontal_image. |
Video Reference Parameters (when reference_type is "video_refer")
| Parameter | Type | Required | Description |
|---|---|---|---|
refer_video | video | Yes | Reference video URL. Formats: .mp4/.mov, 1080P, 3–8 seconds, 16:9 or 9:16, max 200MB. Audio with human voice triggers voice customization. |
Optional Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
element_voice_id | string | No | Bind an existing voice from the voice library. Only for video-customized elements. |
tag_list (json, optional)
Tags for the element. One element can have multiple tags.
Available Tags
| Tag ID | Name |
|---|---|
o_101 | Hottest |
o_102 | Character |
o_103 | Animal |
o_104 | Item |
o_105 | Costume |
o_106 | Scene |
o_107 | Effect |
o_108 | Others |
Submit Request Examples
All examples usePOST /requests:
1. Image Reference. Character Element
Create a character element from a frontal photo and side/detail reference images.2. Image Reference. Animal Element
3. Image Reference. Item/Costume Element
4. Image Reference. Scene Element
5. Video Reference. Character with Voice Customization
Create a character element from video. If the video contains human speech, voice customization is automatically triggered.6. Video Reference. Character with Existing Voice Binding
Bind a voice from the voice library to an element created from video.7. Multiple Tags
End-to-End Workflow: Create Element then Generate Video
Step 1: Create a character element
Step 2: Wait for element creation to complete
PollGET /requests/abc-123 until status is "success". The outcome will contain the created element with its element_id.
Step 3: Use the element in V3 Omni video generation
Step 4: Retrieve the generated video
PollGET /requests/{request_id} until status is "success". The outcome will contain: