Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.gmicloud.ai/llms.txt

Use this file to discover all available pages before exploring further.

Kling Text-to-Video

Model: Kling Text-to-Video (GMI API)
Description: Generates a video from a text prompt using Kling models (Standard/Pro, multiple versions). Supports duration control, CFG scale, and negative prompts.

Inputs

Required

  • prompt (STRING)
    Text prompt describing the video content. Max 2500 characters.
  • negative_prompt (STRING)
    Text describing what should be avoided in the generated video. Max 2500 characters.
  • cfg_scale (FLOAT)
    Controls how strongly the model follows the prompt.
    Options: 0.0, 0.5, 1.0
  • aspect_ratio (STRING)
    Defines output video shape. Options: 16:9, 9:16, 1:1
  • duration (STRING)
    Video length in seconds. Options: 5, 10
  • model (STRING)
    Kling model variant used for generation.

Hidden

  • unique_id (STRING)
    Internal execution tracking ID for async requests.

Outputs

  • VIDEO (VIDEO)
    Generated video tensor output.
  • VIDEO_URL (STRING)
    Public URL where generated video is hosted.
  • FILE_PATH (STRING)
    Local saved file path of generated video.

Kling Image2Video

Model: Kling Image-to-Video (GMI API)
Description: Generates a video from a reference image and text prompt.

Inputs

Required

  • prompt (STRING)
    Text prompt describing motion and scene. Max 2500 characters.
  • negative_prompt (STRING)
    Defines unwanted elements in output video. Max 2500 characters.
  • cfg_scale (FLOAT)
    Controls prompt adherence strength.
  • duration (STRING)
    Video length in seconds. Options: 5, 10
  • model (STRING)
    Kling model version used for generation.

Optional

  • start_frame_image (IMAGE)
    Image tensor used as the starting frame of the video.
  • start_frame (STRING)
    URL of reference image used as starting frame.

Hidden

  • unique_id (STRING)
    Internal request tracking identifier.

Outputs

  • VIDEO (VIDEO)
    Generated video tensor output.
  • VIDEO_URL (STRING)
    Public URL to generated video.
  • FILE_PATH (STRING)
    Local file path where video is saved.

Kling 2_6 Motion Control

Model: Kling 2.6 Motion Control
Description: Transfers motion from a reference video to a character image.

Inputs

  • prompt (STRING)
    Optional text guiding style, lighting, or environment. Motion is derived from reference video.
  • video_url (STRING)
    Reference video used for motion extraction. Must contain human subject motion.
  • image (IMAGE)
    Required reference image used for character appearance.
  • image_url (STRING)
    URL version of reference image.
  • character_orientation (STRING)
    Controls motion alignment mode. Options: video, image
  • mode (STRING)
    Quality mode selection. Options: std, pro
  • keep_original_sound (STRING)
    Determines whether original audio is preserved. Options: yes, no

Outputs

  • VIDEO (VIDEO)
    Generated motion-transfer video.
  • VIDEO_URL (STRING)
    Public URL of generated video.
  • FILE_PATH (STRING)
    Local storage path of output video.

Kling Reference To Video

Model: Kling Reference-to-Video
Description: Generates video using reference video and optional image conditioning.

Inputs

  • prompt (STRING)
    Optional guidance prompt for style or scene control.
  • video_url (STRING)
    Reference motion video input.
  • aspect_ratio (STRING)
    Output video shape. Options: 16:9, 9:16, 1:1
  • image (IMAGE)
    Optional reference image(s) for conditioning.
  • image_url (STRING)
    URL version of reference images.
  • duration (STRING)
    Output video duration in seconds.
  • mode (STRING)
    Quality mode. Options: std, pro
  • keep_original_sound (STRING)
    Preserves audio from reference video. Options: yes, no

Outputs

  • VIDEO (VIDEO)
    Generated video tensor.
  • VIDEO_URL (STRING)
    Hosted video URL.
  • FILE_PATH (STRING)
    Local saved file path.

Kling V3 Image To Video

Model: Kling V3 Image-to-Video
Description: Generates video from image and prompt with optional tail frame support.

Inputs

  • prompt (STRING)
    Main motion and scene description.
  • image (IMAGE)
    Required input image used as starting frame.
  • image_url (STRING)
    URL version of input image.
  • negative_prompt (STRING)
    Elements to exclude from generated video.
  • image_tail (IMAGE)
    Optional end-frame image for motion guidance.
  • image_tail_url (STRING)
    URL version of end-frame image.
  • duration (STRING)
    Video length in seconds. Range: 3–15.
  • sound (STRING)
    Controls audio generation. Options: on, off

Outputs

  • VIDEO (VIDEO)
    Generated video output.
  • VIDEO_URL (STRING)
    Public URL of generated video.
  • FILE_PATH (STRING)
    Local saved video path.

Kling V3 Text To Video

Model: Kling V3 Text-to-Video
Description: Generates video purely from text prompt.

Inputs

  • prompt (STRING)
    Primary text prompt describing video content.
  • negative_prompt (STRING)
    Elements to exclude from video.
  • duration (STRING)
    Video length in seconds. Range: 3–15.
  • aspect_ratio (STRING)
    Output video shape. Options: 16:9, 9:16, 1:1
  • sound (STRING)
    Audio generation toggle. Options: on, off

Outputs

  • VIDEO (VIDEO)
    Generated video tensor output.
  • VIDEO_URL (STRING)
    Public video URL.
  • FILE_PATH (STRING)
    Local saved file path.

Kling Edit Video

Model: Kling O1 Edit Video
Description: Edits existing video using text instructions and optional reference images.

Inputs

  • prompt (STRING)
    Instruction describing how video should be edited.
  • video_url (STRING)
    Input video to be edited.
  • image (IMAGE)
    Optional reference images (up to 4).
  • image_url (STRING)
    URL version of reference images.
  • aspect_ratio (STRING)
    Output video format.
  • duration (STRING)
    Output video duration in seconds.
  • mode (STRING)
    Quality mode. Options: std, pro

Outputs

  • VIDEO (VIDEO)
    Edited video output tensor.
  • VIDEO_URL (STRING)
    Public URL of edited video.
  • FILE_PATH (STRING)
    Local saved file path.

Kling 3 Motion Control

Model: kling-3-motion-control
Description: Transfers motion from a reference video to a character image.

Inputs

  • video_url (STRING)
    Reference video used for motion extraction (3–30s).
  • image (IMAGE)
    Required character reference image used as the motion target.
  • prompt (STRING)
    Optional text prompt to guide style, lighting, or environment of the result.
  • image_url (STRING)
    URL version of the reference image.
  • character_orientation (STRING)
    Controls alignment between character and motion source. Options: video, image.
  • mode (STRING)
    Quality mode selection affecting output fidelity. Options: std, pro.
  • keep_original_sound (STRING)
    Determines whether original audio is preserved. Options: yes, no.

Outputs

  • video (VIDEO)
    Generated motion-transfer video.
  • video_url (STRING)
    Publicly accessible URL of the generated video.
  • file_path (STRING)
    Local filesystem path where the video is saved.

LLM Node

Model: configurable (e.g. moonshotai/Kimi-K2.5)
Description: Calls large language models with optional multimodal inputs (text, image, video).

Inputs

Required

  • model (STRING)
    Identifier of the LLM model used for generation.
  • prompt (STRING)
    User input text or instruction sent to the model.

Optional

  • temperature (FLOAT, default: 1.0)
    Controls randomness of output generation (0 = deterministic, 1 = creative).
  • max_tokens (INT, default: 5120)
    Maximum number of tokens the model is allowed to generate.
  • system_prompt (STRING, default: “You are a helpful AI assistant. Provide direct, concise answers without showing your thinking process.”)
    Defines model behavior and response style.
  • image_url (STRING)
    Optional image input for vision-capable models.
  • video_url (STRING)
    Optional video input for multimodal understanding.

Outputs

  • content (STRING)
    Generated response text from the language model.

Ltx2 Fast Image To Video

Model: ltx-2-fast-image-to-video
Description: Converts a static image into a motion video with optional audio generation.

Inputs

Required

  • prompt (STRING)
    Text description defining motion, style, and animation behavior.
  • duration (INT, default: 6)
    Length of the generated video in seconds.
  • resolution (STRING, default: 1920x1080)
    Output video resolution. Options: 1920x1080, 2560x1440, 3840x2160.
  • image (IMAGE)
    Input image used as the base frame for animation.

Optional

  • image_url (STRING)
    URL version of the input image.
  • fps (INT, default: 25)
    Frame rate of generated video. Options: 25, 50.
  • generate_audio (BOOLEAN, default: True)
    Enables or disables AI-generated audio.
  • camera_motion (STRING)
    Defines camera movement style applied to video generation.

Outputs

  • video (VIDEO)
    Generated animated video tensor.
  • video_url (STRING)
    Public URL of the generated video.
  • file_path (STRING)
    Local path where the video is stored.

Ltx2 Pro ImageToVideo

Model: ltx-2-pro-image-to-video
Description: High-fidelity image-to-video generation with improved visual quality and stability.

Inputs

  • Same as GMILtx2FastImageToVideoNode
    (All inputs share identical meaning, but with higher-quality generation backend.)

Outputs

  • video (VIDEO)
    High-quality generated video.
  • video_url (STRING)
    Public URL for the generated video.
  • file_path (STRING)
    Local saved video path.

Ltx2 Fast TextToVideo

Model: ltx-2-fast-text-to-video
Description: Generates video directly from a text prompt.

Inputs

Required

  • prompt (STRING)
    Text description of the desired video.

Optional

  • duration (INT, default: 6)
    Length of video in seconds.
  • resolution (STRING, default: 1920x1080)
    Output resolution of the video.
  • fps (INT, default: 25)
    Frame rate of generated video.
  • generate_audio (BOOLEAN, default: True)
    Enables AI-generated audio track.
  • camera_motion (STRING)
    Defines camera movement behavior in generated video.

Outputs

  • video (VIDEO)
    Generated video output.
  • video_url (STRING)
    Public URL of video.
  • file_path (STRING)
    Local storage path.

Ltx2 Pro TextToVideo

Model: ltx-2-pro-text-to-video
Description: Premium-quality text-to-video generation with enhanced realism and detail.

Inputs

  • Same as GMILtx2FastTextToVideoNode
    (Identical parameters with improved model quality.)

Outputs

  • video (VIDEO)
  • video_url (STRING)
  • file_path (STRING)

Ltx2 Pro Retake

Model: ltx-2-pro-retake
Description: Edits a specific segment of an existing video (audio, video, or both).

Inputs

Required

  • video_url (STRING)
    Source video to be edited.
  • start_time (FLOAT, default: 0)
    Start time of edit segment in seconds.
  • duration (INT, default: 5)
    Length of segment to modify.

Optional

  • prompt (STRING)
    Instruction describing how the segment should be changed.
  • mode (STRING, default: replace_audio_and_video)
    Edit operation mode. Options: replace_audio_and_video, replace_audio, replace_video.

Outputs

  • video (VIDEO)
    Edited video output.
  • video_url (STRING)
    Public URL of edited video.
  • file_path (STRING)
    Local saved path.

Ltx2 Pro Audio-to-Video

Model: ltx-2-pro-audio-to-video
Description: Generates video driven by audio input, optionally guided by image or prompt.

Inputs

Required

  • audio / audio_url (AUDIO or STRING)
    Audio input file (2–20 seconds).

Optional

  • prompt (STRING)
    Text guidance for scene generation.
  • image / image_url (IMAGE or STRING)
    Optional first-frame visual reference.
  • resolution (STRING, default: 1920x1080)
    Output resolution.
  • guidance_scale (FLOAT, default: 5)
    Strength of prompt adherence.

Outputs

  • video (VIDEO)
    Generated video.
  • video_url (STRING)
    Public video URL.
  • file_path (STRING)
    Local saved file path.

Luma Image-to-Video

Model: Luma-Ray2
Description: Generates video from text prompts with optional image conditioning, frame control, and configurable output settings.

Inputs

Required

  • prompt (STRING)
    Text prompt describing video content.

Optional

  • model (STRING, default: Luma-Ray2)
    Model variant used for generation.
  • negative_prompt (STRING, default: "")
    Specifies elements to exclude from output.
  • duration (STRING, default: 5)
    Video length. Options: 5, 9.
  • aspect_ratio (STRING, default: 16:9)
    Output format ratio.
  • resolution (STRING, default: 1080p)
    Output resolution.
  • loop (BOOLEAN, default: False)
    Enables seamless looping.
  • frame0_image_url (STRING)
    First-frame conditioning image.
  • frame1_image_url (STRING)
    Last-frame conditioning image.
  • seed (INT, default: 0)
    Random seed (0 = random).

Outputs

  • video (VIDEO)
    Generated video tensor.
  • video_url (STRING)
    Public URL of generated video.
  • file_path (STRING)
    Local saved video path.

Minimax Hailuo Video

Model: Minimax-Hailuo
Category: Model Library/General Video Models
Description: Generates video using Minimax-Hailuo model (text-to-video or image-to-video).

Inputs

  • prompt_text (STRING, multiline)
    Text prompt describing video content.
  • model (COMBO, default: Minimax-Hailuo-2.3)
    Model variant selection.
  • duration (COMBO, default: 6)
    Video duration in seconds.
  • image (IMAGE, optional)
    Image input for conditioning animation.
  • first_frame_image (STRING, optional)
    URL-based image input.
  • seed (INT, default: 0)
    Random seed for reproducibility.
  • resolution (COMBO, default: 768P)
    Output resolution.
  • prompt_optimizer (BOOLEAN, default: True)
    Enhances prompt understanding and expansion.
  • fast_pretreatment (BOOLEAN, default: False)
    Enables faster preprocessing pipeline.

Outputs

  • VIDEO (VIDEO)
    Generated video output.
  • VIDEO_URL (STRING)
    Public URL of generated video.
  • FILE_PATH (STRING)
    Local saved file path.

Minimax Text To Video

Model: Minimax-Hailuo
Category: Model Library/General Video Models
Description: Generates video from text-only prompts using Minimax-Hailuo.

Inputs

  • prompt_text (STRING)
    Text prompt describing desired video.
  • model (COMBO, default: Minimax-Hailuo-2.3)
    Model variant selection.
  • duration (COMBO, default: 6)
    Video duration.
  • seed (INT, default: 0)
    Random seed.
  • resolution (COMBO, default: 768P)
    Output resolution.
  • prompt_optimizer (BOOLEAN, default: True)
    Improves prompt interpretation.
  • fast_pretreatment (BOOLEAN, default: False)
    Enables faster preprocessing.

Outputs

  • VIDEO (VIDEO)
  • VIDEO_URL (STRING)
  • FILE_PATH (STRING)

Minimax Image To Video

Model: Minimax-Hailuo
Category: Model Library/General Video Models
Description: Generates video from image + text prompt using Minimax-Hailuo.

Inputs

  • prompt_text (STRING)
    Text describing motion and scene transformation.
  • model (COMBO, default: Minimax-Hailuo-2.3)
    Model variant selection.
  • duration (COMBO, default: 6)
    Video duration.
  • image (IMAGE)
    Input image used for animation.
  • first_frame_image (STRING)
    URL version of input image.
  • seed (INT, default: 0)
    Random seed.
  • resolution (COMBO, default: 768P)
    Output resolution.
  • prompt_optimizer (BOOLEAN, default: True)
    Enhances prompt processing.
  • fast_pretreatment (BOOLEAN, default: False)
    Faster preprocessing mode.

Outputs

  • VIDEO (VIDEO)
  • VIDEO_URL (STRING)
  • FILE_PATH (STRING)
\

Pixverse v5_5 t2v

Model: pixverse-v5.5-t2v
Description: Generates a video from a text prompt using Pixverse v5.5.

Inputs

Required

  • prompt (STRING)
    Text description of the video content to generate.

Optional

  • aspect_ratio (STRING, default: 16:9)
    Sets output video shape ratio.
  • duration (STRING, default: 5)
    Length of the video in seconds (5, 8, 10).
  • quality (STRING, default: 540p)
    Output resolution of the video.
  • negative_prompt (STRING)
    Describes elements to avoid in generation.
  • generate_audio_switch (BOOLEAN, default: False)
    Enables AI-generated audio.
  • generate_multi_clip_switch (BOOLEAN, default: False)
    Enables multi-shot / cinematic transitions.
  • thinking_type (STRING, default: auto)
    Controls prompt optimization behavior.
  • seed (INT, default: 0)
    Random seed for reproducibility (0 = random).

Outputs

  • VIDEO (VIDEO)
    Generated video as a binary/video object used in ComfyUI pipelines.
  • VIDEO_URL (STRING)
    Public URL where the generated video is hosted.
  • FILE_PATH (STRING)
    Local filesystem path where the video is saved.

Pixverse v5_5 i2v

Model: pixverse-v5.5-i2v
Description: Generates a video from a single image and optional prompt.

Inputs

Required (one image source required)

  • image (IMAGE)
    Input image from ComfyUI. Used as the primary reference frame.
OR
  • image_url (STRING)
    URL to the reference image (used if IMAGE input is not provided).
  • prompt (STRING)
    Text description guiding motion, style, and animation.

Optional

  • aspect_ratio (STRING, default: 16:9)
    Output video aspect ratio.
  • duration (STRING, default: 5)
    Video length in seconds.
  • quality (STRING, default: 540p)
    Output resolution.
  • negative_prompt (STRING)
    Elements to exclude from generation.
  • generate_audio_switch (BOOLEAN, default: False)
    Enables audio generation.
  • generate_multi_clip_switch (BOOLEAN, default: False)
    Enables cinematic transitions.
  • thinking_type (STRING, default: auto)
    Controls prompt reasoning optimization.
  • seed (INT, default: 0)
    Random seed for reproducibility.

Outputs

  • VIDEO (VIDEO)
    Generated animated video.
  • VIDEO_URL (STRING)
    Public hosted URL of the video.
  • FILE_PATH (STRING)
    Local file path of saved output.

Pixverse v5_5 Transition

Model: pixverse-v5.5-transition
Description: Creates a video transition between two images.

Inputs

Required (both frames required)

  • first_frame_image (IMAGE) or first_frame_image_url (STRING)
    Starting frame image for transition.
  • last_frame_image (IMAGE) or last_frame_image_url (STRING)
    Ending frame image for transition.
  • prompt (STRING)
    Text describing how the transition should behave.

Optional

  • duration (STRING, default: 5)
    Video duration in seconds.
  • quality (STRING, default: 540p)
    Output resolution.
  • negative_prompt (STRING)
    Elements to avoid in transition.
  • generate_audio_switch (BOOLEAN, default: False)
    Enables audio generation.
  • seed (INT, default: 0)
    Random seed for reproducibility.

Outputs

  • VIDEO (VIDEO)
    Transition video output.
  • VIDEO_URL (STRING)
    Public URL of generated video.
  • FILE_PATH (STRING)
    Local saved file path.

Pixversev 5_6 t2v

Model: pixverse-v5.6-t2v
Description: Generates a video from a text prompt using Pixverse v5.6.

Inputs

Required

  • prompt (STRING)
    Text prompt describing full video content (max 2048 characters).

Optional

  • aspect_ratio (STRING, default: 16:9)
    Output video aspect ratio.
  • duration (STRING, default: 5)
    Video length in seconds (note: 10s not supported at 1080p).
  • quality (STRING, default: 540p)
    Output resolution.
  • negative_prompt (STRING)
    Content to exclude from generation.
  • generate_audio_switch (BOOLEAN, default: False)
    Enables audio generation.
  • style (STRING, default: none)
    Visual style preset (none, anime, 3d_animation, clay, comic, cyberpunk).
  • thinking_type (STRING, default: auto)
    Controls prompt reasoning optimization.
  • seed (INT, default: 0)
    Random seed.

Outputs

  • VIDEO (VIDEO)
    Generated video file object.
  • VIDEO_URL (STRING)
    Hosted video URL.
  • FILE_PATH (STRING)
    Local saved video path.

Pixversev 5_6 i2v

Model: pixverse-v5.6-i2v
Description: Generates a video from a single image using Pixverse v5.6.

Inputs

Required (one image source required)

  • image (IMAGE)
    ComfyUI image input used as the main reference frame.
OR
  • image_url (STRING)
    URL of the reference image.
  • prompt (STRING)
    Text prompt guiding animation and scene behavior.

Optional

  • aspect_ratio (STRING, default: 16:9)
    Output aspect ratio.
  • duration (STRING, default: 5)
    Video duration in seconds.
  • quality (STRING, default: 540p)
    Output resolution.
  • negative_prompt (STRING)
    Elements to exclude.
  • generate_audio_switch (BOOLEAN, default: False)
    Enables audio generation.
  • style (STRING, default: none)
    Visual style preset.
  • thinking_type (STRING, default: auto)
    Controls prompt reasoning behavior.
  • seed (INT, default: 0)
    Random seed.

Outputs

  • VIDEO (VIDEO)
    Generated animated video.
  • VIDEO_URL (STRING)
    Public URL to video.
  • FILE_PATH (STRING)
    Local saved file path.

Pixverse v5_6 Transition

Model: pixverse-v5.6-transition
Description: Generates a transition video between two images using Pixverse v5.6.

Inputs

Required (both frames required)

  • first_frame_image (IMAGE) or first_frame_image_url (STRING)
    Starting frame image.
  • last_frame_image (IMAGE) or last_frame_image_url (STRING)
    Ending frame image.
  • prompt (STRING)
    Text describing transition behavior.

Optional

  • duration (STRING, default: 5)
    Video duration.
  • quality (STRING, default: 540p)
    Output resolution.
  • negative_prompt (STRING)
    Elements to avoid.
  • generate_audio_switch (BOOLEAN, default: False)
    Enables audio generation.
  • seed (INT, default: 0)
    Random seed.

Outputs

  • VIDEO (VIDEO)
    Generated transition video.
  • VIDEO_URL (STRING)
    Public video URL.
  • FILE_PATH (STRING)
    Local file path.
\

Reve Create

Model: reve-create-20250915
Description:
Generates an image from a text prompt using the Reve text-to-image model.

Inputs

  • prompt (required)
    • Type: STRING
    • Description: Text prompt describing the image to generate.
  • aspect_ratio (optional)
    • Type: ENUM
    • Options: 16:9, 9:16, 3:2, 2:3, 4:3, 3:4, 1:1
    • Default: 3:2
    • Description: Controls the aspect ratio of the generated image.

Outputs

  • image
    • Type: IMAGE
    • Description: Generated image tensor output for ComfyUI workflows.
  • image_url
    • Type: STRING
    • Description: Public URL of the generated image.
  • file_name
    • Type: STRING
    • Description: Local saved filename of the generated image.

Reve Edit

Model: reve-edit-20250915
Description:
Edits an image using a reference image and a text prompt.

Inputs

  • prompt (required)
    • Type: STRING
    • Description: Text prompt describing how the image should be edited.
  • image (optional)
    • Type: IMAGE
    • Description: Reference image from ComfyUI used as input.
  • image_url (optional)
    • Type: STRING
    • Description: URL of a reference image (used if IMAGE input is not provided).
  • aspect_ratio (optional)
    • Type: ENUM
    • Options: "", 16:9, 9:16, 3:2, 2:3, 4:3, 3:4, 1:1
    • Description: Controls output image aspect ratio.

Outputs

  • image
    • Type: IMAGE
    • Description: Edited image tensor output.
  • image_url
    • Type: STRING
    • Description: Public URL of the edited image.
  • file_name
    • Type: STRING
    • Description: Local saved filename.

Reve Edit Fast

Model: reve-edit-fast-20251030
Description:
Fast version of image editing using a reference image and text prompt.

Inputs

  • prompt (required)
    • Type: STRING
    • Description: Text prompt guiding the image edit.
  • image (optional)
    • Type: IMAGE
    • Description: Reference image from ComfyUI.
  • image_url (optional)
    • Type: STRING
    • Description: URL of reference image if IMAGE input is not used.
  • aspect_ratio (optional)
    • Type: ENUM
    • Options: "", 16:9, 9:16, 3:2, 2:3, 4:3, 3:4, 1:1
    • Description: Controls output image aspect ratio.

Outputs

  • image
    • Type: IMAGE
    • Description: Fast-generated edited image tensor.
  • image_url
    • Type: STRING
    • Description: Public URL of the generated image.
  • file_name
    • Type: STRING
    • Description: Local saved filename.

Reve Remix

Model: reve-remix-20250915
Description:
Generates an image from 1–6 reference images combined with a text prompt.

Inputs

  • prompt (required)
    • Type: STRING
    • Description: Text prompt guiding remix generation.
  • image (optional)
    • Type: IMAGE
    • Description: 1–6 reference images from ComfyUI.
  • image_url (optional)
    • Type: STRING
    • Description: 1–6 reference image URLs (comma-separated).
  • aspect_ratio (optional)
    • Type: ENUM
    • Options: "", 16:9, 9:16, 3:2, 2:3, 4:3, 3:4, 1:1
    • Description: Controls output image aspect ratio.

Outputs

  • image
    • Type: IMAGE
    • Description: Generated remixed image tensor.
  • image_url
    • Type: STRING
    • Description: Public URL of generated image(s).
  • file_name
    • Type: STRING
    • Description: Local saved filename.

Reve Remix Fast

Model: reve-remix-fast-20251030
Description:
Fast version of multi-image remix generation using 1–6 reference images.

Inputs

  • prompt (required)
    • Type: STRING
    • Description: Text prompt guiding remix generation.
  • image (optional)
    • Type: IMAGE
    • Description: 1–6 reference images from ComfyUI.
  • image_url (optional)
    • Type: STRING
    • Description: 1–6 reference image URLs.
  • aspect_ratio (optional)
    • Type: ENUM
    • Options: "", 16:9, 9:16, 3:2, 2:3, 4:3, 3:4, 1:1
    • Description: Controls output image aspect ratio.

Outputs

  • image
    • Type: IMAGE
    • Description: Fast-generated remixed image tensor.
  • image_url
    • Type: STRING
    • Description: Public URL of generated image(s).
  • file_name
    • Type: STRING
    • Description: Local saved filename.
\

SkyReels Text-to-Video

Model: skyreels-v4-text-to-video
Description: Generates cinematic video from a text prompt using SkyReels V4.

Inputs

  • prompt (required)
    Type: STRING
    Description: Text prompt describing the video content.
  • duration (optional)
    Type: INT
    Default: 5
    Range: 3–15
    Description: Duration of the generated video in seconds.
  • aspect_ratio (optional)
    Type: ENUM
    Options: 16:9, 4:3, 1:1, 9:16, 3:4
    Default: 16:9
    Description: Controls the aspect ratio of the output video.
  • sound (optional)
    Type: BOOLEAN
    Default: False
    Description: Enables audio generation for the video.
  • mode (optional)
    Type: ENUM
    Options: std, fast, pro
    Default: std
    Description: Controls generation quality vs speed.

Outputs

  • VIDEO (Type: VIDEO) — Generated cinematic video.
  • VIDEO_URL (Type: STRING) — Public URL of the generated video.
  • FILE_PATH (Type: STRING) — Local saved file path of the video.

SkyReels Image-to-Video

Model: skyreels-v4-image-to-video
Description: Animates a single image into a cinematic video using SkyReels V4.

Inputs

  • prompt (required)
    Type: STRING
    Description: Text prompt guiding the animation.
  • image (optional)
    Type: IMAGE
    Description: Input image from ComfyUI (exactly one image required if used).
  • image_url (optional)
    Type: STRING
    Description: URL of input image (used if IMAGE input is not provided).
  • duration (optional)
    Type: INT
    Default: 5
    Range: 3–15
    Description: Duration of generated video in seconds.
  • sound (optional)
    Type: BOOLEAN
    Default: False
    Description: Enables audio generation.
  • mode (optional)
    Type: ENUM
    Options: std, fast, pro
    Default: std
    Description: Controls generation quality vs speed.

Outputs

  • VIDEO (Type: VIDEO) — Generated animated video.
  • VIDEO_URL (Type: STRING) — Public URL of the generated video.
  • FILE_PATH (Type: STRING) — Local saved file path of the video.

Sora 2

Description: Generates video from text prompts using the Sora 2 model with optional reference image support.

Inputs

  • prompt (required)
    Type: STRING
    Description: Text prompt for video generation.
  • input_reference_image (optional)
    Type: IMAGE
    Description: Optional single reference image input.
  • input_reference (optional)
    Type: STRING
    Description: URL version of reference image (0–1 images supported).
  • model (optional)
    Type: STRING
    Default: sora-2
    Description: Model identifier for generation.
  • seconds (optional)
    Type: ENUM
    Default: 4
    Options: 4, 8, 12
    Description: Video duration.
  • size (optional)
    Type: ENUM
    Default: 1280x720
    Options: 1280x720, 720x1280
    Description: Output resolution.

Outputs

  • VIDEO (Type: VIDEO) — Generated video output.
  • VIDEO_URL (Type: STRING) — Public video URL.
  • FILE_PATH (Type: STRING) — Local saved file path.

Sora 2 Pro

Description: Generates high-quality video using Sora 2 Pro with reference image support.

Inputs

  • prompt (required)
    Type: STRING
    Description: Text prompt for video generation.
  • input_reference_image (optional)
    Type: IMAGE
    Description: Optional single reference image input.
  • input_reference (optional)
    Type: STRING
    Description: URL version of reference image.
  • model (optional)
    Type: STRING
    Default: sora-2-pro
    Description: Model identifier.
  • seconds (optional)
    Type: ENUM
    Default: 4
    Options: 4, 8, 12
    Description: Video duration.
  • size (optional)
    Type: ENUM
    Default: 1792x1024
    Options: 1792x1024, 1024x1792, 1280x720, 720x1280
    Description: Output resolution.

Outputs

  • VIDEO (Type: VIDEO) — Generated video output.
  • VIDEO_URL (Type: STRING) — Public video URL.
  • FILE_PATH (Type: STRING) — Local saved file path.

Veo3 Video Generation

Description: Generates videos using Google’s Veo 3 models through the GMI gateway, supporting text-to-video and image-conditioned generation.

Inputs

  • prompt (required) — Type: STRING — Text description of the video.
  • aspect_ratio (optional) — Type: ENUM — Default: 16:9 — Options: 16:9, 9:16
  • negative_prompt (optional) — Type: STRING — What should be avoided.
  • duration_seconds (optional) — Type: INT — Default: 8 — Max 8 seconds.
  • person_generation (optional) — Type: ENUM — Default: ALLOW — Options: ALLOW, BLOCK
  • seed (optional) — Type: INT — Default: 0
  • image (optional) — Type: IMAGE/STRING — Reference image input.
  • lastFrame (optional) — Type: IMAGE/STRING — Ending frame image.
  • reference_image (optional) — Type: IMAGE/STRING — Reference image for Veo 3.1.
  • model (optional) — Type: ENUM — Default: Veo3 — Multiple Veo model options listed.

Outputs

  • VIDEO (Type: VIDEO) — Generated video output.
  • VIDEO_URL (Type: STRING) — Public URL.
  • FILE_PATH (Type: STRING) — Local saved file path.

Vidu Q2 Pro I2V

Model: vidu-q2-pro-i2v
Description: Generates a video from a single reference image using VIDU Q2 Pro I2V model.

Inputs

  • prompt (required) — STRING — Text prompt (max 2000 chars).
  • image (optional) — IMAGE — One input image required if used.
  • image_url (optional) — STRING — URL of input image.
  • duration (optional) — INT — Default: 5 — Range: 1–10
  • seed (optional) — INT — Random seed.

Outputs

  • VIDEO — Generated video output.
  • VIDEO_URL — Public video URL.
  • FILE_PATH — Local saved file path.

Vidu Q2 Pro R2V

Model: vidu-q2-pro-r2v
Description: Generates video from multiple reference images and/or videos using VIDU Q2 Pro R2V model.

Inputs

  • prompt (required) — STRING — Text prompt (max 2000 chars).
  • images (optional) — IMAGE (batch) — Up to 7 images.
  • image_urls (optional) — STRING — Comma-separated URLs.
  • video_urls (optional) — STRING — 1 video (8s) or 2 videos (5s each).
  • duration (optional) — INT — Default: 5 — Range: 1–8
  • seed (optional) — INT — Random seed.

Outputs

  • VIDEO — Generated video output.
  • VIDEO_URL — Public video URL.
  • FILE_PATH — Local saved file path.

Vidu Q2 T2V

Model: vidu-q2-t2v
Description: Generates a video from text using VIDU Q2 T2V model.

Inputs

  • prompt (required) — STRING — Text prompt (max 2000 chars).
  • duration (optional) — INT — Default: 5 — Range: 1–10
  • seed (optional) — INT — Random seed.

Outputs

  • VIDEO — Generated video output.
  • VIDEO_URL — Public video URL.
  • FILE_PATH — Local saved file path.

Vidu Q3 Pro I2V

Model: vidu-q3-pro-i2v
Description: Generates video from a single reference image using VIDU Q3 Pro I2V. Supports optional audio.

Inputs

  • prompt (required) — STRING — Text prompt (max 2000 chars).
  • image / image_url (required) — One input image required.
  • duration (optional) — INT — Default: 5 — Range: 1–16
  • audio (optional) — BOOLEAN — Default: False
  • seed (optional) — INT — Random seed.

Outputs

  • VIDEO — Generated video output.
  • VIDEO_URL — Public video URL.
  • FILE_PATH — Local saved file path.

Vidu Q3 Pro T2V

Model: vidu-q3-pro-t2v
Description: Generates a video from text using VIDU Q3 Pro T2V model. Supports optional audio.

Inputs

  • prompt (required) — STRING — Text prompt (max 2000 chars).
  • duration (optional) — INT — Default: 5 — Range: 1–16
  • audio (optional) — BOOLEAN — Default: False
  • seed (optional) — INT — Random seed.

Outputs

  • VIDEO — Generated video output.
  • VIDEO_URL — Public video URL.
  • FILE_PATH — Local saved file path.

Wan Animate Video

Model: Wan2.2-Animate-14B
Description: Generate a video using a reference image and a template video

Inputs

  • refer_path (required)
    Type: STRING
    Reference image URL for video generation.
  • video_path (required)
    Type: STRING
    Template video URL for video generation.
  • resolution (optional)
    Type: ENUM
    Options: 480p, 720p
    Default: 480p
    Resolution of the output video.

Outputs

  • VIDEO (VIDEO) — Generated video output
  • VIDEO_URL (STRING) — Public URL of generated video
  • FILE_PATH (STRING) — Local saved file path

Wan 2.5 Image-to-Video

Model: wan2.5-i2v-preview
Description: Generate video from image using WAN 2.5 model

Inputs

  • image (optional)
    Type: IMAGE
    Input image (ComfyUI tensor). Takes precedence over img_url.
  • img_url (optional)
    Type: STRING
    Image URL used if IMAGE is not provided.
  • prompt (optional)
    Type: STRING
    Text prompt for video generation.
  • negative_prompt (optional)
    Type: STRING
    Negative prompt for video generation.
  • resolution (optional)
    Type: ENUM
    Options: 480P, 720P, 1080P
    Default: 480P
  • duration (optional)
    Type: ENUM
    Options: 5, 10
    Default: 5
  • prompt_extend (optional)
    Type: BOOLEAN
    Default: True
  • watermark (optional)
    Type: BOOLEAN
    Default: False
  • audio (optional)
    Type: BOOLEAN
    Default: False
  • audio_url (optional)
    Type: STRING
  • seed (optional)
    Type: INT
    Range: 0–2147483647

Outputs

  • VIDEO (VIDEO) — Generated video output
  • VIDEO_URL (STRING) — Public URL
  • FILE_PATH (STRING) — Local saved file path

Wan 2.6 Text-to-Video

Model: wan2.6-t2v
Description: Generate video from text using WAN 2.6 model

Inputs

  • prompt (required)
    Type: STRING
    Text prompt for video generation.
  • negative_prompt (optional)
    Type: STRING
  • audio_url (optional)
    Type: STRING
  • resolution (optional)
    Type: ENUM
    Options: 720P, 1080P
    Default: 1080P
  • duration (optional)
    Type: ENUM
    Options: 5, 10, 15
    Default: 5
  • prompt_extend (optional)
    Type: BOOLEAN
    Default: True
  • watermark (optional)
    Type: BOOLEAN
    Default: False
  • audio (optional)
    Type: BOOLEAN
    Default: False
  • seed (optional)
    Type: INT
    Range: 0–2147483647

Outputs

  • VIDEO (VIDEO) — Generated video output
  • VIDEO_URL (STRING) — Public URL
  • FILE_PATH (STRING) — Local saved file path

Wan 2.6 Image-to-Video

Model: wan2.6-i2v
Description: Generate video from image using WAN 2.6 model

Inputs

  • image (optional)
    Type: IMAGE
    Reference image (takes precedence over img_url).
  • img_url (optional)
    Type: STRING
  • prompt (optional)
    Type: STRING
  • negative_prompt (optional)
    Type: STRING
  • audio_url (optional)
    Type: STRING
  • resolution (optional)
    Type: ENUM
    Options: 720P, 1080P
    Default: 720P
  • duration (optional)
    Type: ENUM
    Options: 5, 10, 15
    Default: 5
  • prompt_extend (optional)
    Type: BOOLEAN
    Default: True
  • watermark (optional)
    Type: BOOLEAN
    Default: False
  • audio (optional)
    Type: BOOLEAN
    Default: False
  • seed (optional)
    Type: INT
    Range: 0–2147483647

Outputs

  • VIDEO (VIDEO) — Generated video output
  • VIDEO_URL (STRING) — Public URL
  • FILE_PATH (STRING) — Local saved file path

Wan 2.6 Reference-to-Video

Model: wan2.6-r2v
Description: Generate video using reference video URLs (multi-character supported)

Inputs

  • video_urls (required)
    Type: STRING
    Comma-separated reference video URLs (1–3).
  • prompt (optional)
    Type: STRING (max 1500 chars)
  • negative_prompt (optional)
    Type: STRING (max 500 chars)
  • size (optional)
    Type: ENUM
    Options: multiple resolutions
    Default: 1920*1080
  • duration (optional)
    Type: ENUM
    Options: 5, 10
    Default: 5
  • shot_type (optional)
    Type: ENUM
    Options: single, multi
    Default: single
  • watermark (optional)
    Type: BOOLEAN
    Default: False
  • seed (optional)
    Type: INT
    Range: 0–2147483647

Outputs

  • VIDEO (VIDEO) — Generated video output
  • VIDEO_URL (STRING) — Public URL
  • FILE_PATH (STRING) — Local saved file path

WAN 2.7 Text-to-Video

Description

Generates a video from a text prompt using the WAN 2.7 T2V model, supporting flexible duration (2–15 seconds), aspect ratio selection, optional audio input, prompt enhancement, and watermark control.

Inputs

  • prompt: Required text description of the video content (max 1500 characters).
  • negative_prompt: Optional text describing what to avoid (max 500 characters).
  • audio_url: Optional external audio file (WAV/MP3, 3–30s, ≤15MB).
  • resolution: Output resolution tier, either 720P or 1080P (default: 1080P).
  • ratio: Aspect ratio of output video (16:9, 9:16, 1:1, 4:3, 3:4).
  • duration: Video length in seconds, between 2 and 15 (default: 5).
  • prompt_extend: Enables automatic prompt rewriting/enhancement.
  • watermark: Adds “AI Generated” watermark if enabled.
  • seed: Random seed for reproducibility.

Outputs

  • VIDEO: Generated video object.
  • VIDEO_URL: Hosted URL for the generated video.
  • FILE_PATH: Local saved file path.

Behavior Notes

This node validates prompt input, submits a WAN 2.7 generation request, and polls until completion. It supports optional audio conditioning and aspect-ratio-aware generation. The final video is downloaded, saved locally, and returned with a preview UI.

WAN 2.7 Image-to-Video

Description

Generates a video from an input image using WAN 2.7 I2V. Supports optional first/last frame conditioning, optional driving audio, flexible duration control, and prompt-based motion guidance.

Inputs

  • first_frame_image / first_frame_image_url: Optional starting frame (image tensor or URL).
  • last_frame_image / last_frame_image_url: Optional ending frame (image tensor or URL).
  • first_clip: Optional reference video for motion guidance.
  • driving_audio: Optional audio input to guide motion dynamics.
  • prompt: Optional text prompt (max 1500 characters).
  • negative_prompt: Optional constraints (max 500 characters).
  • resolution: Output resolution (720P or 1080P).
  • duration: Video length in seconds (2–15).
  • prompt_extend: Enables prompt enhancement.
  • watermark: Adds watermark overlay.
  • seed: Random seed for reproducibility.

Outputs

  • VIDEO: Generated video.
  • VIDEO_URL: Hosted result URL.
  • FILE_PATH: Local saved path.

Behavior Notes

This node resolves image inputs from either tensors or URLs, builds a WAN 2.7 I2V request, and generates motion between frames. It supports multimodal conditioning including audio-driven motion and optional temporal guidance via clips.

WAN 2.7 Reference-to-Video

Description

Generates video using multiple reference images and/or reference videos with WAN 2.7. Supports first-frame conditioning, multi-source visual guidance, and flexible motion synthesis across up to 5 total reference assets.

Inputs

  • first_frame_image / first_frame_url: Optional starting frame.
  • reference_images: Optional batch of image tensors.
  • reference_image_urls: Optional comma-separated image URLs.
  • reference_video_urls: Optional comma-separated video references (max 5 total assets combined).
  • prompt: Required or optional text prompt (max 1500 characters).
  • negative_prompt: Optional constraints (max 500 characters).
  • resolution: Output resolution (720P or 1080P).
  • ratio: Aspect ratio control (16:9, 9:16, 1:1, 4:3, 3:4).
  • duration: Video length (2–15 seconds).
  • prompt_extend: Enables prompt rewriting.
  • watermark: Adds watermark overlay.
  • seed: Random seed for reproducibility.

Outputs

  • VIDEO: Generated video.
  • VIDEO_URL: Hosted result.
  • FILE_PATH: Local file path.

Behavior Notes

This node merges multiple visual references (images + videos) into a unified generation context. It validates total reference count (max 5) and ensures at least one valid reference asset exists before generating.

Happy Horse 1.0 Text-to-Video — GMIHHT2VNode

Description

Generates video from a text prompt using the Happy Horse 1.0 model with a focus on high visual fidelity, simple configuration, and short-form generation (3–15 seconds).

Inputs

  • prompt: Required text description of the video content.
  • resolution: Output resolution (720P or 1080P).
  • duration: Video length in seconds (3–15).
  • watermark: Adds “AI Generated” watermark in bottom-right corner.

Outputs

  • VIDEO: Generated video.
  • VIDEO_URL: Hosted video URL.
  • FILE_PATH: Local saved file path.

Behavior Notes

This is a lightweight T2V node optimized for straightforward generation. It enforces minimum duration constraints (≥3 seconds) and applies default watermarking unless disabled.

Happy Horse 1.0 Image-to-Video — GMIHHI2VNode

Description

Generates video from an image using the Happy Horse 1.0 model. The input image defines the initial frame, and motion is generated based on the prompt.

Inputs

  • prompt: Required text describing motion and style.
  • first_frame_image / first_frame_image_url: Required starting image input.
  • resolution: Output resolution (720P or 1080P).
  • duration: Video length (3–15 seconds).
  • watermark: Adds watermark overlay.

Outputs

  • VIDEO: Generated video.
  • VIDEO_URL: Hosted URL.
  • FILE_PATH: Local file path.

Behavior Notes

This node requires a valid first frame (image tensor or URL). It uses the image as a motion anchor and generates consistent temporal transformation guided by the prompt.