HunyuanVideo 1.5

prototypingBest Valueself-hosted

Tencent · Diffusion Transformer · v1.5verifiedVerified

$0.02/sec

starting from, on WaveSpeed

Resolution

1080p

Duration

5–10s

Providers

Text-to-VideoImage-to-Video

API Pricing

FAL.aiStandardCheapest

Try it →

Text-to-Video

$0.075/s

Verified 2026-04-10

WaveSpeedStandard

Try it →

Text-to-Video

$0.020/s

Text-to-Video

$0.040/s

Image-to-Video

$0.020/s

Image-to-Video

$0.040/s

Verified 2026-04-10

Why HunyuanVideo 1.5?

thumb_upStrengths

Runs on consumer GPUs with just 14GB VRAM — the most hardware-accessible open-source video model
Open source with full weights, training code, and LoRA fine-tuning pipeline on GitHub and Hugging Face
SSTA attention mechanism delivers nearly 2x inference speedup over standard FlashAttention
Extremely low-cost on WaveSpeed at $0.02/sec (480p) — the cheapest API option for video generation
Strong motion coherence and prompt adherence despite compact 8.3B parameter size

infoLimitations

Native output is only 480p — requires super-resolution for higher resolution output
Short maximum duration (~5 seconds) compared to models offering 10-15 second clips
No native audio generation, lip-sync, or dialogue capabilities
Lower overall quality ranking (ELO 1,014) compared to premium models
Limited aspect ratio support — only 16:9 and 9:16, no square (1:1) option

auto_fix_highPrompt Guide

1Keep prompts concise and focused on your core idea — less is more with HunyuanVideo. Overloading with excessive detail prevents the model from producing coherent results.
2Use natural language descriptions rather than technical jargon — write prompts like 'sunset over ocean waves' rather than complex comma-separated keyword lists.
3Include specific sensory details to add texture — 'sunlight glistening on wet pavement after rain' creates richer output than generic scene descriptions.
4Stick to one or two style cues when defining tone and aesthetics — conflicting style terms (e.g., 'photorealistic anime noir') confuse the model.
5For in-video text generation, enclose the desired text in quotation marks within your prompt — HunyuanVideo 1.5 can render clear text within video frames.

✓ Do this

Structure prompts with four components: Subject (main focus), Setting (environment), Action (movement and change), and Style (camera, lighting, mood)
Enable prompt expansion for automatic enhancement — it refines raw prompts for better semantic understanding and output quality
Iterate and refine — review the first output and adjust your description for clarity or additional detail rather than starting from scratch
Use the lightweight nature for rapid prototyping at 480p, then upscale final selections with the built-in super-resolution module
Leverage the open-source model for LoRA fine-tuning on consumer GPUs (14GB VRAM minimum) for custom characters or brand styles

✗ Avoid this

Native output resolution is 480p — higher resolutions require the super-resolution upscaling module
Maximum duration is approximately 5 seconds (121 frames) on FAL.ai — shorter than many competitors
No native audio generation — output is video-only
No camera control panel, motion brush, or multi-shot generation
Complex multi-element prompts with many simultaneous actions may produce inconsistent results

Example Prompts

Nature / Animal

“A golden retriever runs through a field of wildflowers on a sunny afternoon. The dog leaps joyfully, ears flapping in the wind. Shallow depth of field, warm golden light, slow motion feel.”

Abstract / Atmospheric

“Close-up of rain droplets falling onto a still pond, creating expanding circular ripples. Each drop catches a glint of overcast sky. Macro lens perspective, muted cool tones, meditative pace.”

Art / Cultural

“A calligrapher writes the character 'Dream' with a brush on rice paper. Ink flows smoothly from the brush tip, each stroke deliberate and confident. Top-down camera angle, warm desk lamp lighting, shallow depth of field on the brush.”

Based on the official prompt guide →

FAQexpand_more

How much does HunyuanVideo 1.5 cost?

From $0.02/sec on WaveSpeed. A 5-second video ≈ $0.10.

Where can I use HunyuanVideo 1.5?

Via API on FAL.ai and WaveSpeed.

How do I get good results with HunyuanVideo 1.5?

Keep prompts concise and focused on your core idea — less is more with HunyuanVideo. Overloading with excessive detail prevents the model from producing coherent results. See the prompt guide below.