PixVerse V6
social-mediaproduct-adstransitionsPixVerse · Diffusion Transformer · v6.0verifiedVerified
$0.03/sec
starting from, on FAL.ai
Resolution
1080p
Duration
1–15s
Providers
3
API Pricing
Why PixVerse V6?
thumb_upStrengths
- Widest mode variety — text-to-video, image-to-video, transition, extend, and lip-sync all available via API
- Granular resolution pricing from $0.025/sec (360p) to $0.115/sec (1080p+audio) — flexible for budget and quality needs
- 20+ cinematic lens controls (focal length, aperture, DoF, chromatic aberration, vignetting) for precise camera work
- Available on 3 providers (FAL.ai, WaveSpeed, PixVerse Platform) with CLI and MCP server support for developer workflows
- Physics simulation engine for realistic fluid, material, and light interactions
infoLimitations
- Lower AA Arena ranking (#16 T2V, ELO 1,209) compared to top-tier models like HappyHorse or Seedance 2.0
- Not open-source or self-deployable — closed weights, API-only access
- Lip-sync requires a separate API endpoint rather than integrated single-pass generation
- No video-to-video editing capability — cannot restyle or edit existing footage
- Extreme camera movements in single shots can produce visual artifacts
auto_fix_highPrompt Guide
- 1Describe observable elements, not emotions — 'wide tracking shot through pine trees with morning side light, a fox walking left to right' outperforms 'a magical energetic forest scene.'
- 2Focus on one primary action per clip — competing movements reduce quality. Keep it simple for cleaner, more consistent results.
- 3Set resolution to 1080p before entering text — ensure configuration matches your target quality before generating.
- 4Use specific camera language — V6's 20+ lens controls respond to terms like focal length, aperture, depth of field, and chromatic aberration in prompts.
- 5Leverage multi-shot mode with consistent descriptions — keep character and environment descriptions identical across shots to maintain visual continuity.
- 6Match aspect ratio to distribution channel — 9:16 for mobile/TikTok, 16:9 for widescreen/YouTube, 1:1 for Instagram feed.
✓ Do this
- Structure prompts with: Subject + Action + Environment + Details + Motion + Camera + Style
- For transition mode, provide two distinct images and describe the desired movement between them — V6 handles camera angle transitions natively
- For extend mode, guide the extension toward the opening frame for seamless loops, or stretch clips to fit different ad slot durations
- Use the physics engine for fluid and material shots — viscosity, surface tension, and light interaction are rendered with improved accuracy in V6
- For lip-sync, provide clear audio with minimal background noise — the model matches mouth movements to speech with high precision
✗ Avoid this
- Extreme camera motions in a single shot can produce artifacts — keep camera movement moderate for best results
- Text rendering inside scenes is not reliably supported
- Physics simulation is improved but still approximate — complex fluid interactions may not be physically accurate
- Not self-deployable — closed source with no model weights available
- Lip-sync is a separate endpoint from video generation — requires a two-step workflow
Example Prompts
“Wide tracking shot through pine trees with morning side light, a fox walking steadily from left to right, leaves rustling on the forest floor. Shallow depth of field, f/2.8 aperture, warm color grading.”
“Close-up of honey slowly dripping from a wooden dipper onto a stack of pancakes. Camera holds steady, macro lens, golden morning light. The viscous flow catches light as it pools. Sound of sizzling butter.”
“A woman in a business suit walks through a modern glass office, turns to camera and speaks. Professional lighting, 16:9, 1080p. [Multi-shot: Shot 1 — tracking following from behind. Shot 2 — medium close-up facing camera.]”
Based on the official prompt guide →
FAQexpand_more
How much does PixVerse V6 cost?
From $0.03/sec on FAL.ai. A 5-second video ≈ $0.13.
Where can I use PixVerse V6?
Via API on FAL.ai and WaveSpeed and PixVerse Platform API.
How do I get good results with PixVerse V6?
Describe observable elements, not emotions — 'wide tracking shot through pine trees with morning side light, a fox walking left to right' outperforms 'a magical energetic forest scene.' See the prompt guide below.