Text-to-Video and Image-to-Video With Synchronized Audio
Veo 3.1 Lite supports both input modes in a single workflow. Start from a text prompt when you need to build a scene from scratch. Start from an image when you have a visual reference or a product photo to animate. Both modes benefit from Google's audio synthesis, which generates ambient sound and environmental audio aligned with the visual action.
| Image 1 | Image 2 | Image 3 | Output Video |
|---|---|---|---|
![]() | ![]() | ![]() | ![]() |
Built for High-Volume Production
Cost has always been the limiting factor in AI video at scale. The Google Veo 3.1 Lite video generator addresses this directly. At less than $0.05 per second for 720p output, it brings the per-clip cost down to a level that makes daily batch production financially viable for small teams and solo creators.
| Input Grid | Prompt | Output Video |
|---|---|---|
![]() | make a video with the image | ![]() |
Flexible Resolution and Aspect Ratio for Every Platform
| Prompt | Output Video |
|---|---|
| 3D Pixar cartoon, Fruit Love Island, anthropomorphic fruits laugh hysterically at funny phone gossip in tropical villa living room, exaggerated comedic expressions, bright lighting, 16:9, 15s video, smooth animation | ![]() |
Veo 3.1 Lite Costs Less Than Half of Veo 3.1 Fast
The cost difference between Veo 3.1 Lite and Veo 3.1 Fast is not marginal - it is over 50%. Veo 3.1 Fast runs around $0.15 per second of video. Veo 3.1 Lite brings that down to approximately $0.05 per second for 720p output. For an 8-second clip, that is the difference between $1.20 and $0.40 per generation.
Across 100 clips per month - a realistic volume for a content team or an app with active users - that gap adds up to hundreds of dollars in savings. The generation speed stays the same. An 8-second clip generates in under a minute at both tiers. For anyone building a Veo 3.1 video generator workflow at volume, Lite is the practical tier to start with.
| Veo 3.1 | Veo 3.1 Fast | Veo 3.1 Lite | |
|---|---|---|---|
| Cost per second | ~$0.40 | ~$0.15 | ~$0.05 |
| Generation speed | Standard | Fast | Fast |
| Max resolution | 4K | 1080P | 1080P |
| Audio | Native | Native | Native |
| Image-to-video | ✓ | ✓ | ✓ |
| Video extension | ✓ | ✓ | — |
| Max duration | 8s | 8s | 8s |


























