Unified Text-to-Video and Image-to-Video in One Model
Text to Video
| Prompt | Output Video |
|---|---|
| A quiet temple garden near Kyoto: moss-coated stones, delicate maples, and a wooden bridge over a koi pond at dusk. Incense curls into the evening air as a single monk sweeps the gravel path, each stroke methodical and calm. Lantern light glimmers on carved statues that watch in silence, while distant wind chimes issue gentle metallic notes. In the fading light, the stillness feels centuries old. |
Image to Video
| Input Image | Prompt | Output Video |
|---|---|---|
![]() | A bowling ball enters from screen right and hits the model made out of marbles, causing it to collapse. |
Architecture Built for Multimodal Consistency
| Input Image | Prompt | Output Video |
|---|---|---|
![]() | make a video with the image | ![]() |
Joint Audio-Video Generation in One Pass
| Prompt | Output Video |
|---|---|
| Rain patters on the tent. Water drips off the edges. Quiet afternoon. Audio: steady rain on fabric, droplets running off, gentle breeze. |
More HappyHorse 1.0 Prompt Examples
The following prompts are designed to test what the HappyHorse 1.0 AI video generator does well: subject motion, scene coherence, and camera control. Copy any prompt directly into Dzine to generate your own version.
Prompt 1 - Cinematic product close-up
A black ceramic coffee mug sits on a rain-wet wooden table. Steam rises slowly from the rim. Camera begins with a tight close-up on the surface texture, then pulls back to reveal a gray morning window behind. Overcast natural light. No music. Ambient rain sound.
Expected result: Stable object rendering, natural steam particle motion, smooth rack focus from surface to background.
Prompt 2 - Character motion in an outdoor environment
A young woman in a yellow raincoat walks across a stone bridge over a fast-moving river. Camera tracks alongside her at shoulder height. Autumn leaves fall from both sides of the frame. Wind sound and footstep audio. 16:9 aspect ratio, cinematic color grade.
Expected result: Character consistency across frames, natural gait physics, coherent background parallax.
Prompt 3 - Abstract motion for social content
Ink drops fall into still water in extreme close-up. Each drop creates expanding circular ripples in slow motion. Black ink on white water, high contrast. No audio. 9:16 portrait format for vertical feed.
Expected result: Physics-accurate fluid simulation, clean contrast rendering, no frame artifacts.
Prompt 4 - Image-to-video product animation
Upload: product photo of a glass perfume bottle
The bottle sits on a white marble surface. A soft light sweeps across it from left to right, catching the glass facets. Subtle lens flare on the highlight. Camera stays locked. Ambient room tone only.
Expected result: Subject identity preserved from reference image, lighting motion coherent, no shape drift.
HappyHorse 1.0 vs Seedance 2.0: Benchmark Comparison
| Feature | HappyHorse 1.0 | Seedance 2.0 |
|---|---|---|
| T2V Elo (no audio) | 1333 - #1 | 1273 - #2 |
| I2V Elo (no audio) | 1392 - #1 | 1355 - #2 |
| T2V Elo (with audio) | 1205 - #2 | 1219 - #1 |
| I2V Elo (with audio) | 1161 - #2 | 1162 - #1 |
| Architecture | Single 40-layer Transformer, shared parameters | Multimodal diffusion transformer |
| Native audio languages | 6 (claimed) | Primarily Chinese and English |
| Open source | Claimed, not yet accessible | No |
| Team identity | Pseudonymous / unconfirmed | ByteDance |
| Available on Dzine | ✓ | ✓ |
Elo scores sourced from the Artificial Analysis Video Arena as of early April 2026. Scores change as votes accumulate.




















