Back to Guides

Happy Horse 1.1: The Complete Guide

Alibaba's Happy Horse 1.1: three video modes, up to 9 reference images, synced audio, and cheaper 1080p pricing than 1.0.

Happy Horse 1.1: The Complete Guide

Happy Horse 1.1 is Alibaba's upgrade to its Happy Horse video model, and it's a bigger jump than a point release usually suggests. This guide covers what Happy Horse 1.1 actually is, how it differs from Happy Horse 1.0, its features, pricing, and how to use it via API, with real examples throughout.

What is Happy Horse 1.1?

Happy Horse 1.1 is Alibaba's video generation model, built to run in one of three modes depending on what you feed it: text-to-video with no input images, image-to-video with one, or reference-to-video with two to nine. That third mode is the headline addition over Happy Horse 1.0, letting you anchor characters, products, and scenes across a generation instead of animating a single frame.

Happy Horse 1.1 vs Happy Horse 1.0: what actually changed

Here's the comparison, pulled directly from Apiframe's own model pages for both versions.

Happy Horse 1.0Happy Horse 1.1
Generation modesText-to-video, image-to-videoText-to-video, image-to-video, reference-to-video
Reference imagesOne (as a first frame)Up to 9, tagged in-prompt
Motion and consistencyBaselineImproved motion expressiveness and temporal consistency
AudioSynced, built inSynced, built in
Resolution and duration720p/1080p, 3-15s720p/1080p, 3-15s
1080p pricing48 credits/second31 credits/second

That last row is worth sitting with. Happy Horse 1.1 does more (a whole extra generation mode, more reference inputs, better motion) and costs less per second at 1080p than the model it replaces. That's not the usual pattern for a version bump, and it's a genuinely useful detail if you're deciding whether to move existing 1.0 workflows over.

Key features

Three modes, one model. The mode is chosen automatically by how many images you pass in: zero runs text-to-video, one runs image-to-video, two to nine run reference-to-video. No separate endpoints to manage.

Reference up to 9 images. Pass multiple images and refer to them directly in your prompt as [Image 1], [Image 2], and so on, useful for keeping a character, a product, and a setting consistent across a single generation instead of describing them all in text.

Smoother motion and synced audio. Alibaba improved motion expressiveness and temporal consistency over 1.0, and every clip ships with audio synchronized to the action, no separate audio pass required.

Flexible output. Five aspect ratios (16:9, 9:16, 1:1, 4:3, 3:4), two resolutions (720p, 1080p), and durations from 3 to 15 seconds, all billed per second.

Specs at a glance

ProviderAlibaba
Aspect ratios16:9, 9:16, 1:1, 4:3, 3:4
Resolutions720p, 1080p
Durations3 to 15 seconds
Image inputSupported (0, 1, or 2-9 images)
AudioSupported, synced automatically
Avg. completion time~180 seconds

Happy Horse 1.1 pricing

Pricing is per second of output: 24 credits/second at 720p and 31 credits/second at 1080p. A 10-second clip at 1080p works out to 310 credits. For current, confirmed rates, check the Happy Horse 1.1 API page directly.

How to use Happy Horse 1.1 (via API)

Using Happy Horse 1.1 through an API comes down to three steps: get a key, submit a request, get the result.

bash
curl -X POST https://api.apiframe.ai/v2/videos/generate \
  -H "X-API-Key: afk_your_api_key_here" \
  -H "Content-Type: application/json" \
  -d '{
        "prompt": "a cinematic sunrise over a futuristic cityscape, smooth camera push-in",
        "model": "happyhorse-1.1",
        "happyhorse11Params": {
            "images": [
                "https://example.com/input.jpg"
            ],
            "resolution": "1080p",
            "aspect_ratio": "16:9",
            "seed": 1
        }
    }'

That returns a jobId right away, since generation is asynchronous. At roughly 180 seconds average completion, a webhook is the better pattern for production use rather than tight polling.

The images array is what controls the mode. Leave it empty for text-to-video, pass one URL for image-to-video, or pass two to nine URLs for reference-to-video, then reference each one in your prompt as [Image 1], [Image 2], and so on. The full Happy Horse 1.1 API docs cover every parameter.

Happy Horse 1.1 examples and prompts

A few real prompts that show the range of what Happy Horse 1.1 can do:

Text-to-video: "A herd of horses thundering across desert dunes, dust trailing behind, golden hour."

Reference-to-video, two images: "[Image 1] walks through the gate of [Image 2] at dusk, cinematic tracking shot."

Reference-to-video, product: "A product hero shot of [Image 1] rotating slowly on a reflective surface, studio lighting."

You can copy these exact prompts on the Happy Horse 1.1 model page.

How Happy Horse 1.1 compares to other video models

Happy Horse 1.1's strongest differentiator is reference count: up to nine images in one generation, more than most video models on the market offer. If you're weighing it against ByteDance's line, the Seedance 2.0 guide covers a model built around similar multi-reference consistency at a higher resolution ceiling. If you want to compare it to Alibaba's other current video model, the Wan 2.7 guide breaks down a model with its own multi-reference and audio features, useful context since both come out of Alibaba but target different tradeoffs.

FAQ

How is the generation mode chosen?

By the number of images you pass in. Zero images runs text-to-video, one runs image-to-video, two to nine run reference-to-video.

How do reference images work?

Pass up to nine images and refer to them in your prompt as [Image 1], [Image 2], and so on. Happy Horse 1.1 keeps those subjects and scenes consistent in the output.

What does Happy Horse 1.1 cost?

24 credits per second at 720p, 31 credits per second at 1080p. A 10-second 1080p clip is 310 credits. Check the model page for current pricing.

Is Happy Horse 1.1 better than 1.0?

It adds a whole generation mode (reference-to-video, up to 9 images), improves motion and temporal consistency, and costs less per second at 1080p (31 credits vs 48). For most use cases, yes.

Where can you access it?

Through Apiframe with a single API key, alongside every other model on the platform.

Ready to try it? Get an API key and start with free credits, or head to the Happy Horse 1.1 API page for the full docs and live pricing.

Ready to start building?

Get your API key and start generating AI content in minutes.