Varya 14B Distills Wan 2.2 for $0.005/Sec Video Generation
Avataar AI has launched Varya, a 14-billion-parameter open-weight video model distilled from Wan 2.2 that cuts generation costs to $0.005 per second.
Avataar AI has released Varya, a 14-billion-parameter open-weight video generation model optimized for Indian cultural contexts and low-cost AI inference. By applying heavy distillation techniques to Alibaba’s Wan 2.2 architecture, the Bengaluru-based startup has reduced the cost of video generation to $0.005 per second. The release targets population-scale adoption across micro, small, and medium enterprises (MSMEs), rural education platforms, and government citizen services.
Distillation and Inference Benchmarks
Varya achieves its speed through an aggressive step-reduction distillation process. The baseline Wan 2.2 model requires 50 computational steps to generate a standard video sequence. Varya compresses this workflow down to just 4 steps.
This structural change heavily reduces the necessary compute time on standard hardware. When rendering a 5-second, 720p video on an NVIDIA H200 GPU, Varya completes the task in 45 seconds. The base Wan 2.2 model requires approximately 1,230 seconds for the identical workload.
| Metric | Wan 2.2 | Varya |
|---|---|---|
| Generation Steps | 50 | 4 |
| Inference Time (5s, 720p) | 1,230s | 45s |
| Hardware | NVIDIA H200 | NVIDIA H200 |
Cultural Alignment
Global foundational models heavily index on Western visual concepts and aesthetic assumptions. Avataar AI trained Varya on a curated dataset specifically designed for the Indian subcontinent to eliminate the need for complex prompting workarounds. The model accurately recognizes and renders local festivals, regional architectural styles, national cuisine, and traditional attire such as sarees and turbans.
Pricing Economics
The primary constraint on generative video adoption is the unit cost per output. Avataar AI explicitly prices Varya at ₹0.48 per second, which converts to approximately $0.005.
This pricing tier alters the production math for developers building high-volume video applications. Proprietary global platforms like OpenAI’s Veo, Luma, Kling, and Runway typically charge $0.10 or more per second of generated video. Varya operates up to 27 times cheaper than these established alternatives while maintaining domain-specific accuracy for regional content.
IndiaAI Mission and Open Weights
The Indian government heavily subsidized Varya’s development through the $1.2 billion IndiaAI Mission. Avataar AI secured access to national AI compute infrastructure as one of 12 selected projects under the federal initiative, accelerating the distillation and training timeline.
The model will be distributed as an open-weight download on the government’s AI Kosh portal. This distribution method allows developers to self-host the model or modify the weights for highly specific enterprise use cases. Avataar AI is also actively building integrations to surface Varya within existing creative workflows, including Higgsfield and Adobe Firefly. A trial version supporting text-to-video and image-to-video generation pipelines is currently live on the company’s website.
If you are building media applications for the Indian market, downloading the open weights from AI Kosh allows you to establish a predictable, fixed-cost inference pipeline. Test the prompt alignment using the web trial before provisioning dedicated H200 clusters for your production deployment.
Get Insanely Good at AI
The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.
Keep Reading
How to Implement the Advisor Strategy with Claude
Optimize AI agents by pairing high-intelligence advisor models with cost-effective executors using Anthropic's native advisor tool API.
Decart Oasis 3 API Renders Endless Driving Sims at 22 FPS
Decart's Oasis 3 is an interactive world model available via API that generates real-time, closed-loop driving environments for autonomous vehicle validation.
Google Dreambeans Curates Personal Data Into 14 Daily Cartoons
Google Labs has introduced Dreambeans, an experimental iOS and Android app that uses the Nano Banana 2 model to transform personal data into daily cartoons.
Google Ships 9 Gemini Omni Demos Alongside 3.5 Flash
Google has released nine demonstration videos showcasing Gemini Omni's physics-aware video generation and the benchmark results for Gemini 3.5 Flash.
Stable Audio 3.0 Hits 6-Minute Tracks in 1.3 Seconds on H200
Stability AI released Stable Audio 3.0, bringing variable-length generation up to six minutes and 20 seconds via a new latent diffusion architecture.