ChatGPT Images 2.0 Adds Multilingual Text and Thinking Mode
OpenAI released ChatGPT Images 2.0 with the gpt-image-2 model, adding agentic web search, 2K resolution, and non-Latin script rendering capabilities.
OpenAI released ChatGPT Images 2.0 on April 21, 2026, replacing the previous gpt-image-1.5 model. The update introduces the gpt-image-2 model, featuring native web search, multilingual text rendering, and a new agentic reasoning mode that drafts image layouts before rendering. The release coincided with OpenAI’s permanent discontinuation of its Sora video model, marking a strategic shift toward high-utility static visuals and reasoning-based visual workflows.
Technical Architecture and API Pricing
The core of the update is a “Thinking” mode available to Plus, Pro, and Business subscribers. The model can search the web for real-time context and generate up to eight coherent images from a single prompt. It supports wide aspect ratios suitable for infographics and cinematic portraits, with resolutions up to 2K. Higher resolutions are currently available in beta via the API.
| Feature | Details |
|---|---|
| Core Model | gpt-image-2 |
| Output Limit | Up to 8 Images |
| Max Resolution | 2K (Higher in Beta) |
| Input Cost | $8.00 per 1M tokens |
| Output Cost | $30.00 per 1M tokens |
When formatting requests for these new layout capabilities, standard prompt engineering principles apply, as the model uses text-based reasoning to determine object placement before rendering pixels.
Multilingual Capabilities and Regional Adoption
The gpt-image-2 model natively renders non-Latin scripts, including Hindi, Bengali, Chinese, Japanese, and Korean. This multilingual text capability drove immediate adoption in India, where the application recorded 5 million downloads during the launch week compared to 2 million in the U.S. Sensor Tower data showed a 3.4% week-over-week increase in daily active users in India.
OpenAI facilitated this regional growth with a piloted “ChatGPT Go” subscription tier priced at $8 per month, designed to bridge the gap between free and premium versions. Users in the region primarily utilize the tool for high-fidelity avatars, cinematic portraits, and fantasy-themed imagery.
Global reception remains measured. Similarweb data showed only a 1.6% increase in global web traffic following the release. The modest Western response aligns with established competition from Google’s Nano Banana 2, which launched in February 2026 with similar dense text rendering capabilities.
If you integrate image generation into production applications, evaluate the $30 per million output token cost against the reduced need for multi-shot prompting. The model’s ability to plan layouts and verify text structure beforehand reduces the total number of API calls required to achieve a usable infographic or localized asset.
Get Insanely Good at AI
The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.
Keep Reading
How to Build Cross-Modal RAG Pipelines With Gemini Embedding 2
Learn how to process text, images, video, and audio into a single semantic vector space using Google's natively multimodal Gemini Embedding 2 model.
GPT-5.5 Instant Cuts ChatGPT Hallucinations by 52.5%
OpenAI has replaced ChatGPT's default engine with GPT-5.5 Instant, a less verbose model featuring improved factuality, personalization, and memory sources.
Varya 14B Distills Wan 2.2 for $0.005/Sec Video Generation
Avataar AI has launched Varya, a 14-billion-parameter open-weight video model distilled from Wan 2.2 that cuts generation costs to $0.005 per second.
Decart Oasis 3 API Renders Endless Driving Sims at 22 FPS
Decart's Oasis 3 is an interactive world model available via API that generates real-time, closed-loop driving environments for autonomous vehicle validation.
Google Dreambeans Curates Personal Data Into 14 Daily Cartoons
Google Labs has introduced Dreambeans, an experimental iOS and Android app that uses the Nano Banana 2 model to transform personal data into daily cartoons.