Ai Engineering 3 min read

ChatGPT Images 2.0 Adds Multilingual Text and Thinking Mode

OpenAI released ChatGPT Images 2.0 with the gpt-image-2 model, adding agentic web search, 2K resolution, and non-Latin script rendering capabilities.

OpenAI released ChatGPT Images 2.0 on April 21, 2026, replacing the previous gpt-image-1.5 model. The update introduces the gpt-image-2 model, featuring native web search, multilingual text rendering, and a new agentic reasoning mode that drafts image layouts before rendering. The release coincided with OpenAI’s permanent discontinuation of its Sora video model, marking a strategic shift toward high-utility static visuals and reasoning-based visual workflows.

Technical Architecture and API Pricing

The core of the update is a “Thinking” mode available to Plus, Pro, and Business subscribers. The model can search the web for real-time context and generate up to eight coherent images from a single prompt. It supports wide aspect ratios suitable for infographics and cinematic portraits, with resolutions up to 2K. Higher resolutions are currently available in beta via the API.

FeatureDetails
Core Modelgpt-image-2
Output LimitUp to 8 Images
Max Resolution2K (Higher in Beta)
Input Cost$8.00 per 1M tokens
Output Cost$30.00 per 1M tokens

When formatting requests for these new layout capabilities, standard prompt engineering principles apply, as the model uses text-based reasoning to determine object placement before rendering pixels.

Multilingual Capabilities and Regional Adoption

The gpt-image-2 model natively renders non-Latin scripts, including Hindi, Bengali, Chinese, Japanese, and Korean. This multilingual text capability drove immediate adoption in India, where the application recorded 5 million downloads during the launch week compared to 2 million in the U.S. Sensor Tower data showed a 3.4% week-over-week increase in daily active users in India.

OpenAI facilitated this regional growth with a piloted “ChatGPT Go” subscription tier priced at $8 per month, designed to bridge the gap between free and premium versions. Users in the region primarily utilize the tool for high-fidelity avatars, cinematic portraits, and fantasy-themed imagery.

Global reception remains measured. Similarweb data showed only a 1.6% increase in global web traffic following the release. The modest Western response aligns with established competition from Google’s Nano Banana 2, which launched in February 2026 with similar dense text rendering capabilities.

If you integrate image generation into production applications, evaluate the $30 per million output token cost against the reduced need for multi-shot prompting. The model’s ability to plan layouts and verify text structure beforehand reduces the total number of API calls required to achieve a usable infographic or localized asset.

Get Insanely Good at AI

Get Insanely Good at AI

The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.

Keep Reading