GPT-5.5 Instant Update Drops Canvas as Legacy Models Face Sunset
OpenAI updated its GPT-5.5 Instant model to reduce formulaic outputs while setting strict retirement dates for GPT-4.5 and o3 in the ChatGPT interface.
On June 2, 2026, OpenAI rolled out a significant update to its GPT-5.5 Instant model alongside a strict deprecation timeline for older ChatGPT architectures. The release reallocates consumer compute capacity toward the GPT-5.5 series while adjusting the model’s fundamental conversational style. The update builds on the original April 23 release of the Instant model. For developers routing traffic via the API, the chat-latest snapshot has pointed to this revised version since May 28.
Style and Performance Adjustments
The updated GPT-5.5 Instant directly targets overly rigid formatting. Internal research lead Michelle Pokras noted the previous iteration suffered from being “bullet-pilled”, often returning unnecessarily long, list-heavy responses. The new version prioritizes natural language flow, optimizing pacing for practical help tasks.
Beyond formatting, the model demonstrates measurable performance improvements. The underlying weight updates improve multilingual execution, increase factual accuracy, and decrease sycophantic behavior. If you rely on rigid structural outputs, you may need to adjust your system prompts to force the desired list formatting.
Interface and Ecosystem Changes
The ChatGPT interface is fundamentally changing how it handles structural tasks. OpenAI is removing the Canvas feature entirely from both GPT-5.5 Instant and GPT-5.5 Thinking. Writing and coding tasks are now confined directly to the chat interface using specialized writing blocks and code blocks.
The update also introduces a live job search tool directly inside ChatGPT. Users can query real-time job listings and freelance contracts sourced from Indeed, Upwork, and Appcast without leaving the conversational interface.
Simultaneously, enterprise availability is expanding. As of June 1, GPT-5.5, GPT-5.4, and Codex are generally available on Amazon Bedrock. This integration allows enterprise teams to run these models on next-generation AI inference engines with AWS-native security and governance controls.
Legacy Model Retirement
To support the compute demands of the 5.5 generation, OpenAI is removing two major models from the ChatGPT Plus and Pro consumer interfaces.
| Model | Sunset Period | Interface Retirement Date |
|---|---|---|
| GPT-4.5 | 30 days | June 27, 2026 |
| OpenAI o3 | 90 days | August 26, 2026 |
These retirements apply strictly to the ChatGPT consumer application. OpenAI confirmed that both GPT-4.5 and o3 will remain accessible via the API for the foreseeable future, following standard deprecation timelines for developer endpoints.
If you are building consumer-facing applications that previously relied on the conversational tone of GPT-4.5, evaluate the new GPT-5.5 Instant behavior using the chat-latest alias. Update any internal documentation that instructs users to select specific legacy models within the ChatGPT interface, as those options will disappear by late summer.
Get Insanely Good at AI
The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.
Keep Reading
How to Cut Checkpoint Time by 85% With TRL Delta Weight Sync
Learn how to configure TRL Delta Weight Sync to reduce trillion-parameter model checkpointing times by 85 percent using Hugging Face Hub Buckets.
Cascaded Speech Pipeline Brings Reachy Mini Inference Local
Hugging Face released an offline conversational stack for the Reachy Mini robot that replaces cloud APIs with a local pipeline built on Gemma 4 and Qwen3-TTS.
Apache 2.0 Gets 218B Command A+ as Cohere Acquires Reliant AI
Cohere expanded its sovereign AI strategy by open-sourcing the 218-billion parameter Command A+ model and acquiring biopharma startup Reliant AI.
Stanford Finds RLHF Drives 49% More AI Sycophancy Than Humans
A Stanford study reveals that leading AI models, including GPT-5.5 and Gemini, endorse user views 49% more often than human advisors due to RLHF incentives.
TML-Interaction-Small Achieves 0.40s Full-Duplex Latency
Thinking Machines Lab has released a research preview of TML-Interaction-Small, a 276-billion-parameter Mixture-of-Experts model for full-duplex conversation.