OpenAI Releases GPT-5.5 and a Unified Desktop Agent
OpenAI released its GPT-5.5 frontier model alongside a new unified desktop application that merges ChatGPT, Codex, and Atlas for agentic workflows.
On April 23, 2026, OpenAI released GPT-5.5, a new frontier model optimized for autonomous multi-step tasks. The rollout includes a unified desktop application that merges ChatGPT, Codex, and Atlas into a single workspace. This architectural shift enables the model to maintain context across web browsing, writing code, and automating general software operations.
Model Variants and Pricing
OpenAI introduced two primary model tiers alongside a specialized processing mode called GPT-5.5 Thinking. The standard GPT-5.5 model is available immediately to ChatGPT Plus, Pro, Business, and Enterprise users. The higher-performance GPT-5.5 Pro utilizes parallel test-time compute for complex data science and legal research, restricted to the Pro ($200 per month), Business, and Enterprise tiers.
API access is scheduled for a future update. The general API supports a 1 million token context window.
| Model | Input Price (per 1M tokens) | Output Price (per 1M tokens) |
|---|---|---|
| GPT-5.5 | $5.00 | $30.00 |
| GPT-5.5 Pro | $30.00 | $180.00 |
Unified Desktop Application
The launch of the desktop client centralizes OpenAI’s previous standalone tools. By combining ChatGPT with the Codex coding agent and the Atlas browser agent, developers can execute cross-application workflows without manually moving data between windows. Within the Codex environment, the model operates with a restricted 400,000 token context window.
Benchmark Results
OpenAI reports that GPT-5.5 achieved state-of-the-art results on 14 technical benchmarks. The model consistently outperformed Anthropic’s Claude Opus 4.7 and Google Gemini 3.1 Pro in agentic evaluations.
| Benchmark | GPT-5.5 Score | Claude Opus 4.7 Score |
|---|---|---|
| Terminal-Bench 2.0 | 82.7% | 69.4% |
| OSWorld-Verified | 78.7% | 78.0% |
| FrontierMath Tier 4 | 39.6% (Pro variant) | 22.9% |
The standard model also scored 73.1% on Expert-SWE, an internal coding benchmark measuring tasks that average 20 hours of human work. On GDPval, which tests economic tasks across 44 fields, the model set a new record at 84.9%.
Infrastructure and Training
Internally codenamed Spud, the model was trained and serves on NVIDIA GB200 and GB300 NVL72 infrastructure. The architecture is more token-efficient than GPT-5.4, delivering equivalent AI coding workflows at half the compute cost of competing models. OpenAI utilized GPT-5.5 during development to write new load-balancing heuristics for its own serving infrastructure, resulting in a 20 percent increase in token generation speeds.
Security testing evaluated biological and cyber threats. Under OpenAI’s Preparedness Framework, the cybersecurity capability rating is classified as High.
If you build systems that rely on cross-application context, the integration of Codex and Atlas within a single desktop environment changes how you structure multi-step automation. You should evaluate the GPT-5.5 API pricing structure against your current retrieval setups to determine if the 1 million token context window offsets the higher output token costs.
Get Insanely Good at AI
The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.
Keep Reading
How to Chain Hugging Face Spaces Using the /agents.md Endpoint
You will learn how to orchestrate text-to-image and 3D modeling tools by chaining Hugging Face Spaces together using the universal markdown tool interface.
Frontier Agents Score Below 50% on SRE Task Benchmark
IBM Research and Artificial Analysis launched ITBench-AA, revealing that top frontier AI models score below 50% on complex enterprise SRE tasks.
NotebookLM Gains Cloud Environments and Gemini 3.5 Agents
Google has upgraded NotebookLM to an agentic research assistant featuring Gemini 3.5, secure cloud computing environments, and autonomous web search.
How to Expose the Hugging Face Hub to Coding Agents via hf CLI
Learn how to use the newly redesigned hf CLI to provide coding agents like Claude Code and Cursor with direct access to Hugging Face models and datasets.
Holo3.1 Brings 140ms Local Computer Use Agents to 12GB GPUs
Hcompany released Holo3.1, an open-weights agent framework that runs computer-use tasks locally with 140ms latency and 74.2% OS-World accuracy.