Amazon Bedrock Gains GPT-5.5 and Codex in $50B OpenAI Deal
Following the end of Microsoft's exclusive distribution rights, Amazon Web Services has introduced OpenAI's GPT-5.5 and Codex models to the Bedrock platform.
The formal end of Microsoft’s exclusive cloud distribution rights for OpenAI products has shifted the enterprise landscape, culminating in a massive expansion of the AWS and OpenAI partnership that brings GPT-5.5 to Amazon Bedrock. Backed by a revised $50 billion investment commitment from Amazon, the integration allows AWS customers to run OpenAI’s frontier models natively within their existing cloud perimeters for the first time. The total value of commercial ties between the two companies is projected to exceed $138 billion over eight years.
Core Models in Limited Preview
The initial rollout introduces three distinct OpenAI offerings to the AWS ecosystem, all currently in limited preview. Developers can access the models through the standard Bedrock InvokeModel API using identifiers like modelId: 'openai.gpt-5-4'.
- GPT-5.4: Available immediately for preview testing.
- GPT-5.5: OpenAI’s latest frontier model, released on April 23, 2026, is scheduled to arrive on Bedrock in the coming weeks.
- Codex: The specialized coding model is now integrated directly into AWS development workflows.
The performance gap between the two frontier models is significant for long-context retrieval tasks. On the MRCR v2 benchmark, evaluating retrieval over a 1 million token context window, GPT-5.5 doubles the performance of its predecessor.
| Model | MRCR v2 Score (1M Tokens) | Codex Token Reduction |
|---|---|---|
| GPT-5.4 | 36.6% | Baseline |
| GPT-5.5 | 74.0% | 40% fewer tokens per task |
If you are tasked with evaluating AI output for coding applications, the token reduction in GPT-5.5 fundamentally changes the cost structure of high-volume automated code generation.
Managed Agents and Developer Tooling
Beyond foundational models, the partnership introduces Amazon Bedrock Managed Agents, powered by OpenAI. This service runs on a co-developed Stateful Runtime Environment (SRE). The SRE combines OpenAI’s underlying models with a proprietary agentic execution harness. This framework allows the models to maintain context, execute multi-step workflows, and take autonomous actions across deployed AWS services. Implementing complex multi-agent coordination patterns is simplified when the orchestration layer has native permissions to interact with S3, Lambda, and DynamoDB.
Codex access also expands beyond the standard API. AWS developers can utilize a new Codex CLI, a standalone desktop application, and a Visual Studio Code extension. Codex currently supports over 4 million weekly active users across the software development lifecycle.
Infrastructure and Network Topologies
Running AI inference for OpenAI models on AWS alters the security and networking topology for enterprise teams. All OpenAI model inference on Bedrock runs strictly within the AWS security boundary. The environment enforces zero operator access, ensuring that no human from AWS or OpenAI can access customer inference payloads or outputs.
Network routing utilizes AWS PrivateLink. This keeps traffic strictly on the Amazon network backbone. Eliminating the cross-cloud hop required to hit Azure or OpenAI’s direct APIs removes the 40 to 70 milliseconds of network latency typically associated with external service calls. To support these workloads at scale, OpenAI has committed to utilizing 2 gigawatts of AWS Trainium3 and Trainium4 compute capacity.
For platform engineering teams, this release unifies vendor management. You can now route inference requests to Claude Opus 4.7, Mistral, Meta, and GPT-5.5 through a single AWS billing and IAM governance framework. Consolidate your security groups and IAM roles to manage access to all frontier models from one central control plane.
Get Insanely Good at AI
The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.
Keep Reading
How to Build Cross-Modal RAG Pipelines With Gemini Embedding 2
Learn how to process text, images, video, and audio into a single semantic vector space using Google's natively multimodal Gemini Embedding 2 model.
CVE-2026-42208: Pre-Auth SQLi Actively Exploited in LiteLLM
Threat actors are exploiting a critical pre-authentication SQL injection in the LiteLLM proxy to exfiltrate master API keys and cloud provider credentials.
Cohere Acquires Aleph Alpha in $20B Sovereign AI Merger
Cohere is acquiring German AI firm Aleph Alpha to create a $20 billion transatlantic entity focused on sovereign AI for regulated European enterprises.
Google Inks Multibillion GB300 Deal With Thinking Machines Lab
Google signed a multibillion-dollar agreement to provide Thinking Machines Lab with access to Nvidia GB300 infrastructure for reinforcement learning.
ScaleOps Raises $130M to Automate AI Infrastructure
ScaleOps secures $130 million in Series C funding to scale its autonomous Kubernetes platform and optimize GPU resources for the AI era.