Arm Launches First In-House AGI CPU
Arm unveiled its first production silicon, a 136-core data center CPU for agentic AI workloads, with Meta as lead partner.
Arm has launched the Arm AGI CPU, its first production silicon product and its first Arm-designed data center CPU. The March 24 launch matters because Arm is moving from selling IP and subsystems into selling its own server processor for AI infrastructure, aimed directly at the control-plane and orchestration work around large accelerator clusters.
Product specifications
Arm positions the Arm AGI CPU as a CPU for agentic AI data centers, where the bottleneck is no longer only model execution on accelerators. CPU capacity also has to cover accelerator management, scheduling, memory and storage coordination, networking support, and service hosting around inference systems.
The chip is built on Arm Neoverse V3 and scales up to 136 cores per CPU. Arm quotes 6 GB/s of memory bandwidth per core, sub-100 ns latency, and a 300 W TDP.
One detail stands out for operators who care about predictable latency under sustained load. Arm says the processor uses a dedicated core per program thread, with the pitch centered on deterministic behavior rather than peak burst throughput.
| Spec | Arm AGI CPU |
|---|---|
| Core architecture | Arm Neoverse V3 |
| Max cores | 136 |
| Memory bandwidth | 6 GB/s per core |
| Latency | Sub-100 ns |
| TDP | 300 W |
| Process node | TSMC 3nm |
Rack density and deployment model
Arm is selling more than a chip. It is shipping a deployment template for high-density AI infrastructure.
The companion 1OU Dual Node Reference Server targets OCP DC-MHS and 21-inch ORv3 racks. The reference design includes two Arm AGI CPUs per chassis, 12 DDR5 64 GB 8000 MT/s DIMMs, OCP NIC 3.0 support, and an OCP DC-SCM 2.1 management module with an Aspeed 2600 BMC.
For air-cooled deployments, Arm says a rack can hold 30 blades, two chips per blade, for 8,160 total cores in a 36 kW rack. For liquid-cooled systems, Arm quotes more than 45,000 cores per rack in a 200 kW Supermicro configuration.
Those numbers matter if you are designing inference backends with heavy orchestration layers, especially for multi-agent systems and long-running service workflows. Rack-level CPU density becomes part of your inference architecture, not just a procurement detail.
Meta’s role in the launch
Meta is the lead partner and co-developer for the Arm AGI CPU. Arm and Meta are framing the product as infrastructure that works alongside Meta’s MTIA silicon, not as a replacement for accelerators.
Meta also says it plans to release its board and rack designs for this CPU through the Open Compute Project later this year. That gives the launch more weight than a one-off custom part. It points to a broader hardware pattern other hyperscalers and infrastructure teams can adopt.
If you build agent platforms, this is the hardware expression of the same shift visible in software. Agents increase coordination overhead. Tool execution, scheduling, state handling, and service composition all consume CPU resources around the model itself, which is why work on agent memory and stateful agents increasingly has infrastructure implications.
Performance and economics
Arm’s headline claim is more than 2x performance per rack versus the latest x86 systems. Arm does not publish a named x86 comparison SKU in the launch materials, so the claim is useful mainly at the level Arm intends it, as a directional rack-efficiency argument.
The more aggressive claim is financial. Arm estimates up to $10 billion in CAPEX savings per gigawatt of AI data center capacity. That figure depends on deployment assumptions, but it shows where Arm wants the conversation to land: AI infrastructure cost is shifting toward total system density, power, and orchestration efficiency.
| Deployment metric | Arm claim |
|---|---|
| Performance per rack vs x86 | More than 2x |
| Air-cooled density | 8,160 cores per rack |
| Liquid-cooled density | 45,000+ cores per rack |
| CAPEX savings | Up to $10B per GW |
Ecosystem and availability
The Arm AGI CPU is available to order now. Commercial systems are available from ASRockRack, Lenovo, and Supermicro, with broader availability expected in the second half of 2026.
Arm also named Cerebras, Cloudflare, F5, OpenAI, Positron, Rebellions, SAP, and SK Telecom as partners planning to use the CPU for accelerator management, control-plane processing, and AI application hosting. This partner list reinforces the product’s actual role. Arm is targeting the infrastructure around inference clusters, not the tensor compute layer itself.
Strategically, this puts Arm in a different relationship with its ecosystem. The company still sells IP, Neoverse CSS, and platform designs, but it now also sells a processor that can sit directly in the same server decisions as parts built by its customers and licensees. That is the most important shift in the launch, more than the branding around AGI.
If you operate AI infrastructure, treat this as a signal to model CPU demand separately from accelerator demand in your 2026 planning. The systems that win on AI inference will increasingly be the ones that provision orchestration, memory coordination, and control-plane capacity as first-class resources, not leftovers after GPU selection.
Get Insanely Good at AI
The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.
Keep Reading
How to Build Long-Running AI Agents With Google ADK 1.0
Google's Agent Development Kit 1.0 enables multi-day workflows that survive restarts. Learn to configure durable state machines and persistent session storage.
NVIDIA Nemotron 3 Super Redefines Agentic AI with Hybrid MoE
NVIDIA's new Nemotron 3 Super combines Mamba and Transformer architectures with a 1-million token context window to power high-speed autonomous reasoning.
OpenEnv Standardizes Agentic RL With Universal Action Space API
Hugging Face and academic partners have released OpenEnv, providing a unified API and 1,200 tasks to train agents across digital and physical interfaces.
IBM Bob Agent Automates the SDLC With Multi-Model Routing
At Think 2026, IBM launched the Bob SDLC agent system, enterprise agent control planes, and detailed its $11 billion acquisition of Confluent.
Claude Managed Agents Add Background Dreaming and Subagents
Anthropic updated Claude Managed Agents with background memory consolidation, multiagent orchestration, and rubric-based output grading for complex workflows.