Arm Launches First In-House AGI CPU
Arm unveiled its first production silicon, a 136-core data center CPU for agentic AI workloads, with Meta as lead partner.
Arm has launched the Arm AGI CPU, its first production silicon product and its first Arm-designed data center CPU. The March 24 launch matters because Arm is moving from selling IP and subsystems into selling its own server processor for AI infrastructure, aimed directly at the control-plane and orchestration work around large accelerator clusters.
Product specifications
Arm positions the Arm AGI CPU as a CPU for agentic AI data centers, where the bottleneck is no longer only model execution on accelerators. CPU capacity also has to cover accelerator management, scheduling, memory and storage coordination, networking support, and service hosting around inference systems.
The chip is built on Arm Neoverse V3 and scales up to 136 cores per CPU. Arm quotes 6 GB/s of memory bandwidth per core, sub-100 ns latency, and a 300 W TDP.
One detail stands out for operators who care about predictable latency under sustained load. Arm says the processor uses a dedicated core per program thread, with the pitch centered on deterministic behavior rather than peak burst throughput.
| Spec | Arm AGI CPU |
|---|---|
| Core architecture | Arm Neoverse V3 |
| Max cores | 136 |
| Memory bandwidth | 6 GB/s per core |
| Latency | Sub-100 ns |
| TDP | 300 W |
| Process node | TSMC 3nm |
Rack density and deployment model
Arm is selling more than a chip. It is shipping a deployment template for high-density AI infrastructure.
The companion 1OU Dual Node Reference Server targets OCP DC-MHS and 21-inch ORv3 racks. The reference design includes two Arm AGI CPUs per chassis, 12 DDR5 64 GB 8000 MT/s DIMMs, OCP NIC 3.0 support, and an OCP DC-SCM 2.1 management module with an Aspeed 2600 BMC.
For air-cooled deployments, Arm says a rack can hold 30 blades, two chips per blade, for 8,160 total cores in a 36 kW rack. For liquid-cooled systems, Arm quotes more than 45,000 cores per rack in a 200 kW Supermicro configuration.
Those numbers matter if you are designing inference backends with heavy orchestration layers, especially for multi-agent systems and long-running service workflows. Rack-level CPU density becomes part of your inference architecture, not just a procurement detail.
Meta’s role in the launch
Meta is the lead partner and co-developer for the Arm AGI CPU. Arm and Meta are framing the product as infrastructure that works alongside Meta’s MTIA silicon, not as a replacement for accelerators.
Meta also says it plans to release its board and rack designs for this CPU through the Open Compute Project later this year. That gives the launch more weight than a one-off custom part. It points to a broader hardware pattern other hyperscalers and infrastructure teams can adopt.
If you build agent platforms, this is the hardware expression of the same shift visible in software. Agents increase coordination overhead. Tool execution, scheduling, state handling, and service composition all consume CPU resources around the model itself, which is why work on agent memory and stateful agents increasingly has infrastructure implications.
Performance and economics
Arm’s headline claim is more than 2x performance per rack versus the latest x86 systems. Arm does not publish a named x86 comparison SKU in the launch materials, so the claim is useful mainly at the level Arm intends it, as a directional rack-efficiency argument.
The more aggressive claim is financial. Arm estimates up to $10 billion in CAPEX savings per gigawatt of AI data center capacity. That figure depends on deployment assumptions, but it shows where Arm wants the conversation to land: AI infrastructure cost is shifting toward total system density, power, and orchestration efficiency.
| Deployment metric | Arm claim |
|---|---|
| Performance per rack vs x86 | More than 2x |
| Air-cooled density | 8,160 cores per rack |
| Liquid-cooled density | 45,000+ cores per rack |
| CAPEX savings | Up to $10B per GW |
Ecosystem and availability
The Arm AGI CPU is available to order now. Commercial systems are available from ASRockRack, Lenovo, and Supermicro, with broader availability expected in the second half of 2026.
Arm also named Cerebras, Cloudflare, F5, OpenAI, Positron, Rebellions, SAP, and SK Telecom as partners planning to use the CPU for accelerator management, control-plane processing, and AI application hosting. This partner list reinforces the product’s actual role. Arm is targeting the infrastructure around inference clusters, not the tensor compute layer itself.
Strategically, this puts Arm in a different relationship with its ecosystem. The company still sells IP, Neoverse CSS, and platform designs, but it now also sells a processor that can sit directly in the same server decisions as parts built by its customers and licensees. That is the most important shift in the launch, more than the branding around AGI.
If you operate AI infrastructure, treat this as a signal to model CPU demand separately from accelerator demand in your 2026 planning. The systems that win on AI inference will increasingly be the ones that provision orchestration, memory coordination, and control-plane capacity as first-class resources, not leftovers after GPU selection.
Get Insanely Good at AI
The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.
Keep Reading
What Are Parameters in AI Models?
Parameters are the numbers that make AI models work. Here's what they are, why models have billions of them, and what the count actually tells you about capability.
Arm Launches AGI CPU With Meta for AI Data Centers
Arm unveiled its first production silicon, the AGI CPU, with Meta as lead partner for AI data center deployments later in 2026.
Meta Acquires Moltbook, Bringing Viral AI Agent Network's Founders to Superintelligence Labs
Meta acquired Moltbook and hired its founders into MSL, betting on AI agent identity and directory tech after the platform's spoofing scandal.
LiteLLM PyPI Package Compromised by Supply Chain Attack
Malicious versions of LiteLLM on PyPI contained a three-stage credential stealer that harvested SSH keys, cloud tokens, and crypto wallets.
Gimlet Labs Raises $80M Series A for AI Inference
Gimlet Labs raised an $80 million Series A led by Menlo Ventures to scale its multi-silicon AI inference cloud.