$300M SN50 Chip Order Validates SambaNova's ASIC-Native Cloud
General Compute has launched an inference neocloud with a $300 million order of air-cooled SambaNova SN50 chips capable of 700 tokens per second.
General Compute announced a $300 million infrastructure order of SambaNova SN50 chips to build an “inference neocloud” optimized for autonomous agents. Detailed in the May 28 launch announcement, the deployment targets the high-speed, repetitive model calls required by agentic systems. For developers deploying agentic workflows, the shift toward Application-Specific Integrated Circuit (ASIC) providers alters the baseline math for cost and inference latency.
Hardware Performance Benchmarks
SambaNova designed the SN50 to process between 600 and 700 tokens per second for large language model inference. Traditional GPUs typically achieve approximately 250 tokens per second for similar workloads. General Compute demonstrated this hardware speed using the open-source MiniMax 2.7 architecture, recording the fastest independently benchmarked speeds for the model family to date.
| Metric | Traditional GPUs | SambaNova SN50 |
|---|---|---|
| Inference Speed | ~250 tokens/sec | 600 - 700 tokens/sec |
| Cooling Requirement | Liquid cooling often required | Air-cooled |
| Relative TCO | Baseline | 3x lower |
The SN50 chips bypass the complex liquid cooling systems required by high-performance GPUs. They are entirely air-cooled and consume less energy per token generated. This thermal profile allows General Compute to deploy hardware in standard data centers and repurposed cryptocurrency mining facilities without executing expensive infrastructure overhauls. SambaNova markets the resulting deployment architecture as delivering a 3x lower total cost of ownership while remaining 5x faster than competitor chips.
The Inference Neocloud Architecture
General Compute positions its platform as an “agent-native” environment. Instead of requiring human engineers to provision and manage servers, the neocloud allows AI agents to self-onboard and provision their own infrastructure via API. This setup is specifically tailored for AI inference tasks where specialized chips like those from SambaNova, Groq, and Cerebras handle high-volume text generation more efficiently than general-purpose compute hardware.
Corporate Backing and Strategic Supply
The massive chip order coincides with a $15 million seed funding round for General Compute at a $60 million post-money valuation. FUSE VC led the round, alongside Carya Venture Partners and Village Global Ventures.
For SambaNova, the deal follows a $350 million Series E financing round in February 2026 led by Vista Equity Partners and Cambium Capital. Intel Capital contributed up to $150 million to that round, securing an 8.2% stake in SambaNova. Intel CEO Lip-Bu Tan serves as Executive Chairman of SambaNova, giving the chipmaker direct access to Intel’s Xeon-based infrastructure and global supply chain to scale its SambaCloud product. The Federal Trade Commission granted early termination for the review of the Intel and SambaNova partnership on May 1, 2026.
If you build infrastructure for multi-step AI agents, evaluate your provider’s underlying silicon. Workloads that depend on hundreds of sequential model calls per minute often hit severe latency bottlenecks on traditional GPU clusters, making ASIC-native environments a practical requirement for production deployments.
Get Insanely Good at AI
The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.
Keep Reading
Build AI Agent Search with Cloudflare AI Search
Learn how to use Cloudflare AI Search to simplify RAG pipelines with hybrid vector search, automated indexing, and native MCP support for AI agents.
Wirestock DaaS Platform Lands $23M for Ethical Multimodal Data
Wirestock raised $23 million to expand its data-as-a-service platform, supplying foundation model makers with ethically licensed images, video, and 3D assets.
SpaceX Terafab Will Manufacture 1TW of AI Compute Capacity
SpaceX has filed plans to build a $55 billion semiconductor manufacturing facility in Texas designed to produce 1 terawatt of AI compute annually.
Cambridge HfO2 Memristor Cuts AI Energy Use by 70%
The University of Cambridge has developed a heterointerface memristor using hafnium oxide that integrates memory and processing to reduce AI energy use by 70%.
Google launches TPU 8t for training and TPU 8i for inference
Google's eighth-generation TPUs split into the 8t for frontier training and the 8i for low-latency inference, with Broadcom and MediaTek as fab partners.