ScaleOps Raises $130M to Automate AI Infrastructure
ScaleOps secures $130 million in Series C funding to scale its autonomous Kubernetes platform and optimize GPU resources for the AI era.
ScaleOps secured a $130 million Series C funding round to expand its autonomous Kubernetes and AI infrastructure platform. The investment brings the company’s valuation past $800 million. For teams managing generative AI models or multi-agent systems, static resource allocation often fails to match unpredictable traffic spikes. ScaleOps targets this bottleneck by replacing manual configuration with real-time tuning across CPU, memory, and GPU instances.
Autonomous Cluster Optimization
Traditional cluster management relies on static thresholds and manual rightsizing. ScaleOps deploys context-aware automation that adjusts compute resources on the fly. This prevents GPU underutilization during idle periods and avoids throttling when traffic spikes hit deployed endpoints.
The software extends beyond basic pod rightsizing. It automates disruption budgets for clusters using Karpenter. By dynamically calculating minimum availability requirements, the platform allows node consolidation without violating uptime constraints. The system also optimizes replica counts by analyzing historical data alongside immediate traffic forecasts. If you build infrastructure for AI agents, this predictive scaling handles the bursty, asynchronous nature of complex inference workloads.
Platform Compatibility and Benchmarks
The ScaleOps platform operates as a self-hosted solution. It supports cloud deployments, on-premises data centers, and air-gapped environments. You can provision the system directly through the AWS, Azure, and Google Cloud Marketplaces.
For regulated industries, the platform is FIPS-compatible and suitable for FedRAMP environments. ScaleOps states its automation significantly alters the unit economics of hosting AI applications.
| Metric | ScaleOps Detail |
|---|---|
| Claimed Cost Reduction | Up to 80% |
| Supported Environments | Cloud, On-Premises, Air-Gapped |
| Cloud Marketplaces | AWS, Azure, Google Cloud |
| Security / Compliance | FIPS-compatible, FedRAMP environments |
Funding and Expansion Roadmap
Insight Partners led the $130 million round. Lightspeed Venture Partners, NFX, Glilot Capital Partners, and Picture Capital also participated. The deal includes tens of millions in secondary transactions for employee equity, bringing the total funding to date past $210 million.
The customer base already includes enterprise platforms like Adobe, Salesforce, DocuSign, Armis, Coupa, and Wiz. CEO Yodar Shafrir, formerly an engineer at Nvidia-acquired Run:ai, founded the company in 2022. The new capital will fund platform expansion beyond core compute and GPUs into broader layers of cloud orchestration. ScaleOps also plans to triple its 120-person workforce across North America, Europe, and Israel by the end of 2026.
Review your current Kubernetes deployment manifests and auto-scaling rules. If your AI workloads regularly hit GPU limits or leave expensive instances idling during off-peak hours, you need to transition from static thresholds to predictive resource allocation.
Get Insanely Good at AI
The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.
Keep Reading
How to Fine-Tune Qwen3 on AMD MI300X Using ROCm
Learn how to configure ROCm 6.1 environment variables and use the Hugging Face stack to fine-tune Qwen3-1.7B on AMD hardware without CUDA.
Meta Deploys Millions of Graviton5 CPUs for Agentic Workloads
Meta is deploying tens of millions of custom AWS Graviton5 CPU cores to handle the reasoning and orchestration demands of multi-step agentic AI workloads.
Meta’s KernelEvolve Agent Cuts AI Kernel Dev from Weeks to Hours
Meta introduces KernelEvolve, an agentic AI system that autonomously optimizes high-performance kernels, boosting ads model inference throughput by 60%.
Wirestock DaaS Platform Lands $23M for Ethical Multimodal Data
Wirestock raised $23 million to expand its data-as-a-service platform, supplying foundation model makers with ethically licensed images, video, and 3D assets.
Async CUDA Streams Eliminate 25% GPU Wait in Transformers
Hugging Face implemented asynchronous continuous batching in the transformers library, using CUDA streams to recover 25% of runtime lost to CPU idle gaps.