Arcee Releases 400B Open-Source Trinity Model for Agents

U.S. startup Arcee AI has released Trinity-Large-Thinking, a 400-billion-parameter open-source model designed specifically for complex agent workflows. The release coincides with Anthropic blocking approximately 135,000 users of the OpenClaw framework from its flat-rate Claude subscription tier. For developers relying on autonomous tool calling, this shifts the immediate economic viability of heavily automated systems.

Architecture and Training

Arcee built the model using a sparse Mixture-of-Experts (MoE) architecture to balance frontier-scale capacity with inference efficiency. The system contains 399 billion total parameters but only activates 13 billion parameters per token. It uses a 4-of-256 expert routing strategy. The 262,144-token context window can be expanded to 512,000 tokens in specific configurations.

Training required 33 days on 2,048 NVIDIA B300 Blackwell GPUs with a $20 million budget. The team utilized the Muon optimizer alongside a novel load balancing strategy called SMEBU (Soft-clamped Momentum Expert Bias Updates). This combination prevents expert collapse during training. The final weights are distributed under the Apache 2.0 license. If you deploy MoE architectures locally, this permissive license allows for fully air-gapped commercial hosting.

Agent Benchmarks and Performance

Trinity-Large-Thinking targets multi-turn tool calling and long-horizon planning rather than general chat. Its benchmark profile reflects this optimization. It closely trails top-tier closed models across major evaluations.

Benchmark	Trinity-Large-Thinking	Claude Opus 4.6
PinchBench	91.9	93.3
IFBench	52.3	53.1
SWE-bench Verified	63.2	75.6

The model also scored 96.3 on AIME25, matching the top-tier Chinese model Kimi-K2.5. The gap in SWE-bench Verified indicates that while the model handles general tool orchestration well, it lags in complex software engineering tasks compared to Anthropic’s flagship model. If you are evaluating autonomous systems, you must account for this differential in pure coding capabilities.

The Anthropic Policy Shift and Migration

The rapid adoption curve for the new model is tightly coupled to policy changes at Anthropic. On April 3, 2026, Anthropic restricted OpenClaw traffic from its $200 per month flat-rate subscription tier. Users running continuous agent loops were forced onto standard API billing. Typical heavy users reported projected costs exceeding $600 per month under the new structure.

Following OpenClaw version 2026.4.7, which added native support for Arcee, Trinity-Large-Thinking became the most used open model in the U.S. on OpenRouter. The economics drive this migration entirely. Arcee serves the model at $0.90 per million output tokens. This represents a 96% reduction compared to Claude Opus 4.6, which costs $25 per million output tokens. If your application relies on complex multi-agent orchestration, this pricing fundamentally changes your allowable token budget per task.

Operating heavy autonomous loops on closed APIs exposes your infrastructure to sudden policy shifts and unpredictable billing spikes. Transitioning your agent tooling to Trinity-Large-Thinking secures a stable cost baseline without relying on foreign-built alternatives. Test your specific workflow against the model’s SWE-bench limitations before migrating production systems that rely heavily on zero-shot code generation.

Arcee Releases 400B Open-Source Trinity Model for Agents

Architecture and Training

Agent Benchmarks and Performance

The Anthropic Policy Shift and Migration

Keep Reading

How to Speed Up MoE Fine-Tuning With NeMo AutoModel

TML-Interaction-Small Achieves 0.40s Full-Duplex Latency

3T-Parameter Kimi 3 Narrows the MMLU Gap With Opus 4.8

SpaceXAI Prices 1.5T MoE Grok 4.5 at $2 Per Million Input

Open-Weight GLM-5.2 Matches Restricted Claude Mythos in Cyber