NVIDIA Launches Nemotron Coalition at GTC 2026
NVIDIA launched the Nemotron Coalition and expanded its open AI model lineup at GTC 2026, with the first coalition model set for Nemotron 4.
NVIDIA launched the Nemotron Coalition at GTC 2026, putting eight AI labs into a shared effort to build open frontier foundation models on DGX Cloud. For developers, the key detail is concrete: the first coalition model is a base model codeveloped by NVIDIA and Mistral AI, and it will underpin the upcoming Nemotron 4 family.
Coalition Scope
The Nemotron Coalition starts with Black Forest Labs, Cursor, LangChain, Mistral AI, Perplexity, Reflection AI, Sarvam, and Thinking Machines Lab. NVIDIA’s structure is straightforward. Members contribute data, evaluations, research, domain expertise, and compute collaboration, with training run on NVIDIA DGX Cloud.
This matters because NVIDIA is moving beyond releasing open weights by itself. It is building an organized post-training and evaluation pipeline around those models, with contributors that each represent a real deployment surface: coding, agents, multilingual AI, multimodal systems, and search.
If you build agents, the member list is the signal. LangChain is contributing tool use, long-horizon reasoning, agent harnesses, and observability. Cursor is contributing real-world performance requirements and eval datasets. Perplexity brings a production search and orchestration environment. Those are the ingredients you need to make open models useful in practice, especially for evaluating agents and improving LLM observability.
Nemotron 4 Path
NVIDIA tied the coalition directly to its next model generation. The first coalition-built model will support the Nemotron 4 family, with Mistral AI and NVIDIA taking the lead on the base model.
That is the strategic shift. Nemotron 3 is the current open-model line. Nemotron 4 is being positioned as a coalition-built successor, with shared post-training and downstream specialization as part of the design from the start.
For teams that prefer open models over API-only closed systems, this is a stronger promise than a one-off model drop. You are looking at an ecosystem play aimed at sustained iteration, not a single release.
Nemotron 3 Super Sets the Technical Baseline
Five days before GTC, NVIDIA released Nemotron 3 Super, which gives the clearest current baseline for where the coalition is starting.
Nemotron 3 Super has 120 billion total parameters, 12 billion active parameters, a 1 million-token context window, and was trained on 25 trillion tokens. NVIDIA says it delivers up to 5x higher throughput than the previous Nemotron Super model. The open release includes base, post-trained, and quantized checkpoints in NVFP4, FP8, and BF16.
The architecture also matters. NVIDIA describes it as a Mixture-of-Experts Hybrid Mamba-Transformer Model and says it is the first Nemotron model to use LatentMoE.
Nemotron 3 Super comparison
| Model | Total Params | Active Params | Context Window | Reported Throughput Result |
|---|---|---|---|---|
| Nemotron 3 Super | 120B | 12B | 1M tokens | Up to 2.2x vs GPT-OSS-120B, 7.5x vs Qwen3.5-122B at 8k input / 64k output on B200 |
| GPT-OSS-120B | 120B-class | Not disclosed here | Not disclosed here | Slower than Nemotron 3 Super in NVIDIA’s tested setting |
| Qwen3.5-122B | 122B | Not disclosed here | Not disclosed here | Slower than Nemotron 3 Super in NVIDIA’s tested setting |
For long-context agent systems, a 1M token window changes your context management options, but it does not remove them. You still need disciplined context engineering, retrieval boundaries, and memory policies. Large windows help most when your application already knows what context deserves to stay in the prompt.
Broader Open Model Push at GTC
NVIDIA paired the coalition launch with a wider expansion of its open model families on March 16’s GTC announcements. The new lineup includes Nemotron 3 Ultra, Nemotron 3 Omni, Nemotron 3 VoiceChat, Isaac GR00T N1.7, Alpamayo 1.5, and Cosmos 3 as an upcoming release.
The Nemotron additions are the most relevant for application developers:
| Model | Focus | NVIDIA detail |
|---|---|---|
| Nemotron 3 Ultra | Frontier-level reasoning | 5x throughput efficiency using NVFP4 on Blackwell |
| Nemotron 3 Omni | Multimodal understanding | Audio, vision, and language integration |
| Nemotron 3 VoiceChat | Speech-native interaction | ASR, LLM processing, and TTS in one system |
NVIDIA also added Nemotron safety models and an agentic retrieval pipeline for multimodal trust and relevance. If you are building retrieval-heavy systems, that work connects directly to production concerns around ranking, grounding, and RAG architecture.
Licensing and Deployment
Nemotron models are available under the NVIDIA Open Model License, with commercial use, modification, and distribution permitted. NVIDIA says developers can access Nemotron 3 Super through build.nvidia.com, Hugging Face, OpenRouter, and Perplexity, with deployment paths across cloud providers and as an NVIDIA NIM microservice.
This combination matters more than the coalition branding. Open weights are useful. Open weights plus cloud distribution, NIM packaging, quantized checkpoints, and partner-supported inference are what make adoption plausible in real systems. If you are comparing open deployment options against hosted APIs or deciding whether to run models locally, operational packaging often decides the outcome more than benchmark deltas.
The practical move is to treat the coalition as NVIDIA’s roadmap for open frontier models, not as a standalone announcement. If your stack depends on open weights, long-context agent workflows, or customizable post-training, Nemotron 3 Super is the model to evaluate now, and Nemotron 4 is the line to watch for your next upgrade cycle.
Get Insanely Good at AI
The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.
Keep Reading
How to Run NVIDIA Nemotron 3 Nano 4B Locally on Jetson and RTX
Learn to deploy NVIDIA's Nemotron 3 Nano 4B locally with BF16, FP8, or GGUF on Jetson, RTX, vLLM, TensorRT-LLM, and llama.cpp.
NVIDIA Ships Nemotron 3 Content Safety 4B for On-Device Filtering
NVIDIA released Nemotron 3 Content Safety 4B, a multilingual multimodal moderation model for text and images, on Hugging Face.
NVIDIA Unveils NemoClaw at GTC as a Security-Focused Enterprise AI Agent Platform
NVIDIA introduced NemoClaw, an alpha open-source enterprise agent platform built to add security and privacy controls to OpenClaw workflows.
NVIDIA Nemotron 3 Super Redefines Agentic AI with Hybrid MoE
NVIDIA's new Nemotron 3 Super combines Mamba and Transformer architectures with a 1-million token context window to power high-speed autonomous reasoning.
Mistral AI Raises $830M for New Data Center Near Paris
Mistral AI has secured $830 million in debt financing to build a sovereign data center in France featuring 13,800 NVIDIA Blackwell GPUs.