NVIDIA Launches Nemotron Coalition at GTC 2026
NVIDIA launched the Nemotron Coalition and expanded its open AI model lineup at GTC 2026, with the first coalition model set for Nemotron 4.
NVIDIA launched the Nemotron Coalition at GTC 2026, putting eight AI labs into a shared effort to build open frontier foundation models on DGX Cloud. For developers, the key detail is concrete: the first coalition model is a base model codeveloped by NVIDIA and Mistral AI, and it will underpin the upcoming Nemotron 4 family.
Coalition Scope
The Nemotron Coalition starts with Black Forest Labs, Cursor, LangChain, Mistral AI, Perplexity, Reflection AI, Sarvam, and Thinking Machines Lab. NVIDIA’s structure is straightforward. Members contribute data, evaluations, research, domain expertise, and compute collaboration, with training run on NVIDIA DGX Cloud.
This matters because NVIDIA is moving beyond releasing open weights by itself. It is building an organized post-training and evaluation pipeline around those models, with contributors that each represent a real deployment surface: coding, agents, multilingual AI, multimodal systems, and search.
If you build agents, the member list is the signal. LangChain is contributing tool use, long-horizon reasoning, agent harnesses, and observability. Cursor is contributing real-world performance requirements and eval datasets. Perplexity brings a production search and orchestration environment. Those are the ingredients you need to make open models useful in practice, especially for evaluating agents and improving LLM observability.
Nemotron 4 Path
NVIDIA tied the coalition directly to its next model generation. The first coalition-built model will support the Nemotron 4 family, with Mistral AI and NVIDIA taking the lead on the base model.
That is the strategic shift. Nemotron 3 is the current open-model line. Nemotron 4 is being positioned as a coalition-built successor, with shared post-training and downstream specialization as part of the design from the start.
For teams that prefer open models over API-only closed systems, this is a stronger promise than a one-off model drop. You are looking at an ecosystem play aimed at sustained iteration, not a single release.
Nemotron 3 Super Sets the Technical Baseline
Five days before GTC, NVIDIA released Nemotron 3 Super, which gives the clearest current baseline for where the coalition is starting.
Nemotron 3 Super has 120 billion total parameters, 12 billion active parameters, a 1 million-token context window, and was trained on 25 trillion tokens. NVIDIA says it delivers up to 5x higher throughput than the previous Nemotron Super model. The open release includes base, post-trained, and quantized checkpoints in NVFP4, FP8, and BF16.
The architecture also matters. NVIDIA describes it as a Mixture-of-Experts Hybrid Mamba-Transformer Model and says it is the first Nemotron model to use LatentMoE.
Nemotron 3 Super comparison
| Model | Total Params | Active Params | Context Window | Reported Throughput Result |
|---|---|---|---|---|
| Nemotron 3 Super | 120B | 12B | 1M tokens | Up to 2.2x vs GPT-OSS-120B, 7.5x vs Qwen3.5-122B at 8k input / 64k output on B200 |
| GPT-OSS-120B | 120B-class | Not disclosed here | Not disclosed here | Slower than Nemotron 3 Super in NVIDIA’s tested setting |
| Qwen3.5-122B | 122B | Not disclosed here | Not disclosed here | Slower than Nemotron 3 Super in NVIDIA’s tested setting |
For long-context agent systems, a 1M token window changes your context management options, but it does not remove them. You still need disciplined context engineering, retrieval boundaries, and memory policies. Large windows help most when your application already knows what context deserves to stay in the prompt.
Broader Open Model Push at GTC
NVIDIA paired the coalition launch with a wider expansion of its open model families on March 16’s GTC announcements. The new lineup includes Nemotron 3 Ultra, Nemotron 3 Omni, Nemotron 3 VoiceChat, Isaac GR00T N1.7, Alpamayo 1.5, and Cosmos 3 as an upcoming release.
The Nemotron additions are the most relevant for application developers:
| Model | Focus | NVIDIA detail |
|---|---|---|
| Nemotron 3 Ultra | Frontier-level reasoning | 5x throughput efficiency using NVFP4 on Blackwell |
| Nemotron 3 Omni | Multimodal understanding | Audio, vision, and language integration |
| Nemotron 3 VoiceChat | Speech-native interaction | ASR, LLM processing, and TTS in one system |
NVIDIA also added Nemotron safety models and an agentic retrieval pipeline for multimodal trust and relevance. If you are building retrieval-heavy systems, that work connects directly to production concerns around ranking, grounding, and RAG architecture.
Licensing and Deployment
Nemotron models are available under the NVIDIA Open Model License, with commercial use, modification, and distribution permitted. NVIDIA says developers can access Nemotron 3 Super through build.nvidia.com, Hugging Face, OpenRouter, and Perplexity, with deployment paths across cloud providers and as an NVIDIA NIM microservice.
This combination matters more than the coalition branding. Open weights are useful. Open weights plus cloud distribution, NIM packaging, quantized checkpoints, and partner-supported inference are what make adoption plausible in real systems. If you are comparing open deployment options against hosted APIs or deciding whether to run models locally, operational packaging often decides the outcome more than benchmark deltas.
The practical move is to treat the coalition as NVIDIA’s roadmap for open frontier models, not as a standalone announcement. If your stack depends on open weights, long-context agent workflows, or customizable post-training, Nemotron 3 Super is the model to evaluate now, and Nemotron 4 is the line to watch for your next upgrade cycle.
Get Insanely Good at AI
The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.
Keep Reading
How to Run NVIDIA Nemotron 3 Nano 4B Locally on Jetson and RTX
Learn to deploy NVIDIA's Nemotron 3 Nano 4B locally with BF16, FP8, or GGUF on Jetson, RTX, vLLM, TensorRT-LLM, and llama.cpp.
NVIDIA Unveils NemoClaw at GTC as a Security-Focused Enterprise AI Agent Platform
NVIDIA introduced NemoClaw, an alpha open-source enterprise agent platform built to add security and privacy controls to OpenClaw workflows.
How to Deploy Mistral Small 4 for Multimodal Reasoning and Coding
Learn how to deploy Mistral Small 4 with reasoning controls, multimodal input, and optimized serving on API, Hugging Face, or NVIDIA.
How to Get Started with Open-H, GR00T-H, and Cosmos-H for Healthcare Robotics Research
Learn how to use NVIDIA's new Open-H dataset and GR00T-H and Cosmos-H models to build and evaluate healthcare robotics systems.
Hugging Face Reports Chinese Open Models Overtook U.S. on Hub as Qwen and DeepSeek Drive Derivative Boom
Hugging Face's Spring 2026 report says Chinese open models now lead Hub adoption, with Qwen and DeepSeek powering a surge in derivatives.