What are the key points?

NVIDIA releases Nemotron 3 Super, a 120B parameter hybrid Mamba-Transformer reasoning model. New architecture delivers 11% higher throughput per GPU compared to existing open-weights competitors. Model features high openness scores with fully disclosed training data and methodology.

NVIDIA Launches Nemotron 3 Super Reasoning Model

•NVIDIA releases Nemotron 3 Super, a 120B parameter hybrid Mamba-Transformer reasoning model.
•New architecture delivers 11% higher throughput per GPU compared to existing open-weights competitors.
•Model features high openness scores with fully disclosed training data and methodology.

NVIDIA has unveiled Nemotron 3 Super, a sophisticated 120B parameter model that prioritizes both high-level reasoning and extreme inference efficiency. By utilizing a hybrid Mamba-Transformer architecture, the model balances the powerful pattern recognition of standard AI with the speed of newer, more efficient designs. This specific "Super" variant acts as the mid-range powerhouse in the Nemotron 3 lineup, bridging the gap between smaller edge models and massive data-center scale systems.

The technical standout of this release is the integration of Mixture of Experts (MoE), a design where only a fraction of the model’s total "brain" (12.7B out of 120.6B parameters) is active at any given moment. This allows the model to maintain deep knowledge without the massive computational cost typically associated with large-scale systems. In performance testing, it demonstrated significantly higher throughput—the speed at which it processes information—than comparable open-weights models, making it a highly attractive option for developers focused on cost-effective deployment.

Beyond raw speed, NVIDIA is doubling down on transparency by releasing not just the model weights, but also the training data and detailed methodology. This "open weights" approach allows researchers to look under the hood and understand exactly how the AI was built. With a massive context window of one million tokens, the model can process entire libraries of documents or complex codebases in a single go, positioning it as a top-tier tool for agentic workflows and real-world industrial applications.

NVIDIA has unveiled Nemotron 3 Super, a sophisticated 120B parameter model that prioritizes both high-level reasoning and extreme inference efficiency. By utilizing a hybrid Mamba-Transformer architecture, the model balances the powerful pattern recognition of standard AI with the speed of newer, more efficient designs. This specific "Super" variant acts as the mid-range powerhouse in the Nemotron 3 lineup, bridging the gap between smaller edge models and massive data-center scale systems.

The technical standout of this release is the integration of Mixture of Experts (MoE), a design where only a fraction of the model’s total "brain" (12.7B out of 120.6B parameters) is active at any given moment. This allows the model to maintain deep knowledge without the massive computational cost typically associated with large-scale systems. In performance testing, it demonstrated significantly higher throughput—the speed at which it processes information—than comparable open-weights models, making it a highly attractive option for developers focused on cost-effective deployment.

Beyond raw speed, NVIDIA is doubling down on transparency by releasing not just the model weights, but also the training data and detailed methodology. This "open weights" approach allows researchers to look under the hood and understand exactly how the AI was built. With a massive context window of one million tokens, the model can process entire libraries of documents or complex codebases in a single go, positioning it as a top-tier tool for agentic workflows and real-world industrial applications.

NVIDIA Launches Nemotron 3 Super Reasoning Model

Tags