NVIDIA Nemotron 3 Nano Launches on Amazon Bedrock
- •NVIDIA Nemotron 3 Nano 30B model now available as serverless endpoint on Amazon Bedrock.
- •New hybrid architecture combines Transformer and Mamba layers with Mixture-of-Experts for high-efficiency reasoning.
- •Model achieves top scores on SWE-bench and AIME 2025 benchmarks for coding and mathematics.
NVIDIA has expanded its presence on Amazon Bedrock by introducing Nemotron 3 Nano, a sophisticated 30-billion parameter model designed for high-performance enterprise applications. Unlike traditional dense models, this version utilizes a Mixture-of-Experts (MoE) architecture, where only 3 billion parameters are active at any given time. This design allows the model to "think fast" by maintaining high accuracy while significantly reducing the computational power required for each response.
The technical backbone of Nemotron 3 Nano is particularly innovative, featuring a hybrid design that merges Transformer and Mamba architectures. While Transformers excel at structured reasoning and complex planning, the Mamba component handles long-range information with minimal memory overhead. This synergy, paired with a massive 256,000-token context window—equivalent to several hundred pages of text—makes it an ideal candidate for complex software development and financial data analysis.
By deploying through Amazon Bedrock, developers can access these capabilities via a fully managed, serverless environment. This eliminates the need for manual infrastructure management, allowing teams to focus on building features rather than server maintenance. The model also integrates seamlessly with AWS safety tools and retrieval systems, enabling businesses to create secure, data-driven assistants that can ground their answers in private corporate documents.