AWS to Deploy One Million NVIDIA GPUs for Agentic AI
- •AWS will deploy over one million NVIDIA Blackwell and Rubin GPUs across global cloud regions.
- •New EC2 instances featuring RTX PRO 4500 GPUs target high-performance conversational AI and rendering.
- •Amazon Bedrock integrates NVIDIA Nemotron 3 Super and native reinforcement fine-tuning for specialized domains.
The collaboration between AWS and NVIDIA has reached a massive scale, shifting focus from experimental AI pilots to robust production environments. Central to this expansion is the deployment of over one million next-generation Blackwell and Rubin GPUs across AWS regions starting in 2026. This infrastructure surge is specifically designed to support the rise of agentic AI—systems that do not just predict text but can reason, plan, and execute multi-step workflows autonomously across complex business environments.
To optimize these workloads, AWS is introducing the NVIDIA Inference Xfer Library (NIXL). This tool facilitates "disaggregated inference," a technique where different parts of an AI model's processing are split across multiple chips or servers. By streamlining how data moves between these components, NIXL minimizes the communication delays that often slow down large models, ensuring that high-speed responses are possible even as model sizes continue to grow.
On the software front, Amazon Bedrock is expanding its library to include NVIDIA’s Nemotron 3 Super. This model utilizes a "Mixture of Experts" (MoE) architecture, which functions like a team of specialists where only the most relevant part of the brain is activated for a specific task. Furthermore, developers will soon have access to Reinforcement Fine-Tuning (RFT), allowing them to shape how a model thinks and responds based on specific feedback, which is crucial for high-stakes industries like legal and healthcare.