News Summary:  

  • The Rubin platform integrates hardware and software design to reduce inference token costs by up to 10x and decrease GPU requirements for MOE model training by 4x compared to the NVIDIA Blackwell platform.  
  • NVIDIA Spectrum X, Ethernet, Photonics, and Switch Systems provide five times greater power efficiency and uptime.  
  • The new NVIDIA Inference Context Memory Storage Platform, powered by NVIDIA BlueField-4 storage processor, accelerates agent AI reasoning.  
  • Microsoft’s next-generation Fairweather AI super factories featuring NVIDIA Vera Rubin NVL72 rack-scale systems will scale to hundreds of thousands of NVIDIA Vera Rubin superchips.  
  • CoreWeave is among the first to offer NVIDIA Rubin managed through CoreWeave Mission Control to ensure flexibility and performance.  
  • NVIDIA has expanded collaboration with Red Hat to deliver a complete AI stack optimized for the Rubin platform, including Red Hat Enterprise Linux, Red Hat OpenShift, and Red Hat AI.  

NVIDIA launched the NVIDIA Rubin Platform, which includes six new chips designed to deliver a high-performance AI supercomputer. Rubin sets a new standard for building, deploying, and securing large-scale AI systems at lower cost, accelerating mainstream AI adoption.  

The Rubin platform applies integrated code design across six chips:  

  1. NVIDIA Vera CPU  
  1. NVIDIA Rubin GPU  
  1. NVIDIA NVLink 6 switch  
  1. Media Connect X 9 supernic  
  1. Media Blue Field for DPU  
  1. NVIDIA Spectrum 6 Ethernet switch  

to reduce training time and inference token costs.  

Rubin arrives at exactly the right moment as AI computing demand for both training and inference is going through the roof, said Jensen Huang, founder and CEO of NVIDIA. With our annual cadence of delivering a new generation of AI supercomputers and extreme co-design across six new chips, Rubin takes a giant step towards the next frontier of AI.  

Named for Vera-Florence Cooper Rubin, the pioneering American astronomer whose discoveries transformed our understanding of the universe, the Rubin Platform features the NVIDIA Vera-Rubin NVL72 rack-scale solution and the NVIDIA HGX Rubin NvL8 system.  

The Rubin Platform introduces five innovations:  

  1. The latest NVIDIA, NVLink  Interconnect technology  
  1. Transformer Engine  
  1. Confidential Computing  
  1. RAS engine  
  1. NVIDIA VERA CPU  

These progressions accelerate agentic AI, advanced reasoning, and large-scale M.O.E model inference at up to ten times lower cost per token than the NVIDIA Blackwell Platform. Rubin also trains M.O.E models with 4x fewer GPUs, further accelerating AI adoption.  

Broad Ecosystem Support 

Leading AI labs, cloud service providers, computer manufacturers, and startups expected to adopt Rubin include: Amazon Web Services (AWS), Anthropic, Black Forest Labs, Cisco, Cohere, CoreWeave, Cursor, Dell Technologies, Google, Harvey, HPE, Lambada, Lenovo, Meta, Microsoft, Mistral AI, Nebius, Nscale, OpenAI, OpenEvidence, Oracle Cloud Infrastructure (OCI), Perplexity, Runway, Supermicro, Thinking Machines Lab, and xAI.  

Sam Altman, CEO of OpenAI, says intelligence scales with compute. When we add more compute models, we get more capable, solve harder problems, and make a bigger impact on people. The NVIDIA Rubin platform helps us keep scaling this progress. So advanced intelligence benefits everyone.  

Dario Amodei, Co-Founder and CEO of Anthropic: The efficiency gains in the NVIDIA Rubin platform represent the kind of infrastructure progress that enables longer memory, better reasoning, and more reliable outputs. Our cooperation with NVIDIA helps power our safety research and our frontier models.  

Mark Zuckerberg, Founder and CEO of Meta: NVIDIA’s Rubin platform pledges to deliver the next change in performance and effectiveness required to deploy the most advanced models to billions of people.  

Musk, Founder and CEO of xAI: NVIDIA Rubin will be a rocket engine for AI. If you want to train and deploy frontier models at scale, this is the infrastructure you use. Rubin will remind the world that NVIDIA is the gold standard.  

Satya Nadella, Executive Chairman and CEO of Microsoft: We are building the world’s most powerful AI super factories to serve any workload anywhere with maximum performance and effectiveness. With the addition of NVIDIA, Vera, and Rubin GPUs, we will authorize developers and organizations to create, reason, and scale in entirely new ways.  

Engineered to Scale Intelligence 

Agentic AI reasoning models and advanced video generation workloads are expanding computational capabilities. Multi-step problem-solving requires models to process, reason, and act along extended token sequences. The Rubin platform meets these requirements with five key technologies.  

  • 6th generation Nvidia NV-Link: Provides high-speed GPU-to-GPU communication for large MOE models. Each GPU delivers 3.6 TB of bandwidth, and the Vera Rubin Nvl72 rack offers 260 TB of built-in network bandwidth. Compute accelerates collective operations while new features improve serviceability and resiliency. The NVLink 6 switch supports efficient AI training and inference at scale.  
  • NVIDIA Vera CPU: designed for Agentic Reasoning, NVIDIA Vera is a power-efficient CPU for large-scale AI operations. It features 88 custom Olympus cores, full ARMv9.2 compatibility, and high-speed NVLink C-to-C connectivity. Vera provides strong performance, bandwidth, and capability for contemporary data center workloads.  
  • NVIDIA Rubin GPU: The 3rd generation transformer engine with hardware-accelerated adaptive compression enables Rubin GPU to deliver 50 petaflops of NVFP4 compute for AI inference.  
  • Third-generation NVIDIA Confidential Computing, Vera Rubin NVL72, is the first rack-scale platform to offer it, upholding data security across CPU, GPU, and NVLink domains. This protects both large proprietary models and training and inference workloads.  
  • 2nd Generation RAS engine: the Rubin platform spanning GPUs, CPUs, and NVLink includes instant health checks, fault tolerance, and preemptive maintenance to boost system performance. That modular, cable-free tray design enables up to 18x faster assembly and servicing than Blackwell.  

AI Native Storage And Secure Software-Defined Infrastructure 

The NVIDIA Rubin introduces the NVIDIA Inference Context Memory Storage Platform, an AI-native storage solution created to scale inference context to gigascale.  

Powered by NVIDIA Bluefield, the platform permits efficient sharing and reuse of key-value cache data across AI infrastructure. This improves responsiveness and throughput and supports predictable, power-efficient scaling of agentic AI.  

As AI factories adopt bare-metal and multi-tenant deployment models, preserving robust infrastructure control and isolation is essential.  

BlueField4 presents the Advanced Secure Trusted Resource Architecture (ASTRA). This system-level trust framework provides a single secure control point for provisioning, isolating, and operating large-scale AI environments without impairing performance.  

As AI applications advance towards multi-term agentic reasoning, organizations must manage and share significantly larger volumes of inference context across users, sessions, and services.  

Different Forms for Different Workloads 

NVIDIA Vera Rubin NVLink 72 is a unified, secure system that integrates 72 NVIDIA Rubin GPUs, 36 NVIDIA Vera CPUs, NVIDIA NVLink 6, NVIDIA Connect X9 Super NIX, and NVIDIA BlueField for DPUs.  

NVIDIA will also offer the HGX-Rubin NVL8 platform, a server board that connects 8 Rubin GPUs via NVLink to support x86-based generative AI platforms. HGX-Rubin NVL8 accelerates training, inference, and scientific computing for AI and high-performance computing workloads.  

NVIDIA DJX SuperPod provides a reference architecture for large-scale deployment of Rubin-based systems integrating DGX Vera Rubin NVL 72 or DGX Rubin NVL8 systems with NVIDIA Bluefield 4 DPUs, Connect X9 Super NICS, InfiniBand, networking, and Mission Control software.  

Next Generation Ethernet Networking 

Advanced Ethernet networking and storage are critical to sustaining data center performance, efficiency, and cost-effectiveness in AI infrastructure.  

NVIDIA Spectrum 6 Ethernet is the next generation of AI networking Ethernet. Designed to scale Rubin-based AI factories with greater efficiency and steadfastness. It features 200G super-DES communication circuitry, co-packaged optics, and AI-optimized fabrics.  

Based on the Spectrum 6 architecture, SpectrumX Ethernet/Photonics co-packaged optical switch systems provide 10x greater reliability, 5x longer uptime, and 5x better power efficiency for AI applications, maximizing performance per watt compared to traditional methods. Spectrum XGS Ethernet technology, part of the SpectrumX platform, allows facilities separated by hundreds of kilometers or more to operate in a single AI environment.  

Together, these inventions define the next generation of NVIDIA Spectrum X Ethernet Platform, engineered through close codesign with Rubin, the Rubin supporter, massive-scale AI factories, and a-GPU environments.  

Rubin Readiness 

NVIDIA Rubin is now in full production. Rubin-based products will be available from partners in the second half of 2026.  

AWS, Google Cloud, Microsoft, and OCI will be among the first cloud providers to deploy Vera-Rubin-based instances in 2026, along with NVIDIA Cloud Partners, CoreWeave, Lambda, Nebius, and Nscale.  

Microsoft will deploy NVIDIA/Vera/Rubin/NVL72 rack-scale systems in its next-generation AI datacenters, including future FairWeather AI super factory sites.  

Rubin Platform will provide the foundation for Microsoft’s next-generation cloud AI capabilities by delivering high efficiency and performance for training and inference workloads. Microsoft Azure will offer an optimized platform to help customers accelerate innovation across enterprise research and consumer applications.  

CoreWeave integrates NVIDIA Rubin-based systems into its cloud platform starting in the second half of 2026. Its platform accommodates multiple architectures, allowing customers to adopt Rubin for training, inference, and agentic workloads.  

CoreWeave and NVIDIA will support AI innovators in employing Rubin’s progress in reasoning and MOE models. CoreWeave will continue to provide the performance, reliability, and scale needed for production AI throughout the life cycle with CoreWeave Mission Control.  

Cisco, Dell, HPE, Lenovo, and Supermicro are also expected to deliver a range of servers based on Rubin products.  

AI labs such as Anthropic, Black Forest, Cohere, Cursor, Havi, Meta, Mistral AI, OpenAI, Open Evidence, Perplexity, Runway, Thinking Machines Lab, and xAI plan to use the NVIDIA Rubin Platform to train larger models and provide long-context, multi-modal services systems with lower latency and lower cost than previous GPU generations.  

Infrastructure, software, and storage partners, including AIC, Canonical, Cloudian, DDN, Dell, HPE, Hitachi, Vantara, IBM, NetApp, Nutanix, Pure Storage, Supermicro, SUSE, Vast Data, and WEKA, are collaborating with NVIDIA to design next-generation Rubin infrastructure platforms. The Rubin platform represents NVIDIA’s third-generation rack-scale architecture and includes more than 80 NVIDIA MGX ecosystem partners.  

Unlock this density: Red Hat today announced an expanded collaboration with NVIDIA to deliver a complete AI stack optimized for the NVIDIA/Rubin platform, powered by Red Hat’s hybrid cloud portfolio, including Red Hat Enterprise Linux and Red Hat OpenShift. These solutions are used by the vast majority of Fortune Global 500 companies. 

Source: https://nvidianews.nvidia.com/news/rubin-platform-ai-supercomputer 

Amazon

Leave a Reply

Your email address will not be published. Required fields are marked *