A Need for Faster Scale-Up Interconnects

Unlocking the full potential of exascale computing and trillion-parameter AI models hinges on swift, seamless communication between every GPU within a server cluster. The fifth generation of NVIDIA® NVLink® is a scale–up interconnect that unleashes accelerated performance for trillion- and multi-trillion parameter AI models.

NVLink Performance

NVLink in NVIDIA H100 increases inter-GPU communication bandwidth 1.5X compared to the previous generation, so researchers can use larger, more sophisticated applications to solve more complex problems.

Raise GPU Throughput With NVLink Communications

Fully Connect GPUs With NVIDIA NVLink and NVLink Switch

NVLink is a 1.8TB/s bidirectional, direct GPU-to-GPU interconnect that scales multi-GPU input and output (IO) within a server. The NVIDIA NVLink Switch chips connect multiple NVLinks to provide all-to-all GPU communication at full NVLink speed within a single rack and between racks. 

To enable high-speed, collective operations, each NVLink Switch has engines for NVIDIA Scalable Hierarchical Aggregation and Reduction Protocol (SHARP)™ for in-network reductions and multicast acceleration.   

Train Multi-Trillion Parameter Models With NVLink Switch System

With NVLink Switch, NVLink connections can be extended across nodes to create a seamless, high-bandwidth, multi-node GPU cluster—effectively forming a data center-sized GPU. NVIDIA NVLink Switch enables 130TB/s of GPU bandwidth in one NVL72 for large model parallelism. Multi-server clusters with NVLink scale GPU communications in balance with the increased computing, so NVL72 can support 9X the GPU count than a single eight-GPU system.

NVIDIA NVLink Switch

The NVIDIA NVLink Switch features 144 NVLink ports with a non-blocking switching capacity of 14.4TB/s. The rack switch is designed to provide high bandwidth and low latency in NVIDIA GB200 NVL72 systems supporting external fifth-generation NVLink connectivity.

Scaling From Enterprise to Exascale

Full Connection for Unparalleled Performance

The NVLink Switch is the first rack-level switch chip capable of supporting up to 576 fully connected GPUs in a non-blocking compute fabric. The NVLink Switch interconnects every GPU pair at an incredible 1,800GB/s. It supports full all-to-all communication. The 72 GPUs in GB200 NVL72 can be used as a single high-performance accelerator with up to 1.4 exaFLOPS of AI compute power.

The Most Powerful AI and HPC Platform

NVLink and NVLink Switch are essential building blocks of the complete NVIDIA data center solution that incorporates hardware, networking, software, libraries, and optimized AI models and applications from the NVIDIA AI Enterprise software suite and the  NVIDIA NGC™ catalog. The most powerful end-to-end AI and HPC platform, it allows researchers to deliver real-world results and deploy solutions into production, driving unprecedented acceleration at every scale. 

Specifications

 

Second Generation

Third GenerationFourth GenerationFifth Generation
NVLink bandwidth per GPU300GB/s600GB/s900GB/s1,800GB/s
Maximum Number of Links per GPU6121818
Supported NVIDIA ArchitecturesNVIDIA Volta™ architectureNVIDIA Ampere architectureNVIDIA Hopper™ architectureNVIDIA Blackwell architecture
2.NVLink Switch

 

First GenerationSecond GenerationThird GenerationNVLink Switch
Number of GPUs with direct connection within a NVLink domainUp to 8Up to 8Up to 8Up to 576
NVSwitch GPU-to-GPU bandwidth300GB/s600GB/s900GB/s1,800GB/s
Total aggregate bandwidth2.4TB/s4.8TB/s7.2TB/s1PB/s
Supported NVIDIA architecturesNVIDIA Volta™ architectureNVIDIA Ampere architectureNVIDIA Hopper™ architectureNVIDIA Blackwell architecture

 Preliminary specifications; may be subject to change.

Take a Deep Dive into the NVIDIA Blackwell Architecture