A Need for Faster Scale-Up Interconnects

Unlocking the full potential of exascale computing and trillion-parameter AI models hinges on swift, seamless communication between every GPU within a server cluster. The fifth generation of NVIDIA® NVLink® is a scale–up interconnect that unleashes accelerated performance for trillion- and multi-trillion parameter AI models.

Maximize System Throughput with NVIDIA NVLink

Fifth-generation NVLink vastly improves scalability for larger multi-GPU systems. A single NVIDIA Blackwell Tensor Core GPU supports up to 18 NVLink 100 gigabyte-per-second (GB/s) connections for a total bandwidth of 1.8 terabytes per second (TB/s)—2X more bandwidth than the previous generation and over 14X the bandwidth of PCIe Gen5. Server platforms like the GB200 NVL72 take advantage of this technology to deliver greater scalability for today’s most complex large models.

NVLink Performance

NVLink in NVIDIA H100 increases inter-GPU communication bandwidth 1.5X compared to the previous generation, so researchers can use larger, more sophisticated applications to solve more complex problems.

Raise GPU Throughput With NVLink Communications

Fully Connect GPUs With NVIDIA NVLink and NVLink Switch

NVLink is a 1.8TB/s bidirectional, direct GPU-to-GPU interconnect that scales multi-GPU input and output (IO) within a server. The NVIDIA NVLink Switch chips connect multiple NVLinks to provide all-to-all GPU communication at full NVLink speed within a single rack and between racks.

To enable high-speed, collective operations, each NVLink Switch has engines for NVIDIA Scalable Hierarchical Aggregation and Reduction Protocol (SHARP)™ for in-network reductions and multicast acceleration.

Train Multi-Trillion Parameter Models With NVLink Switch System

With NVLink Switch, NVLink connections can be extended across nodes to create a seamless, high-bandwidth, multi-node GPU cluster—effectively forming a data center-sized GPU. NVIDIA NVLink Switch enables 130TB/s of GPU bandwidth in one NVL72 for large model parallelism. Multi-server clusters with NVLink scale GPU communications in balance with the increased computing, so NVL72 can support 9X the GPU count than a single eight-GPU system.

NVIDIA NVLink Switch

The NVIDIA NVLink Switch features 144 NVLink ports with a non-blocking switching capacity of 14.4TB/s. The rack switch is designed to provide high bandwidth and low latency in NVIDIA GB200 NVL72 systems supporting external fifth-generation NVLink connectivity.

Scaling From Enterprise to Exascale

Full Connection for Unparalleled Performance

The NVLink Switch is the first rack-level switch chip capable of supporting up to 576 fully connected GPUs in a non-blocking compute fabric. The NVLink Switch interconnects every GPU pair at an incredible 1,800GB/s. It supports full all-to-all communication. The 72 GPUs in GB200 NVL72 can be used as a single high-performance accelerator with up to 1.4 exaFLOPS of AI compute power.

The Most Powerful AI and HPC Platform

NVLink and NVLink Switch are essential building blocks of the complete NVIDIA data center solution that incorporates hardware, networking, software, libraries, and optimized AI models and applications from the NVIDIA AI Enterprise software suite and the NVIDIA NGC™ catalog. The most powerful end-to-end AI and HPC platform, it allows researchers to deliver real-world results and deploy solutions into production, driving unprecedented acceleration at every scale.

Specifications

NVLink

	Second Generation	Third Generation	Fourth Generation	Fifth Generation
NVLink bandwidth per GPU	300GB/s	600GB/s	900GB/s	1,800GB/s
Maximum Number of Links per GPU	6	12	18	18
Supported NVIDIA Architectures	NVIDIA Volta™ architecture	NVIDIA Ampere architecture	NVIDIA Hopper™ architecture	NVIDIA Blackwell architecture

2.NVLink Switch

	First Generation	Second Generation	Third Generation	NVLink Switch
Number of GPUs with direct connection within a NVLink domain	Up to 8	Up to 8	Up to 8	Up to 576
NVSwitch GPU-to-GPU bandwidth	300GB/s	600GB/s	900GB/s	1,800GB/s
Total aggregate bandwidth	2.4TB/s	4.8TB/s	7.2TB/s	1PB/s
Supported NVIDIA architectures	NVIDIA Volta™ architecture	NVIDIA Ampere architecture	NVIDIA Hopper™ architecture	NVIDIA Blackwell architecture

Preliminary specifications; may be subject to change.

Take a Deep Dive into the NVIDIA Blackwell Architecture