Unlocking the full potential of exascale computing and trillion-parameter AI models hinges on swift, seamless communication between every GPU within a server cluster. The fifth generation of NVIDIA® NVLink® is a scale–up interconnect that unleashes accelerated performance for trillion- and multi-trillion parameter AI models.
Fifth-generation NVLink vastly improves scalability for larger multi-GPU systems. A single NVIDIA Blackwell Tensor Core GPU supports up to 18 NVLink 100 gigabyte-per-second (GB/s) connections for a total bandwidth of 1.8 terabytes per second (TB/s)—2X more bandwidth than the previous generation and over 14X the bandwidth of PCIe Gen5. Server platforms like the GB200 NVL72 take advantage of this technology to deliver greater scalability for today’s most complex large models.
NVLink in NVIDIA H100 increases inter-GPU communication bandwidth 1.5X compared to the previous generation, so researchers can use larger, more sophisticated applications to solve more complex problems.
NVLink is a 1.8TB/s bidirectional, direct GPU-to-GPU interconnect that scales multi-GPU input and output (IO) within a server. The NVIDIA NVLink Switch chips connect multiple NVLinks to provide all-to-all GPU communication at full NVLink speed within a single rack and between racks.
To enable high-speed, collective operations, each NVLink Switch has engines for NVIDIA Scalable Hierarchical Aggregation and Reduction Protocol (SHARP)™ for in-network reductions and multicast acceleration.
With NVLink Switch, NVLink connections can be extended across nodes to create a seamless, high-bandwidth, multi-node GPU cluster—effectively forming a data center-sized GPU. NVIDIA NVLink Switch enables 130TB/s of GPU bandwidth in one NVL72 for large model parallelism. Multi-server clusters with NVLink scale GPU communications in balance with the increased computing, so NVL72 can support 9X the GPU count than a single eight-GPU system.
The NVIDIA NVLink Switch features 144 NVLink ports with a non-blocking switching capacity of 14.4TB/s. The rack switch is designed to provide high bandwidth and low latency in NVIDIA GB200 NVL72 systems supporting external fifth-generation NVLink connectivity.
The NVLink Switch is the first rack-level switch chip capable of supporting up to 576 fully connected GPUs in a non-blocking compute fabric. The NVLink Switch interconnects every GPU pair at an incredible 1,800GB/s. It supports full all-to-all communication. The 72 GPUs in GB200 NVL72 can be used as a single high-performance accelerator with up to 1.4 exaFLOPS of AI compute power.
NVLink and NVLink Switch are essential building blocks of the complete NVIDIA data center solution that incorporates hardware, networking, software, libraries, and optimized AI models and applications from the NVIDIA AI Enterprise software suite and the NVIDIA NGC™ catalog. The most powerful end-to-end AI and HPC platform, it allows researchers to deliver real-world results and deploy solutions into production, driving unprecedented acceleration at every scale.
Second Generation | Third Generation | Fourth Generation | Fifth Generation | |
NVLink bandwidth per GPU | 300GB/s | 600GB/s | 900GB/s | 1,800GB/s |
Maximum Number of Links per GPU | 6 | 12 | 18 | 18 |
Supported NVIDIA Architectures | NVIDIA Volta™ architecture | NVIDIA Ampere architecture | NVIDIA Hopper™ architecture | NVIDIA Blackwell architecture |
| First Generation | Second Generation | Third Generation | NVLink Switch |
Number of GPUs with direct connection within a NVLink domain | Up to 8 | Up to 8 | Up to 8 | Up to 576 |
NVSwitch GPU-to-GPU bandwidth | 300GB/s | 600GB/s | 900GB/s | 1,800GB/s |
Total aggregate bandwidth | 2.4TB/s | 4.8TB/s | 7.2TB/s | 1PB/s |
Supported NVIDIA architectures | NVIDIA Volta™ architecture | NVIDIA Ampere architecture | NVIDIA Hopper™ architecture | NVIDIA Blackwell architecture |
Preliminary specifications; may be subject to change.