NVIDIA GB200 NVL2
Bringing the new era of computing to every data center.
Unparalleled Single-Server Performance
The NVIDIA GB200 NVL2 platform brings the new era of computing to every data center, delivering unparalleled performance for mainstream large language model (LLM) inference, vector database search, and data processing through 2 Blackwell GPUs and 2 Grace GPUs. With its scale-out, single node NVIDIA MGX™ architecture, its design enables a wide variety of system designs and networking options to seamlessly integrate accelerated computing into existing data center infrastructure.
Highlights
Turbocharging Accelerated Computing
Llama3 LLM inference: Token-to-token latency (TTL) = 50 milliseconds (ms) real time, first token latency (FTL) = 2s, input sequence length = 2.048, output sequence length = 128 output, 8x NVIDIA HGX™ H100 air-cooled vs. GB200 NVL2 air-cooled single node, per-GPU performance comparison
Vector database search performance within RAG pipeline using memory shared by NVIDIA Grace CPU and Blackwell GPU. 1x x86, 1x H100 GPU, and 1x GPU from GB200 NVL2 node.
Data processing: A database join and aggregation workload with Snappy/Deflate compression derived from TPC-H Q4 query. Custom query implementations for x86, H100 single GU, and single GPU from GB200 NVL2 node: GB200 vs. Intel Xeon 8480+
Projected performance subject to change.
Real-Time Mainstream LLM Inference
Vector Database Search
Data Processing
Features
Technological Breakthroughs
Blackwell Architecture
The NVIDIA Blackwell architecture delivers groundbreaking advancements in accelerated computing, powering a new era of computing with unparalleled performance, efficiency, and scale.
NVIDIA Grace CPU
The NVIDIA Grace CPU is a breakthrough processor designed for modern data centers running AI, cloud, and high-performance computing (HPC) applications. It provides outstanding performance and memory bandwidth with 2X the energy efficiency of today’s leading server processors.
NVLINK C2C
NVIDIA NVLink-C2C coherently interconnects each Grace CPU and Blackwell GPU at 900GB/s. The GB200 NVL2 uses both NVLink-C2C and the fifth-generation NVLink to deliver a 1.4 TB coherent memory model for accelerated AI.

Key Value (KV) Caching
Key Value (KV) Caching improves LLM response speeds by storing conversation context and history. GB200 NVL2 optimizes KV Caching through its fully coherent Grace GPU and Blackwell GPU memory connected by NVLink-C2C, 7X faster than PCIe, enabling LLMs to predict words faster than x86-based GPU implementations.
Fifth-Generation NVIDIA NVLink
Unlocking the full potential of exascale computing and trillion-parameter AI models requires swift, seamless communication between every GPU in a server cluster. Fifth-generation NVLink is a scale-up interconnect that unleashes accelerated performance for trillion- and multi-trillion-parameter AI models.
NVIDIA Networking
The data center’s network plays a crucial role in driving AI advancements and performance, serving as the backbone for distributed AI model training and generative AI performance. NVIDIA Quantum-X800 InfiniBand, NVIDIA Spectrum™-X800 Ethernet, and NVIDIA BlueField®-3 DPUs enable efficient scalability across hundreds and thousands of Blackwell GPUs for optimal application performance.
Specifications
GB200 NVL2¹ Specs
| Configuration | 2x Grace CPUs, 2x Blackwell GPUs |
| FP4 Tensor Core² | 40 PFLOPS |
| FP8/FP6 Tensor Core² | 20 PFLOPS |
| INT8 Tensor Core² | 20 POPS |
| FP16/BF16 Tensor Core² | 10 PFLOPS |
| TF32 Tensor Core² | 5 PFLOPS |
| FP32 | 180 TFLOPS |
| FP64/FP64 Tensor Core | 90 TFLOPS |
| GPU Memory | Bandwidth | Up to 384GB | 16TB/s |
| CPU Core Count | 144 Arm® Neoverse V2 cores |
| LPDDR5X Memory | Bandwidth | Up to 960GB | Up to 1,024GB/s |
| Interconnect | NVLink: 1.8TB/s NVLink-C2C: 2x 900GB/s PCIe Gen6: 2x 256GB/s |
| Server Options | Various NVIDIA GB200 NVL2 configuration options using NVIDIA MGX |
1 Preliminary specifications. May be subject to change. | |

NVIDIA GB200 NVL72
The NVIDIA GB200 NVL72 connects 36 GB200 Superchips in a rack-scale design. The GB200 NVL72 is a liquid-cooled, rack-scale solution that boasts a 72-GPU NVLink domain that acts as a single, massive GPU.

MacBook
iPad
Apple Watch
Airpods
iMac
Studio Display
iphone
Gaming Laptop

Gaming Desktop
DDR5 Desktop
DDR5 Laptop
DDR5 Server
DDR4 Desktop
DDR4 Laptop
DDR4 Server
DDR3 Desktop
DDR3 Laptop
DDR3 Server
Intel Socket
Intel Z890
Intel B860
Intel B760
Intel H770
Intel B660
Intel H670
Intel H610
Intel Z690
Intel H510
Intel Z590
Intel B560
Intel H470
Intel Z490
Intel H410
Intel B460
Intel H310
Intel B360
Intel B365
Intel X299
Intel Z390
Intel Z370
Intel H370
Intel Z270 H270
Intel B250
Intel Z170 H170
Intel H110
Intel H81
Intel B85
Intel H61
Intel B150
AMD Socket
AMD B850
AMD B840
AMD TRX50
AMD A620
AMD X870
AMD B650
AMD A520
AMD TRX40
AMD B550
AMD X570
AMD X470
AMD B450
AMD X370
AMD A320
AMD B350
AMD X399
AMD A88
AMD A68 A78
Cpu Air Coolers
CPU Liquid Coolers
Fans
AMD CPUs Desktop
AMD Server CPU
Intel Server CPU
Samsung CPUs
Other special CPUs

Solid State Drives
NVMe PCIe M.2
SATA 2.5inch
Hard Disk Drive
Server Hard Drives
NAS hard drive
Monitoring hard drive
Portable Solid State Drives
Memory Cards
USB Flash Drives
Nvidia GPU
RTX 50 series
RTX 30 series
GTX 16 series
GTX 10 series
RX 9000 series
RX 7000 series
RX 6000 series
RX 5000 series
RX 500 series
RTX 20 series
Rack server
Blade server
Tower server
Storage Server Solutions
Network switch
Workstation
Mobile Workstation
Server motherboard
Workstation Motherboard
SONY Gaming Console
ASUS Gaming Console
Lenovo Gaming Console
One XPlayer
Microsoft Gaming Console
XBOX Gaming Console
MSI Gaming Console
Motherboard
GTX TITAN
Computer Cases