Description

The NVIDIA DGX GB200 NVL72 is a revolutionary rack-scale AI supercomputer designed to power the next era of generative AI, accelerated computing, and large-scale data analytics. Built on the groundbreaking NVIDIA Blackwell architecture, it combines 72 Blackwell GPUs and 36 Grace CPUs into a single, liquid-cooled NVLink-connected system that acts as one massive GPU—delivering unprecedented performance, memory bandwidth, and energy efficiency.

This system is purpose-built for real-time inference of trillion-parameter large language models (LLMs), massive-scale training, and high-throughput data processing. With innovations like the second-generation Transformer Engine, FP4 and FP8 precision, and NVLink-C2C interconnects, the DGX GB200 NVL72 redefines what’s possible in AI infrastructure.

Key Capabilities

Hardware Architecture

GPU Configuration: 72× NVIDIA Blackwell GPUs
CPU Configuration: 36× NVIDIA Grace CPUs (2,592 Arm Neoverse V2 cores)
Memory:
- Up to 13.5TB HBM3E
- Up to 17TB LPDDR5X with ECC
- Up to 30.5TB fast-access memory
GPU Interconnect:
- NVLink domain bandwidth: 130TB/s
- NVLink-C2C: 900GB/s bidirectional bandwidth between Grace and Blackwell
- 5th Gen NVLink: 1.8TB/s GPU-to-GPU interconnect
Cooling: Liquid-cooled rack for high-density, energy-efficient operation

Performance Highlights

Metric	Value
Real-Time Inference (FP4)	1,440 petaFLOPS
Training Performance (FP8)	720 petaFLOPS
Memory Bandwidth	Up to 576TB/s
NVLink Bandwidth	130TB/s
Energy Efficiency	25× more efficient than H100
Database Query Speedup	18× faster than CPU

AI Software Stack

The DGX GB200 NVL72 is optimized for NVIDIA’s full AI software ecosystem, enabling seamless development, deployment, and scaling of generative AI workloads:

NVIDIA AI Enterprise – End-to-end software platform for model development, deployment, and monitoring
NVIDIA NIM™ Microservices – Fast, secure, and scalable model inference
NVIDIA Magnum IO™ – High-performance data movement and IO stack
TensorRT™-LLM & NeMo Framework – Accelerated inference and training for LLMs and MoE models
Confidential Computing – Hardware-based security for sensitive data and models

Use Case Scenarios

Real-time inference of trillion-parameter LLMs
Training massive mixture-of-experts (MoE) models
Enterprise-scale generative AI deployment
High-throughput database analytics and decompression
Sustainable AI infrastructure with reduced carbon footprint

Sustainability & Efficiency

The DGX GB200 NVL72 is engineered for sustainability. Its liquid-cooled design reduces water usage and floor space while delivering 25× the performance of air-cooled H100 systems at the same power envelope. This enables enterprises to scale AI workloads without compromising on energy efficiency or environmental impact.

Technical Summary

Component	Specification
GPUs	72× Blackwell GPUs
CPUs	36× Grace CPUs
Total Fast Memory	Up to 30.5TB
FP4 Tensor Core Perf.	1,440 PFLOPS
FP8 Tensor Core Perf.	720 PFLOPS
NVLink Bandwidth	130TB/s
Cooling	Liquid-cooled rack
Security	Confidential Computing with TEE-I/O

The NVIDIA DGX GB200 NVL72 is not just a leap forward—it’s a paradigm shift in how enterprises build and scale AI infrastructure. Whether you’re deploying real-time LLMs, training multi-trillion parameter models, or accelerating data analytics, this system delivers unmatched performance, efficiency, and scalability.

Brand: Nvidia

NVIDIA DGX GB200 NVL72 ( AI Supercomputer with 72 Blackwell GPUs and 36 Grace CPUs)

Available In

Description

Key Capabilities

Hardware Architecture

Performance Highlights

AI Software Stack

Use Case Scenarios

Sustainability & Efficiency

Technical Summary

Brand

Nvidia

Reviews

Shipping & Payment

Additional information

Related products

NVIDIA H100 NVL GPU

NVIDIA DGX H200 (AI Supercomputer – 8× H200 SXM5 GPUs, 2× Intel Xeon 64C, 2TB DDR5, 30TB NVMe)

Qualcomm Cloud AI100 Ultra

NVIDIA Quantum-2 QM9790 InfiniBand Switch (64×400Gb/s NDR, SHARPv3, 1U)

NVIDIA L40S

NVIDIA HGX B200 (8-GPU) Platform