Description
The NVIDIA DGX GB200 NVL72 is a revolutionary rack-scale AI supercomputer designed to power the next era of generative AI, accelerated computing, and large-scale data analytics. Built on the groundbreaking NVIDIA Blackwell architecture, it combines 72 Blackwell GPUs and 36 Grace CPUs into a single, liquid-cooled NVLink-connected system that acts as one massive GPU—delivering unprecedented performance, memory bandwidth, and energy efficiency.
This system is purpose-built for real-time inference of trillion-parameter large language models (LLMs), massive-scale training, and high-throughput data processing. With innovations like the second-generation Transformer Engine, FP4 and FP8 precision, and NVLink-C2C interconnects, the DGX GB200 NVL72 redefines what’s possible in AI infrastructure.
Key Capabilities
Hardware Architecture
- GPU Configuration: 72× NVIDIA Blackwell GPUs
- CPU Configuration: 36× NVIDIA Grace CPUs (2,592 Arm Neoverse V2 cores)
- Memory:
- Up to 13.5TB HBM3E
- Up to 17TB LPDDR5X with ECC
- Up to 30.5TB fast-access memory
- GPU Interconnect:
- NVLink domain bandwidth: 130TB/s
- NVLink-C2C: 900GB/s bidirectional bandwidth between Grace and Blackwell
- 5th Gen NVLink: 1.8TB/s GPU-to-GPU interconnect
- Cooling: Liquid-cooled rack for high-density, energy-efficient operation
Performance Highlights
Metric | Value |
---|---|
Real-Time Inference (FP4) | 1,440 petaFLOPS |
Training Performance (FP8) | 720 petaFLOPS |
Memory Bandwidth | Up to 576TB/s |
NVLink Bandwidth | 130TB/s |
Energy Efficiency | 25× more efficient than H100 |
Database Query Speedup | 18× faster than CPU |
AI Software Stack
The DGX GB200 NVL72 is optimized for NVIDIA’s full AI software ecosystem, enabling seamless development, deployment, and scaling of generative AI workloads:
- NVIDIA AI Enterprise – End-to-end software platform for model development, deployment, and monitoring
- NVIDIA NIM™ Microservices – Fast, secure, and scalable model inference
- NVIDIA Magnum IO™ – High-performance data movement and IO stack
- TensorRT™-LLM & NeMo Framework – Accelerated inference and training for LLMs and MoE models
- Confidential Computing – Hardware-based security for sensitive data and models
Use Case Scenarios
- Real-time inference of trillion-parameter LLMs
- Training massive mixture-of-experts (MoE) models
- Enterprise-scale generative AI deployment
- High-throughput database analytics and decompression
- Sustainable AI infrastructure with reduced carbon footprint
Sustainability & Efficiency
The DGX GB200 NVL72 is engineered for sustainability. Its liquid-cooled design reduces water usage and floor space while delivering 25× the performance of air-cooled H100 systems at the same power envelope. This enables enterprises to scale AI workloads without compromising on energy efficiency or environmental impact.
Technical Summary
Component | Specification |
---|---|
GPUs | 72× Blackwell GPUs |
CPUs | 36× Grace CPUs |
Total Fast Memory | Up to 30.5TB |
FP4 Tensor Core Perf. | 1,440 PFLOPS |
FP8 Tensor Core Perf. | 720 PFLOPS |
NVLink Bandwidth | 130TB/s |
Cooling | Liquid-cooled rack |
Security | Confidential Computing with TEE-I/O |
Reviews
There are no reviews yet.