Brand: Nvidia
NVIDIA DGX GB200 NVL72 ( AI Supercomputer with 72 Blackwell GPUs and 36 Grace CPUs)
Warranty:
1 Year Effortless warranty claims with global coverage
Description
The NVIDIA DGX GB200 NVL72 is a revolutionary rack-scale AI supercomputer designed to power the next era of generative AI, accelerated computing, and large-scale data analytics. Built on the groundbreaking NVIDIA Blackwell architecture, it combines 72 Blackwell GPUs and 36 Grace CPUs into a single, liquid-cooled NVLink-connected system that acts as one massive GPU—delivering unprecedented performance, memory bandwidth, and energy efficiency.
This system is purpose-built for real-time inference of trillion-parameter large language models (LLMs), massive-scale training, and high-throughput data processing. With innovations like the second-generation Transformer Engine, FP4 and FP8 precision, and NVLink-C2C interconnects, the DGX GB200 NVL72 redefines what’s possible in AI infrastructure.
Key Capabilities
Hardware Architecture
- GPU Configuration: 72× NVIDIA Blackwell GPUs
- CPU Configuration: 36× NVIDIA Grace CPUs (2,592 Arm Neoverse V2 cores)
- Memory:
- Up to 13.5TB HBM3E
- Up to 17TB LPDDR5X with ECC
- Up to 30.5TB fast-access memory
- GPU Interconnect:
- NVLink domain bandwidth: 130TB/s
- NVLink-C2C: 900GB/s bidirectional bandwidth between Grace and Blackwell
- 5th Gen NVLink: 1.8TB/s GPU-to-GPU interconnect
- Cooling: Liquid-cooled rack for high-density, energy-efficient operation
Performance Highlights
| Metric | Value |
|---|---|
| Real-Time Inference (FP4) | 1,440 petaFLOPS |
| Training Performance (FP8) | 720 petaFLOPS |
| Memory Bandwidth | Up to 576TB/s |
| NVLink Bandwidth | 130TB/s |
| Energy Efficiency | 25× more efficient than H100 |
| Database Query Speedup | 18× faster than CPU |
AI Software Stack
The DGX GB200 NVL72 is optimized for NVIDIA’s full AI software ecosystem, enabling seamless development, deployment, and scaling of generative AI workloads:
- NVIDIA AI Enterprise – End-to-end software platform for model development, deployment, and monitoring
- NVIDIA NIM™ Microservices – Fast, secure, and scalable model inference
- NVIDIA Magnum IO™ – High-performance data movement and IO stack
- TensorRT™-LLM & NeMo Framework – Accelerated inference and training for LLMs and MoE models
- Confidential Computing – Hardware-based security for sensitive data and models
Use Case Scenarios
- Real-time inference of trillion-parameter LLMs
- Training massive mixture-of-experts (MoE) models
- Enterprise-scale generative AI deployment
- High-throughput database analytics and decompression
- Sustainable AI infrastructure with reduced carbon footprint
Sustainability & Efficiency
The DGX GB200 NVL72 is engineered for sustainability. Its liquid-cooled design reduces water usage and floor space while delivering 25× the performance of air-cooled H100 systems at the same power envelope. This enables enterprises to scale AI workloads without compromising on energy efficiency or environmental impact.
Technical Summary
| Component | Specification |
|---|---|
| GPUs | 72× Blackwell GPUs |
| CPUs | 36× Grace CPUs |
| Total Fast Memory | Up to 30.5TB |
| FP4 Tensor Core Perf. | 1,440 PFLOPS |
| FP8 Tensor Core Perf. | 720 PFLOPS |
| NVLink Bandwidth | 130TB/s |
| Cooling | Liquid-cooled rack |
| Security | Confidential Computing with TEE-I/O |

Reviews
There are no reviews yet.