NVIDIA A100 PCIe 40GB GPU
- Launched on June 22nd, 2020
- GA100 Graphic Processor
- 6912 Cores
- 432 TMUS
- 160 ROPS
- 40GB Memory size
- HBM2e Memory type
- 5120 bit BUS width
$USD $14,399.00*RRP Pricing*
To View Channel Discounts Please Login
Start configuring your GP-GPU Server now!
Accelerating the Most Important Work of Our Time
The NVIDIA A100 Tensor Core GPU delivers unprecedented acceleration at every scale for AI, data analytics, and high-performance computing (HPC) to tackle the world’s toughest computing challenges. As the engine of the NVIDIA data center platform, A100 can efficiently scale to thousands of GPUs or, with NVIDIA Multi-Instance GPU (MIG) technology, be partitioned into seven GPU instances to accelerate workloads of all sizes. And third-generation Tensor Cores accelerate every precision for diverse workloads, speeding time to insight and time to market.
The A100 PCIe is a professional graphics card by NVIDIA, launched on June 22nd, 2020. Built on the 7 nm process, and based on the GA100 graphics processor, the card does not support DirectX. Since A100 PCIe does not support DirectX 11 or DirectX 12, it might not be able to run all the latest games. The GA100 graphics processor is a large chip with a die area of 826 mm² and 54,200 million transistors. It features 6912 shading units, 432 texture mapping units, and 160 ROPs. Also included are 432 tensor cores which help improve the speed of machine learning applications. NVIDIA has paired 40 GB HBM2e memory with the A100 PCIe, which are connected using a 5120-bit memory interface. The GPU is operating at a frequency of 765 MHz, which can be boosted up to 1410 MHz, memory is running at 1215 MHz.
Being a dual-slot card, the NVIDIA A100 PCIe draws power from an 8-pin EPS power connector, with power draw rated at 250 W maximum. This device has no display connectivity, as it is not designed to have monitors connected to it. A100 PCIe is connected to the rest of the system using a PCI-Express 4.0 x16 interface. The card measures 267 mm in length, and features a dual-slot cooling solution.
The Most Powerful End-to-End AI and HPC Data Center Platform
A100 is part of the complete NVIDIA data center solution that incorporates building blocks across hardware, networking, software, libraries, and optimized AI models and applications from NGC™. Representing the most powerful end-to-end AI and HPC platform for data centers, it allows researchers to deliver real-world results and deploy solutions into production at scale.
Deep Learning training
AI models are exploding in complexity as they take on next-level challenges such as accurate conversational AI and deep recommender systems. Training them requires massive compute power and scalability.
NVIDIA A100’s third-generation Tensor Cores with Tensor Float (TF32) precision provide up to 20X higher performance over the prior generation with zero code changes and an additional 2X boost with automatic mixed precision and FP16.When combined with third-generation NVIDIA® NVLink®, NVIDIA NVSwitch™, PCI Gen4, NVIDIA Mellanox InfiniBand, and the NVIDIA Magnum IO™software SDK, it’s possible to scale to thousands of A100 GPUs. This means that large AI models like BERT can be trained in just 37 minutes on a cluster of 1,024 A100s, offering unprecedented performance and scalability.
NVIDIA’s training leadership was demonstrated in MLPerf 0.6, the first industry-wide benchmark for AI training.
Deep Learning Inference
A100 introduces ground breaking new features to optimize inference workloads. It brings unprecedented versatility by accelerating a full range of precisions, from FP32 to FP16 to INT8 and all the way down to INT4. Multi-Instance GPU (MIG) technology allows multiple networks to operate simultaneously on a single
A100 GPU for optimal utilization of compute resources. And structural sparsity support delivers up to 2X more performance on top of A100’s other inference performance gains.
NVIDIA already delivers market-leading inference performance, as demonstrated in an across-the-board sweep of MLPerf Inference 0.5, the first industry-wide benchmark for inference. A100 brings 20X more performance to further extend that leadership.
To unlock next-generation discoveries, scientists look to simulations to better understand complex molecules for drug discovery, physics for potential new sources of energy, and atmospheric data to better predict and prepare for extreme weather patterns.A100 introduces double-precision Tensor Cores, providing the biggest milestone since the introduction of double-precision computing in GPUs for HPC. This enables researchers to reduce a 10-hour, double-precision simulation running on NVIDIA V100 Tensor Core GPUs to just four hours on A100.HPC applications can also leverage TF32 precision in A100’s Tensor Cores to achieve up to 10X higher throughput for single-precision dense matrixmultiply operations.
High-Performance Data Analytics
Customers need to be able to analyze, visualize, and turn massive datasets into insights. But scale-out solutions often become bogged down as these datasets are scattered across multiple servers. Accelerated servers with A100 deliver the needed compute power—along with 1.6 terabytes per second (TB/sec) of memory bandwidth and scalability with third-generation NVLink and NVSwitch—to tackle these massive workloads. Combined with NVIDIA Mellanox InfiniBand, the Magnum IO SDK, and RAPIDS suite of open source software libraries, including the RAPIDS Accelerator for Apache Spark for GPU-accelerated data analytics, the NVIDIA data center platform is uniquely able to accelerate these huge workloads at unprecedented levels of performance and efficiency.
A100 with MIG maximizes the utilization of GPU-accelerated infrastructure like never before. MIG allows an A100 GPU to be partitioned into as many as seven independent instances, giving multiple users access to GPU acceleration for their applications and development projects. MIG works with Kubernetes, containers, and hypervisor-based server virtualization with NVIDIA Virtual Compute Server (vCS). MIG lets infrastructure managers offer a right-sized GPU with guaranteed quality of service (QoS) for every job, optimizing utilization and extending the reach of accelerated computing resources to every user.
Start configuring your GP-GPU Server now!
|NVIDIA A100 for HGX||NVIDIA A100 for PCIe|
|Peak FP64||9.7 TF||9.7 TF|
|Peak FP64 Tensor Core||19.5 TF||19.5 TF|
|Peak FP32||19.5 TF||19.5 TF|
|Peak TF32 Tensor Core||156 TF | 312 TF*||156 TF | 312 TF*|
|Peak BFLOAT16 Tensor Core||312 TF | 624 TF*||312 TF | 624 TF*|
|Peak FP16 Tensor Core||312 TF | 624 TF*||312 TF | 624 TF*|
|Peak INT8 Tensor Core||624 TOPS | 1,248 TOPS*||624 TOPS | 1,248 TOPS*|
|Peak INT4 Tensor Core||1,248 TOPS | 2,496 TOPS*||1,248 TOPS | 2,496 TOPS*|
|GPU Memory||40 GB||40 GB|
|GPU Memory Bandwidth||1,555 GB/s||1,555 GB/s|
|Interconnect||NVIDIA NVLink 600 GB/s**|
PCIe Gen4 64 GB/s
|NVIDIA NVLink 600 GB/s**|
PCIe Gen4 64 GB/s
|Multi-instance GPUs||Various instance sizes with up to 7MIGs @5GB||Various instance sizes with up to 7MIGs @5GB|
|Form Factor||4/8 SXM on NVIDIA HGX™ A100||PCIe|
|Max TDP Power||400W||250W|
|Delivered Performance of Top Apps||100%||90%|
|A100 Product Brief||1||15-13-21||333|