NVIDIA A40 PCIe 48GB GPU
- Launched on October 5th, 2020
- GA102 Graphic processor
- 10752 Cores
- 336 TMUS
- 112 ROPS
- 48GB Memory size
- GDDR6 Memory type
- 384 bit BUS width
$USD $8,047.00*RRP Pricing*
To View Channel Discounts Please Login
Start configuring your GP-GPU Server now!
The World’s Most Powerful Data Center GPU for Visual Computing
The NVIDIA A40 GPU is an evolutionary leap in performance and multi-workload capabilities from the data center, combining best-in-class professional graphics with powerful compute and AI acceleration to meet today’s design, creative, and scientific challenges. Driving the next generation of virtual workstations and server-based workloads, NVIDIA A40 brings state-of-the-art features for ray-traced rendering, simulation, virtual production, and more to professionals anytime, anywhere.
POWERED BY THE NVIDIA AMPERE ARCHITECTURE
The A40 PCIe is a professional graphics card by NVIDIA, launched on October 5th, 2020. Built on the 8 nm process, and based on the GA102 graphics processor, the card supports DirectX 12 Ultimate. The GA102 graphics processor is a large chip with a die area of 628 mm² and 28,300 million transistors. It features 10752 shading units, 336 texture mapping units, and 112 ROPs. Also included are 336 tensor cores which help improve the speed of machine learning applications. The card also has 84 raytracing acceleration cores. NVIDIA has paired 48 GB GDDR6 memory with the A40 PCIe, which are connected using a 384-bit memory interface. The GPU is operating at a frequency of 1305 MHz, which can be boosted up to 1740 MHz, memory is running at 1812 MHz (14.5 Gbps effective).
Being a dual-slot card, the NVIDIA A40 PCIe draws power from an 8-pin EPS power connector, with power draw rated at 300 W maximum. Display outputs include: 3x DisplayPort 1.4a. A40 PCIe is connected to the rest of the system using a PCI-Express 4.0 x16 interface. The card measures 267 mm in length, 112 mm in width, and features a dual-slot cooling solution.
NVIDIA Ampere Architecture CUDA® Cores
Double-speed processing for single-precision floating point (FP32) operations and improved power efficiency provide significant performance improvements for graphics and simulation workflows, such as complex 3D computer-aided design (CAD) and computer-aided engineering (CAE).
Second-Generation RT Cores
With up to 2X the throughput over the previous generation and the ability to concurrently run ray tracing with either shading or denoising capabilities, second-generation RT Cores deliver massive speedups for workloads like photorealistic rendering of movie content, architectural design evaluations, and virtual prototyping of product designs. This technology also speeds up the rendering of ray-traced motion blur for faster results with greater visual accuracy.
Third-Generation Tensor Cores
New Tensor Float 32 (TF32) precision provides up to 5X the training throughput over the previous generation to accelerate AI and data science model training without requiring any code changes. Hardware support for structural sparsity doubles the throughput for inferencing. Tensor Cores also bring AI to graphics with capabilities like DLSS, AI denoising, and enhanced editing for select applications.
48GB of GPU Memory
Ultra-fast GDDR6 memory, scalable up to 96GB with NVLink, gives data scientists, engineers, and creative professionals the large memory necessary to work with massive datasets and workloads like data science and simulation.
Third-Generation NVIDIA NVLink®
Connect two A40 GPUs together to scale from 48GB of GPU memory to 96GB. Increased GPU-to-GPU interconnect bandwidth provides a single scalable memory to accelerate graphics and compute workloads and tackle larger datasets. A new, more compact NVLink connector enables functionality in a wider range of servers.
Next-generation improvements with NVIDIA virtual GPU (vGPU) software allow for larger, more powerful virtual workstation instances for remote users, enabling high-end remote design, AI, and compute workloads.
PCI Express Gen 4
PCI Express Gen 4 doubles the bandwidth of PCIe Gen 3, improving data-transfer speeds from CPU memory for data-intensive tasks like AI, data science, and 3D design. Faster PCIe performance also accelerates GPU direct memory access (DMA) transfers, providing faster I/O communication of video data between the GPU and GPUDirect® for Video-enabled devices delivering a powerful solution for live broadcast. A40 is backwards compatible with PCI Express Gen 3 for deployment flexibility.
Data Center Efficiency and Security
Featuring a dual-slot, power efficient design, NVIDIA A40 is up to 2X as power efficient as the previous generation and compatible with a wide range of servers from worldwide OEMs. The NVIDIA A40 also includes a CEC 1712 chip that enables secure and measured boot with hardware root of trust, ensuring that firmware has not been tampered with or corrupted.
Start configuring your GP-GPU Server now!
|GPU Memory||48 GB GDDR6 with error-correcting code (ECC)|
|GPU Memory Bandwidth||696 GB/s|
|Interconnect||NVIDIA NVLink 112.5 GB/s (bidirectional)|
PCIe Gen4: 64GB/s
|NVLink||2-way low profile (2-slot)|
|Display Ports||3x DisplayPort 1.4*|
|Max Power Consumption||300 W|
|Form Factor||4.4" (H) x 10.5" (L) Dual Slot|
|vGPU Software Support||NVIDIA Virtual PC, NVIDIA Virtual Applications, NVIDIA RTX Virtual Workstation, NVIDIA Virtual Compute Server, NVIDIA AI Enterprise|
|vGPU Profiles Supported||See the Virtual GPU Licensing Guide|
|NVENC | NVDEC||1x | 2x (includes AV1 decode)|
|Secure and Measured Boot with Hardware Root of Trust||Yes (optional)|
|NEBS Ready||Level 3|
|Power Connector||8-pin CPU|
|NVIDIA A40 DS||1||389KB|