News

NVIDIA Launches A2 Tensor Core GPU, An Entry-Stage Design Powered By Ampere GA107 GPU & 16 GB GDDR6 Reminiscence

Written by Jeff Lampkin

NVIDIA has additional expanded its skilled information middle lineup of Ampere GPUs with the A2 Tensor Core GPU accelerator. The brand new accelerator is essentially the most entry-level design we have now seen from NVIDIA and boasts some first rate specs primarily based on its entry-level market designation.

NVIDIA A2 Tensor Core GPU Is An Entry-Stage Knowledge Middle Design Powered By Ampere GA107

The NVIDIA A2 Tensor Core GPU is designed particularly for inferencing and replaces the Turing-powered T4 Tensor Core GPU. By way of specs, the cardboard incorporates a variant of Ampere GA107 GPU SKU which provides 1280 CUDA cores and 40 Tensor cores. These cores run at a clock frequency of 1.77 GHz and are primarily based on the Samsung 8nm course of node. Solely the higher-end GA100 GPU SKUs are primarily based on the TSMC 7nm course of node.

NVIDIA Declares AGX Orin & Jetson AGX Orin Platforms Powered by Orin SOC: 12 ARM Hercules CPU Cores & Ampere GPU

Reminiscence design contains a 16 GB GDDR6 capability that runs throughout a 128-bit bus-wide interface, clocking in at 12.5 Gbps successfully for a complete bandwidth of 200 GB/s. The GPU is configured to function at a TDP between 40 and 60 Watts. Resulting from its entry-level design, it additionally is available in a small kind issue design with a Half-Top and Half-Size kind issue which is passively cooled. Resulting from its decrease TDP, it does not require any exterior energy connectors besides. The cardboard additionally incorporates a PCIe Gen 4.0 x8 interface as an alternative of the usual x16 hyperlink.

The NVIDIA A2 Tensor Core GPU offers entry-level inference with low energy, a small footprint, and excessive efficiency for NVIDIA AI on the edge. That includes a low-profile PCIe Gen4 card and a low 40-60W configurable thermal design energy (TDP) functionality, the A2 brings versatile inference acceleration to any server for deployment at scale.

via NVIDIA

Efficiency clever, the compute numbers are rated at 4.5 TFLOPs (FP32), 0.14 TFLOPs (FP64), 36 TOPs (INT8), 18 TFLOPs (FP16 Tensor) and 9 TFLOPs (TF32) Tensor. Evaluating efficiency in IVA to an NVIDIA T4, the A2 provides as much as 30% enchancment together with consuming a lot decrease energy. The NVIDIA A2 Tensor Core GPU is on the market as of proper now although there are not any particular particulars shared concerning the pricing of the cardboard.

NVIDIA Ampere Skilled GPU Lineup

GPU IdentifyA100A40A30A16A10A2
Course of NodeTSMC 7nmSamsung 8nmTSMC 7nmSamsung 8nmSamsung 8nmSamsung 8nm
GPU SKUGA100-884GA102-895GA100-8904x GA107GA102-890GA107
GPU Transistors54.2B28.3B54.2BTBA28.3BTBA
CUDA Cores69121075235842560 x492161280
Tensor Cores43233622480 x428840
Increase Clock1.41 GHz1.74 GHz1.44 GHz1.69 GHz1.69 GHz1.77 GHz
FP32 Compute19.49 TFLOPs37.42 TFLOPs10.32 TFLOPs8.678 TFLOPs x431.24 TFLOPs4.5 TFLOPs
FP64 Compute9.74 TFLOPs1.16 TFLOPs5.16 TFLOPs0.27 TFLOPs x40.97 TFLOPs0.14 TFLOPs
FP16 Compte77.97 TFLOPs37.42 TFLOPs10.32 TFLOPs8.67 TFLOPs x431.24 TFLOPs4.5 TFLOPs
INT8 Tensor Compute624 TOPS598.6 TOPs330 TOPSTBA500 TOPS36 TOPS
TF32 Tensor Compute156 TFLOPS149.6 TOPs82 TFLOPSTBA125 TF9 TFLOPS
PCIe InterconnectsNVLink 3
12 Hyperlinks
PCIe 4.0 x16PCIe 4.0 x16 +
NVLink 3 (4 Hyperlinks)
PCIe 4.0 x16PCIe 4.0 x16PCIe 4.0 x8
Reminiscence Capability40 GB HBM2e48 GB GDDR624 GB HBM2e16 GB x4 GDDR624 GB GDDR616 GB GDDR6
Reminiscence Bus5120 bit384 bit3072 bit128 bit x4384 bit128-bit
Reminiscence Clock1215 MHz1812 MHz1215 MHz1812 MHz1563 MHz1563 MHz
Bandwidth1.55 TB/s695.8 GB/s933.1 GB/s231.9 GB/s x4600.2 GB/s200 GB/s
TDP400W300W165W250W150W60W
Kind IssueSXM4PCIe Twin Slot, Full SizePCIe Twin Slot, Full SizePCIe Twin Slot, Full SizePCIe Single Slot, FLHHPCIe Single Slot, HLHF

About the author

Jeff Lampkin

Jeff Lampkin was the first writer to have joined gamepolar.com. He has since then inculcated very effective writing and reviewing culture at GamePolar which rivals have found impossible to imitate. His approach has been to work on the basics while the whole world was focusing on the superstructures.