NVIDIA A30
Tensor Core GPU
Versatile compute acceleration for mainstream enterprise servers.
AI Inference and Mainstream Compute for Every Enterprise
Bring accelerated performance to every enterprise workload with NVIDIA A30 Tensor Core GPUs. With NVIDIA Ampere architecture Tensor Cores and Multi-Instance GPU (MIG), it delivers speedups securely across diverse workloads, including AI inference at scale and high-performance computing (HPC) applications. By combining fast memory bandwidth and low-power consumption in a PCIe form factor—optimal for mainstream servers—A30 enables an elastic data center and delivers maximum value for enterprises.
The Data Center Solution for Modern IT
The NVIDIA Ampere architecture is part of the unified NVIDIA EGX™ platform, incorporating building blocks across hardware, networking, software, libraries, and optimized AI models and applications from the NVIDIA NGC™ catalog. Representing the most powerful end-to-end AI and HPC platform for data centers, it allows researchers to rapidly deliver real-world results and deploy solutions into production at scale.
Deep Learning Training
AI Training—Up to 3X higher throughput than v100 and 6X higher than T4
BERT Large Pre-Training (Normalized)
ERT-Large Pre-Training (9/10 epochs) Phase 1 and (1/10 epochs) Phase 2, Sequence Length for Phase 1 = 128 and Phase 2 = 512, dataset = real, NGC™ container = 21.03,
8x GPU: T4 (FP32, BS=8, 2) | V100 PCIE 16GB (FP32, BS=8, 2) | A30 (TF32, BS=8, 2) | A100 PCIE 40GB (TF32, BS=54, 8) | batch sizes indicated are for Phase 1 and Phase 2 respectively
Training AI models for next-level challenges such as conversational AI requires massive compute power and scalability.
NVIDIA A30 Tensor Cores with Tensor Float (TF32) provide up to 10X higher performance over the NVIDIA T4 with zero code changes and an additional 2X boost with automatic mixed precision and FP16, delivering a combined 20X throughput increase. When combined with NVIDIA® NVLink®, PCIe Gen4, NVIDIA networking, and the NVIDIA Magnum IO™ SDK, it’s possible to scale to thousands of GPUs.
Tensor Cores and MIG enable A30 to be used for workloads dynamically throughout the day. It can be used for production inference at peak demand, and part of the GPU can be repurposed to rapidly re-train those very same models during off-peak hours.
NVIDIA set multiple performance records in MLPerf, the industry-wide benchmark for AI training.
Deep Learning Inference
A30 leverages groundbreaking features to optimize inference workloads. It accelerates a full range of precisions, from FP64 to TF32 and INT4. Supporting up to four MIGs per GPU, A30 lets multiple networks operate simultaneously in secure hardware partitions with guaranteed quality of service (QoS). And structural sparsity support delivers up to 2X more performance on top of A30’s other inference performance gains.
NVIDIA’s market-leading AI performance was demonstrated in MLPerf Inference. Combined with NVIDIA Triton™ Inference Server, which easily deploys AI at scale, A30 brings this groundbreaking performance to every enterprise.
AI Inference—Up To 3X higher throughput than V100 at real-time conversational AI
BERT Large Inference (Normalized)
Throughput for <10ms Latency
NVIDIA® TensorRT®, Precision = INT8, Sequence Length = 384, NGC Container 20.12, Latency <10ms, Dataset = Synthetic 1x GPU: A100 PCIe 40GB (BS=8) | A30 (BS=4) | V100 SXM2 16GB (BS=1) | T4 (BS=1)
AI Inference—Over 3X higher throughput than T4 at real-time image classification
RN50 v1.5 Inference (Normalized)
Throughput for <7ms Latency
TensorRT, NGC Container 20.12, Latency <7ms, Dataset=Synthetic, 1x GPU: T4 (BS=31, INT8) | V100 (BS=43, Mixed precision) | A30 (BS=96, INT8) | A100 (BS=174, INT8)
High-Performance Computing
HPC—Up to 1.1X higher throughput than V100 and 8X higher than T4
LAMMPS (Normalized)
Dataset: ReaxFF/C, FP64 | 4x GPU: T4, V100 PCIE 16GB, A30
To unlock next-generation discoveries, scientists use simulations to better understand the world around us.
NVIDIA A30 features FP64 NVIDIA Ampere architecture Tensor Cores that deliver the biggest leap in HPC performance since the introduction of GPUs. Combined with 24 gigabytes (GB) of GPU memory with a bandwidth of 933 gigabytes per second (GB/s), researchers can rapidly solve double-precision calculations. HPC applications can also leverage TF32 to achieve higher throughput for single-precision, dense matrix-multiply operations.
The combination of FP64 Tensor Cores and MIG empowers research institutions to securely partition the GPU to allow multiple researchers access to compute resources with guaranteed QoS and maximum GPU utilization. Enterprises deploying AI can use A30’s inference capabilities during peak demand periods and then repurpose the same compute servers for HPC and AI training workloads during off-peak periods.
High-Performance Data Analytics
Data scientists need to be able to analyze, visualize, and turn massive datasets into insights. But scale-out solutions are often bogged down by datasets scattered across multiple servers.
Accelerated servers with A30 provide the needed compute power—along with large HBM2 memory, 933GB/sec of memory bandwidth, and scalability with NVLink—to tackle these workloads. Combined with NVIDIA InfiniBand, NVIDIA Magnum IO and the RAPIDS™ suite of open-source libraries, including the RAPIDS Accelerator for Apache Spark, the NVIDIA data center platform accelerates these huge workloads at unprecedented levels of performance and efficiency.
Enterprise-Ready Utilization
A30 with MIG maximizes the utilization of GPU-accelerated infrastructure. With MIG, an A30 GPU can be partitioned into as many as four independent instances, giving multiple users access to GPU acceleration.
MIG works with Kubernetes, containers, and hypervisor-based server virtualization. MIG lets infrastructure managers offer a right-sized GPU with guaranteed QoS for every job, extending the reach of accelerated computing resources to every user.
NVIDIA AI Enterprise
NVIDIA AI Enterprise, an end-to-end cloud-native suite of AI and data analytics software, is certified to run on A30 in hypervisor-based virtual infrastructure with VMware vSphere. This enables management and scaling of AI workloads in a hybrid cloud environment.
Mainstream NVIDIA-Certified Systems
NVIDIA-Certified Systems™ with NVIDIA A30 bring together compute acceleration and high-speed, secure NVIDIA networking into enterprise data center servers, built and sold by NVIDIA’s OEM partners. This program enables customers to identify, acquire, and deploy systems for traditional and diverse modern AI applications from the NVIDIA NGC catalog on a single high-performance, cost-effective, and scalable infrastructure.
A30 Tensor Core GPU Specifications
| FP64 | 5.2 teraFLOPS | |
| FP64 Tensor Core | 10.3 teraFLOPS | |
| FP32 | 10.3 teraFLOPS | |
| TF32 Tensor Core | 82 teraFLOPS | 165 teraFLOPS* | |
| BFLOAT16 Tensor Core | 165 teraFLOPS | 330 teraFLOPS* | |
| FP16 Tensor Core | 165 teraFLOPS | 330 teraFLOPS* | |
| INT8 Tensor Core | 330 TOPS | 661 TOPS* | |
| INT4 Tensor Core | 661 TOPS | 1321 TOPS* | |
| Media engines | 1 optical flow accelerator (OFA) 1 JPEG decoder (NVJPEG) 4 video decoders (NVDEC) | |
| GPU memory | 24GB HBM2 | |
| GPU memory bandwidth | 933GB/s | |
| Interconnect | PCIe Gen4: 64GB/s Third-gen NVLINK: 200GB/s** | |
| Form factor | Dual-slot, full-height, full-length (FHFL) | |
| Max thermal design power (TDP) | 165W | |
| Multi-Instance GPU (MIG) | 4 GPU instances @ 6GB each 2 GPU instances @ 12GB each 1 GPU instance @ 24GB | |
| Virtual GPU (vGPU) software support | NVIDIA AI Enterprise NVIDIA Virtual Compute Server | |
* With sparsity
** NVLink Bridge for up to two GPUs

MacBook
iPad
Apple Watch
Airpods
iMac
Studio Display
iphone
Gaming Laptop

Gaming Desktop
DDR5 Desktop
DDR5 Laptop
DDR5 Server
DDR4 Desktop
DDR4 Laptop
DDR4 Server
DDR3 Desktop
DDR3 Laptop
DDR3 Server
Intel Socket
Intel Z890
Intel B860
Intel B760
Intel H770
Intel B660
Intel H670
Intel H610
Intel Z690
Intel H510
Intel Z590
Intel B560
Intel H470
Intel Z490
Intel H410
Intel B460
Intel H310
Intel B360
Intel B365
Intel X299
Intel Z390
Intel Z370
Intel H370
Intel Z270 H270
Intel B250
Intel Z170 H170
Intel H110
Intel H81
Intel B85
Intel H61
Intel B150
AMD Socket
AMD B850
AMD B840
AMD TRX50
AMD A620
AMD X870
AMD B650
AMD A520
AMD TRX40
AMD B550
AMD X570
AMD X470
AMD B450
AMD X370
AMD A320
AMD B350
AMD X399
AMD A88
AMD A68 A78
Cpu Air Coolers
CPU Liquid Coolers
Fans
AMD CPUs Desktop
AMD Server CPU
Intel Server CPU
Samsung CPUs
Other special CPUs

Solid State Drives
NVMe PCIe M.2
SATA 2.5inch
Hard Disk Drive
Server Hard Drives
NAS hard drive
Monitoring hard drive
Portable Solid State Drives
Memory Cards
USB Flash Drives
Nvidia GPU
RTX 50 series
RTX 30 series
GTX 16 series
GTX 10 series
RX 9000 series
RX 7000 series
RX 6000 series
RX 5000 series
RX 500 series
RTX 20 series
Rack server
Blade server
Tower server
Storage Server Solutions
Network switch
Workstation
Mobile Workstation
Server motherboard
Workstation Motherboard
SONY Gaming Console
ASUS Gaming Console
Lenovo Gaming Console
One XPlayer
Microsoft Gaming Console
XBOX Gaming Console
MSI Gaming Console
Motherboard
GTX TITAN
Computer Cases