NVIDIA Multi-Instance GPU
Seven independent instances in a single GPU.
Multi-Instance GPU (MIG) expands the performance and value of NVIDIA Blackwell and Hopper™ generation GPUs. MIG can partition the GPU into as many as seven instances, each fully isolated with its own high-bandwidth memory, cache, and compute cores. This gives administrators the ability to support every workload, from the smallest to the largest, with guaranteed quality of service (QoS) and extending the reach of accelerated computing resources to every user.
Benefits Overview
Expand GPU Access
With MIG, you can achieve up to 7X more GPU resources on a single GPU. MIG gives researchers and developers more resources and flexibility than ever before.
Optimize GPU Utilization
MIG provides the flexibility to choose many different instance sizes, which allows provisioning of the right-sized GPU instance for each workload, ultimately optimizing utilization and maximizing data center investment.
Run Simultaneous Workloads
MIG enables inference, training, and high-performance computing (HPC) workloads to run at the same time on a single GPU with deterministic latency and throughput. Unlike time slicing, each workload runs in parallel, delivering higher performance.
How the Technology Works
Without MIG, different jobs running on the same GPU, such as different AI inference requests, compete for the same resources. A job consuming larger memory bandwidth starves others, resulting in several jobs missing their latency targets. With MIG, jobs run simultaneously on different instances, each with dedicated resources for compute, memory, and memory bandwidth, resulting in predictable performance with QoS and maximum GPU utilization.
Provision and Configure Instances as Needed
Run Workloads in Parallel, Securely
MIG in Blackwell GPUs
Blackwell and Hopper GPUs support MIG with multi-tenant, multi-user configurations in virtualized environments across up to seven GPU instances, securely isolating each instance with confidential computing at the hardware and hypervisor level. Dedicated video decoders for each MIG instance deliver secure, high-throughput intelligent video analytics (IVA) on shared infrastructure. With concurrent MIG profiling, administrators can monitor right-sized GPU acceleration and allocate resources for multiple users.
For researchers with smaller workloads, rather than renting a full cloud instance, they can use MIG to isolate a portion of a GPU securely while being assured that their data is secure at rest, in transit, and in use. This improves flexibility for cloud service providers to price and address smaller customer opportunities.
Built for IT and DevOps
MIG enables fine-grained GPU provisioning by IT and DevOps teams. Each MIG instance behaves like a standalone GPU to applications, so there’s no change to the CUDA® platform. MIG can be used in all major enterprise computing environments.
Deploy from Data Center to Edge
Deploy from Data Center to Edge Use MIG on premises, in the cloud, and at the edge.
Leverage Containers
Run containerized applications on MIG instances.
Support Kubernetes
Schedule Kubernetes pods on MIG instances.
Virtualize Applications
Run applications on MIG instances inside a virtual machine.
MIG Specifications
GB200/B200/B100 | H100 | |
---|---|---|
Confidential computing | Yes | Yes |
Instance types | Up to 7x 23GB Up to 4x 45GB Up to 2x 95GB Up to 1x 192GB | 7x 10GB 4x 20GB 2x 40GB 1x 80GB |
GPU profiling and monitoring | Concurrently on all instances | Concurrently on all instances |
Secure Tenants | 7x | 7x |
Media decoders | Dedicated NVJPEG and NVDEC per instance | Dedicated NVJPEG and NVDEC per instance |
Preliminary specifications, may be subject to change