Name: GPU Virtualization with MIG: Multi-Tenant Isolation for AI Inference Workloads
Start: 2026-01-31T18:00:00
End: 2026-01-31T18:00:00
Location: H.2213

Abstract

Serving AI models on a single GPU for multi tenant workload sounds challenging till you partition a GPU correctly.

This talk is a deep technical exploration of running AI inference workloads on modern GPUs across using Multi-Instance GPU (MIG) isolation.

We'll explore:

The multi-tenant problem: MIG vs other GPU slicing methods.
MIG Fundamentals: Key concepts, working and support.
Managing MIG instances: creation, configuration, monitoring and deletion.
Identifying right approaches based on your workload.
Common issues and failures

Whether you're building a multi-tenant inference platform, optimizing GPU utilization for your team, or exploring how to serve AI models cost-effectively, this talk provides practical configurations for your AI workloads.

Attachments

Slides

Links

External Links

Notice: The placeholder video image is licensed under CC BY-SA 4.0. The original image can be found hereChanges made to the image are: Cropped the image to a new ratio, part of the image was cut off.

GPU Virtualization with MIG: Multi-Tenant Isolation for AI Inference Workloads

Notes

Abstract

Attachments

Speakers

Links

External Links