Skip to main content

Beyond nvidia-smi: Tools for Real GPU Performance Metrics

H.1301 (Cornil) | Day 2 | 09:50 - 10:30 | Speakers: YASH PANCHAL

Beyond nvidia-smi: Tools for Real GPU Performance Metrics
A picture of a devroom at FOSDEM 2024
Open in browser
Get involved in the conversation!Join the chat

Notes

Abstract

Relying only on nvidia-smi is like measuring highway usage by checking if any car is present, not how many lanes are full.

This talk reveals the metrics nvidia-smi doesn't show and introduces open source tools that expose actual GPU efficiency metrics.

We'll cover:

  1. Why GPU Utilization is not same as GPU Efficiency.
  2. Deep dive into relevant key metrics: SM metrics, Tensor Core metrics, Memory metrics explained.
  3. Practical gpu profiling and monitoring setup.
  4. Identifying bottlenecks in inference workloads.

Attendees will leave understanding how to identify underutilized GPU and discover real optimization opportunities across inference workloads.

Attachments


Notice: The placeholder video image is licensed under CC BY-SA 4.0. The original image can be found hereChanges made to the image are: Cropped the image to a new ratio, part of the image was cut off.