Skip to main content

Closing the Loop: A Self-Learning Compiler for AI Accelerators

UD2.120 (Chavanne) | Day 1 | 13:30 - 13:50 | Speakers: Ramon Wirsch

Closing the Loop: A Self-Learning Compiler for AI Accelerators
A picture of a devroom at FOSDEM 2024
Open in browser

Notes

Abstract

AI workloads increasingly target heterogeneous accelerators, from GPUs and TPUs to novel RISC-V architectures, each with different scheduling, memory, and concurrency constraints. Traditional compilers rely on static heuristics that do not generalize across devices and custom neural network layers.

In this talk, we present the Daisytuner Optimizing Compiler Collection (DOCC), a self-learning compiler that closes the optimization loop by continuously collecting performance data from real executions and feeding it back into the compilation pipeline. The system represents code regions using stateful dataflow multigraphs, an open-source intermediate representation that enables symbolic dataflow analysis. Performance profiles in the form of hardware counters and execution times are ingested into an online embedding database that the compiler can query to derive and apply new optimizations.

We describe the generation of SDFGs from ONNX and PyTorch via IREE, the passes for mapping the IR to backends, and the benchmarking infrastructure running on our super-heterogeneous cluster. We conclude by showing how this feedback pipeline allows the compiler to evolve its optimization strategies automatically, improving schedules without human intervention.

Speakers


Notice: The placeholder video image is licensed under CC BY-SA 4.0. The original image can be found hereChanges made to the image are: Cropped the image to a new ratio, part of the image was cut off.