tract - an efficient rust neural network inference engine
UD2.120 (Chavanne) | Day 1 | 11:25 - 11:45 | Speakers: Julien Balian, Mathieu Poumeyrol
Abstract
Deploying neural networks in production environments presents unique challenges: models must run efficiently across diverse hardware, from powerful servers to resource-constrained embedded devices, while maintaining predictable performance without heavy runtime dependencies.
This talk introduces tract, Sonos's open-source neural network inference toolkit started in 2018 and written in Rust. We'll explore how tract bridges the gap between training frameworks and production deployment by offering a no-nonsense, self-contained inference solution used today to deploy deep learning on millions of devices at Sonos.
This toolkit has some unique strengths thanks to embedded graph optimization, automated streaming management, and symbolic abstraction for dynamic dimensions — plus support for multiple open exchange standards including ONNX, NNEF, and TensorFlow Lite.
tract also has a companion project coined torch-to-nnef that strive to export PyTorch models to an NNEF optimized for tract with maximum compatibility. It enables some unique features like quantization, better Fourier Transform support and easier extensibility: this will also be discussed shortly during this presentation.
Attachments
Speakers
Machine learning practitioner with 12 years of experience spanning NLP, speech recognition, voice identification, wake-word detection, large language models, and quantization—from training to production. I work across industry and healthcare applications, with a focus on privacy-first, on-device inference solutions. I specialize in translating research papers into production-ready implementations, conducting applied research and experimentation to solve real-world problems. I share insights and technical deep-dives at dreamermind.dev.
Mathieu Poumeyrol (aka kali) is a software engineer based in Paris, working currently at Sonos. Mathieu is the author of tract, a “tiny, no-nonsense” neural-network inference toolkit written in Rust that can load models in multiple formats, optimize them, and run them on-device. Along the way, Mathieu has worked on developer tooling and embedded-friendly ML/voice topics, with published work in areas like keyword spotting.
Links
External Links
Notice: The placeholder video image is licensed under CC BY-SA 4.0. The original image can be found hereChanges made to the image are: Cropped the image to a new ratio, part of the image was cut off.
