Running tinygrad and ggml on microcontroller NPUs
UD2.120 (Chavanne) | Day 1 | 15:10 - 15:15 | Speakers: Roman Shaposhnik
Abstract
Running various forms of inference on microcontroller NPUs is not new. Systems where machine learning is used to analyze sensor data or do light CV on microcontroller-grade systems under 1 watt, under few dozen MB of RAM and FLASH and under 10 USD bill-of-materials are being massively deployed (even if they stay in the long shadow of more flashy LLMs and GenAI). That area, however, historically has been a domain of specialized machine learning frameworks such as emlearn, LiteRT (artist formerly known as TensorFlow Lite) and a few others.
The question I will try to answer in this talk is the following: are there any benefits of trying to use more well established, but still pretty tightly optimized frameworks such as ggml and tinygrad for these types of deployments. I will share my experience with adopting these frameworks to targets such as Google Coral NPU and AI Foundry Erbium and what kind of interesting challenges it presented.
Links
External Links
Notice: The placeholder video image is licensed under CC BY-SA 4.0. The original image can be found hereChanges made to the image are: Cropped the image to a new ratio, part of the image was cut off.
