You are viewing the 2025 edition of FOSDEM. Click here to view the 2026 edition
wllama: bringing llama.cpp to the web
UB2.252A (Lameere) | Day 2 | 16:20 - 16:40 | Speakers: Xuan-Son Nguyen
wllama: bringing llama.cpp to the web
Abstract
As one of the main contributor of the llama.cpp project, I’ve explored ways to bring its capabilities to the web through WebAssembly, creating a frontend solution for on-device inference without the need for servers or external APIs. This talk shares my journey in implementing wllama, a lightweight TypeScript/JavaScript library designed to push llama.cpp’s limits in a web context. I’ll discuss my motivations, the implementation details, the challenges faced, and the future roadmap, offering insights into the technical and creative decisions behind the project.
Speakers
Xuan-Son Nguyen
Links
External Links
Notice: The placeholder video image is licensed under CC BY-SA 4.0. The original image can be found hereChanges made to the image are: Cropped the image to a new ratio, part of the image was cut off.
