Skip to main content

wllama: bringing llama.cpp to the web

UB2.252A (Lameere) | Day 2 | 16:20 - 16:40 | Speakers: Xuan-Son Nguyen

wllama: bringing llama.cpp to the web
A picture of a devroom at FOSDEM 2024
Open in browser

Notes

Abstract

As one of the main contributor of the llama.cpp project, I’ve explored ways to bring its capabilities to the web through WebAssembly, creating a frontend solution for on-device inference without the need for servers or external APIs. This talk shares my journey in implementing wllama, a lightweight TypeScript/JavaScript library designed to push llama.cpp’s limits in a web context. I’ll discuss my motivations, the implementation details, the challenges faced, and the future roadmap, offering insights into the technical and creative decisions behind the project.

Speakers

Xuan-Son Nguyen

Notice: The placeholder video image is licensed under CC BY-SA 4.0. The original image can be found hereChanges made to the image are: Cropped the image to a new ratio, part of the image was cut off.