Name: Building a new GGML backend: How, Challenges and Opportunities with Novel Accelerators
Start: 2025-02-01T15:00:00
End: 2025-02-01T15:00:00
Location: UB2.252A (Lameere)

Abstract

llama.cpp/GGML is a popular piece of software to run (mostly) large language models. It has support for common consumer and enterprise hardware like NVIDIA, AMD and Intel GPUs. But what if you want to onboarding new accelerators? Say a new architecture that promises to reduce power by a few fold. This talk aims to share the experience and knowledge learned building a (work in progress) GGML backend for Tenstorrent's Grayskull and Wormhole AI processor. And what's like to work with a brand new software stack.

Source code: https://github.com/marty1885/llama.cpp/tree/metalium-support/ Documentation: https://github.com/marty1885/llama.cpp/blob/metalium-support/docs/backend/Metalium.md

Attachments

Links

External Links

Notice: The placeholder video image is licensed under CC BY-SA 4.0. The original image can be found hereChanges made to the image are: Cropped the image to a new ratio, part of the image was cut off.

Building a new GGML backend: How, Challenges and Opportunities with Novel Accelerators

Notes

Abstract

Attachments

Speakers

Links

External Links