Skip to main content

Vulkan API for Machine Learning? Competing with CUDA and ROCm in llama.cpp

UD2.120 (Chavanne) | Day 1 | 14:45 - 15:05 | Speakers: Ruben Ortlam

Vulkan API for Machine Learning? Competing with CUDA and ROCm in llama.cpp
A picture of a devroom at FOSDEM 2024
Open in browser

Notes

Abstract

Most Machine Learning tools use CUDA for hardware acceleration, and are as a result only compatible with Nvidia GPUs. AMD has been making a lot of progress enabling simple recompilation with minimal code changes to ROCm for their hardware, but why not use an open and broadly-compatible API instead? That's where Vulkan comes in, which was built up for game development, but also allows compute-only applications, and has broad and good driver support across many hardware vendors.

As a follow-up to last year's talk about my work on the llama.cpp/GGML Vulkan backend, this talk will discuss lessons learnt from optimizations and new features that we added since, how viable Vulkan is for Machine Learning and what it is still missing.

https://github.com/ggml-org/llama.cpp https://github.com/ggml-org/ggml

Speakers


Notice: The placeholder video image is licensed under CC BY-SA 4.0. The original image can be found hereChanges made to the image are: Cropped the image to a new ratio, part of the image was cut off.