So you want to do RDMA programming? RTRS: An easy to use, reliable high speed transport library over RDMA
H.1302 (Depage) | Day 1 | 14:30 - 14:50 | Speakers: Haris, Jinpu Wang
Abstract
Description
- RDMA programming is comparatively complex to something like sockets.
- RDMA is the industry standard for data centers and high-performance computing (HPC) environments.
- RTRS is a reliable high speed transport library, which provides a simple interface to perform RDMA. It is a stable, and proven transport library, running on more than 5000 servers across our data centers.
- RTRS establishes a stateful session which provides features like multipath, heartbeats, reusability, etc.
- It creates an optimal number of connections based on the number of CPUs, and uses IRQ pinning for data transfers.
- It allows users to send and receive data in the form of sg lists.
- RTRS is multipath capable (with different policies to choose from) and provides I/O fail-over and load-balancing functionality.
- RTRS pre-allocates and pre-maps DMA buffers on the server side to speed up data paths.
Benefit to the ecosystem
- An easy to use, reliable and stable RDMA transport library to build any kind of module upon. RTRS will provide an entry point for newcomers to RDMA.
- The pre mapping abilities have use-cases in high performance use cases like ML and AI training.
Link to the module
https://elixir.bootlin.com/linux/v6.17.7/source/drivers/infiniband/ulp/rtrs
Attachments
Speakers
Haris Iqbal is a Senior Manager and Linux Kernel Developer at IONOS, specializing in cloud storage systems, kernel architecture, and RDMA-based transport protocols. With over nine years of experience in system software development, he is a maintainer of the RNBD and RTRS modules in the upstream Linux kernel. Haris is passionate about building resilient, high-performance storage systems and holds multiple patents in storage technologies.
Jinpu (Jack) Wang is a Linux systems engineer based in Berlin, currently working as an Sr staff software engineer at IONOS Cloud. He specializes in Linux kernel development, virtualization, and performance optimization, and is an active contributor to the upstream Linux kernel and related open-source projects. His work focuses on building reliable, scalable cloud compute infrastructure through close collaboration with the open-source community.
Links
External Links
Notice: The placeholder video image is licensed under CC BY-SA 4.0. The original image can be found hereChanges made to the image are: Cropped the image to a new ratio, part of the image was cut off.
