Skip to main content

So you want to do RDMA programming? RTRS: An easy to use, reliable high speed transport library over RDMA

H.1302 (Depage) | Day 1 | 14:30 - 14:50 | Speakers: Haris, Jinpu Wang

So you want to do RDMA programming? RTRS: An easy to use, reliable high speed transport library over RDMA
A picture of a devroom at FOSDEM 2024
Open in browser
Get involved in the conversation!Join the chat

Notes

Abstract

Description

  • RDMA programming is comparatively complex to something like sockets.
  • RDMA is the industry standard for data centers and high-performance computing (HPC) environments.
  • RTRS is a reliable high speed transport library, which provides a simple interface to perform RDMA. It is a stable, and proven transport library, running on more than 5000 servers across our data centers.
  • RTRS establishes a stateful session which provides features like multipath, heartbeats, reusability, etc.
  • It creates an optimal number of connections based on the number of CPUs, and uses IRQ pinning for data transfers.
  • It allows users to send and receive data in the form of sg lists.
  • RTRS is multipath capable (with different policies to choose from) and provides I/O fail-over and load-balancing functionality.
  • RTRS pre-allocates and pre-maps DMA buffers on the server side to speed up data paths.

Benefit to the ecosystem

  • An easy to use, reliable and stable RDMA transport library to build any kind of module upon. RTRS will provide an entry point for newcomers to RDMA.
  • The pre mapping abilities have use-cases in high performance use cases like ML and AI training.

Link to the module

https://elixir.bootlin.com/linux/v6.17.7/source/drivers/infiniband/ulp/rtrs

Attachments


Notice: The placeholder video image is licensed under CC BY-SA 4.0. The original image can be found hereChanges made to the image are: Cropped the image to a new ratio, part of the image was cut off.