Skip to main content

Apache Arrow tensor arrays: an approach for storing tensor data

UB5.132 | Day 1 | 14:30 - 14:35 | Speakers: Rok Mihevc, Alenka Frim

Apache Arrow tensor arrays: an approach for storing tensor data
A picture of a devroom at FOSDEM 2024
Open in browser

Notes

Abstract

This talk introduces Apache Arrow's tensor arrays as a tool for representing an array of tensors in memory, their storage and transportation. We'll introduce the tensor array memory layout specification, its implementation in Arrow C++ and Python, showcasing how it can help interoperate with PyData and database ecosystems.

We'll present the fixed and variable shape tensor array specifications, their implementations and how they can be used to interoperate with Arrow aware ecosystem such as DLPack, NumPy, and others. Further we'll discuss design decisions we made to make the two tensor arrays as generic and universal as possible.

Attachments

Speakers

Rok Mihevc
Alenka Frim

Notice: The placeholder video image is licensed under CC BY-SA 4.0. The original image can be found hereChanges made to the image are: Cropped the image to a new ratio, part of the image was cut off.