Apache Arrow tensor arrays: an approach for storing tensor data

Day 1 | 14:30 | 00:05 | UB5.132 | Rok Mihevc, Alenka


Note: I'm reworking this at the moment, some things won't work.

The stream isn't available yet! Check back at 14:30.

This talk introduces Apache Arrow's tensor arrays as a tool for representing an array of tensors in memory, their storage and transportation. We'll introduce the tensor array memory layout specification, its implementation in Arrow C++ and Python, showcasing how it can help interoperate with PyData and database ecosystems.

We'll present the fixed and variable shape tensor array specifications, their implementations and how they can be used to interoperate with Arrow aware ecosystem such as DLPack, NumPy, and others. Further we'll discuss design decisions we made to make the two tensor arrays as generic and universal as possible.