Skip to main content

What can PyArrow do for you - Array interchange, storage, compute and transport

UD2.218A | Day 2 | 11:00 - 11:30 | Speakers: Rok Mihevc, Alenka Frim

What can PyArrow do for you - Array interchange, storage, compute and transport
A picture of a devroom at FOSDEM 2024
Open in browser

Notes

Abstract

PyArrow is a powerful tool for Python developers seeking high-performance data processing and interchange. This talk will provide a pragmatic overview of some of PyArrow's capabilities, demonstrating data interchange, storage, manipulation and transport using a single Python library.

We'll explore four key capabilities:

Array Interchange: Seamless data exchange between NumPy, pandas, and other libraries using zero-copy Storage: Efficient serialization and file format support (Parquet, ORC, Feather) with advanced compression Compute: High-performance in-memory computation and data transformation capabilities Transport: Leveraging Arrow Flight RPC for distributed data movement and processing

Attachments

Speakers

Rok Mihevc
Alenka Frim

Notice: The placeholder video image is licensed under CC BY-SA 4.0. The original image can be found hereChanges made to the image are: Cropped the image to a new ratio, part of the image was cut off.