Skip to main content

Apache Arrow, Hostage Negotiator: Revisiting the case for Client Protocol Redesign

UB2.252A (Lameere) | Day 1 | 17:00 - 17:20 | Speakers: Matthew Topol

Apache Arrow, Hostage Negotiator: Revisiting the case for Client Protocol Redesign
A picture of a devroom at FOSDEM 2024
Open in browser

Notes

Abstract

In 2017, Mark Raasveldt and Hannes Mühleisen (who went on to create DuckDB presented a VLDB paper entitled “Don’t Hold My Data Hostage – A Case For Client Protocol Redesign.” Their paper proposed the use of columnar serialization to achieve order-of-magnitude improvements in query result transfer performance. Eight years later, this talk revisits Raasveldt and Mühleisen’s argument and describes the central role that the Apache Arrow project has played in realizing this vision—through the dissemination of Arrow IPC, Arrow Flight, Arrow Flight SQL, Arrow over HTTP, and ADBC across numerous open source and commercial query systems. The talk concludes with a call to action to introduce Arrow-based transport to the systems that continue to “hold data hostage.”


Notice: The placeholder video image is licensed under CC BY-SA 4.0. The original image can be found hereChanges made to the image are: Cropped the image to a new ratio, part of the image was cut off.