Zero Downtime vs. Zero Data Loss: Insights from Large-Scale S3 Migrations

Day 1 | 10:30 | 00:30 | K.3.401 | Sirisha Guduru, Artem Torubarov


Note: I'm reworking this at the moment, some things won't work.

The stream isn't available yet! Check back at 10:30.
Get involved in the conversation!Join the chat

Building upon previous work presented at FOSDEM, this talk provides updates on the Chorus project's advancements in large-scale S3 object storage migrations. We introduce key enhancements including the Chorus Agent, which enables proxy-free live migrations through bucket notifications, along with SWIFT API support and scheduled migrations. Drawing from real-world multi-petabyte migration experiences, we analyze two migration strategies through their disaster recovery metrics. The first achieves zero-downtime with optimal Recovery Time Objective (RTO) but variable Recovery Point Objective (RPO) during outages, while the second guarantees zero RPO at the cost of planned downtime. We present practical insights and recommendations for implementing large-scale S3 migrations while minimizing operational risks and disruptions.