Skip to main content

Rewriting pyc files for fun and reproducibility

UD2.218A | Day 2 | 16:30 - 17:00 | Speakers: Zbigniew Jędrzejewski-Szmek

Rewriting pyc files for fun and reproducibility
A picture of a devroom at FOSDEM 2024
Open in browser

Notes

Abstract

Python writes bytecode files (*.pyc) to speed up module imports. It's been known for a while that those are not reproducible: on different architectures, the bytecode for exactly the same sources ends up slightly different. Fedora is working on making all package builds reproducible, and with 8500 Python source packages, we quickly found out that differences in bytecode give us grief. One source of the difference (reference flag numbering) has been known for a while. But after cleaning that up, we found that there are at least two other ways in which bytecode is irreproducible: one related to reference use (also solved), one related to object order (still unsolved). In this talk we'll describe the problem and report how Fedora rewrites bytecode files in package builds to make them smaller and reproducible.

Speakers

Zbigniew Jędrzejewski-Szmek

Notice: The placeholder video image is licensed under CC BY-SA 4.0. The original image can be found hereChanges made to the image are: Cropped the image to a new ratio, part of the image was cut off.