Description
Bug report
description
Using gzip.compress()
with mtime=0
in 3.8<=cpython<=3.10, the OS
byte, i.e. the 10th byte in the GZIP header, is set to 255
"unknown" (also see e.g. #83302):
Line 599 in dc0adb4
However, in cpython 3.11 and 3.12, the OS
byte is suddenly set to a "known" value, e.g. 3
("Unix") on Ubuntu.
This is not mentioned in the changelog for Python 3.11.
This may lead to problems in the context of reproducible builds. In our case, hash checking fails after decompressing and re-compressing a gzipped archive.
how to reproduce
Here's an example, where byte 10 is \xff
in python 3.10 and \x03
in python 3.11:
~ $ python
Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0] on linux
>>> import gzip
>>> gzip.compress(b'', mtime=0)
b'\x1f\x8b\x08\x00\x00\x00\x00\x00\x02\xff\x03\x00\x00\x00\x00\x00\x00\x00\x00\x00'
~ $ pyenv shell 3.11
~ $ python
Python 3.11.6 (main, Nov 23 2023, 17:30:16) [GCC 11.4.0] on linux
>>> import gzip
>>> gzip.compress(b'', mtime=0)
b'\x1f\x8b\x08\x00\x00\x00\x00\x00\x02\x03\x03\x00\x00\x00\x00\x00\x00\x00\x00\x00'
cause
I guess this is caused by python 3.11 delegating the gzip.compress()
call to zlib
if mtime=0
, as mentioned in the docs:
Changed in version 3.11: Speed is improved by compressing all data at once instead of in a streamed fashion. Calls with mtime set to 0 are delegated to zlib.compress() for better speed.
and source:
Lines 609 to 612 in 89ddea4
Apparently zlib
does set the OS
byte.
CPython versions tested on:
3.8, 3.9, 3.10, 3.11, 3.12
Operating systems tested on:
Linux, macOS, Windows
Linked PRs
- gh-112346: Bugfix: Remove faster codepath from gzip.compress as it introduces behavioral inconsistencies #114116
- gh-112346: Document the OS byte in
gzip.compress
output change in 3.11 #120480 - gh-112346: Always set OS byte to 255, simpler gzip.compress function. #120486
- [3.13] gh-112346: Always set OS byte to 255, simpler gzip.compress function. (GH-120486) #120563
- [3.13] gh-112346: Document the OS byte in
gzip.compress
output change in 3.11 (GH-120480) #120612 - [3.12] gh-112346: Document the OS byte in
gzip.compress
output change in 3.11 (GH-120480) #120613 - [3.11] gh-112346: Document the OS byte in
gzip.compress
output change in 3.11 (GH-120480) #120614
Metadata
Metadata
Assignees
Projects
Status