https://www.bamsoftware.com/hacks/zipbomb/ (writeup)
https://www.bamsoftware.com/hacks/denhac-zipbomb/ (this talk)
$ git clone https://www.bamsoftware.com/git/zipbomb.git $ cd zipbomb $ make $ sha256sum zbsm.zip zblg.zip zbxl.zip fb4ff972d21189beec11e05109c4354d0cd6d3b629263d6c950cf8cc3f78bd99 zbsm.zip f1dc920869794df3e258f42f9b99157104cd3f8c14394c1b9d043d6fcda14c0a zblg.zip eafd8f574ea7fd0f345eaa19eae8d0d78d5323c8154592c850a2d78a86817744 zbxl.zip $ wc --bytes zbsm.zip zblg.zip zbxl.zip 42374 zbsm.zip 9893525 zblg.zip 45876952 zbxl.zip $ unzip -l zblg.zip | tail -n 1 281395456244934 65534 files $ ./ratio zblg.zip zblg.zip 281395456244934 / 9893525 28442385.9286689 +74.54 dB $ time unzip -p zblg.zip | dd bs=32M of=/dev/null status=progress ...
The only universally supported compression algorithm is DEFLATE (RFC 1951).
But DEFLATE's maximum compression ratio is 1032.
$ dd if=/dev/zero bs=1000000 count=1000 | gzip > test.gz $ gzip -l test.gz compressed uncompressed ratio uncompressed_name 970501 1000000000 99.9% test $ echo '1000000000 / 970501' | bc -l 1030.39564101428025318881
42.zip tries to work around the DEFLATE limitation by recursively nesting zip files inside other zip files.
Goal: high compression ratio without using recursion.
A zip file consists of a central directory, which is like a table of contents that points backwards to individual files.
Each file consists of a local file header and compressed file data.
The headers in the central directory and in the files contain (redundant) metadata such as the filename.
The zip file format specification is called APPNOTE.TXT. For a specification, it's not very precise. If you read it with a security mindset, you will quickly think of many troubling questions.
Compress one kernel of ratio 1032:1, refer to it many times.
Unfortunately this doesn't quite work, because filenames don't match.
(See overlap.zip, generated by the source code.)
We need separate local file headers, but we cannot just put them end to end, because the decompressor is expecting a structured DEFLATE stream, not another local file header.
We need a way to protect or quote the local file headers to prevent them from being interpreted as DEFLATE data.
Solution: add a prefix that wraps the local file header in a non-compressed literal block, thus making it a valid part of the DEFLATE stream.
$ ./zipbomb --alphabet=ABCDE --num-files=5 --compressed-size=50 > test.zip $ unzip -l test.zip Archive: test.zip Length Date Time Name --------- ---------- ----- ---- 36245 1982-10-08 13:37 A 36214 1982-10-08 13:37 B 36183 1982-10-08 13:37 C 36152 1982-10-08 13:37 D 36121 1982-10-08 13:37 E --------- ------- 180915 5 files
A | B | C | D | E |
---|---|---|---|---|
BFINAL=0 BTYPE=00 31 bytes "PK\x03\x04\x14\x00..." BFINAL=0 BTYPE=00 31 bytes "PK\x03\x04\x14\x00..." BFINAL=0 BTYPE=00 31 bytes "PK\x03\x04\x14\x00..." BFINAL=0 BTYPE=00 31 bytes "PK\x03\x04\x14\x00..." BFINAL=1 BTYPE=10 compressed kernel 36121 × 'a' |
BFINAL=0 BTYPE=00 31 bytes "PK\x03\x04\x14\x00..." BFINAL=0 BTYPE=00 31 bytes "PK\x03\x04\x14\x00..." BFINAL=0 BTYPE=00 31 bytes "PK\x03\x04\x14\x00..." BFINAL=1 BTYPE=10 compressed kernel 36121 × 'a' |
BFINAL=0 BTYPE=00 31 bytes "PK\x03\x04\x14\x00..." BFINAL=0 BTYPE=00 31 bytes "PK\x03\x04\x14\x00..." BFINAL=1 BTYPE=10 compressed kernel 36121 × 'a' |
BFINAL=0 BTYPE=00 31 bytes "PK\x03\x04\x14\x00..." BFINAL=1 BTYPE=10 compressed kernel 36121 × 'a' |
BFINAL=1 BTYPE=10 compressed kernel 36121 × 'a' |
Local file headers are treated as both code and data: both as part of the zip structure (code) and as part of a DEFLATE stream (file data).
$ unzip test.zip Archive: test.zip inflating: A inflating: B inflating: C inflating: D inflating: E $ xxd E | head -n 5 00000000: 6161 6161 6161 6161 6161 6161 6161 6161 aaaaaaaaaaaaaaaa 00000010: 6161 6161 6161 6161 6161 6161 6161 6161 aaaaaaaaaaaaaaaa 00000020: 6161 6161 6161 6161 6161 6161 6161 6161 aaaaaaaaaaaaaaaa 00000030: 6161 6161 6161 6161 6161 6161 6161 6161 aaaaaaaaaaaaaaaa 00000040: 6161 6161 6161 6161 6161 6161 6161 6161 aaaaaaaaaaaaaaaa $ xxd D | head -n 5 00000000: 504b 0304 1400 0000 0800 a06c 4805 a1b7 PK.........lH... 00000010: f363 3200 0000 198d 0000 0100 0000 4561 .c2...........Ea 00000020: 6161 6161 6161 6161 6161 6161 6161 6161 aaaaaaaaaaaaaaaa 00000030: 6161 6161 6161 6161 6161 6161 6161 6161 aaaaaaaaaaaaaaaa 00000040: 6161 6161 6161 6161 6161 6161 6161 6161 aaaaaaaaaaaaaaaa $ xxd C | head -n 5 00000000: 504b 0304 1400 0000 0800 a06c 4805 29b0 PK.........lH.). 00000010: 790b 5600 0000 388d 0000 0100 0000 4450 y.V...8.......DP 00000020: 4b03 0414 0000 0008 00a0 6c48 05a1 b7f3 K.........lH.... 00000030: 6332 0000 0019 8d00 0001 0000 0045 6161 c2...........Eaa 00000040: 6161 6161 6161 6161 6161 6161 6161 6161 aaaaaaaaaaaaaaaa
In order of increasing technicality.
Info-ZIP UnZip 6.0 mishandles the overlapping of files inside a ZIP container, leading to denial of service (resource consumption), aka a "better zip bomb" issue.
IMO this is not really a security problem with UnZip.
Debian merged a patch; SUSE decided not to. The Debian patch caused unanticipated problems with certain zip-like files:
VirusTotal for:
Selected web server referers:
PDF is similar structurally to zip. Didier Stevens wrote about stacking compression filters to create a PDF bomb.
https://www.bamsoftware.com/hacks/zipbomb/ (writeup)
https://www.bamsoftware.com/hacks/denhac-zipbomb/ (this talk)