Last year I wrote a small benchmark suite to benchmark the various zlib optimization forks that were floating around. There's a couple of reasons to update those results. First, there were major optimizations added to the Cloudflare fork. And second, there's now a new entrant, zlib-ng which merges in the changes from both the Intel and Cloudflare versions but also drops support for old architectures and cleans up the code in general.
I'll write a bit less commentary this time, so that the results will
be easier to update in the future without a new post. The big change
compared to the 2014-08 results is that the Cloudflare version is
now significantly faster particularly on high compression levels, but
there are smaller improvements on all compression levels. Except
for compression level 1, it seems like the preferable version now for
pure speed.
Zlib-ng showed a massive slowdown in decompression
speed compared to all other versions until compiled with --zlib-compat
(only relevant for minigzip, not necessary for general use of
the library),
and is much slower with
compression level 1 than the the Intel version despite apparently using
the new quick
deflate strategy. On other levels it
closely shadows the Intel results.
Versions used:
baseline | 50893291621658f355bc5b4d450a8d06a563053d |
cloudflare | a80420c63532c25220a54ea0980667c02303460a |
intel | e176b3c23ace88d5ded5b8f8371bbab6d7b02ba8 |
zlib-ng | 4b1728a261e32e08bc5403f391ba65bfe5f4ba57 |
Flags used:
All: | CFLAGS='-msse4.2 -mpclmul -O3'
|
zlib-ng: | --zlib-compat
|
Decompression
baseline | cloudflare | intel | zlib-ng | |||||
decompress executable (50 iterations) | ||||||||
Execution time | 1.32s±0.00 | (100%) | 1.10s±0.00 | (83%) | 1.30s±0.01 | (98%) | 1.31s±0.01 | (99%) |
decompress html (50 iterations) | ||||||||
Execution time | 0.76s±0.00 | (100%) | 0.65s±0.00 | (85%) | 0.75s±0.00 | (98%) | 0.76s±0.00 | (100%) |
decompress jpeg (50 iterations) | ||||||||
Execution time | 0.20s±0.00 | (100%) | 0.12s±0.00 | (60%) | 0.20s±0.01 | (101%) | 0.20s±0.00 | (100%) |
decompress pngpixels (50 iterations) | ||||||||
Execution time | 0.87s±0.00 | (100%) | 0.65s±0.00 | (75%) | 0.85s±0.00 | (98%) | 0.86s±0.00 | (99%) |
Compression level 1
baseline | cloudflare | intel | zlib-ng | |||||
compress executable -1 (10 iterations) | ||||||||
Compression ratio | 0.37 | 0.37 | 0.46 | 0.46 | ||||
Execution time | 0.75s±0.01 | (100%) | 0.52s±0.01 | (69%) | 0.29s±0.00 | (38%) | 0.46s±0.01 | (61%) |
compress html -1 (10 iterations) | ||||||||
Compression ratio | 0.39 | 0.37 | 0.54 | 0.54 | ||||
Execution time | 0.38s±0.00 | (100%) | 0.27s±0.00 | (71%) | 0.19s±0.00 | (49%) | 0.28s±0.00 | (73%) |
compress jpeg -1 (10 iterations) | ||||||||
Compression ratio | 1.00 | 1.00 | 1.05 | 1.05 | ||||
Execution time | 0.65s±0.01 | (100%) | 0.53s±0.01 | (81%) | 0.24s±0.00 | (36%) | 0.40s±0.00 | (61%) |
compress pngpixels -1 (10 iterations) | ||||||||
Compression ratio | 0.17 | 0.17 | 0.23 | 0.23 | ||||
Execution time | 0.44s±0.01 | (100%) | 0.27s±0.01 | (60%) | 0.18s±0.00 | (40%) | 0.26s±0.00 | (57%) |
Compression level 3
baseline | cloudflare | intel | zlib-ng | |||||
compress executable -3 (10 iterations) | ||||||||
Compression ratio | 0.35 | 0.36 | 0.36 | 0.36 | ||||
Execution time | 1.10s±0.02 | (100%) | 0.62s±0.01 | (56%) | 0.73s±0.02 | (66%) | 0.69s±0.01 | (63%) |
compress html -3 (10 iterations) | ||||||||
Compression ratio | 0.36 | 0.35 | 0.35 | 0.35 | ||||
Execution time | 0.61s±0.00 | (100%) | 0.37s±0.00 | (59%) | 0.43s±0.00 | (69%) | 0.41s±0.00 | (66%) |
compress jpeg -3 (10 iterations) | ||||||||
Compression ratio | 1.00 | 1.00 | 1.00 | 1.00 | ||||
Execution time | 0.62s±0.00 | (100%) | 0.51s±0.00 | (82%) | 0.55s±0.00 | (88%) | 0.56s±0.00 | (90%) |
compress pngpixels -3 (10 iterations) | ||||||||
Compression ratio | 0.15 | 0.15 | 0.16 | 0.16 | ||||
Execution time | 0.85s±0.01 | (100%) | 0.44s±0.00 | (51%) | 0.46s±0.00 | (54%) | 0.44s±0.01 | (51%) |
Compression level 5
baseline | cloudflare | intel | zlib-ng | |||||
compress executable -5 (10 iterations) | ||||||||
Compression ratio | 0.33 | 0.34 | 0.34 | 0.34 | ||||
Execution time | 1.61s±0.00 | (100%) | 0.93s±0.01 | (57%) | 0.93s±0.00 | (57%) | 0.91s±0.01 | (56%) |
compress html -5 (10 iterations) | ||||||||
Compression ratio | 0.34 | 0.33 | 0.33 | 0.33 | ||||
Execution time | 0.99s±0.01 | (100%) | 0.57s±0.00 | (57%) | 0.53s±0.00 | (53%) | 0.52s±0.01 | (52%) |
compress jpeg -5 (10 iterations) | ||||||||
Compression ratio | 1.00 | 1.00 | 1.00 | 1.00 | ||||
Execution time | 0.64s±0.00 | (100%) | 0.53s±0.00 | (83%) | 0.74s±0.01 | (116%) | 0.74s±0.00 | (115%) |
compress pngpixels -5 (10 iterations) | ||||||||
Compression ratio | 0.14 | 0.14 | 0.14 | 0.14 | ||||
Execution time | 1.23s±0.01 | (100%) | 0.61s±0.01 | (49%) | 0.61s±0.00 | (49%) | 0.59s±0.00 | (47%) |
Compression level 9
baseline | cloudflare | intel | zlib-ng | |||||
compress executable -9 (10 iterations) | ||||||||
Compression ratio | 0.33 | 0.33 | 0.33 | 0.33 | ||||
Execution time | 9.55s±0.01 | (100%) | 4.07s±0.01 | (42%) | 7.53s±0.01 | (78%) | 7.34s±0.01 | (76%) |
compress html -9 (10 iterations) | ||||||||
Compression ratio | 0.33 | 0.33 | 0.33 | 0.33 | ||||
Execution time | 2.81s±0.01 | (100%) | 1.64s±0.00 | (58%) | 2.54s±0.01 | (90%) | 2.48s±0.02 | (88%) |
compress jpeg -9 (10 iterations) | ||||||||
Compression ratio | 1.00 | 1.00 | 1.00 | 1.00 | ||||
Execution time | 0.64s±0.00 | (100%) | 0.53s±0.00 | (82%) | 0.58s±0.01 | (90%) | 0.59s±0.00 | (93%) |
compress pngpixels -9 (10 iterations) | ||||||||
Compression ratio | 0.12 | 0.12 | 0.12 | 0.12 | ||||
Execution time | 26.58s±0.05 | (100%) | 14.24s±0.02 | (53%) | 21.43s±0.02 | (80%) | 19.40s±0.03 | (72%) |