The Numbers First
Level 13 in libdeflate saves 86,990 bytes across the Silesia corpus — a 0.134% improvement over level 12 — at the cost of 1,641,401 ms of extra CPU time. That's a 5,636.7% slowdown. The biggest relative gain is on the nci file: 0.296% smaller. The smallest is sao: 0.004%.
Why Bother?
DEFLATE is everywhere. HTTP content encoding, ZIP archives, PNG images, software distribution, backup tools, embedded systems. The decoder is fixed — you can't change the format. But the encoder has freedom: it chooses match lengths, block boundaries, and Huffman tables. Better choices mean smaller output, and smaller output means faster downloads, less storage, and lower bandwidth costs.
libdeflate is already one of the strongest practical DEFLATE encoders. Level 12 is the baseline. Level 13, contributed as a pull request, pushes the search space to extremes.
What Level 13 Actually Does
Level 13 keeps libdeflate's near-optimal parser but makes every decision more expensive:
- Full 32 KiB window search. It scans the entire DEFLATE sliding window for matches, not just a subset.
- Up to 15 optimisation passes. The parser iterates to refine choices.
- Static Huffman optimisation. Blocks up to 50,000 input bytes get static Huffman table tuning.
- Delayed block size commitment. For text-like data (no NULL byte, ≤97 distinct bytes in a 64 KiB sample), the soft block size limit rises from 300,000 to 1,000,000 bytes. The idea: stable byte distribution lets one Huffman table cover more data.
- Broader minimum cost search. The encoder can choose the cheapest offset for each match length, estimate initial Huffman costs from literal and match length statistics, estimate offset slot frequencies from cached matches, and compare measured dynamic Huffman cost against static Huffman and literal-only encodings.
- Delayed block splitting. Up to nine split candidates are stored with parser state. The compressor scores the full block and a bounded shortest path over candidate intervals. A single split can win on cost; a multi-split path must beat the full block by at least 512 bits. The selected parse is cached for final flushing.
Bounded Slowness
Unlike turtledeflate or broader file optimisers like ECT, level 13 caps its search passes, split candidates, and block sizes. It cannot enter an unbounded optimisation loop. The worst-case runtime is predictable.
The Zero-Regression Policy
Development followed a strict rule: every change must decrease at least one file in the Silesia corpus without increasing any other. Many approaches were tried and rejected. The final configuration is a conservative collection of heuristics that only improve compression.
Silesia Results in Detail
| File | Level 12 Size / Time | Level 13 Size / Time | Size Diff | Time Diff |
|---|---|---|---|---|
| dickens | 3,688,552 / 1,289 ms | 3,684,671 / 83,512 ms | -3,881 (-0.105%) | +82,223 (+6,378.8%) |
| mozilla | 18,267,490 / 4,959 ms | 18,235,120 / 110,754 ms | -32,370 (-0.177%) | +105,795 (+2,133.4%) |
| mr | 3,448,571 / 1,627 ms | 3,443,723 / 16,260 ms | -4,848 (-0.141%) | +14,633 (+899.4%) |
| nci | 2,766,224 / 7,935 ms | 2,758,044 / 673,648 ms | -8,180 (-0.296%) | +665,713 (+8,389.6%) |
| ooffice | 2,998,130 / 424 ms | 2,995,604 / 8,676 ms | -2,526 (-0.084%) | +8,252 (+1,946.2%) |
| osdb | 3,642,347 / 798 ms | 3,641,341 / 4,942 ms | -1,006 (-0.028%) | +4,144 (+519.3%) |
| reymont | 1,702,796 / 1,005 ms | 1,699,494 / 81,839 ms | -3,302 (-0.194%) | +80,834 (+8,043.2%) |
| samba | 5,135,662 / 2,889 ms | 5,122,812 / 141,227 ms | -12,850 (-0.250%) | +138,338 (+4,788.4%) |
| sao | 5,255,575 / 333 ms | 5,255,358 / 1,687 ms | -217 (-0.004%) | +1,354 (+406.6%) |
| webster | 11,565,754 / 6,452 ms | 11,555,293 / 475,196 ms | -10,461 (-0.090%) | +468,744 (+7,265.6%) |
| x-ray | 5,754,248 / 305 ms | 5,748,141 / 3,276 ms | -6,107 (-0.106%) | +2,971 (+974.1%) |
| xml | 633,760 / 1,104 ms | 632,518 / 69,504 ms | -1,242 (-0.196%) | +68,400 (+6,195.7%) |
| Total | 64,859,109 / 29,120 ms | 64,772,119 / 1,670,521 ms | -86,990 (-0.134%) | +1,641,401 (+5,636.7%) |
When to Use Level 13
This level is for data that is compressed once and distributed many times. Think: software packages, firmware images, reference datasets, archived logs. The compression cost is amortised over millions of decompressions. For one-off transfers or frequently recompressed data, level 12 remains the practical choice.
Getting Level 13
The code is in libdeflate's repository. Compile with make and use libdeflate-gzip with -13 or the library API with LIBDEFLATE_LEVEL_13. Expect compilation times to be unaffected — the overhead is only at runtime.
What's Next
Level 13 is a dead end for further algorithmic gains within DEFLATE. The next steps are format-level changes: Brotli, Zstandard, or LZMA. But as long as DEFLATE is the universal fallback, squeezing every byte counts. If you maintain a package distribution pipeline, benchmark level 13 on your assets. The savings may justify a build-time increase.
