Don't compress the data.Compress the generator.
A lossless, format-aware compressor with one load-bearing promise: never worse than LZMA -9 on any input. On schema-regular data — CSV, structured JSON, event streams — the format-aware codecs win. On everything else, pzip falls back to LZMA. Single C++ binary. Apache-2.0. Free forever.
https://github.com/phuctruong/pzip/releases/download/v1.0.0/pzip-1.0.0-linux-x86_64.deb
sudo dpkg -i pzip.deb
pzip --help
How much smaller, by format
Measured numbers from our reproducible benchmark suite (tools/bench.sh). The win depends heavily on how regular your data is. The Never-Worse contract means you can try pzip on your data without risk: when no codec beats LZMA, the output IS LZMA.
| Format | Codec | When pzip helps |
|---|---|---|
| CSV (regular columns) | pzcsv | up to ~2.3× over LZMA |
| JSON (schema-stable) | pzjson | few % over LZMA |
| JSONL (uniform rows) | pzjsonl | strong on long event streams |
| Source code | pzsource | scenario-dependent |
| Server logs | pzlogs | scenario-dependent |
| Tensor checkpoints | pzsafetensors | scenario-dependent |
| Everything else | LZMA fallback | matches LZMA — never worse |
Honest framing: the “up to” numbers come from highly-regular benchmark inputs. Real-world data with high entropy or unstable schema sits closer to LZMA-equivalent. Codec depth on JSON / source / log formats is tracked on the v1.0.x roadmap.
The contract — three things pzip promises
Byte-exact round-trip
decompress(compress(X)) == X for every X, always. Verified by sha256 in the container header — the decompressor refuses bytes that don't match.
Never worse than LZMA -9
If no specialised codec beats LZMA on the input, the output IS LZMA. You pay nothing for trying pzip.
No phone-home
Pure local CLI under Apache-2.0. No telemetry, no analytics, no key. The source is the entire product surface.
Where pzip fits
pzip is the OSS compression engine at the bottom of the Solace AI Worker Platform. You can use pzip free forever under Apache-2.0. If your organisation needs SLA, indemnification, custom codecs for proprietary data formats, or on-prem licensing, the Solace Compression Enterprise tier ships those.
The code license and the commercial offering are independent. The OSS release is not a trial.