LZ4 v1.10.0
introduces major updates, integrating 600+ commits that significantly enhance its capabilities. This version brings multithreading support to the forefront, harnessing modern multi-core processors to accelerate both compression and decompression processing. It's a good upgrade for users looking to optimize performance in high-throughput environments.
Multithreading support
The most visible upgrade of this version is likely Multithreading support. While LZ4 has historically been recognized for its high-speed compression, the demand for even faster throughput has grown, particularly with the advent of nvme
storage technologies that allow for multi-GB/s throughput.
Multithreading is particularly beneficial for High Compression modes, which now perform dramatically faster. The following benchmark table showcases the performance improvements:
source | cpu | os | level | v1.9.4 | v1.10.0 | Improvement |
---|---|---|---|---|---|---|
silesia.tar | 7840HS | Win11 | 12 | 13.4 sec | 1.8 sec | x7.4 |
silesia.tar | M1 Pro | macos | 12 | 16.6 sec | 2.55 sec | x6.5 |
silesia.tar | i7-9700k | linux | 12 | 16.2 sec | 3.05 sec | x5.4 |
enwik9 | 7840HS | Win11 | 9 | 20.8 sec | 2.6 sec | x8.0 |
enwik9 | M1 Pro | macos | 9 | 22.1 sec | 2.95 sec | x7.4 |
enwik9 | i7-9700k | linux | 9 | 22.9 sec | 4.05 sec | x5.7 |
Multithreading is less critical for decompression, as modern nvme
drives can still be saturated with a single decompression thread. Nonetheless, the new version enhances performance by overlapping I/O operations with decompression processes.
Tested on a x64 linux platform, decompressing a 5 GB text file locally takes 5 seconds with v1.9.4
;
this is reduced to 3 seconds in v1.10.0
, corresponding to > +60% performance improvement.
Official support for dictionary compression (and decompression)
Starting from v1.10.0
, dictionary compression, previously tagged as "experimental", now receives full support. This upgrade assures stability and ongoing support for the feature, enabling developers to reliably use this functionality in their applications.
The new symbols supported by liblz4
are :
LZ4_loadDictSlow()
: minor variant ofLZ4_loadDict()
, which consumes more initialization time to better reference the dictionary, resulting in slightly improved compression ratios.LZ4_attach_dictionary()
: use in-place a LZ4 state initialized with a dictionary, to perform dictionary compression (LZ4 Block format) without the initialization costs. Very useful for small data, where dictionary initialization can become a bottleneck. The dictionary state can be used by multiple threads concurrently.LZ4_attach_HC_dictionary()
: same asLZ4_attach_dictionary()
, but for LZ4HC dictionary compression.LZ4F_compressBegin_usingDict()
: initiate streaming compression to the LZ4Frame format, using a Dictionary.LZ4F_decompress_usingDict()
: decompress a LZ4Frame requiring a DictionaryLZ4F_createCDict()
: create a materialized dictionary, ready to start compression without initialization cost. Can be shared across multiple threads.LZ4F_compressFrame_usingCDict()
: one-shot compression to the LZ4Frame format, using materializedCDict
LZ4F_compressBegin_usingCDict()
: initiate streaming compression to the LZ4Frame format, using materializedCDict
New compression level 2
The new "Level 2" compression effectively fills the substantial gap between the standard "Fast Level 1" and the more intensive "High Compression Level 3." It provides a balanced option, optimizing performance and compression as evidenced in the benchmark results below (i7-9700k, linux
):
file | level 1 | level 2 | level 3 |
---|---|---|---|
silesia.tar (speed) | 685 MB/s | 315 MB/s | 110 MB/s |
silesia.tar (ratio) | x2.101 | x2.378 | x2.606 |
Level 2 is ideal for applications requiring better compression than lz4
level 1, without the speed trade-offs associated with HC level 3.
Miscellaneous
- The CLI now supports the environment variables
LZ4_CLEVEL
andLZ4_NBWORKERS
, offering flexible control over its behavior in scenarios where direct commands are impractical, or when customized local defaults are necessary. - The licensing for the CLI and test programs has been clarified to
GPL-2.0-or-later
to distinguish it fromGPL-2.0-only
, enhancing transparency. Theliblz4
library maintains its BSD-2 clause license. - Various less common platforms have been validated (
loongArch
,risc-v
,m68k
,mips
andsparc
), and are now continuously tested in CI, to ensure portability. - Visual Studio solutions are now generated from
cmake
recipe, in an effort to reduce manual maintenance of multiple Solutions.
One-liner updates
- cli : multithreading compression support: improves speed by X times threads allocated
- cli : overlap decompression with i/o, improving speed by >+60%
- cli : support environment variables
LZ4_CLEVEL
andLZ4_NBWORKERS
- cli : license of CLI more clearly labelled
GPL-2.0-or-later
- cli : fix: refuse to compress directories
- cli : fix dictionary compression benchmark on multiple files
- cli : change: no more implicit
stdout
(except when input isstdin
) - lib : new level 2, offering mid-way performance (speed and compression)
- lib : Improved lz4frame compression speed for small data (up to +160% at 1KB)
- lib : Slightly faster (+5%) HC compression speed (levels 3-9), by @JunHe77
- lib : dictionary compression support now in stable status
- lib : lz4frame states can be safely reset and reused after a processing error (described by @QrczakMK)
- lib :
lz4file
API improvements, by @vsolontsov-volant and @t-mat - lib : new experimental symbol
LZ4_compress_destSize_extState()
- build: cmake minimum version raised to 3.5
- build: cmake improvements, by @foxeng, @Ohjurot, @LocalSpook, @teo-tsirpanis, @ur4t and @t-mat
- build: meson scripts are now hosted into
build/
directory, by @eli-schwartz - build: meson improvements, by @tristan957
- build: Visual Studio solutions generated by
cmake
via scripts - port : support for loongArch, risc-v, m68k, mips and sparc architectures
- port : improved Visual Studio compatibility, by @t-mat
- port : freestanding support improvements, by @t-mat
Automated change log
- Cancel in-progress CI if a new commit workflow supplants it by @tristan957 in #1142
- allocation optimization for lz4frame compression by @Cyan4973 in #1158
- fixed a few remaining ubsan warnings in lz4hc by @Cyan4973 in #1160
- simplify getPosition by @Cyan4973 in #1161
- build: Support BUILD_SHARED=no by in #1162
- fix benchmark mode using Dictionary by @Cyan4973 in #1168
- remove usages of
base
pointer by @Cyan4973 in #1163 - fix rare ub by @Cyan4973 in #1169
- LZ4 HC match finder and parsers use direct offset values by @Cyan4973 in #1173
- very minor refactor of lz4.c by @Cyan4973 in #1174
- Update Meson build to 1.9.4 by @tristan957 in #1139
- Fixed const-ness of src data pointer in lz4file and install lz4file.h by @vsolontsov-volant in #1192
- Add copying lz4file.h to make install by @vsolontsov-volant in #1191
- Change the version of lib[x]gcc for clang-(11|12) -mx32 by @t-mat in #1197
- Remove PATH=$(PATH) prefix from all shell script invocation in tests/Makefile by @t-mat in #1196
- Add environment check for freestanding test : resolves #1186 by @t-mat in #1187
- Declare read_long_length_no_check() static by @x4m in #1188
- uncompressed-blocks: Allow uncompressed blocks for all modes by @alexmohr in #1178
- fixed usan32 tests by @Cyan4973 in #1175
- Meson updates by @tristan957 in #1184
- Fix typo found by codespell by @DimitriPapadopoulos in #1204
- Clean up generation of internal static library by @tristan957 in #1206
- build: move meson files from contrib, to go alongside other build systems by @eli-schwartz in #1207
- Improve LZ4F_decompress() docs by @embg in #1199
- Add 64-bit detection for LoongArch by @zhaixiaojuan in #1209
- refuse to compress directories by @Cyan4973 in #1212
- Fix #1232 : lz4 command line utility sub-project for Visual Studio 2022 missing by @t-mat in #1233
- Add security policy by @pnacht in #1238
- fix #1246 by @Cyan4973 in #1247
- fix: missing LZ4F_freeDecompressionContext by @t-mat in #1251
- CI: updates (Add gcc-13 and clang-15. Fix msvc2022-x86-release) by @t-mat in #1245
- Don't clobber default WINDRES in MinGW environments by @uckelman-sf in #1242
- Reduce usage of variable cpy on decompression by @Nicoshev in #1226
- lib/Makefile: Support building on legacy OS X by @sevan in #1220
- Set CMake minimum requirement to 3.5 by @haampie in #1228
- Adding XXH_NAMESPACE to CMake builds by @Ohjurot in #1258
- Apply pyupgrade suggestion to Python test scripts by @DimitriPapadopoulos in #1257
- Remove redundant error check by @Nicoshev in #1224
- Add packing support for msc by @Nicoshev in #1225
- Don't conflate the shared library name with the shared library filename by @uckelman-sf in #1244
- Move GNUInstallDirs include before it is referenced first by @laszlo-dobcsanyi in #1260
- lz4hc: increase count back search step by @JunHe77 in #1263
- Update code documentation about stableDst by @Cyan4973 in #1267
- Fix #1227 by @Cyan4973 in #1268
- fix x32 CI tests by @Cyan4973 in #1278
- fix examples by @Cyan4973 in #1277
- fix: issue #1269 by @t-mat in #1281
- cmake static lib test (ci.yml) by @t-mat in #1286
- Ignore Visual Studio Code files in
.gitignore
by @LocalSpook in #1288 - Make CMake version number parsing more robust by @LocalSpook in #1289
- Make Makefile version number parsing more robust by @LocalSpook in #1290
- Make Meson version number parsing more robust by @LocalSpook in #1291
- Introduce the
.clang-format
rule file by @LocalSpook in #1287 - Use
-Wpedantic
instead of-pedantic
for consistency with other-W*
options by @LocalSpook in #1294 - Add null pointer check before
FREEMEM()
by @LocalSpook in #1297 - added new qemu targets for CI (MIPS, M68K, RISC-V) by @Cyan4973 in #1299
- Enable basic support for riscv64 by @Hamlin-Li in #1298
- lz4: remove unnecessary check of ip by @JunHe77 in #1301
- decomp: refine read_variable_length codegen layout by @JunHe77 in #1312
- updated code documentation by @Cyan4973 in #1314
- Add LZ4_compress_fast_extState_destSize() API by @tristan957 in #1308
- Minor refactor by @Cyan4973 in #1322
- fix minor conversion warnings by @Cyan4973 in #1325
- ensure make install target doesn't create files by @Cyan4973 in #1326
- minor: lz4file API provides more accurate error codes by @Cyan4973 in #1327
- Make hashes identical between LE and BE platforms by @ltrk2 in #1253
- Makefile refactor by @Cyan4973 in #1328
- fix 1308 by @Cyan4973 in #1333
- Add multi-threading compression by @Cyan4973 in #1336
- Compile-time constants by @Cyan4973 in #1343
- change INSTALL_DIR into MAKE_DIR by @Cyan4973 in #1350
- added a lorem ipsum generator by @Cyan4973 in #1356
- Fix Python 3.6 string interpolation by @likema in #1353
- Datagen uses lorem ipsum generator by default by @Cyan4973 in #1357
- Fix seed 571 test by @Cyan4973 in #1361
- add sparc compilation test by @Cyan4973 in #1360
- LZ4 Level 2 by @Cyan4973 in #1363
- updated lorem ipsum generator by @Cyan4973 in #1369
- Added preprocessor checks for Clang on Windows by @Razakhel in #1370
- Add unified CMake target if building only a shared or statric library, by @teo-tsirpanis in #1372
- Introduce Async I/O for decompression by @Cyan4973 in #1376
- fix #1374 by @Cyan4973 in #1380
- Suppress VS2022 warnings by @jonrumsey in #1383
- benchmark results are displayed to stdout by @Cyan4973 in #1390
- Prefer OR over ADD for splicing numbers from byte-addressed memory by @AtariDreams in #1404
- len should be unsigned by @AtariDreams in #1403
- Define mlen = MINMATCH at the start of the loop by @AtariDreams in #1405
- Update function comment by @Nicoshev in #1401
- CMake: Separate symlinks creation and installation by @ur4t in #1395
- [cmake] minor refactor of the symlink installation paragraph by @Cyan4973 in #1406
- [cmake] Always create lz4 target. by @teo-tsirpanis in #1413
- add status update when decompressing legacy frames by @Cyan4973 in #1426
- Async-IO for LZ4F decompression by @Cyan4973 in #1428
- Support Multithreading for Windows by @Cyan4973 in #1429
- Control nb threads via environment variable
LZ4_NBWORKERS
by @Cyan4973 in #1430 - fix cpuload measurements on Windows by @Cyan4973 in #1432
- minor threadpool refactor by @Cyan4973 in #1433
- automatically enable multithreading by default on Windows by @Cyan4973 in #1434
- add support for environment variable
LZ4_CLEVEL
by @Cyan4973 in #1435 - Linked Blocks compression (
-BD
) can employ multiple threads by @Cyan4973 in #1436 - update gpl license text to 2.0-or-later by @Cyan4973 in #1438
- removed implicit
stdout
by @Cyan4973 in #1442 - Generate Visual Studio solutions from
cmake
by @Cyan4973 in #1440 - Ensure
--list
output is sent to stdout by @Cyan4973 in #1446 - exit on invalid frame by @Cyan4973 in #1448
- promote dictionary API to stable by @Cyan4973 in #1443
New Contributors
- @foxeng made their first contribution in #1162
- @vsolontsov-volant made their first contribution in #1192
- @x4m made their first contribution in #1188
- @embg made their first contribution in #1199
- @zhaixiaojuan made their first contribution in #1209
- @IgorWiecz made their first contribution in #1235
- @pnacht made their first contribution in #1238
- @uckelman-sf made their first contribution in #1242
- @Nicoshev made their first contribution in #1226
- @sevan made their first contribution in #1220
- @haampie made their first contribution in #1228
- @Ohjurot made their first contribution in #1258
- @laszlo-dobcsanyi made their first contribution in #1260
- @JunHe77 made their first contribution in #1263
- @LocalSpook made their first contribution in #1288
- @Hamlin-Li made their first contribution in #1298
- @ltrk2 made their first contribution in #1253
- @likema made their first contribution in #1353
- @Razakhel made their first contribution in #1370
- @teo-tsirpanis made their first contribution in #1372
- @deining made their first contribution in #1375
- @RoboSchmied made their first contribution in #1384
- @AtariDreams made their first contribution in #1404
- @ur4t made their first contribution in #1395
Full Changelog: v1.9.4...v1.10.0
edit: the lz4-1.10.0.tar.gz
artifact has been updated because the initial version was embedding some macos
specific stuff.
edit 2: the windows binary packages have been updated to fix a bug affecting time measurement when compressing extremely large files.