An zlib NIF library for Erlang optimised for streaming
ezlib
can be used with different zlib forks like:
- baseline - will statically link with the original zlib
- cloudflare - will statically link with the cloudflare zlib fork
- intel - will statically link with the intel zlib fork
- zlibng - will statically link with zlibng fork
By default the baseline is used. In order to change the zlib fork you can use the ZLIB_FORK
env variable. For example
to use zlibng
for you can use:
ZLIB_FORK=zlibng rebar compile
Based on zlib FAQ thread safety of the library can be achieved by meeting several conditions. The most important being
the fact that you should only operate on any given zlib stream from a single thread at a time. In erlang this means that
you need to operate on a ezlib session from the same process that created it. In case you access the same stream from multiple
processes you will get an error when calling process/2
.
You can access the changelog from here
StringBin = <<"this is a string compressed with zlib nif library">>,
{ok, DeflateRef} = ezlib:new(?Z_DEFLATE),
{ok, InflateRef} = ezlib:new(?Z_INFLATE),
CompressedBin = ezlib:process(DeflateRef, StringBin),
DecompressedBin = ezlib:process(InflateRef, CompressedBin),
DecompressedBin = StringBin
ezlib:new
accepts a second parameter where you can specify the following options:
compression_level
: Compression level 0 - 9 default 6. 0 no compression, 9 max compressionwindow_bits
: The windowBits parameter is the base two logarithm of the window size (the size of the history buffer). It should be in the range 8..15memory_level
: Specifies how much memory should be allocated for the internal compression state. Values between 1 to 9 default is 8compression_strategy
: The compression strategy that should be used. One of the following values (Z_DEFAULT_STRATEGY
default one):
-define(Z_FILTERED, 1).
-define(Z_HUFFMAN_ONLY, 2).
-define(Z_RLE, 3).
-define(Z_FIXED, 4).
-define(Z_DEFAULT_STRATEGY, 0).
Example:
Options = [
{compression_level, 6},
{window_bits, 15},
{memory_level, 8}
],
{ok, DeflateRef} = ezlib:new(?Z_DEFLATE, Options)
zlib memory footprint which can be calculated as:
- deflate memory usage in bytes = (1 << (
window_bits
+2)) + (1 << (memory_level
+9)) - inflate memory usage in bytes = (1 <<
window_bits
) + 1440*2*sizeof(int)
The default values for window_bits
and memory_level
are 15 and 8 so for this values the default required memory for deflate is 256 KB and for inflate is 44 KB.
In addition to this memory ezlib allocates a 1 KB buffer used to avoid reallocation of memory all the time. This buffer is auto resizeable and cannot grow over 8KB.
In order to see the statistics regarding compression ratio you can use ezlib:metrics/1
. Output looks like:
{ok,[{bytes_in,42227900},
{bytes_out,29096830},
{compression_ratio,31.09572107540276}]}
You can specify a file path, concurrency level, how many times to deflate it line by line and compression level, window size and memory level.
benchmark:run(ezlib,"file path here", 1, 1, 6, 10, 1).
or benchmark:run(erlang,"file path here", 1, 1, 6, 10, 1).
Benchmark results compressing a text file line by line 20 times on a MacBook Pro
benchmark:run(ezlib, FilePath, ConcurencyLevel, 20, 6, 10, 1).
Zlib settings:
Compression level
: 6Window Size
: 10Memory Level
: 1
Results: (for different concurrency levels)
zlib library | C1 (MB/s) | C5 (MB/s) | C10 (MB/s) | Compression Ratio (%) |
---|---|---|---|---|
erlang 19.2 zlib | 22.22 | 87.90 | 91.46 | N/A |
ezlib baseline | 25.33 | 105.37 | 113.68 | 73.10 |
ezlib cloudflare | 28.98 | 122.36 | 133.12 | 72.61 |
ezlib intel | 29.43 | 120.98 | 131.13 | 75.89 |
ezlib zlibng | 31.24 | 129.36 | 139.50 | 75.86 |
Other useful resources regarding the optimisations did by Intel, Cloudflare and ZlibNg into their forks and other benchmarks as well can be found here and here.