zstd (stdlib v7.0)

View Source

Zstandard compression interface.

This module provides an API for the Zstandard library (www.zstd.net). It is used to compress and decompress data and offers the same compression ratio as zlib but at a lower CPU cost.

Example:

> Data = ~"my data to be compressed".
> Compressed = zstd:compress(Data).
> zstd:decompress(Compressed).
[~"my data to be compressed"]

If you are compressing or decompressing possibly large amounts of data, it is also possible to do streamed compression/decompression.

Example:

> Compress = fun F(Ctx, D) ->
                     case file:read(D, 5) of
                         {ok, Data} ->
                             {continue, C} = zstd:stream(Ctx, Data),
                             [C|F(Ctx, D)];
                         eof ->
                             {done, C} = zstd:finish(Ctx, ""),
                             C
                     end
             end.
> {ok, Ctx} = zstd:context(compress).
> {ok, D} = file:open(File,[read,binary]).
> Compressed = iolist_to_binary(Compress(Ctx, D)).
<<40,181,47,253,0,88,89,0,0,108,111,114,101,109,32,105,112,115,117,109>>
> zstd:decompress(Compressed).
[~"lorem ipsum"]

In all functions errors can be thrown, where Reason describes the error.

Typical Reasons:

  • badarg - Bad argument.
  • zstd_error - An error generated by the Zstandard library.
  • not_on_controlling_process - The context was used by a process that did not create it.

Summary

Types

Compression parameters.

The compression level.

A compression or decompression context that can be used for streaming compression or decompression.

Decompression parameters.

A compression or decompression dictionary.

The compression strategy.

Functions

Close a context/0, releasing all referenced resources. After a context/0 is closed it is no longer possible to use it.

Compress Data using the given compress_parameters/0 or the context/0.

Create a compression or decompression context.

Decompress Data using the given compress_parameters/0 or the context/0.

Create a compression or decompression dictionary.

Finish compressing/decompressing data.

Get the dictionary ID of a dictionary or a frame.

Get header of a Zstandard compressed frame.

Get a parameter from a context/0.

Reset a context while streaming data, returning it to its original state but keeping all parameters set.

Set a parameter on a context/0.

Compress or decompress a stream of data. The last stream of data should be called with finish/2 to complete the compression/decompression.

Types

compress_parameters()

(not exported) (since OTP 28.0)
-type compress_parameters() ::
          #{dictionary => binary() | dict(),
            pledgedSrcSize => non_neg_integer(),
            compressionLevel => compressionLevel(),
            windowLog => non_neg_integer(),
            hashLog => non_neg_integer(),
            chainLog => non_neg_integer(),
            searchLog => non_neg_integer(),
            minMatch => non_neg_integer(),
            targetLength => non_neg_integer(),
            strategy => strategy(),
            targetCBlockSize => non_neg_integer(),
            enableLongDistanceMatching => boolean(),
            ldmHashLog => non_neg_integer(),
            ldmMinMatch => non_neg_integer(),
            ldmBucketSizeLog => non_neg_integer(),
            ldmHashRateLog => non_neg_integer(),
            contentSizeFlag => boolean(),
            checksumFlag => boolean(),
            dictIDFlag => boolean()}.

Compression parameters.

Zstandard has many parameters that can be tuned. Setting some parameters will fail when set to an incorrect value, while others will be silently adjusted to the closest valid value.

  • dictionary - Sets the compression dictionary for the context. The dictionary can be either a t:binary() representing the dictionary or a compression dict/0. When a dict/0 is attached to a context it will be kept alive until either the context is closed or it is replaced by another dictionary.

    To reset the context to not use any dictionary use the empty dictionary, that is <<>>.

  • pedgedSrcSize - When using stream/2 to do streaming compression, the decompressed size is not known when the header of the Zstandard frame is emitted. Setting this parameter on the context lets the compressor know the expected size of the data to compress. If the size is not correct when finish/2 is called an exception will be generated. Using compress/1,2 will automatically set this value.

  • compressionLevel - Sets the compressionLevel/0.

  • windowLog | hashLog | chainLog | searchLog | minMatch | targetLength | targetCBlockSize - Set the corresponding parameter. See the Zstandard documentation for more details.

  • strategy - Sets the compression strategy/0.

  • enableLongDistanceMatching - Whether to enable Long Distance Matching or not. LDM is useful when compressing large datasets and is enabled by using a high compressionLevel.

  • ldmHashLog | ldmMinMatch | ldmBucketSizeLog | ldmHashRateLog - Set the corresponding LDM parameter. See the Zstandard documentation for more details.

  • contentSizeFlag - Whether to include the contentSize or not.

  • checksumFlag - Whether to include the checksum or not.

  • dictIDFlag - Whether to include the dictionary ID or not.

The Zstandard documentation contains more details about each parameter.

compressionLevel()

(not exported) (since OTP 28.0)
-type compressionLevel() :: -22..22.

The compression level.

Higher values mean better compression ratio at the sacrifice of performance. A negative value sacrifices compression ratio in favor of performance.

0 is a special value which represents the default compression level.

context()

(since OTP 28.0)
-opaque context()

A compression or decompression context that can be used for streaming compression or decompression.

Only the process that created the context can use it.

decompress_parameters()

(not exported) (since OTP 28.0)
-type decompress_parameters() :: #{dictionary => binary() | dict(), windowLogMax => non_neg_integer()}.

Decompression parameters.

Zstandard has many parameters that can be tuned. Setting some parameters will fail when set to an incorrect value, while others will be silently adjusted to the closest valid value.

  • dictionary - Sets the decompression dictionary for the context. The dictionary can be either a t:binary() representing the dictionary or a decompression dict/0. When a dict/0 is attached to a context it will be kept alive until either the context is closed or it is replaced by another dictionary.

To reset the context to not use any dictionary use the empty dictionary, that is <<>>.

  • windowLogMax - Set the corresponding parameter. See the Zstandard documentation for more details.

dict()

(since OTP 28.0)
-opaque dict()

A compression or decompression dictionary.

strategy()

(not exported) (since OTP 28.0)
-type strategy() ::
          default | fast | dfast | greedy | lazy | lazy2 | btlazy2 | btopt | btultra | btultra2.

The compression strategy.

The strategies are listed depending on which compression ratio they give, that is the fast strategy is the fastest but also has the worst compression ratio, while btultra2 is the slowest but has the best compression ratio.

default is a special strategy representing the current default strategy.

See the Zstandard documentation for details on each strategy.

Functions

close(Ctx)

(since OTP 28.0)
-spec close(Ctx :: context()) -> ok.

Close a context/0, releasing all referenced resources. After a context/0 is closed it is no longer possible to use it.

A context/0 is automatically closed when GC:ed, so the only reason to call this function is to make the resources attached to the context be released before the next GC.

compress(Data)

(since OTP 28.0)
-spec compress(iodata()) -> iodata().

Equivalent to compress(Data, #{}).

compress(Data, CtxOrOptions)

(since OTP 28.0)
-spec compress(Data :: iodata(), Options :: compress_parameters()) -> iodata();
              (Data :: iodata(), Ctx :: context()) -> iodata().

Compress Data using the given compress_parameters/0 or the context/0.

Example:

> zstd:compress("abc").
> zstd:compress("abc", #{ compressionLevel => 20 }).

context(Mode)

(since OTP 28.0)
-spec context(compress | decompress) -> {ok, context()}.

Equivalent to context(Mode, #{}).

context(Mode, Options)

(since OTP 28.0)
-spec context(compress, Options :: compress_parameters()) -> {ok, context()};
             (decompress, Options :: decompress_parameters()) -> {ok, context()}.

Create a compression or decompression context.

A context can be used to do streaming compression/decompression and allows re-using parameters for multiple compressions/decompressions.

decompress(Data)

(since OTP 28.0)
-spec decompress(iodata()) -> iodata().

Equivalent to decompress(Data, #{}).

decompress(Data, CtxOrOptions)

(since OTP 28.0)
-spec decompress(Data :: iodata(), Options :: decompress_parameters()) -> iodata();
                (Data :: iodata(), Ctx :: context()) -> iodata().

Decompress Data using the given compress_parameters/0 or the context/0.

Example:

> Compressed = zstd:compress("abc").
> zstd:decompress(Compressed).
[~"abc"]

dict(Mode, Dict)

(since OTP 28.0)
-spec dict(Mode :: compress | decompress, Dict :: binary()) -> {ok, dict()}.

Equivalent to dict(Mode, Dict, #{}).

dict/3

(since OTP 28.0)
-spec dict(compress, Dict :: binary(), #{compressionLevel => compressionLevel()}) -> {ok, dict()};
          (decompress, Dict :: binary(), #{}) -> {ok, dict()}.

Create a compression or decompression dictionary.

A compression dictionary can be used as a compress_parameters/0 to use a dictionary for compression. Dictionaries allow good compression ratios even for small amounts of data.

A decompression dictionary can be used as a decompress_parameters/0 to use a dictionary for decompression. The same dictionary has to be used for compression as decompression. To verify that the same dictionary is used you can use get_dict_id/1 on the dictionary and compressed data, or just try to decompress as decompression will raise and exception if an incorrect dictionary is given.

The compressionLevel set on a dictionary will override the compressionLevel set in the context/0.

Example:

> {ok, CDict} = zstd:dict(compress, Dict).
> Data = lists:duplicate(100, 1).
[1, 1, 1 | _]
> iolist_size(zstd:compress(Data)).
17
> iolist_size(zstd:compress(Data, #{ dictionary => CDict, dictIDFlag => false })).
16

As loading a dictionary can be a heavy operations, it is possible to create only a single dict/0 and provide it to multiple context/0.

There is no API exposed in zstd to create a dictionary, instead use the zstd command line tool.

finish(Ctx, Data)

(since OTP 28.0)
-spec finish(Ctx :: context(), Data :: iodata()) -> Result when Result :: {done, erlang:iovec()}.

Finish compressing/decompressing data.

This flushes all output buffers and resets the context/0 so that it can be used for compressing/decompressing again.

Example:

> {ok, DCtx} = zstd:context(decompress).
> {continue, D1} = zstd:stream(DCtx, <<40,181,47,253,32>>).
> {done, D2} = zstd:finish(DCtx, <<2,17,0,0,97,98>>).
> iolist_to_binary([D1,D2]).
<<"ab">>

get_dict_id(DictOrFrame)

(since OTP 28.0)
-spec get_dict_id(DictOrFrame :: dict() | binary()) -> non_neg_integer().

Get the dictionary ID of a dictionary or a frame.

The dictionary ID 0 represents no dictionary.

Example:

> {ok, CDict} = zstd:dict(compress, Dict).
> zstd:get_dict_id(CDict).
1850243626
> zstd:get_dict_id(zstd:compress("abc")).
0

get_frame_header(Frame)

(since OTP 28.0)
-spec get_frame_header(Frame :: iodata()) ->
                          {ok,
                           #{blockSizeMax => non_neg_integer(),
                             checksumFlag => boolean(),
                             dictID => non_neg_integer(),
                             frameContentSize => non_neg_integer(),
                             frameType => 'ZSTD_frame' | 'ZSTD_skippableFrame',
                             headerSize => non_neg_integer(),
                             windowSize => non_neg_integer()}} |
                          {error, unicode:chardata()}.

Get header of a Zstandard compressed frame.

A compressed Zstandard stream can consist of multiple frames. This function will read metadata from the first frame. This information can be useful when debugging corrupted Zstandard streams.

Example:

> Compressed = zstd:compress(~"abc").
> zstd:get_frame_header(Compressed).
{ok,#{frameContentSize => 3,windowSize => 3,blockSizeMax => 3,
      frameType => 'ZSTD_frame',headerSize => 6,
      dictID => 0, checksumFlag => false}}

get_parameter(Ctx, Key)

(since OTP 28.0)
-spec get_parameter(Ctx :: context(), Key :: term()) -> Value :: term().

Get a parameter from a context/0.

See compress_parameters/0 and decompress_parameters/0 for details on which parameters are available and what each parameter does.

Note that it is not possible to get the dictionary and pledgedSrcSize parameters using this API. Instead you can use get_dict_id/1 on the context/0 to get the id of the dictionary used. There is no way to get the pledgedSrcSize.

Returns ok on success, raises an error on failure.

Example:

> {ok, CCtx} = zstd:context(compress).
{ok, _}
> zstd:get_parameter(CCtx, compressionLevel).
3
> zstd:set_parameter(CCtx, compressionLevel, 15).
ok
> zstd:get_parameter(CCtx, compressionLevel).
15

reset(Ctx)

(since OTP 28.0)
-spec reset(Ctx :: context()) -> ok.

Reset a context while streaming data, returning it to its original state but keeping all parameters set.

By resetting the state, the context can be re-used for other operations even if it is in the middle of a (de)compression stream.

Example:

> {ok, CCtx} = zstd:context(compress).
> zstd:stream(CCtx, "a").
{continue, _}
> zstd:reset(CCtx).
ok
> {done, Compressed} = zstd:finish(CCtx, "b").
> zstd:decompress(Compressed).
[~"b"]

set_parameter(Ctx, Key, Value)

(since OTP 28.0)
-spec set_parameter(Ctx :: context(), Key :: term(), Value :: term()) -> ok.

Set a parameter on a context/0.

See compress_parameters/0 and decompress_parameters/0 for details on which parameters are available and what each parameter does.

Returns ok on success, raises an error on failure.

Example:

> {ok, CCtx} = zstd:context(compress).
{ok, _}
> ok = zstd:set_parameter(CCtx, compressionLevel, 15).
ok
> zstd:stream(CCtx, "abc").
{continue, _}
> catch zstd:set_parameter(CCtx, dictionary, "abc").
{'EXIT', {{zstd_error, <<"Operation not authorized at current processing stage">>}, _}}

stream(Ctx, Data)

(since OTP 28.0)
-spec stream(Ctx :: context(), Data :: iodata()) -> Result
                when
                    Result ::
                        {continue, Remainder :: erlang:iovec(), Output :: binary()} |
                        {continue, Output :: binary()}.

Compress or decompress a stream of data. The last stream of data should be called with finish/2 to complete the compression/decompression.

Example:

> {ok, CCtx} = zstd:context(compress).
> {continue, C1} = zstd:stream(CCtx, ~"a").
> {done, C2} = zstd:finish(CCtx, ~"b").
> Compressed = iolist_to_binary([C1, C2]).
<<40,181,47,253,0,88,17,0,0,97,98>>
> zstd:decompress(Compressed).
[<<"ab">>]