sext library - new encoding
Ulf Wiger
ulf.wiger@REDACTED
Mon Feb 21 20:01:56 CET 2011
I added a variant of base32 encoding to the sext sortable serialization library.
http://github.com/esl/sext
The reason was to have an encoding that can be used in file names without great difficulty.
Example:
Eshell V5.8.1 (abort with ^G)
1> sext:encode_sb32(dict:new()).
<<"200000091IP5KR3N8040K0000000K00000G0K00000G0K0000080K00002G0K00001G100000081200H008G04802401200H008G04802401200H008G"...>>
2> sext:decode_sb32(v(1)).
{dict,0,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}}
Obviously, the sorting properties are preserved. To achieve this, I had to change the alphabet a bit, so it is not *actually* base32.
There is some blowup:
X = dict:new(),
term_to_binary(X): 60 bytes
sext:encode(X): 121 bytes
sext:encode_sb32(X): 200 bytes
OTOH, if used for encoding "key"-style terms, sizes should still be manageable.
BR,
Ulf W
PS Why not base64 instead? Because I ran into some trouble selecting good edge and pad symbols while still being file system friendly. I don't see why it couldn't be added later if it's deemed important, and I have more time. :)
Ulf Wiger, CTO, Erlang Solutions, Ltd.
http://erlang-solutions.com
More information about the erlang-questions
mailing list