[erlang-questions] sortable serialization format

Ulf Wiger <>
Sun Oct 25 23:26:12 CET 2009


Tony Finch wrote:
> 
> How about using "consistent overhead byte stuffing" to encode binaries
> more compactly? http://www.stuartcheshire.org/papers/COBSforToN.pdf
> The Erlang reference manual doesn't say how binaries are ordered.

Interesting, but wouldn't that break the sort order?

Binaries and bit strings are sorted as left-aligned bit arrays.
You are right, it isn't documented. It took me a while to wrap
my brain around it, even though it's perfectly logical:

1> lists:sort([<<2#111:3>>, <<2#1111111:7>>,
                <<2#110111111:9>>, <<1:1>>]).
[<<1:1>>,<<223,1:1>>,<<7:3>>,<<127:7>>]

Looking back, I don't remember how I thought I would work...

...oh, but my email archive has better memory than I:

5> lists:sort([<<1:1>>,<<1:2>>,<<1:3>>,<<1:4>>,
                <<1:5>>,<<1:6>>,<<1:7>>,<<1:8>>,
                <<2:2>>,<<2:3>>,<<2:4>>,<<1>>,
                <<16>>, <<128>>, <<255>>]).
[<<1>>,
  <<1>>,
  <<1:7>>,
  <<1:6>>,
  <<1:5>>,
  <<1:4>>,
  <<16>>,
  <<1:3>>,
  <<2:4>>,
  <<1:2>>,
  <<2:3>>,
  <<1:1>>,
  <<2:2>>,
  <<128>>,
  <<"ÿ">>]

A certain professor used the academic term "Gobsmacking!" to
describe this, but the fallacy is that you fall into thinking
that the sort order above should resemble the order of the
integer values that we have encoded with different lengths.
It makes sense only if you view them as bit streams:

6> lists:sort([{sext:pp(B),B} || B <-
       [<<1:1>>,<<1:2>>,<<1:3>>,<<1:4>>,
        <<1:5>>,<<1:6>>,<<1:7>>,<<1:8>>,
        <<2:2>>,<<2:3>>,<<2:4>>,<<1>>,
        <<16>>, <<128>>, <<255>>]]).
[{"00000001",<<1>>},
  {"00000001",<<1>>},
  {"0000001",<<1:7>>},
  {"000001",<<1:6>>},
  {"00001",<<1:5>>},
  {"0001",<<1:4>>},
  {"00010000",<<16>>},
  {"001",<<1:3>>},
  {"0010",<<2:4>>},
  {"01",<<1:2>>},
  {"010",<<2:3>>},
  {"1",<<1:1>>},
  {"10",<<2:2>>},
  {"10000000",<<128>>},
  {"11111111",<<"ÿ">>}]


BR,
Ulf W
-- 
Ulf Wiger
CTO, Erlang Training & Consulting Ltd
http://www.erlang-consulting.com



More information about the erlang-questions mailing list