[erlang-questions] Needed: Great big ordered int set.

Richard Carlsson carlsson.richard@REDACTED
Tue Jul 9 12:02:08 CEST 2013


On 2013-07-09 11:32 , Jesper Louis Andersen wrote:
> Unless I am totally wrong here about what you want to get, why don't you
> use a digital representation? A 64 bit integer is a stream of bits. You
> can then share the common parts of the bit pattern in a tree-like
> structure. This in turn means that compression is "automatic" since the
> common bit patterns acts like the compression pattern. Add something
> like "everything below this node is present" and you can compress common
> parts of the tree.
>
> In general, the solution is RADIX-trees, or something along that
> pattern. They have, incidentally, also been used for quick
> Integer-key-based maps and so on.

Assuming a 64-bit Erlang VM, there's not much point in using a radix 
tree, since the smallest unit of representation will use at least 64 
bits anyway, even if it's just for storing the last few unique bits of 
each entry. (On a 32-bit Erlang VM, a 2- or 3- level radix tree would be 
good though.) So you'll end up with something proportional to N*Log(N) 
words (times the overhead for the data structure), unless you start 
packing the radix trees into binaries. That could work, but would 
probably be quite slow and rather messy and error prone.

    /Richard




More information about the erlang-questions mailing list