[erlang-questions] Needed: Great big ordered int set.

Heinz Nikolaus Gies heinz@REDACTED
Tue Jul 9 13:06:35 CEST 2013


If you know that data often appear in hunks you could so something like:

[{start,end} | lone]

so worst case you literally get a straight list, best case you can get a good compression by getting a lot of bunches.

Just a thought ;) not sure if it'd work for you - I've thought of that to represent a list of IPs where I would likely get a good compression out of that.

Cheers,
Heinz


On Jul 9, 2013, at 12:37, Knut Nesheim <knutin@REDACTED> wrote:

> 
> 
> 
> 
> On 09.07.2013, at 12:02, Richard Carlsson <carlsson.richard@REDACTED> wrote:
> 
>> On 2013-07-09 11:32 , Jesper Louis Andersen wrote:
>>> Unless I am totally wrong here about what you want to get, why don't you
>>> use a digital representation? A 64 bit integer is a stream of bits. You
>>> can then share the common parts of the bit pattern in a tree-like
>>> structure. This in turn means that compression is "automatic" since the
>>> common bit patterns acts like the compression pattern. Add something
>>> like "everything below this node is present" and you can compress common
>>> parts of the tree.
>>> 
>>> In general, the solution is RADIX-trees, or something along that
>>> pattern. They have, incidentally, also been used for quick
>>> Integer-key-based maps and so on.
>> 
>> Assuming a 64-bit Erlang VM, there's not much point in using a radix tree, since the smallest unit of representation will use at least 64 bits anyway
> 
> Yes. Being a bit naive, I implemented such a tree structure in Erlang. The memory overhead is way too big and the performance quite low. I even tried the approach of using a binary for storage, available at https://github.com/knutin/btrie
> 
> Judy arrays and especially the C implementation from HP has all the tricks you can think of. As Richard points out, if the word size is 64 bits, just the overhead of the pointers in such a structure will trump the size of the data.
> 
> Regards
> Knut
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions




More information about the erlang-questions mailing list