[erlang-questions] A bug? - bloom filters - and loadable BIFs
tsuraan
tsuraan@REDACTED
Mon May 11 18:06:16 CEST 2009
> something like the above. But I would add one more twist. Use an array
> not of binaries but of integers. I have once implemented a data type for
> representing sets with bitmaps. (There I used ets instead of array, but
> that is orthogonal.) First I used binaries but later I switched to
> integers, exploiting the fact that erlang gives us arbitrary size
> integers. It has some advantages:
I have a bitset implementation at
http://github.com/tsuraan/bitset/tree/master that uses tuples of
smallints (ints with only the lower 27 bits in use). It's growable,
and seems fast and correct. If anybody wants to take a look at it,
feel free to do so.
The included bstest.erl shows various useful things that the library
is capable of, and also generates random data using the bitset and
gb_sets to ensure that the two give the same results. For speed, the
test is somewhat stupid because it builds the set starting with bit 0
and going to bit 1,000,000, which is the most optimistic possible use
case, but it can do that in under a second on my machine. For actual
use cases it does perform very well, especially w.r.t boolean
operations like unions and intersections.
More information about the erlang-questions
mailing list