[erlang-questions] Packets deduplication

Felix Gallo felixgallo@REDACTED
Thu Feb 18 19:56:22 CET 2016


you'd use the bloom filter first to determine if it's possible that you've
seen it before, and then if you get a hit, just do a lookup into some other
structure.  Depending on your hash functions and the sparsity, it could be
significantly faster; consider, for example, if the sequence is

1,2,3,4,2^1024,2^1024+1,2^1024+2...

just about anything except a bloom filter is going to have
possibly-interesting behavior when the 2^1024 shows up.  The bloom filter
may give you some false positives and will generally be slower, but if
you're trying to maintain a packet train for, like, remote medical surgery,
you might accept the extra space and the slightly higher average latency
for less jitter.

On Thu, Feb 18, 2016 at 10:45 AM, Jesper Louis Andersen <
jesper.louis.andersen@REDACTED> wrote:

>
> On Thu, Feb 18, 2016 at 7:28 PM, <dmkolesnikov@REDACTED> wrote:
>
>> The right data structure is either bloom filter or scalable bloom filter.
>
>
> You also need to delete elements, which a cuckoo filter can do.
>
> The problem with bloom filters in this situation is the probability of
> error as far as I see. They either say NOPE or MAYBE??? The other problem
> is that it probably wont beat integer operations on 60 bit words since it
> requires a hash.
>
>
> --
> J.
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20160218/d75904f4/attachment.htm>


More information about the erlang-questions mailing list