[erlang-questions] IP search optimization

Adam Cammack adam@REDACTED
Fri Nov 20 21:18:04 CET 2015


On Fri, Nov 20, 2015 at 06:01:50PM +0100, Sid Muller wrote:
> > Looks more like a task for radix trees...
> 
> Hmm, good point although with ets if my owner process gets killed I
> inherit the ets and no data is lost. Not sure how that cold be done
> with a radix tree.

The ETS ordered_set table type is nearly as good for this purpose.

> > The best you can achieve with ets is:
> > on insert: make sure that inserted networks are normalized (do not have
> > any bits set after prefix length).
> > on lookup:
> > 1. start with masklen of 32
> > 2. normalize address to current masklen
> > 3. lookup for address/masklen record, return if found
> > 4. if current masklen is zero - return 'not_found'
> > 5. decrease masklen and repeat from step two. 
> 
> There are a few faster methods:
> 1. Have a pre-scanned list of /CIDR [21, 20...], there are usually
> about 10. But this still traverses the table 10 times in worst case.
> 
> 2. Traverse the table with ets:lookup(ets:next(), this has the
> advantage over #1 as it's much faster in most cases.
> 
> 3. ets:select(MatchSpec), this is probably the fastest solution
> because it should traverse the table only once. But right now I'm
> having a bit of a problem with my select statement... see other email
> [bitstring select]

These are all slower. ets:match/{2,3}, ets:select/{2,3}, and repeatedly
calling ets:next/2 should be avoided where possible on large tables
because of their immense running time.

CIDR blocks can be expressed as a tuple of two integers: the 32-bit
padded mask and the number of significant bits. We can take advantage of
Erlang's term ordering[1] and use the ordered_set table type with the
block as the key {Mask, Length}. In an ordered_set table, ets:prev(Tab,
{IP, 32}) will find the block that might contain that address[2]. Simple
bit arithmetic will tell you if the address is in the block or not. This
is about an order of magnitude faster on my box than doing ets:lookup/2
for increasingly less precise masks.

Your issue with matching the bit strings in a match spec can be resolved
by storing the bit strings as above and then using the bit arithmetic
guard expressions (select/1 in the below gist), although I do not
recommend this.

When in doubt (and especially when certain), measure!

https://gist.github.com/acammack/f631cf5ca978775f10f7

For 100 lookups on a 1,000,000 element table, Erlang/OTP 18.1:

ets:prev/2 (ordered): 9.40000e-5
ets:select/3 (ordered): 1.45619e+1
ets:lookup/2 (ordered): 1.72300e-3
ets:lookup/2 (unordered): 7.94000e-4

[1] Tuples of the same length are ordered by their elements, starting
    with the first.
[2] If individual addresses are stored in the same table, you will need
    to do an extra ets:lookup/2 for the address. If blocks can overlap,
    you will need to call ets:prev/2 until you get a block that does not
    match the address.

Happy coding!

--
Adam Cammack



More information about the erlang-questions mailing list