[erlang-questions] IP search optimization

Sid Muller sid5@REDACTED
Sat Nov 21 20:55:58 CET 2015


Adam,

that ets:lookup() via ets:prev() is actually brilliant!!!

Thank you!

> Sent: Friday, November 20, 2015 at 12:18 PM
> From: "Adam Cammack" <adam@REDACTED>
> To: "Sid Muller" <sid5@REDACTED>
> Cc: erlang-questions <erlang-questions@REDACTED>
> Subject: Re: [erlang-questions] IP search optimization
>
> On Fri, Nov 20, 2015 at 06:01:50PM +0100, Sid Muller wrote:
> > > Looks more like a task for radix trees...
> > 
> > Hmm, good point although with ets if my owner process gets killed I
> > inherit the ets and no data is lost. Not sure how that cold be done
> > with a radix tree.
> 
> The ETS ordered_set table type is nearly as good for this purpose.
> 
> > > The best you can achieve with ets is:
> > > on insert: make sure that inserted networks are normalized (do not have
> > > any bits set after prefix length).
> > > on lookup:
> > > 1. start with masklen of 32
> > > 2. normalize address to current masklen
> > > 3. lookup for address/masklen record, return if found
> > > 4. if current masklen is zero - return 'not_found'
> > > 5. decrease masklen and repeat from step two. 
> > 
> > There are a few faster methods:
> > 1. Have a pre-scanned list of /CIDR [21, 20...], there are usually
> > about 10. But this still traverses the table 10 times in worst case.
> > 
> > 2. Traverse the table with ets:lookup(ets:next(), this has the
> > advantage over #1 as it's much faster in most cases.
> > 
> > 3. ets:select(MatchSpec), this is probably the fastest solution
> > because it should traverse the table only once. But right now I'm
> > having a bit of a problem with my select statement... see other email
> > [bitstring select]
> 
> These are all slower. ets:match/{2,3}, ets:select/{2,3}, and repeatedly
> calling ets:next/2 should be avoided where possible on large tables
> because of their immense running time.
> 
> CIDR blocks can be expressed as a tuple of two integers: the 32-bit
> padded mask and the number of significant bits. We can take advantage of
> Erlang's term ordering[1] and use the ordered_set table type with the
> block as the key {Mask, Length}. In an ordered_set table, ets:prev(Tab,
> {IP, 32}) will find the block that might contain that address[2]. Simple
> bit arithmetic will tell you if the address is in the block or not. This
> is about an order of magnitude faster on my box than doing ets:lookup/2
> for increasingly less precise masks.
> 
> Your issue with matching the bit strings in a match spec can be resolved
> by storing the bit strings as above and then using the bit arithmetic
> guard expressions (select/1 in the below gist), although I do not
> recommend this.
> 
> When in doubt (and especially when certain), measure!
> 
> https://gist.github.com/acammack/f631cf5ca978775f10f7
> 
> For 100 lookups on a 1,000,000 element table, Erlang/OTP 18.1:
> 
> ets:prev/2 (ordered): 9.40000e-5
> ets:select/3 (ordered): 1.45619e+1
> ets:lookup/2 (ordered): 1.72300e-3
> ets:lookup/2 (unordered): 7.94000e-4
> 
> [1] Tuples of the same length are ordered by their elements, starting
>     with the first.
> [2] If individual addresses are stored in the same table, you will need
>     to do an extra ets:lookup/2 for the address. If blocks can overlap,
>     you will need to call ets:prev/2 until you get a block that does not
>     match the address.
> 
> Happy coding!
> 
> --
> Adam Cammack
> 



More information about the erlang-questions mailing list