[erlang-questions] where it's the best way to store a very big term object shared between processes

Anders Nygren anders.nygren@REDACTED
Fri Oct 23 14:20:50 CEST 2015


Since it is number analysis You want I think I should mention
https://github.com/nygge/number_analysis
Originally written by Klacke. It builds a trie in an ETS table.

/Anders

On Thu, Oct 22, 2015 at 3:54 PM, Caragea Silviu <silviu.cpp@REDACTED>
wrote:

> Hello,
>
> @Michael I'm using btree only because of btrie:find_prefix_longest .
>
> Basically this is the main functionality I need. As I already posted if
> you have a btrie with the following elements ["aa", "a", "b", "bb", "aaa"]
> and you call: btrie:find_prefix_longest("aaawhatever")  will return the
> associated value to the key "aaa".
>
> I need this for a long table with calling breakouts (prefixes and rate per
> prefix) - around 50 k breakouts and basically I call
> btrie:find_prefix_longest(<<"phonenumber">>) and it returns me the prefix
> and the rate I need to bill for that destination. Lookup operation seems ok
> from 1-2.5 ms 95% of time is spent in ets:lookup. As somebody already
> pointed out is because ets is doing a copy. I will change with gen_server
> state and benchmark again.
>
> Thanks everyone for suggestions !
>
> On Thu, Oct 22, 2015 at 11:38 PM, Michael Truog <mjtruog@REDACTED> wrote:
>
>> On 10/22/2015 01:29 AM, Caragea Silviu wrote:
>>
>> Hello.
>>
>> In one of my projects I need to use a radix tree. I found out a very nice
>> library :
>> https://github.com/okeuday/trie
>>
>> Lookup performances are great. But I have one problem.
>>
>> Basically my tree has around 100 000 elements so building it it's an
>> extremely operation. For this reason I'm building it once and all processes
>> that needs to do lookups need to share the btrie object (created using
>> btrie:new/1).
>>
>> Here I see several options:
>>
>> 1. Use a gen server and store the btrie object on the state or process
>> dictionary. - I didn't tried this
>> 2. Use a ets table and store the tire object on a public table where all
>> processes can read and  write.
>>
>> It is easier to scale and is more natural in Erlang if you pursue #1
>> (using the state, not the process dictionary).  The #2 path (including
>> mochiglobal) is typical in imperative programming (mutating global state).
>> With #1 you can manage the reliability of individual processes for
>> fault-tolerance concerns and you would probably start with a single locally
>> registered process name.  Then if there is too much contention for the
>> single process that has the btrie, you would switch to using a process
>> group, to share the load with replicated data.
>>
>> The btrie usage is probably slower than using the newer maps data
>> structure.  The trie repo was mainly created for string keys, not binary
>> keys, due to the memory access details in Erlang (i.e., it is easier to
>> have more efficient lookups with string keys, when using process heap data,
>> which includes being more efficient than maps in some cases).
>>
>> You could also store the key/value lookup as a single large binary that
>> you reference (in multiple processes, since large binaries are reference
>> counted) with something like https://github.com/knutin/bisect which may
>> work too.
>>
>> Best Regards,
>> Michael
>>
>>
>> Doing some benchmarks I see that lookup-ing for the longest prefix (btrie:
>> find_prefix_longest) in around 100 K elements by prefix it's around 2- 5
>> ms and 95% of the time is spent in the ets:lookup.
>>
>> I think the time spent there is so big because also my term stored there
>> is very big.
>>
>> Any other suggestions ?
>>
>> Silviu
>>
>>
>>
>> _______________________________________________
>> erlang-questions mailing listerlang-questions@REDACTED://erlang.org/mailman/listinfo/erlang-questions
>>
>>
>>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20151023/48dc2392/attachment.htm>


More information about the erlang-questions mailing list