[erlang-questions] dealing with large dicts

Paul Fisher pfisher@REDACTED
Thu Sep 4 20:43:35 CEST 2008


Jacob Perkins wrote:
> Hey all, I've been having an issue with erlang crashing due to an 
> out-of-memory error. I'm pretty sure the reason is that I'm looking up, 
> updating, then writing back a fairly large dict in a mnesia transaction, 
> and doing that many times for many different dicts. Each dict may 
> contain thousands of 30 character string keys along with float values. 
> My rough calculations say that each dict may be on the order of a couple 
> megabytes, maybe approaching 10M given the indexing overhead of the dict 
> structure. And since all the value copying happens within a transaction, 
> there's a lot of memory being used. Here's the basic transaction 
> function I'm doing:
> 
> T = fun(Term, Key) ->
>          [Dict1] = mnesia:read(Table, Term),
>          Dict2 = dict:store(Key, Val, Dict1),
>          ok = write_val(Table, Term, Dict2)
>      end.
> 
> Any thoughts, alternative approaches?

While not always the right thing because of the extra cpu overhead, you 
can reduce some of the memory overhead and copying with something like this:

T = fun(Term, Key) ->
         [CDict1] = mnesia:read(Table, Term),
         Dict1 = binary_to_term( CDict1 ),
         Dict2 = dict:store(Key, Val, Dict1),
         CDict2 = term_to_binary( Dict2, [compressed] ),
         ok = write_val(Table, Term, CDict2)
     end.


This will keep the size down for the copies and should be ref counted 
when passed from/to mnesia/ets/dets process to retrieve/store.


--
paul



More information about the erlang-questions mailing list