[erlang-questions] gen_server with a dict vs mnesia table vs ets

Jayson Vantuyl kagato@REDACTED
Thu Jan 28 23:36:36 CET 2010


Use ETS, managed by a gen_server.  The gen_server will serialize all operations, so there will be no concurrency against the ETS table.

If you need to scale this further, you might have one table and gen_server per message type, or even more split up by a hash on the user_id.  When splitting, if your need for an atomic update is global across processes, you can either collect them as batches in another process or use a gen_fsm to temporarily lock them all, then flush them, then unlock them.

While ETS does cause some extra copying, for what it is good at it can be blazingly fast.  Just test it, as variations between systems make it nearly impossible to say which is better without actual testing.  eprof and fprof are your friends.

Or you could just use Mnesia, as you can use its transactions to get your atomicity.

On Jan 28, 2010, at 1:22 PM, Pablo Platt wrote:

> @Robert
> 
> My use case is simple:
> - a list of key/value records ({user_id, msg_type}, msg_body)
> - several processes needs to create/update records.
> - one process needs to get all the records and clear the list in an 'atomic' operation once per 1 minute.
> - number of records per minutes expected to be <1K at start.
> - No need for replication/distribution. The list will be only in memory.
> 
> 
> 
> ________________________________
> From: Robert Virding <rvirding@REDACTED>
> To: Pablo Platt <pablo.platt@REDACTED>
> Cc: Max Lapshin <max.lapshin@REDACTED>; erlang-questions@REDACTED
> Sent: Thu, January 28, 2010 5:44:13 PM
> Subject: Re: [erlang-questions] gen_server with a dict vs mnesia table vs ets
> 
> It really depends very much on your app which is better:
> 
> - An ETS table will generally allow you to hold more data.
> - An ETS table is external to processes so there is no cost in process GC.
> - BUT there is still an ETS data GC cost every time you add or remove data.
> - Since ETS data not in process there are copying costs every time you
> access table. This can make some operations very expensive, but
> match_object and select_object can help alot.
> - A dict allows easy roll back to previous state if you keep old reference.
> - ETS and dicts provide slightly different interfaces.
> 
> You could use a public ETS table, but this would not allow for more
> complex atomic transactions and is not accessible over distribution.
> 
> It really does depend on what you are doing. The best is to test it
> with realistic data amounts and operations. As an alternative to dicts
> there are gb_trees which are also in the process memory but have
> different properties compared to dicts.
> 
> Robert
> 
> 2010/1/28 Pablo Platt <pablo.platt@REDACTED>:
>> So I'll use a gen_server that controls the ETS table with private access.
>> Thanks
>> 
>> 
>> 
>> 
>> ________________________________
>> From: Max Lapshin <max.lapshin@REDACTED>
>> To: Pablo Platt <pablo.platt@REDACTED>
>> Cc: erlang-questions@REDACTED
>> Sent: Thu, January 28, 2010 3:29:48 PM
>> Subject: Re: [erlang-questions] gen_server with a dict vs mnesia table vs ets
>> 
>> On Thu, Jan 28, 2010 at 4:28 PM, Pablo Platt <pablo.platt@REDACTED> wrote:
>>> The fact that ETS doesn't take part in garbage collection is a good or bad
>>> feature in my case?
>> 
>> Good, of course: you can control by yourself, when to clean objects,
>> so there will be no GC-penalty on each loop
>> 
>> ________________________________________________________________
>> erlang-questions mailing list. See http://www.erlang.org/faq.html
>> erlang-questions (at) erlang.org
>> 
>> 
>> 
> 
> ________________________________________________________________
> erlang-questions mailing list. See http://www.erlang.org/faq.html
> erlang-questions (at) erlang.org
> 
> 

-- 
Jayson Vantuyl
kagato@REDACTED



More information about the erlang-questions mailing list