[erlang-questions] What is the best type as key of mnesia table?

Laszlo Lovei <>
Wed Jan 9 16:56:21 CET 2013


Q0: there is not a single best key type, and you are on the right track to 
find the one that is best suited for your scenario: do a lot of experiments 
and measurements :) Personally I would advise against using atoms when you 
use a lot of dynamically generated keys, they are not designed for that.

Q1: the numbers you quote are the sizes for in-memory representation. Atoms 
use 1 word in memory because there is a lookup table for them. When you 
write the data to disc, all characters have to be stored, and the 
difference between these types will be small.


L.

On Wednesday, January 9, 2013 8:27:53 AM UTC+1, nanun wrote:
>
> I used _list_ as key of table.
> Now I need more faster and lighter so I tested some.
>
> Before I tested, I think that _atom_ is the fastest and smaller mnesia 
> files.
> And I thought that _binary_ is the best candidate because _atom_ has issue 
> of garbage collection.
>
> But test results make me confused.
>
> Q0. What is the best type as key?
>
> == test results ==
>
> There are three candidates; list, binary and atom.
>
> 100k keys;
>    list keys are   "k1000001"   ~   "k1100000",
>  binary keys are <<"k1000001">> ~ <<"k1100000">>,
>    atom keys are    k1000001    ~    k1100000.
>
> I tested both mnesia types; disc_copy and disc_only_copy
>
> Q1. Mnesia file sizes are strange, what's wrong?
> - disc_copy
>    size   table_type
>   3100089 atom.DCD
>   3300089 binary.DCD
>   3100089 list.DCD
> - disc_only_copy
>    size   table_type
>   5391992 list_o.DAT
>   6489808 binary_o.DAT
>   6130276 atom_o.DAT
>
>   * Every key has same value.
>   * checked the sizes after q().
>
>
>   accoding to http://erlang.org/doc/efficiency_guide/advanced.html
>     list : 1 word + 2 words * 8(length of a list) = 17 words
>     binary : 3..6 + 8(length of a list) = 11 ~ 14 words
>     atom : 1 word
>     so I thought the size order will list > binary >> atom.
>
> Q2. Why atom is not fastest? atom is sometimes the slowest.
>   atom  read    :  61.095
>   binary  read  :  65.729
>   list  read    :  84.162
>
>   atom  write   : 858.18
>   binary  write : 913.092
>   list  write   : 861.357
>
>   atom_o  read    : 1657.427
>   binary_o  read  : 1453.002
>   list_o  read    : 1482.138
>
>   atom_o  write   : 1765.436
>   binary_o  write : 2261.9
>   list_o  write   : 2289.084
>
>   * read and write every tables(100k)
>   * tried to avoid difference of period to get a key.
>   * milli seconds
>
> * I think that the best type is _integer_, but I want use one of them.
>
> thanks.
> /nanun
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20130109/852e6c51/attachment.html>


More information about the erlang-questions mailing list