[erlang-questions] Efficient Insertions in Mnesia tables

Olivier BOUDEVILLE olivier.boudeville@REDACTED
Fri Nov 5 13:35:38 CET 2010


Hello Dan,

Thanks for your answer! I was thinking that Mnesia, even with 
disc_only_copies, could maintain a in-RAM buffer to batch even dirty 
writes (a bit like delayed_writes for files). Maybe I should do the 
buffering by myself, not sure it would help. Indeed the RAM-only options I 
specified had then no effect.

Actually, wouldn't it be nice to have an intermediate step between 
disc_only_copies and disc_copies, i.e. basically a batch writing to files 
with a partial RAM caching for pending writes? This would be useful in all 
the cases where the whole table could not fit in RAM, and trade a bit of 
robustness for a performance gain. 

As for locks, I could see with vmstat that beam.smp was mostly spending 
its time waiting (not idle), whereas the number of ongoing I/O operations 
stayed very low, so I was supposing the time spent was mostly linked with 
some synchronisation happening within the VMs. As I do not think it was 
due to our application (as it is basically stalled because of the 
synchronous writes we have to perform), I suppose that locks had to do 
with Mnesia.

As for the default index, unless I am mistaken, it is made on the primary 
key, in our case there cannot be any collision (as it is basically an 
increasing time-step). I suppose we cannot temporarily disable the index 
during this phase and rebuild it when it is over?

Thanks,
Best regards,

Olivier.
---------------------------
Olivier Boudeville

EDF R&D : 1, avenue du Général de Gaulle, 92140 Clamart, France
Département SINETICS, groupe ASICS (I2A), bureau B-226
Office : +33 1 47 65 59 58 / Mobile : +33 6 16 83 37 22 / Fax : +33 1 47 
65 27 13



dangud@REDACTED 
05/11/2010 11:52

A
olivier.boudeville@REDACTED
cc
erlang-questions@REDACTED
Objet
Re: [erlang-questions] Efficient Insertions in Mnesia tables






On Fri, Nov 5, 2010 at 11:15 AM, Olivier BOUDEVILLE
<olivier.boudeville@REDACTED> wrote:
> Hi,
>
> We are trying to write (with mnesia:dirty_write) in a disc_only_copies
> Mnesia table (type: set, not fragmented, not replicated) records (ex: 60
> 000 of them) and we observe that the insertion time is increasing as the
> table is increasingly crowded. This is not really a surprise but 
something
> we need to avoid. What we would like is to have constant (and preferably
> low) insertion times, like we had when writing directly to a file.
>
> We tried to get as close as possible with the following settings and 
use:
>
>                        % We want tables to be dumped less frequently 
from
> memory to disc,
>                        % in order to buffer writings (default value is
> 4):
>                        ok = application:set_env( mnesia, dc_dump_limit, 
1
> ),

You are using disc_only_copies, there is no memory to dump here,
this does nothing for you.

>
>                        % Increases a lot (default value is 100) the
> maximum number of
>                        % writes to the transaction log before a new dump
> is performed:
>                        ok = application:set_env( mnesia,
> dump_log_write_threshold, 50000 ),
>
> Over time we see the CPU load decrease steadily, the computer seems to
> spend most of its time fighting for locks.
>

Locks ? You are using dirty i.e. no locks are taken, or are you
talking about mutex's in the emulator).

> We happen to be in a pretty favorable situation (only writes, no
> concurrent access to a given table). We chose disc_only_copies as there
> might be a large number of such tables and if they filled over time they
> could exhaust the RAM.
>
> Is there anything we missed that would allow us (roughly) constant
> insertion times with Mnesia?
>

Mnesia should mostly behave as dets does when using it directly, does it?
i.e. switch the mnesia calls to direct dets calls and see if it does.

You are not doing something stupid, like have an index on the table and 
all
test entries have the same value on the index field?
(which would cause a O(n) insertion time, and have happend more than
once when someone is
measuring performance).

/Dan

> Thanks in advance for any hint,
> Best regards,
>
> Olivier.
> ---------------------------
> Olivier Boudeville
>
> EDF R&D : 1, avenue du Général de Gaulle, 92140 Clamart, France
> Département SINETICS, groupe ASICS (I2A), bureau B-226
> Office : +33 1 47 65 59 58 / Mobile : +33 6 16 83 37 22 / Fax : +33 1 47
> 65 27 13
>
>
>
> Ce message et toutes les pièces jointes (ci-après le 'Message') sont 
établis à l'intention exclusive des destinataires et les informations qui 
y figurent sont strictement confidentielles. Toute utilisation de ce 
Message non conforme à sa destination, toute diffusion ou toute 
publication totale ou partielle, est interdite sauf autorisation expresse.
>
> Si vous n'êtes pas le destinataire de ce Message, il vous est interdit 
de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou 
partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de 
votre système, ainsi que toutes ses copies, et de n'en garder aucune trace 
sur quelque support que ce soit. Nous vous remercions également d'en 
avertir immédiatement l'expéditeur par retour du message.
>
> Il est impossible de garantir que les communications par messagerie 
électronique arrivent en temps utile, sont sécurisées ou dénuées de toute 
erreur ou virus.
> ____________________________________________________
>
> This message and any attachments (the 'Message') are intended solely for 
the addressees. The information contained in this Message is confidential. 
Any use of information contained in this Message not in accord with its 
purpose, any dissemination or disclosure, either whole or partial, is 
prohibited except formal approval.
>
> If you are not the addressee, you may not copy, forward, disclose or use 
any part of it. If you have received this message in error, please delete 
it and all copies from your system and notify the sender immediately by 
return message.
>
> E-mail communication cannot be guaranteed to be timely secure, error or 
virus-free.
>




Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse.

Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message.

Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus.
____________________________________________________

This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval.

If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message.

E-mail communication cannot be guaranteed to be timely secure, error or virus-free.


More information about the erlang-questions mailing list