[erlang-questions] Combining ets and mnesia operations for extremeperformance
Paulo Sérgio Almeida
psa@REDACTED
Fri Sep 12 18:04:55 CEST 2008
Hi Ulf,
Thanks for the prompt reply. See below.
Ulf Wiger (TN/EAB) wrote:
>
> Perhaps you'd want to consider using ram_copies and dumping
> to disk using mnesia:dump_tables(Tabs)?
There is not much information about this operation. Does it provide
atomicity? What happens if there is a crash in the middle of it? I have
not made it clear, but I need to persist the updates on several tables
as an atomic operation.
Bit even if that was the case, there would be a problem of performance,
because the whole ets table would have to be dumped. Using mnesia I
could enjoy the transactional nature while having incremental writes.
For example, a "transaction" could update only, say 1/100 of each table,
and using mnesia, each DCL would grow slowly and the full ets table
could be dumped much less frequently (as I wished, according to
dc_dump_limit), say, each 50 transactions.
> I assume from your description that the table isn't replicated?
Sure. It is weird enough for the non-replicated case.
>> Basically, it has been working for me. But as I am doing something I
>> shouldn't (updating the ets tables directly), I ask what could go
>> wrong. I thought a bit and could only see one potential problem: that
>> mnesia dumps the ets table to the DCD file when I already started a
>> subsequent aggregation and have already done some ets operations
>> myself.
>>
>> Therefore I ask: when exactly does mnesia try to see whether to dump
>> a table or not (according to dc_dump_limit)? Can it be after a
>> mnesia:transaction finished? How long after?
>
> You could mess with the dump limit, but if you were to use
> mnesia for some other tasks in your application, this might
> come back and bite you.
Yes. This is not good. This makes one think that it could be interesting
to have a finer grain, per table dc_dump_limit, and not only a global one.
> Log dump is a background job, and it's scheduled as soon
> as the time or write threshold is exceeded. If it's a write
> threshold, the number of committed writes is what triggers it.
These thresholds are for dumping the global log to the DCL files. I will
imagine the test for dumping the ets tables to the DCD files happens
immediately after the dump to the DCL. Is it?
> If you have no particular need for some mnesia functions, it
> would seem as if just using an ets table and calling ets:tab2file/2
> would seem to be sufficient for what you've described.
This would not give me atomicity of updates in several tables ...
Regards,
Paulo
More information about the erlang-questions
mailing list