[erlang-questions] Combining ets and mnesia operations for extremeperformance

Paulo Sérgio Almeida psa@REDACTED
Fri Sep 12 18:04:55 CEST 2008


Hi Ulf,

Thanks for the prompt reply. See below.

Ulf Wiger (TN/EAB) wrote:
> 
> Perhaps you'd want to consider using ram_copies and dumping
> to disk using mnesia:dump_tables(Tabs)?

There is not much information about this operation. Does it provide 
atomicity? What happens if there is a crash in the middle of it? I have 
not made it clear, but I need to persist the updates on several tables 
as an atomic operation.

Bit even if that was the case, there would be a problem of performance, 
because the whole ets table would have to be dumped. Using mnesia I 
could enjoy the transactional nature while having incremental writes. 
For example, a "transaction" could update only, say 1/100 of each table, 
and using mnesia, each DCL would grow slowly and the full ets table 
could be dumped much less frequently (as I wished, according to 
dc_dump_limit), say, each 50 transactions.

> I assume from your description that the table isn't replicated?

Sure. It is weird enough for the non-replicated case.

>> Basically, it has been working for me. But as I am doing something I
>>  shouldn't (updating the ets tables directly), I ask what could go
>> wrong. I thought a bit and could only see one potential problem: that
>> mnesia dumps the ets table to the DCD file when I already started a
>> subsequent aggregation and have already done some ets operations
>> myself.
>>
>> Therefore I ask: when exactly does mnesia try to see whether to dump
>> a table or not (according to dc_dump_limit)? Can it be after a 
>> mnesia:transaction finished? How long after?
> 
> You could mess with the dump limit, but if you were to use
> mnesia for some other tasks in your application, this might
> come back and bite you.

Yes. This is not good. This makes one think that it could be interesting 
to have a finer grain, per table dc_dump_limit, and not only a global one.

> Log dump is a background job, and it's scheduled as soon
> as the time or write threshold is exceeded. If it's a write
> threshold, the number of committed writes is what triggers it.

These thresholds are for dumping the global log to the DCL files. I will 
imagine the test for dumping the ets tables to the DCD files happens 
immediately after the dump to the DCL. Is it?

> If you have no particular need for some mnesia functions, it
> would seem as if just using an ets table and calling ets:tab2file/2
> would seem to be sufficient for what you've described.

This would not give me atomicity of updates in several tables ...

Regards,
Paulo






More information about the erlang-questions mailing list