[erlang-questions] How to speed up mnesia startup?

Ulf Wiger ulf@REDACTED
Fri Apr 13 07:53:32 CEST 2012



On 13 Apr 2012, at 00:31, Richard O'Keefe wrote:

> 
> On 12/04/2012, at 9:26 PM, Ulf Wiger wrote:
> 
>> 
>> Hmm, the first thing that stands out is the inordinate number of tables.
>> 
>> Out of curiosity, I tried figuring out what is considered a reasonable number of tables for other database management systems:
>> 
>> - Oracle: over 10,000 tables considered insane
>>  (http://asktom.oracle.com/pls/asktom/f?p=100:11:::::P11_QUESTION_ID:26039880025641)
> 
> That page has no claim that *Oracle* cannot handle it.
> The arguments given are
> - that it seems extremely unlikely that people could "design,
>   implement, or maintain" a system with 10,000 distinct tables
>   "personally".
…

Fair enough. My point was mainly that 60k tables is well beyond what most database management systems are optimized for. The question is obviously here what _nmesia_ is good at, but I just wanted to illustrate that if you want to build a database with 60k tables, you shouldn't _assume_ that any given dbms will handle it well.

As far as mnesia is concerned, there are a number of commonly used functions that are reasonable as long as the number of tables is somewhere in the hundreds.

Take, for example, mnesia:wait_for_tables/2. Let's assume that our application actually needs those 60k tables to be loaded before it starts doing things.

wait_for_tables/2 is basically implemented like this (lots of code snipped):

wait_for_tables_init(From, Tabs) ->
    process_flag(trap_exit, true),
    ...
    cast({sync_tabs, Tabs, self()}),
            rec_tabs(Tabs, Tabs, From, Init)
    end.

rec_tabs([Tab | Tabs], AllTabs, From, Init) ->
    receive
        {?SERVER_NAME, {tab_synced, Tab}} ->
            rec_tabs(Tabs, AllTabs, From, Init);
        ...
    end;
rec_tabs([], _, _, Init) ->
    unlink(Init),
    ok.

To begin with, if the objective was to support tens of thousand tables, it would be better to provide functions to efficiently query the 'table of tables'. Currently, there is no table of tables, but a system_info(tables), which provides them all as a list (including 'schema', which you often must delete from the list, since you can't, for example, do wait_for tables on the schema).

In the code above, mnesia sends the 60k list of tables in a single message. It then loops through the list and does selective receive on the replies belonging to each. This is a convenient way to do it when the list of Tabs is short, but pretty awful for a very large list, since each iteration may end up scanning a rather large message queue.

This was just one example. Mnesia was not designed with this sort of thing in mind.

BR,
Ulf W

Ulf Wiger, Co-founder & Developer Advocate, Feuerlabs Inc.
http://feuerlabs.com






More information about the erlang-questions mailing list