[erlang-questions] mnesia slow startup

Paul Mineiro paul-trapexit@REDACTED
Sat Jul 12 01:40:17 CEST 2008


Hi.  I'm responding to my own message for search engines to help out the
next guy.

There were two key things we did to improve things:

  * avoiding unnecessary schema transactions.  instead of counting on
mnesia:create_table/2, mnesia:add_table_copy/3 etc. to be idempotent, we
short-circuit them via a mnesia:table_info/2 check first.  also we avoid
calling mnesia:change_config (extra_db_nodes, Nodes) unless the nodeset
has really changed.
  * -mnesia no_table_loaders 100 : this greatly reduced the time it took
to transfer the data to a restarting node.

so that got restart times down from about 4 hours to 4 minutes.

-- p

On Wed, 25 Jun 2008, Paul Mineiro wrote:

> hello.
>
> we've been scaling our cluster and we're currently at 24 nodes.  lately
> we've noticed that when a node crashes (by running out of memory, oops,
> software defect), that restarting mnesia on that node takes a while.  by
> "restarting that node", btw, i mean getting through things like
> mnesia_controller:try_merge_schema/1 .  when we at say 4 nodes we never
> really noticed this.
>
> today we literally went to lunch after a node failed and wasn't
> (finishing) starting up, figuring a full stomach would lubricate
> cognition. we came back and it had resolved itself.  hurray for
> laziness!
>
> however there are still some questions:
>
>   * i'm having a hard time getting some visibility into what is causing a
> particular schema transaction(s) to be blocked.  any tricks here?
>
>   * is it expected that certain transactional protocols scale badly with
> the number of nodes?
>
> any feedback would be appreciated.  thanks!
>
> -- p
>
> In an artificial world, only extremists live naturally.
>
>         -- Paul Graham
>

In an artificial world, only extremists live naturally.

        -- Paul Graham



More information about the erlang-questions mailing list