Mnesia, disconnections and reconnections

Thu Feb 19 11:55:25 CET 2004

On Thu, 19 Feb 2004 09:34:09 +0100 (MET), Hakan Mattsson 
<hakan@REDACTED> wrote:

> On Thu, 19 Feb 2004, Dan Gudmundsson wrote:
>
> Danne> See mnesia:set_master_nodes/[12]
> Danne>
> Danne> However you should reset it to the empty list after re-starting
> Danne> mnesia. It's usually best to reset it with empty list after 
> starting
> Danne> mnesia.
>
> You need to be careful here, as you need to wait for all involved
> tables to actually be loaded from the master node(s) before you safely
> can empty the master node list. Use mnesia:wait_for_tables/2 for this
> purpose.
>
> /Håkan

But of course, there are some tricky cases where the tables will
never load, and if you want to be really safe, you need to handle
those too. There is no generic algorithm that will work in all
cases, so what you do to resolve different situations is dependent
on your particular system.

If i recall correctly, these situations might cause problems:

1) Node A dies, node B logs that; B dies; A restarts, but cannot
    load tables, since it doesn't know that they are the latest copy.
2) Communication lost, A and B both think they are master; comm. restored,
    but now you may have inconsistency in your tables.
3) Comm. lost, and is restored; A and B both detect that there may be
    inconsistency and restart to resolve it; A sets B to master; B sets
    A to master (I don't recall if this leads to a deadlock)
4) Same as 3), but both identify B as master; B dies before A has loaded
    all tables from B; A may now have only partially consistent data.

(1) is difficult to detect. In AXD 301, we do a WFG analysis in a special
program that monitors mnesia table loading; we also have the rule that
a node may force-load tables if the other nodes are not there.

(2) You can detect this either through mnesia's event mechanism, and
you can opt to write your own mnesia event handler (see the mnesia docs).
Another way to do this is to set -kernel dist_auto_connect once, and
have a backdoor ping (e.g. on UDP). This means that the nodes will not
reconnect automatically unless one of the nodes restarts first. If you
get a UDP ping from a node that is not in the nodes() list, you have
a partitioned network. The UDP ping could also carry enough info that
you can decide which node should be the master, and which should restart.
This also prevents (3) from happening.

(4) I don't know how to solve this, and don't recall exactly what will 
happen.

When using mnesia:wait_for_tables(Tabs, Timeout), setting the Timeout may
be tricky, esp. if you may want your node to force_load tables if the
other nodes stay down. Setting the timeout too short means that 
wait_for_tables
may time out while tables are still loading. You will be told how many
tables remain, and can write logic to check that things are moving (which
they may not appear to be, if a huge table is being synched over a slow
network.) Setting the timeout too long may mean that you get unnecessary
downtime in your system.

Hope I haven't scared you off now.  ;)

/Uffe

-- 
Ulf Wiger, Senior System Architect
EAB/UPD/S

This communication is confidential and intended solely for the addressee(s). Any unauthorized review, use, disclosure or distribution is prohibited. If you believe this message has been sent to you in error, please notify the sender by replying to this transmission and delete the message without disclosing it. Thank you.

E-mail including attachments is susceptible to data corruption, interruption, unauthorized amendment, tampering and viruses, and we only send and receive e-mails on the basis that we are not liable for any such corruption, interception, amendment, tampering or viruses or any consequences thereof.