[erlang-questions] Mnesia: inconsistent views without netsplit?

Garret Smith garret.smith@REDACTED
Tue May 3 17:52:30 CEST 2016


On Wed, Apr 27, 2016 at 12:32 PM, Daniel Dormont
<dan@REDACTED> wrote:
> On Wed, Apr 27, 2016 at 3:03 AM, Dan Gudmundsson <dangud@REDACTED> wrote:
>> 1) No clue. But would be interested if you have an idea what have gone
>> wrong.
>>
>> 2) mnesia:del_table_copy(...) followed by mnesia:add_table_copy(..) should
>> re-copy the table from the other nodes.
>
> Thanks. I'll give that a try.
>
> Along the same lines I was wondering: is there a setting I can use to
> adjust the sensitivity of the system's detection of node disconnects,
> either generically or specifically within Mnesia? My production
> environment appears to have occasional momentary network hiccups (it's
> Amazon EC2 instances spanning zones within a region, for anyone
> curious). I'd like to make it less likely for those hiccups to cause
> Mnesia to enter an inconsistent state, even if it means real failures
> take a little longer to detect.

If the "hiccups" are high latency, you can look at adjusting
net_ticktime, documented here.
http://erlang.org/doc/man/kernel_app.html

>
> thanks,
> Dan
>
>
>>
>> On Tue, Apr 26, 2016 at 9:30 PM Daniel Dormont <dan@REDACTED>
>> wrote:
>>>
>>> Hi all,
>>>
>>> I have a three node Mnesia cluster (hosting a somewhat outdated
>>> version of ejabberd, but I'm not sure that matters). I have a table
>>> that is stored as ram_copies on all three nodes. Yet, this table has
>>> differing numbers of records among the three.
>>>
>>> The table info from one of them is pasted below. Running the same
>>> query on one of my other nodes, I get more or less the same result,
>>> but the "size" is very different: 553 vs 867. And indeed, there are
>>> individual records that turn up in a mnesia:read/2 or
>>> mnesia:dirty_read/2 on one node and not the other.
>>>
>>> Yet, nothing in my log indicates that there was ever a netsplit or
>>> disconnection. So I have two questions:
>>>
>>> 1) What might cause this? and
>>> 2) Is there any way, especially given I know which records are
>>> affected, to force some kind of replication on this table without
>>> completely restarting one of the nodes?
>>>
>>> thanks,
>>> Dan Dormont
>>>
>>>
>>> [{access_mode,read_write},
>>>  {active_replicas,['ejabberd@REDACTED',
>>>                    'ejabberd@REDACTED',
>>>                    'ejabberd@REDACTED']},
>>>  {all_nodes,['ejabberd@REDACTED',
>>>              'ejabberd@REDACTED',
>>>              'ejabberd@REDACTED']},
>>>  {arity,3},
>>>  {attributes,[name_host,pid]},
>>>  {checkpoints,[]},
>>>  {commit_work,[]},
>>>  {cookie,{{1341,344810,207763},'ejabberd@REDACTED'}},
>>>  {cstruct,{cstruct,muc_online_room,set,
>>>                    ['ejabberd@REDACTED',
>>>                     'ejabberd@REDACTED',
>>>                     'ejabberd@REDACTED'],
>>>                    [],[],0,read_write,false,[],[],false,muc_online_room,
>>>                    [name_host,pid],
>>>                    [],[],[],{...},...}},
>>>  {disc_copies,[]},
>>>  {disc_only_copies,[]},
>>>  {frag_properties,[]},
>>>  {index,[]},
>>>  {load_by_force,false},
>>>  {load_node,'ejabberd@REDACTED'},
>>>  {load_order,0},
>>>  {load_reason,{active_remote,'ejabberd@REDACTED'}},
>>>  {local_content,false},
>>>  {majority,false},
>>>  {master_nodes,[]},
>>>  {memory,73643},
>>>  {ram_copies,['ejabberd@REDACTED',
>>>               'ejabberd@REDACTED',
>>>               'ejabberd@REDACTED']},
>>>  {record_name,muc_online_room},
>>>  {record_validation,{muc_online_room,3,set}},
>>>  {type,set},
>>>  {size,867},
>>>  {snmp,[]},
>>>  {storage_properties,...},
>>>  {...}|...]
>>> _______________________________________________
>>> erlang-questions mailing list
>>> erlang-questions@REDACTED
>>> http://erlang.org/mailman/listinfo/erlang-questions
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions



More information about the erlang-questions mailing list