[erlang-questions] auto-syncing mnesia after a network split

Tue Dec 2 21:03:51 CET 2008

Alex,

On Dec 2, 2008, at 5:18 PM, Alex wrote:

> what happens when you have multiple updates to both sides of the  
> split?  if you just pick the highest vnum, you lose all the  
> transactions from the other side of the split when it rejoins.

You can pick up new (inserted) records by doing a diff of primary keys  
for each table.

You cannot do anything about deleted records, I think, so you'll just  
have to delete those again somehow. You could assume that the table  
replica with the latest timestamp is the right one and just delete the  
extra records from the other table.

Imagine a bank account that's distributed across the split nodes,  
where a customer deposits money a 2 times and the deposits are split  
across the nodes. You'll pick up the latest deposit on one node and  
miss the other deposit.

I think you can overcome this programmatically, with a timestamp _and_  
a version number. You can have a version table per node with three  
columns: table name, vnum and timestamp. The rest of the tables would  
have just the vnum in their records.

When updating table T, you will first update the version table by  
storing the current time and bumping the vnum for the key T. You will  
then store the vnum in the record of table T that you are updating.

You will be able to find the split time by looking at the version  
tables and figuring out when the vnums started to diverge. You can  
then invoke a merge function that figures out, for example, how to  
merge a bunch of bank deposit transactions into a single balance.

You will know the vnum at split time and will only need to consider  
the transactions that happened after. Shouldn't be a lot of  
transactions for a short split time.

What do you think?

--
http://twitter.com/wagerlabs