[erlang-questions] auto-syncing mnesia after a network split
Joel Reymont
joelr1@REDACTED
Tue Dec 2 21:03:51 CET 2008
Alex,
On Dec 2, 2008, at 5:18 PM, Alex wrote:
> what happens when you have multiple updates to both sides of the
> split? if you just pick the highest vnum, you lose all the
> transactions from the other side of the split when it rejoins.
You can pick up new (inserted) records by doing a diff of primary keys
for each table.
You cannot do anything about deleted records, I think, so you'll just
have to delete those again somehow. You could assume that the table
replica with the latest timestamp is the right one and just delete the
extra records from the other table.
Imagine a bank account that's distributed across the split nodes,
where a customer deposits money a 2 times and the deposits are split
across the nodes. You'll pick up the latest deposit on one node and
miss the other deposit.
I think you can overcome this programmatically, with a timestamp _and_
a version number. You can have a version table per node with three
columns: table name, vnum and timestamp. The rest of the tables would
have just the vnum in their records.
When updating table T, you will first update the version table by
storing the current time and bumping the vnum for the key T. You will
then store the vnum in the record of table T that you are updating.
You will be able to find the split time by looking at the version
tables and figuring out when the vnums started to diverge. You can
then invoke a merge function that figures out, for example, how to
merge a bunch of bank deposit transactions into a single balance.
You will know the vnum at split time and will only need to consider
the transactions that happened after. Shouldn't be a lot of
transactions for a short split time.
What do you think?
--
http://twitter.com/wagerlabs
More information about the erlang-questions
mailing list