mnesia and binary/large files
Hakan Mattsson
hakan@REDACTED
Thu Oct 14 10:17:19 CEST 1999
On Wed, 13 Oct 1999, Klacke wrote:
klacke> An obvious mnesia enhancemenet would be the following:
klacke> Two nodes A + B, one table T.
klacke> A dies, X and Y is inserted in T (on B only), A recovers.
klacke> Now instead of copying the entire table T, we only need to
klacke> redo the insertions of X and Y at A.
klacke>
klacke> Once I started to implement this but got severe headaches. It's
klacke> a lot harder than it sounds.
Perhaps there are some volunteers, looking for headache? ;-)
There are several approches that could be explored in order to speed
up the table loading mechanism:
- one is to introduce the notion of version identifiers on records in
ets and dets. This would enable the table load mechanism to rely on an
algorithm that looks for records whose version identifiers differs
between trusted nodes and startups. To go a step further we could use
a common version identifier generator for all records in a table. This
would enable us to use the version identifier as a kind of timestamp
for the entire table. If both nodes agrees on the fact that the
the table timestamp are the same on both nodes, there are no need
to iterate over the table and compare the version identifier of each
record.
- another is to introduce the notion of durable checkpoints in Mnesia.
That is checkpoints that survives, even if all nodes would go down
simultaneously. A durable checkpoint should be activated while all
replicas are active and the checkpoint should then be moved forwards,
now and then, in the same manner as checkpoints used for the
purpose of incremental backups. At startup the table load algorithm
could compare the checkpoint contents of the nodes and only transfer
the lost updates. Using checkpoints are not for free, but in systems
using incremental backups it would be almost for free.
- ...
The real headache comes, when you start to think about backwards
compatibility and smooth on-line upgrade...
klacke>Ps. Now don't go and get all worked up about this, do some
klacke>measurements instead and see how long it takes to repair
klacke>as well as network load tables of different sizes.
klacke>Usually it's ok.
If it is not ok, you should try to use the feature of fragmented
tables in Mnesia. This will make the dets files smaller and therefore
shorten the repair times of them. Fragmented tables may also be used
for the purpose of distributing large amounts of data over several
nodes, and therefore decrease need for copying bauta tables between
the nodes.
/Håkan
More information about the erlang-questions
mailing list