mnesia_frag ????

Tue Jan 17 12:46:18 CET 2006

Hi ... /Uffe

Many thanks for your valuble replies. Hope, your info will help all the
users of mnesia_frag.

Regarding the NodeList, I have used it as {disc_copies, [node()]} to get the
previous mentioned result.

What I noted is {disc_copies, [node()]} or {disc_only_copies, [node()]} with
the {frag_properties, [{n_fragments, 30}]} give the same result (no copies
on disk / only ram copies).

but when I replace frag_properties as {frag_properties, [{n_fragments,
30},{n_disc_copies, 1}]} OR {frag_properties, [{n_fragments,
30},{n_disc_only_copies, 1}]} it gives fragmented tables on disc

profile_db.DCD
profile_db_frag2.DCD
profile_db_frag3.DCD
.....
profile_db_frag29.DCD
profile_db_frag30.DCD

in adition "n_disc_only_copies" gives aditional index files on disk as well.

That means, did the "n_disc_copies" override {disc_copies, [node()]} ????? I
mean what is the difference between disc_copies value & n_disc_copies value
on fragmented table (when both are used at same time)???

Also, what is the possibility to use the mnesia fragmented 3Gb (assumed)
table on mechine with 2Gb (assumed) RAM (Less memory than the table size).
If we use the disc_copies did the mnesia_frag handle it when the table grows
more than 2Gb. In such situation what is the best way to get the best
performance???

Did the mnesia/mnesia_frag & erlang take care of mechine with dual
processors (with hyperthreading) to ge the best performance ??? OR what
additional things we can try for to get the perfoemance max (if
applicable)???

Thanks in advance

Sanjaya Vitharana

----- Original Message -----
From: "Ulf Wiger (AL/EAB)" <ulf.wiger@REDACTED>
To: "Sanjaya Vitharana" <sanjaya@REDACTED>
Cc: <erlang-questions@REDACTED>
Sent: Tuesday, 17 January 2006 04:19 pm
Subject: RE: mnesia_frag ????

Sanjaya Vitharana wrote:
>
> "...But I would strongly recommend against using dirty access
> in combination with fragmented tables."
>
> "Both read and write in fragmented tables are complex operations."
>
> If we use the original function without dirty, how it will
> effect to the efficiency of the huge table something like ~3G
> (3M rec x 1K).

There is a setup cost when using transactions, which you are
paying anyway if you wrap the whole thing with
mnesia:activity(transaction, fun() -> ... end, mnesia_frag).
It's in the order of 110 usec on my 1 GHz SunBlade.

The size of the table makes very little difference to
performance, except of course while building or loading
the table. Reads and writes are not affected much at all,
esp. if type is 'set' (a hash table).

Furthermore, there is no difference in this regard between
read() and dirty_read(). The only differences are:
- read() will request a lock on the object before reading,
  while dirty_read() will not. The cost of this operation
  will depend on how many replicas exist on the fragment
  where the object resides. If there's only one copy, the
  operation will amount to a gen_server:call(), a check
  for a table lock, and a check for an object lock, and
  finally, of course, an ets:insert in the lock table.
- read() will perform a lookup in the local transaction
  store, whereas dirty_read() will not.

All in all, a transaction-based read() on my SunBlade,
using a fragmented table with 1000 fragments (no replicas),
takes about 270 us, while a dirty read of the same object
takes about 80 us. Doing the same thing with only 3 fragments
takes about 240 usecs. So, there is a difference, but it's
very small.

> Also, how the number of fragments will effect to the table
> access?? since the operetions are complex (as you told), can
> the number of fragments degrade the performance (depending
> upon the numberof record / table size)? I mean the record
> access time against the number of fragments (I assume the
> number of records in the table as constant).

For all practical purposes, I don't think you will notice
any degradation.

> And also refereing to the
> http://www.erlang.org/ml-archive/erlang-questions/200307/msg00
> 052.html.ORIG
> if we change the access_module configuration parameter to
> mnesia_frag it says "> without doing any changes to your old
> code."

This is almost partly true. If e.g. your code uses dirty_write()
and dirty_read() directly, then -- as you have noticed -- the
fragmentation functionality will not work as expected. The
same thing goes for code that uses the older function
mnesia:transaction(fun() -> ... end).

Also, you will pay a (slight) performance penalty for
using mnesia_frag on non-fragmented tables. It's about
a few tens of microseconds per read/write operation.

> Refering to my original question I have used the
> {disc_copies, NodeList} , so why the mnesia:info() shows only
> the schema table as disc_copies???

Your original post doesn't show the actual value of
NodeList. If it's [], then mnesia will by default
add {ram_copies, [node()]}.

BR,
/Uffe