requirements of health care apps
Ulf Wiger
etxuwig@REDACTED
Fri Apr 28 16:49:52 CEST 2000
On Fri, 28 Apr 2000, Hakan Mattsson wrote:
hakan>There are however two mechanisms in Mnesia, that may probably will
hakan>turn out to be showstoppers for your gigantic database:
hakan>
hakan>- repair of dets files. If your systems happens to crash and leave
hakan> a dets file in a opened state, it will automatically recover at
hakan> next startup, but if the file is large it will take ages to repair
hakan> even for rather small dets files. Klacke has suggested a clever
hakan> solution for this (safe dets), this has however not been incorporated
hakan> in Erlang/OTP.
hakan>
hakan>- remote table load. When Mnesia recovers after a node crash, it will
hakan> copy tables from other nodes hosting a more up-to-date replica of
hakan> the table. If the table is large it may take quite a while to transfer
hakan> it between the Erlang nodes. This issue is tricky to solve, without
hakan> major architectural changes of Mnesia.
In my experience, local table load is also inefficient on very large
tables, unless you keep the table as a disc_only copy (i.e. do not
load it into RAM.) This currently means that you can not make the
table an ordered set (since this exists in ets only - not dets.)
I recently had an ordered_set table with 2 million objects (roughly
500 MB). It took 36 hours to load -- without repair -- on my 330 MHz
UltraSPARC. Granted, the table resided on the file server, but I don't
think the load time would have been in the order of minutes anyway.
Just to get a grip of how long it takes in general to write objects to
a (plain) file, and then read them into an ets table, I wrote a tiny
test program. Putting the file on the same file server as above, I
got:
1000 objects (payload: lists:seq(1,500)):
write: 119 msec (??)
read : 2.5 sec
10,000 objects:
write: 14 sec
read : 25 sec
100,000 objects (174 MB):
write: 168 sec (2 min 48 sec)
read : 433 sec
I didn't run a million objs, but I can guess from the above that it
would take >2500 secs (42 minutes) to read. Even when reading 100,000
objects, my 512 MB Ultra got into some heavy swapping, as the VM grew
to 480 MB (see -- another problem right there).
Conclusion? I think we need to invent some form of large-file I/O, and
an efficient way of dumping a large file into ets, and back.
For the time being, we should use other DBMSes for really large
tables, and let mnesia handle small-to-medium size real-time
databases.
I've attached my simple test program, so you can see what I did.
/Uffe
--
Ulf Wiger, Chief Designer AXD 301 <ulf.wiger@REDACTED>
Ericsson Telecom AB tfn: +46 8 719 81 95
Varuvägen 9, Älvsjö mob: +46 70 519 81 95
S-126 25 Stockholm, Sweden fax: +46 8 719 43 44
-------------- next part --------------
-module(dbtest).
-compile(export_all).
write(N, F) ->
{ok, Fd} = file:open(F, [write]),
Bin = term_to_binary(data()),
Sz = size(Bin),
io:format("write Sz = ~p~n", [Sz]),
do_write(1, N, Bin, Fd),
file:close(Fd).
do_write(N, NMax, Data, Fd) when N =< NMax ->
file:write(Fd, list_to_binary([i32(N),Data])),
do_write(N+1, NMax, Data, Fd);
do_write(N, NMax, Data, Fd) ->
ok.
i32(Int) when binary(Int) ->
i32(binary_to_list(Int));
i32(Int) when integer(Int) -> [(Int bsr 24) band 255,
(Int bsr 16) band 255,
(Int bsr 8) band 255,
Int band 255];
i32([X1,X2,X3,X4]) ->
(X1 bsl 24) bor (X2 bsl 16) bor (X3 bsl 8) bor X4.
i32(X1,X2,X3,X4) ->
(X1 bsl 24) bor (X2 bsl 16) bor (X3 bsl 8) bor X4.
read(F) ->
{ok, Fd} = file:open(F, [read]),
T = ets:new(?MODULE, [set]),
Sz = size(term_to_binary(data()))+4,
io:format("read Sz = ~p~n", [Sz]),
do_read(Fd, Sz, T).
do_read(Fd, Sz, T) ->
case file:read(Fd, Sz) of
{ok, [B1,B2,B3,B4|Bytes]} ->
Obj = binary_to_term(list_to_binary(Bytes)),
ets:insert(T, {i32([B1,B2,B3,B4]), Obj}),
do_read(Fd, Sz, T);
Other ->
{T, Other}
end.
data() ->
lists:seq(1,500).
More information about the erlang-questions
mailing list