dets improvements?

Yariv Sadan yarivvv@REDACTED
Thu Jun 15 18:24:55 CEST 2006


>
> Dets stores objects on the external term format, which means that
> binary_to_term() (term_to_binary()) is called whenever objects are
> retrieved (stored). These functions block the emulator (or just the
> scheduler running the Dets process, if SMP), which spoils the soft
> real-time properties for huge terms. Films etc are better stored
> outside of Dets.

Interesting... I don't expect to store objects bigger than a few megs.
I hope such usage won't hurt performance too much.

> The repair time was significantly reduced in Erlang/OTP R8B. When
> repairing, data is written and read serially; no random access is
> involved. (By the way, the same goes for copying (open) Dets files
> between nodes, something Mnesia does every now and then.) A full Dets
> table (16 millions small objects) should not take more than half an
> hour to repair, at the worst. This is of course a very long time...
> I don't think it's easy to further reduce repair times.

This is much better than the 12 hour figure I had in mind based on a
previous message I had seen on this list. In fact, it may change my
mind about using MySQL instead of Mnesia.

Are there any databased gurus out there than know how the rebuilding
cost in dets compares to the disc storage in
MySQL/Postgres/Oracle/etc?

>
> The major problem with Dets as I see it is that the memory allocation
> scheme (a buddy system) is kept in RAM. For a fragmented table with
> millions of objects, the RAM data can amount to several megabytes.
> When closing or syncing a table, this (possibly huge) data structure
> has to be written to disc.

Is there any way to 1) measure the level of this fragmentation and 2)
to manage/reduce it via maintenance operations (preferably, without
taking offline the whole database/table)? I think this monitoring and
maintentance aspect needs to be documented somewhere, because there
will always be a fear that it may run out of control.

I don't know too much about database storage algorithms, but is there
any way to improve this allocation scheme? I think it will be
beneficial for many applications that use Mnesia if the "worst case"
fragmentation scenario were mitigated.

I appreciate the response!

Thanks
Yariv



More information about the erlang-questions mailing list