mnemosyne and BIG databases (Re: mnesia power index -- What about fragmentation?)
Mon Feb 21 00:05:54 CET 2005
Den 2005-02-20 22:31:48 skrev Valentin Micic <valentin@REDACTED>:
> On a different note, I've been playing with menmosyne and
> as much as I liked it, I'm not so sure that it is intended
> for BIG databases. Actually, I've seen run-time crashed
> due to the bad/buggy query (it run out of memory). This
> is a bit concerning and preatty un-erlang like. Should one
> encourage usage of mnemosyne on a big databases? Oh, all
> of this is in R9. Hope that R10 brings improovements.
R10 fixes one bug in mnemosyne that had to do with queries
on very big tables (some queries could loop forever).
I don't know if this was the bug you ran into. The problem
of doing joins on very large data sets is tricky to solve
in Erlang, partly because of the way the garbage collector
works (building large data structures on the process
heap will cause a lot of memory overhead.)
Mnemosyne was initially designed as a proof of concept,
showing how one could use a declarative language and
"set comprehension" syntax for database queries. As I
understand it, mnemosyne also did some very clever
query optimization, which made it clearly outperform e.g.
Oracle on some very complex queries on large data sets.
But already from the start, there was a slight mismatch
in focus between mnesia and mnemosyne. Mnesia focused
on RAM-based databases, where the need for "dirty"
accesses was an important requirement. For most applications
using OTP so far, mnesia has had desirable characteristics,
while mnemosyne has been something of an odd beast.
Partly for this reason, mnemosyne has never really received
the attention it needed in order to become a really good
In R10, OTP takes a clear step away from mnemosyne by
"QLC (Query List Comprehensions) is another solution for queries
to Mnesia, Ets and Dets tables which will be the recommended way
to perform queries. QLC belongs to Stdlib and is described there.
"It is not recommended to use Mnemosyne queries in performance
OTOH, on QLC, the manual states:
"Support for faster join of two tables will be added not later
than in R11. Depending on preferences and priorities some high
level optimizations may be added in the future."
My interpretation is that QLC can hardly be recommended for
big databases either - at least not at this time. QLC might
still be the best way to go for the future, since it is a
more general approach than mnemosyne was. I'm not convinced
that there won't be a price to pay for that generality down
the line, esp. when it comes to handling complex queries on
very large databases, but I'm not an expert on the subject.
More information about the erlang-questions