dets improvements?

Sat Jun 10 22:07:42 CEST 2006

> To be fair, mnesia doesn't target the same applications as e.g. PostgreSQL
> and MySQL. It's difficult to come up with database systems that achieve
> such tight integration with the applications. There is no semantic gap and
> overhead is very low.

I know... I think Mnesia is quite amazing actually. I would much
rather use it than MySQL or Postgres, and that's why it hurts that
it's not really suited for my application -- a standard web site with
forums, etc, running on a hosted server with little RAM. Pretty
standard stuff, actually.

> Recall that mnesia is primarily a dbms for embedded realtime systems.
> Normally, one wants to have more control over what goes on, even if it
> means more work up front.

I think it's great to give the developer the capability of
fragmentation -- clearly essential for clustering -- but I think that
in general things should "just work" out of the box with the least
amount of maintenance. In fact, the "just works" property is what
makes Mnesia so appealing. You write a one liner such as

save_page(Page) ->
  mnesia:transaction(fun()-> mnesia:write(Page) end).

and magically the transaction is broadcast to a cluster of nodes
without your knowing or caring what happens behind the scenes. That's
powerful.

Then again, I should say that the need for fragmentation is just a
peeve. Rebuilding time for broken tables is the real deal breaker for
me.

> - The most influential application projects for Erlang's
>    development so far simply haven't seen this has a hard
>    requirement.

Yes, I understand... I'm not criticizing the historical design
decisions that were made because clearly Erlang was not created as a
language for building web based forums :) I just think there are
probably a lot of developers like me who are just getting into Erlang
and who would like to use its full power on a $40/month hosted server,
and it's unforunate that they should run into these same issues.

By the way, I think that if the Erlang ecosystem had more features
catering to hobbyists like me, this graph
http://www.google.com/trends?q=ruby+on+rails%2C+erlang&ctab=0&geo=all&date=all)would
look quite different.

>
> - It *is* more difficult to make efficient disk storage
>    for dynamically typed data. Especially ordered_set
>    disk storage is extremely difficult to implement
>    efficiently without knowing the type or size of the
>    keys. IMO, in order to compete squarely with
>    conventional DBMSes on large data volumes, mnesia
>    will have to allow type definitions of data.

I didn't realize this point. Hmm... maybe the create_table function
could have more parameters indicating the type of each field?

>
> The rdbms contrib has this, even though it's in early beta.

Interesting! I didn't know that.

> Again, rdbms has an embryo to this. It's not fully functional yet. I would
> certainly welcome some help, but it's probably best to first verify the
> indexing functionality first.

That's great! I can't wait for the next Erlang-powered Google killer
:) (It recently occurred to me that Erlang would be a great platform
for such a beast.)

> Ironically, there was one from the start - mnemosyne. It received a bad
> reputation for a couple of reasons:
>
> (1) The main applications at the time were real-time applications with
> little or no need for an optimizing
> query engine. Some people misunderstood this to mean that mnemosyne was no
> good, when in fact it was misused, or simply vast overkill for the given
> applications; and
>
> (2) it was really a research project at the time when it was included in
> OTP, and not enough work was put in to make it product quality.
>
> The aim of mnemosyne was, if I've understood things correctly, partly to
> explore the idea of set comprehensions for database queries, and partly to
> make a very advanced query optimizer, which was especially good at really,
> really hairy queries. There are queries that can bring most query engines
> to their knees, and optimization techniques that can bring down query time
>  from hours to minutes in certain situations. Mnemosyne was especially good
> at resolving recursive dependencies. But all such processing comes at a
> cost, and in practice, mnemosyne was mostly used for very simple queries,
> where optimization made little or no difference.

I did hear that the OTP group is working on a query optimizer for QLC,
so I wonder what strategies they are taking to avoid the issues that
came up with mnemosyne. Maybe in order to get the best of both worlds
-- complex optimizations and real time performace -- it would make
sense to have prepared queries whose optimizations are only done once
and then cached for future uses.

>
> Basically, most things are possible, but there has to be sufficient demand
> for them, and people willing to put in the effort.
>

I don't know how much demand there is for such features... all I can
do is express demand from at least 1 developer here on the mailing
list :) However, I don't think my issues are so esoteric, so hopefully
they will be answered by the time I build Erlang Hobby Project #2.
For Erlang Hobby Project #1, I'll probably use Yaws backed by MySQL or
Postgres, so forunately I don't have to undergo complete Erlang
withdrawal :)

Best,
Yariv