[erlang-questions] mnesia sync_transactions not fsynced?

Mon Oct 31 04:39:44 CET 2011

On Sun, Oct 30, 2011 at 3:59 AM, Dan Gudmundsson <dangud@REDACTED> wrote:

> As far as mnesia is concerned it is logged to disc. It have left
> mnesia call chain and it's nothing
> more mnesia can do, except sync the disk.
> Which is a performance penalty that is not acceptable per transaction.
>
>
"Not acceptable" to who? That's what a durable transaction *is*. For most
users of actual databases, it is not acceptable that a (durable, isolated)
transaction does *not* sync the disk when it claims to synchronize and its
results are globally visible.

Imagine building a bank account balance transfer system on a database that
allows globally visible changes to be rolled back. First, I transfer money
to your account. Then, someone asks how much money you have to cover some
payment. Then, the database crashes, and my transfer to you is reverted,
even though the transfer was observed globally from outside the
transaction.

Again, you can change the documentation if you want -- but from this
discussion, I've learned something new -- what mnesia calls a transactional
relational databse is not the same as what you'll find in a database
systems textbook.

> I have been reminded why the disc_log cache was implemented, when you
> push transactions
> in a short loop, you are pushing more transaction then what can be
> pushed to disc.
>

This is what back pressure is for. If you need more peak throughput than
your disk can provide (when including blocked commits) then you have to use
non-synchronous "transactions" (which really aren't), and maybe a
replicated table store in a second data center, to reduce the risk of
permanent data loss. Mnesia can do this (neat!) similar to how systems like
MongoDB can do this, but that's not a durable, isolated transaction system
in the face of machine failure.

> So that everything was stuck in disc_logs message queue.
> The way to improve that was to write larger chunks to the disc.
>

That's fine. A synchronous transaction commit should not be allowed until
that larger block has committed, though. This means that you may collect
many transactions waiting for commit, and they all commit with that
particular block flush at the same time.

I think a note in the documentation would be useful. I think an
implementation that blocks sync transactions until the block has flushed to
disk (and thus, unblocking many synchronous transactions at once) might be
even better!

Sincerely,

jw
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20111030/e4086d2b/attachment.htm>