[erlang-questions] Efficient way to store message lists in mnesia

Richard A. O'Keefe ok@REDACTED
Wed Jun 8 06:22:00 CEST 2016

On 8/06/16 11:26 AM, Dmitry Belyaev wrote:
> Alexander, thanks for a great example of a double-linked list 
> implementation in a database.
> As we can see one insert operation requires 3 write operations in the 
> database in this case. Anyway this is still a constant factor applied 
> to a normal write operation.
I can't recall, did this thread start with a presentation of
  - message sizes
  - rates at which messages are produced and consumed
  - number of registered and of active users
  - quality issues such as what proportion of messages may be lost
that kind of stuff?  I mean that's what tells us how many writes
per message are a problem.

It's also important to distinguish between *worst-case* performance and
*amortised* performance.  A message will go into the doubly linked list
once (at O(1) cost) and come out of the doubly linked list once (at O(1) 
so the amortised cost per message of the doubly linked list is O(1) per
message.  It's an unpleasantly high constant factor, but it is a constant
factor.  Reading and deleting a batch of k messages is then O(k), so a
burst of stuff can happen at one time, but reading is going to take O(k)
time anyway, so we're talking about a matter of constant factor.

There are possible hybrid data structures where you choose a batch
size B and buffer recent messages in memory until there are B of them,
and run a doubly linked list of batches.

More information about the erlang-questions mailing list