[erlang-questions] Disk-backed log

John Smith 4crzen62cwqszy68g7al@REDACTED
Sat Jun 18 12:54:34 CEST 2016

For one of my systems in the financial area, I am in need of a disk-backed
log that I could use as a backend for an Event Sourcing/CQRS store.
Recently, I have read a bit about Kafka [1] and it seems like a good fit
but, unfortunately, it is on JVM (written in Scala, to be exact) and
depends heavily on ZooKeeper [2] for distribution, while I would prefer
something similar for an Erlang ecosystem. Thus, ideally, I would like to
have something that is:

  * small,
  * durable (checksummed, with a clear recovery procedure),
  * pure Erlang/Elixir (maybe with some native code, but tightly
  * (almost) not distributed - data fits on the single node (at least now;
with replication for durability, though).

Before jumping right into implementation, I have some questions:

  1. Is there anything already available that fulfils above requirements?
  2. Kafka uses different approach to persistence - instead of using
in-process buffers and transferring data to disk, it writes straight to the
filesystem which, actually, uses pagecache [3]. Can I achieve the same
thing using Erlang or does it buffers writes in some other way?
  3. ...also, Kafka has a log compaction [4] which can work not only in
time but also in a key dimension - I need this, as I need to persist the
last state for every key seen (user, transfer, etc.). As in Redis, Kafka
uses the UNIX copy-on-write semantics (process fork) to avoid needless
memory usage for log fragments (segments, in Kafka nomenclature) that have
not changed. Can I mimick a similar behaviour in Erlang? Or if not, how can
I deal with biggish (say, a couple of GB) logs that needs to be compacted?

In other words, I would like to create something like a *Minimum Viable
Log* (in Kafka style), only in Erlang/Elixir. I would be grateful for any
kind of design/implementation hints.

[1] http://kafka.apache.org/
[2] https://zookeeper.apache.org/
[3] http://kafka.apache.org/documentation.html#persistence
[4] https://cwiki.apache.org/confluence/display/KAFKA/Log+Compaction
