[erlang-questions] Disk-backed log
Sat Jun 18 12:54:34 CEST 2016
For one of my systems in the financial area, I am in need of a disk-backed
log that I could use as a backend for an Event Sourcing/CQRS store.
Recently, I have read a bit about Kafka  and it seems like a good fit
but, unfortunately, it is on JVM (written in Scala, to be exact) and
depends heavily on ZooKeeper  for distribution, while I would prefer
something similar for an Erlang ecosystem. Thus, ideally, I would like to
have something that is:
* durable (checksummed, with a clear recovery procedure),
* pure Erlang/Elixir (maybe with some native code, but tightly
* (almost) not distributed - data fits on the single node (at least now;
with replication for durability, though).
Before jumping right into implementation, I have some questions:
1. Is there anything already available that fulfils above requirements?
2. Kafka uses different approach to persistence - instead of using
in-process buffers and transferring data to disk, it writes straight to the
filesystem which, actually, uses pagecache . Can I achieve the same
thing using Erlang or does it buffers writes in some other way?
3. ...also, Kafka has a log compaction  which can work not only in
time but also in a key dimension - I need this, as I need to persist the
last state for every key seen (user, transfer, etc.). As in Redis, Kafka
uses the UNIX copy-on-write semantics (process fork) to avoid needless
memory usage for log fragments (segments, in Kafka nomenclature) that have
not changed. Can I mimick a similar behaviour in Erlang? Or if not, how can
I deal with biggish (say, a couple of GB) logs that needs to be compacted?
In other words, I would like to create something like a *Minimum Viable
Log* (in Kafka style), only in Erlang/Elixir. I would be grateful for any
kind of design/implementation hints.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the erlang-questions