[erlang-questions] Disk-backed log

Sat Jun 18 21:24:11 CEST 2016

Do *not* try to use this https://github.com/tsloughter/vonnegut/ :) but
it may help you if you decide to go down the route of implementing the
features you need from something like kafka in Erlang.

I'm adding an additional app that will be responsible for the role that
zookeeper has with kafka, but in this case it is for chain replication.

It uses the same on disk format as kafka and concept of
topics/partitions, which may come in handy. While I must repeat it is
very much a work in progress, something I've played around with to try
out other libs like teleport, it may give you a place to start.

--
Tristan Sloughter
t@REDACTED

On Sat, Jun 18, 2016, at 05:54 AM, John Smith wrote:
> For one of my systems in the financial area, I am in need of a disk-
> backed log that I could use as a backend for an Event Sourcing/CQRS
> store. Recently, I have read a bit about Kafka [1] and it seems like a
> good fit but, unfortunately, it is on JVM (written in Scala, to be
> exact) and depends heavily on ZooKeeper [2] for distribution, while I
> would prefer something similar for an Erlang ecosystem. Thus, ideally,
> I would like to have something that is:
>
> * small,
> * durable (checksummed, with a clear recovery procedure),
> * pure Erlang/Elixir (maybe with some native code, but tightly
> integrated),
> * (almost) not distributed - data fits on the single node (at least
> now; with replication for durability, though).
>
> Before jumping right into implementation, I have some questions:
>
> 1. Is there anything already available that fulfils above
> requirements?
> 2. Kafka uses different approach to persistence - instead of using in-
> process buffers and transferring data to disk, it writes straight to
> the filesystem which, actually, uses pagecache [3]. Can I achieve the
> same thing using Erlang or does it buffers writes in some other way?
> 3. ...also, Kafka has a log compaction [4] which can work not only in
> time but also in a key dimension - I need this, as I need to persist
> the last state for every key seen (user, transfer, etc.). As in Redis,
> Kafka uses the UNIX copy-on-write semantics (process fork) to avoid
> needless memory usage for log fragments (segments, in Kafka
> nomenclature) that have not changed. Can I mimick a similar behaviour
> in Erlang? Or if not, how can I deal with biggish (say, a couple of
> GB) logs that needs to be compacted?
>
> In other words, I would like to create something like a *Minimum
> Viable Log* (in Kafka style), only in Erlang/Elixir. I would be
> grateful for any kind of design/implementation hints.
>
> [1] http://kafka.apache.org/
> [2] https://zookeeper.apache.org/
> [3] http://kafka.apache.org/documentation.html#persistence
> [4] https://cwiki.apache.org/confluence/display/KAFKA/Log+Compaction
> _________________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20160618/832da392/attachment.htm>