[ANN] ram, an in-memory distributed KV store

Wed Dec 22 23:56:02 CET 2021

I added cloudi_crdt for internal (Erlang/Elxiir/etc.) CloudI services to 
have an in-memory distributed KV database.  It uses a POLog CRDT with 
the data stored in an Erlang map locally (reads are accessing the local 
Erlang map).  There is additional functionality (bootstrap and 
clean_vclocks) to ensure fault-tolerance problems are handled (e.g., 
service processes crashing, netsplits, etc.).

The configuration does need to know the number of nodes that will be 
used, though a node can be replaced without problems.  The amount of 
messaging with a POLog means that your node count is best kept lower 
unless you have fast hardware (so use 4 nodes and think twice before 
trying 64 nodes).  There is basic use in cloudi_service_request_rate and 
more complex use in cloudi_service_funnel.  The 
cloudi_service_request_rate service does a loadtest of the CloudI 
service request throughput that is sustained without any timeouts.  The 
cloudi_service_funnel service is able to group multiple service requests 
into a single service request as a proxy, so it could be used for a 
distributed fault-tolerant cron setup (with cloudi_service_cron).  The 
relevant links are below:

https://github.com/CloudI/CloudI/blob/develop/src/lib/cloudi_core/src/cloudi_crdt.erl
https://github.com/CloudI/CloudI/blob/develop/src/lib/cloudi_service_request_rate/src/cloudi_service_request_rate.erl
https://github.com/CloudI/CloudI/blob/develop/src/lib/cloudi_service_funnel/src/cloudi_service_funnel.erl

Best Regards,
Michael

On 12/21/21 5:57 AM, Roberto Ostinelli wrote:
> Let’s write a database! Well not really, but I think it’s a little sad 
> that there doesn’t seem to be a simple in-memory distributed KV 
> database in Erlang. Many times all I need is a consistent distributed 
> ETS table.
>
> The two main ones I normally consider are:
>
>   * *Riak* which is great, it handles loads of data and is based on
>     DHTs. This means that when there are cluster changes there is a
>     need for redistribution of data and the process needs to be
>     properly managed, with handoffs and so on. It is really great but
>     it’s eventually consistent and on many occasions it may be
>     overkill when all I’m looking for is a simple in-memory ACI(not D)
>     KV solution which can have 100% of its data replicated on every node.
>   * *mnesia* which could be it, but unfortunately requires special
>     attention when initializing tables and making them distributed
>     (which is tricky), handles net splits very badly, needs hacks to
>     resolve conflicts, and does not really support dynamic clusters
>     (additions can be kind of ok, but for instance you can’t remove
>     nodes unless you stop the app).
>   * …other solutions? In general people end up using Foundation DB or
>     REDIS (which has master-slave replication), so external from the
>     beam. Pity, no?
>
>
> So… :) Well I don’t plan to write a database (since ETS is /awesome/), 
> rather distributing it in a cluster. I’d simply want a distributed ETS 
> solution after all!
>
> I’ve already started the work and released a version 0.1.0 or ram:
> https://github.com/ostinelli/ram <https://github.com/ostinelli/ram>
>
> Docs are here:
> https://hexdocs.pm/ram <https://hexdocs.pm/ram>
>
> Please note this is a very early stage. It started as an experiment 
> and it might remain one. So feedback is welcome to decide its future!
>
> Best,
> r.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20211222/7dfd3853/attachment.htm>