[erlang-questions] Choice in Distributed Databases for a Key/Value Store

code wiget <>
Mon Sep 18 17:34:39 CEST 2017


HI,

Thank you all for your replies.

Nathaniel: The reads must be 'eventually' consistent, at least within a second. The problem is that it updates user connection information, and they will be unable to connect if our read does not get information from the write. So if we update, the connection before the write is fully committed will fail. I suppose it is ok if they cannot connect and just have to reconnect, but ideally they should be able to connect every time.

So Riak seems like a great solution, but speed wise really worries me. We are trying to connect as many clients as possible per server, this is very important as it saves us money. If the reads take 2-3x as long, this could be very slow and bad. According to this article: https://github.com/citrusbyte/redis-comparison <https://github.com/citrusbyte/redis-comparison>, Riak is up to 10x slower than Redis. This would really hurt our operations.

To those who commented redis-cluster, my problem with a cluster solution is that redis-cluster seemed to be in an experimental stage. It also has the problem where if all copies of a node die, then the cluster will lose all that data and it is up to the user to not lose that data. All of this has to be handled by the user, and this seems like it will get tedious when there are multiple nodes and all it would take is for one admin to mess it up.

So this is where Aerospike comes in. Reading about them on the web they come off as the perfect tool for a version of redis that is distributed: https://stackoverflow.com/questions/24482337/how-is-aerospike-different-from-other-key-value-nosql-databases <https://stackoverflow.com/questions/24482337/how-is-aerospike-different-from-other-key-value-nosql-databases> . But for some reason, they don’t get as much attention as redis

Does anyone have experience with Aerospike? For my application, it seems like a no brainer.

Thank you all again,
> On Sep 15, 2017, at 2:02 PM, Nathaniel Waisbrot <> wrote:
> 
> Scatter-shot reply:
> 
> Since you're using Redis right now, have you considered Redis Cluster (https://redis.io/topics/cluster-tutorial <https://redis.io/topics/cluster-tutorial>)?
> 
> I'm using Cassandra and don't feel that it's got a small community or slow pace of updates. There are a lot of NoSQL databases and they all have quite different tradeoffs which tends to fragment the community, so your expectations may be too high.
> 
> Riak, ElasticSearch, EtcD, MongoDB, etc. You have many (too many!) options. When you say "read speed and consistency" what sort of consistency are you looking for? Is eventual consistency good, or do you require that every read that takes place after a write gets the new data?
> 
> 
> 
> 
>> On Sep 15, 2017, at 12:43 PM, code wiget < <mailto:>> wrote:
>> 
>> Hello everyone,
>> 
>> I am at the point where I have many Erlang nodes, and I am going to have to move to a distributed database. Right now, I am using a basic setup: each Erlang node has a copy of the same Redis DB, and all of those DBs are slaves(non-writable copies) of a master. A big problem with this is obvious - If the db goes down, the node goes down. If the master goes down, the slaves won’t get updated, so I would like to move to a distributed db that all of my nodes can read/write to that can not/does not go down.
>> 
>> The nodes do ~50 reads per write, and are constantly reading, so read speed and consistency is my real concern. I believe this will be the node’s main speed factor.
>> 
>> Another thing is that all of my data is key/key/value , so it would mimic the structure of ID -> name -> “Fred”, ID->age->20, so I don’t need a SQL DB.
>> 
>> A big thing also is that I don’t need disc copies, as a I have a large backup store where the values are generated from.
>> 
>> I have looked at as many options as I can ->
>> 
>> Voldemort : http://project-voldemort.com/ <http://project-voldemort.com/> 
>> - looks perfect, but there are 0 resources on learning how to use it outside of their docs and no Erlang driver, which is huge because I would both have to learn how to write a c driver and everything about this just to get it to work. 
>> 
>> Cassandra: http://cassandra.apache.org/ <http://cassandra.apache.org/>
>> - looks good too, but apparently there is a small community and apparently isn’t updated often
>> 
>> Scalaris: https://github.com/scalaris-team/scalaris/blob/master/user-dev-guide/main.pdf <https://github.com/scalaris-team/scalaris/blob/master/user-dev-guide/main.pdf>
>> - Looks very very cool, seems great, but there is 0 active community and their GitHub isn’t updated often. This is a distributed all in-memory database, written in Erlang.
>> 
>> 
>> So from my research, which consisted heavily of this blog:https://www.metabrew.com/article/anti-rdbms-a-list-of-distributed-key-value-stores <https://www.metabrew.com/article/anti-rdbms-a-list-of-distributed-key-value-stores> , I have narrowed it down to these three.
>> 
>> BUT you are all the real experts and have built huge applications in Erlang, what do you use? What do you have experience in that performs well with Erlang nodes spread across multiple machines and possibly multiple data centers?
>> 
>> Thanks for your time.
>> 
>> _______________________________________________
>> erlang-questions mailing list
>>  <mailto:>
>> http://erlang.org/mailman/listinfo/erlang-questions
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20170918/6e3f9e37/attachment.html>


More information about the erlang-questions mailing list