[erlang-questions] Choice in Distributed Databases for a Key/Value Store

Paul Oliver puzza007@REDACTED
Tue Sep 19 01:00:08 CEST 2017


I'd recommend checking jepsen.io for testing of distributed systems.
There's a very thorough review of Aerospike there with some results that
may give you pause. https://aphyr.com/posts/324-jepsen-aerospike

On Tue, Sep 19, 2017 at 4:54 AM Heinz N. Gies <heinz@REDACTED> wrote:

> I would not give too much on those ‘benchmarks’, they’re highly bogus and
> that’s if you’re treating them kindly.
>
> For a starter it uses default settings and they are not even provided.
> Redis is a in memory store by default, is it even saving the data? How are
> risk or Cassandra set up, unlike mongo or redis the others those are build
> to be clustered, are the default configs used for them disabling unless
> overhead? Does it mean risk, that is storing every write on disks, perhaps
> 3 times, is only 10x slower compared to a database that never writes to
> disk and only keeps one copy?
>
> For you own sanity, print that benchmark, find a burn proof area (safety
> matters!) and set it on fire then move on and benchmark for yourself with a
> real use case and sensible data.
>
>
> On 18. Sep 2017, at 17:34, code wiget <codewiget95@REDACTED> wrote:
>
> HI,
>
> Thank you all for your replies.
>
> Nathaniel: The reads must be 'eventually' consistent, at least within a
> second. The problem is that it updates user connection information, and
> they will be unable to connect if our read does not get information from
> the write. So if we update, the connection before the write is fully
> committed will fail. I suppose it is ok if they cannot connect and just
> have to reconnect, but ideally they should be able to connect every time.
>
> So Riak seems like a great solution, but speed wise really worries me. We
> are trying to connect as many clients as possible per server, this is very
> important as it saves us money. If the reads take 2-3x as long, this could
> be very slow and bad. According to this article:
> https://github.com/citrusbyte/redis-comparison, Riak is up to 10x slower
> than Redis. This would really hurt our operations.
>
> To those who commented redis-cluster, my problem with a cluster solution
> is that redis-cluster seemed to be in an experimental stage. It also has
> the problem where if all copies of a node die, then the cluster will lose
> all that data and it is up to the user to not lose that data. All of this
> has to be handled by the user, and this seems like it will get tedious when
> there are multiple nodes and all it would take is for one admin to mess it
> up.
>
> So this is where Aerospike comes in. Reading about them on the web they
> come off as the perfect tool for a version of redis that is distributed:
> https://stackoverflow.com/questions/24482337/how-is-aerospike-different-from-other-key-value-nosql-databases .
> But for some reason, they don’t get as much attention as redis
>
> Does anyone have experience with Aerospike? For my application, it seems
> like a no brainer.
>
> Thank you all again,
>
> On Sep 15, 2017, at 2:02 PM, Nathaniel Waisbrot <nathaniel@REDACTED>
> wrote:
>
> Scatter-shot reply:
>
> Since you're using Redis right now, have you considered Redis Cluster (
> https://redis.io/topics/cluster-tutorial)?
>
> I'm using Cassandra and don't feel that it's got a small community or slow
> pace of updates. There are a lot of NoSQL databases and they all have quite
> different tradeoffs which tends to fragment the community, so your
> expectations may be too high.
>
> Riak, ElasticSearch, EtcD, MongoDB, etc. You have many (too many!)
> options. When you say "read speed and consistency" what sort of consistency
> are you looking for? Is eventual consistency good, or do you require that
> every read that takes place after a write gets the new data?
>
>
>
>
> On Sep 15, 2017, at 12:43 PM, code wiget <codewiget95@REDACTED> wrote:
>
> Hello everyone,
>
> I am at the point where I have many Erlang nodes, and I am going to have
> to move to a distributed database. Right now, I am using a basic setup:
> each Erlang node has a copy of the same Redis DB, and all of those DBs are
> slaves(non-writable copies) of a master. A big problem with this is obvious
> - If the db goes down, the node goes down. If the master goes down, the
> slaves won’t get updated, so I would like to move to a distributed db that
> all of my nodes can read/write to that can not/does not go down.
>
> The nodes do ~50 reads per write, and are constantly reading, so read
> speed and consistency is my real concern. I believe this will be the node’s
> main speed factor.
>
> Another thing is that all of my data is key/key/value , so it would mimic
> the structure of ID -> name -> “Fred”, ID->age->20, so I don’t need a SQL
> DB.
>
> A big thing also is that I don’t need disc copies, as a I have a large
> backup store where the values are generated from.
>
> I have looked at as many options as I can ->
>
> Voldemort : http://project-voldemort.com/
> - looks perfect, but there are 0 resources on learning how to use it
> outside of their docs and no Erlang driver, which is huge because I would
> both have to learn how to write a c driver and everything about this just
> to get it to work.
>
> Cassandra: http://cassandra.apache.org/
> - looks good too, but apparently there is a small community and apparently
> isn’t updated often
>
> Scalaris:
> https://github.com/scalaris-team/scalaris/blob/master/user-dev-guide/main.pdf
> - Looks very very cool, seems great, but there is 0 active community and
> their GitHub isn’t updated often. This is a distributed all in-memory
> database, written in Erlang.
>
>
> So from my research, which consisted heavily of this blog:
> https://www.metabrew.com/article/anti-rdbms-a-list-of-distributed-key-value-stores ,
> I have narrowed it down to these three.
>
> BUT you are all the real experts and have built huge applications in
> Erlang, what do you use? What do you have experience in that performs well
> with Erlang nodes spread across multiple machines and possibly multiple
> data centers?
>
> Thanks for your time.
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20170918/f54734d1/attachment.htm>


More information about the erlang-questions mailing list