[erlang-questions] riak client best practices

Mon Jul 27 19:54:11 CEST 2015

Hi.

- use riakc protobuf client or join services and Riak ring in a common
erlang cluster with the same cookie?

R// Definitely use riak protobufs, is the mainstream, maintained and
optimized by Basho

- open a new connection from riakc to Riak node on each hit or keep a
connection pool? I suppose it should be the pool as sockets are finite.

R// Definitely you should use a connection pool

- on every request connect to a random Riak node from the ring? Or use
dedicated Riak nodes for each service instance, if there are many?

R// I'd say: it depends on what you really need AND the option that fits
better with your systemic properties. One common case could be use a load
balancer, take a look at this link: "
http://docs.basho.com/riak/latest/ops/advanced/configs/load-balancing-proxy".
On other hand, depending of your context, is not bad to have a symetric
topology as you mentioned, having a Riak node per Service/Server node, as
long as you have a same symetric traffic over your resources, because in
top of the server nodes I suppose that you will have a load balancer, so
the traffic will be distributed symetrically over the Riak nodes too. And
the other option is quite similar, picking a random Riak node from the
client, using some balancing strategy.

- what to do with the new nodes which are joining the ring and the ones
which are removed from the ring? What is the more or less standard way to
notify clients about this?

R// The idea is don't do anything -- as far as possible. Riak handles this,
it is an internal Riak process. Also it could be an admin task, but not to
deal with it from your client (in this case is what you call service
instance). You can read more about it:
http://docs.basho.com/riak/latest/ops/running/handoff/

- what to do in a client if a Riak node temporarily goes offline? Keep
trying to connect to such node until it gets online again? This would
obviously hit latency unless there's a special dedicated process for this.

R// This is one of the drawbacks when you have one-to-one connections (Riak
node per Service node). And you would have to handle re-connection attempts
as you mention, but the idea is the timeout between attempts be longer each
time (e.g.: using a linear or exponential factor), and in this way avoid to
hit the DB to much. Again, probably the recomendation here is to use the
load balancer approach (mentioned previously), because your service
instances are pointing to the LB not directly to the DB, so if some Riak
node goes down, the upcoming requests will be distributed on the rest of
the available Riak nodes.

Finally, I strongly recommend to use 'sumo_db' (
https://github.com/inaka/sumo_db), which will make easier interact with
Riak, and also avoid to deal with some things like connection pools,
configuration, complex search, etc.

C. Andres Bolaños
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20150727/5663e47b/attachment.htm>