[erlang-questions] Best implementation for a clustered connected client list ?

Bryan Hughes bryan@REDACTED
Sat Jun 1 17:48:43 CEST 2013

Hi Morgan,

Have you taken a look at Basho's Riak Core (its open source) - they have 
solved nicely the consistent hashing mapping to vnodes that allow 
clusters to dynamically change in size while being stable.


We are looking at using it to implement distribution in our solution.  
Haven't dove into yet, so can speak to what is involved in adopting 
their solution.


On 5/31/13 3:52 AM, Morgan Segalis wrote:
> Hi Dmitry,
> I have though about consistent hashing,
> The only issue is that consistent hashing will work if we have a fixed 
> number of cluster, If we add dynamically another cluster, the hash 
> won't gives me the same cluster...
> I might be wrong...
> Actually right now I have a gateway, which will choose the cluster on 
> which there is the less number of connected clients, and redirect the 
> client to this one. Working like a load balancer among other things 
> the gateway does.
> Best regards,
> Morgan.
> Le 31 mai 2013 à 12:30, Dmitry Kolesnikov <dmkolesnikov@REDACTED 
> <mailto:dmkolesnikov@REDACTED>> a écrit :
>> Hello,
>> You current implementation starts to suffer from performance due to 
>> large number of messages to discover process location.
>> You have to define a formal rule about "id" and its relation to node 
>> where processes exists. Essentially I am talking about consistent 
>> hashing.
>> To be honest, I am not getting what is wrong with ETS and gproc. I am 
>> using a similar approach for my cluster management. I am using P2P 
>> methodology
>> where local tables get sync periodically + updates on local table is 
>> replicated to cluster members. Each node is capable to observe the 
>> status of cluster members. Once node is disconnected the table is 
>> clean-up. However, I am using that approach for "internal" processes. 
>> "Client connection" are not distributed globally.
>> Best Regards,
>> Dmitry
>> On May 31, 2013, at 1:12 PM, Morgan Segalis <msegalis@REDACTED 
>> <mailto:msegalis@REDACTED>> wrote:
>>> Hi,
>>> Actually right now, I have this implementation :
>>> Each node has a clientpool
>>> Each client pool manage an ets table, where all client are inserted 
>>> / removed / searched
>>> If an user A sends a message to user B, it will check first the 
>>> local ets, if not found, will ask to all nodes if they found user B 
>>> in their ets table, if user B is connected to another node, this 
>>> node will return the process pid.
>>> Advantage of this implementation : no synchronization required if 
>>> one node goes down and back up again...
>>> Disadvantage of this implementation : I guess the number of message 
>>> / sec is much higher of number of connection/disconnection / sec
>>> Le 31 mai 2013 à 00:36, Chris Hicks <khandrish@REDACTED 
>>> <mailto:khandrish@REDACTED>> a écrit :
>>>> Please keep in mind that this is the result of a total of about 5 
>>>> seconds of thinking.
>>>> You could have a coordinator on each node which is responsible for 
>>>> communicating with the coordinators on all of the other connected 
>>>> nodes. Your ETS entries would need to be expanded to keep track of 
>>>> the node that the user is connected on as well. The coordinators 
>>>> track the joining/leaving of the cluster of all other nodes and 
>>>> will purge the ETS table of any entries that belong to any recently 
>>>> downed node. As long as you don't have hard real-time requirements, 
>>>> which if you do you're using the wrong tool anyway, then you can 
>>>> come up with a bunch of ways to group together updates between 
>>>> coordinators to make sure they don't get overloaded.
>>>> Without a lot more details on the exact sort of metrics your system 
>>>> needs to be able to handle it's all really just a guessing game, in 
>>>> the end.
>>>> On Thu, May 30, 2013 at 8:38 AM, Morgan Segalis <msegalis@REDACTED 
>>>> <mailto:msegalis@REDACTED>> wrote:
>>>>     Hi everyone,
>>>>     I'm currently looking for better ways to do a clustered
>>>>     connected client list.
>>>>     On every server I have an ets table, in which I insert / delete
>>>>     / search the process of each client connecting to the server.
>>>>     Each row in the ets table is for now a simple {"id",
>>>>     <pid.of.process>}
>>>>     I have tried the gproc module from Ulf Wiger, but when a
>>>>     cluster goes down, everything goes wrong... (especially if it
>>>>     is the elected leader).
>>>>     If a cluster goes down, other clusters should consider that
>>>>     every client connected on the said cluster are actually not
>>>>     connected (even if is just a simple connection error between
>>>>     clusters).
>>>>     If it goes back online, back on the nodes() list, other
>>>>     clusters should consider clients on this cluster back online.
>>>>     What would be in your opinion the best way to do that ?
>>>>     It is a messaging system, so it has to handle massive message
>>>>     passing through processes.
>>>>     Thank you for your help.
>>>>     Morgan.
>>>>     _______________________________________________
>>>>     erlang-questions mailing list
>>>>     erlang-questions@REDACTED <mailto:erlang-questions@REDACTED>
>>>>     http://erlang.org/mailman/listinfo/erlang-questions
>>> _______________________________________________
>>> erlang-questions mailing list
>>> erlang-questions@REDACTED <mailto:erlang-questions@REDACTED>
>>> http://erlang.org/mailman/listinfo/erlang-questions
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions


Bryan Hughes
*Go Factory*

/"Internet Class, Enterprise Grade"/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20130601/c1e655a2/attachment.htm>

More information about the erlang-questions mailing list