[erlang-questions] Best implementation for a clustered connected client list ?

Bryan Hughes <>
Sat Jun 1 17:48:43 CEST 2013


Hi Morgan,

Have you taken a look at Basho's Riak Core (its open source) - they have 
solved nicely the consistent hashing mapping to vnodes that allow 
clusters to dynamically change in size while being stable.

http://basho.com/where-to-start-with-riak-core/

We are looking at using it to implement distribution in our solution.  
Haven't dove into yet, so can speak to what is involved in adopting 
their solution.

Cheers,
Bryan

On 5/31/13 3:52 AM, Morgan Segalis wrote:
> Hi Dmitry,
>
> I have though about consistent hashing,
>
> The only issue is that consistent hashing will work if we have a fixed 
> number of cluster, If we add dynamically another cluster, the hash 
> won't gives me the same cluster...
> I might be wrong...
>
> Actually right now I have a gateway, which will choose the cluster on 
> which there is the less number of connected clients, and redirect the 
> client to this one. Working like a load balancer among other things 
> the gateway does.
>
> Best regards,
> Morgan.
>
>
> Le 31 mai 2013 à 12:30, Dmitry Kolesnikov < 
> <mailto:>> a écrit :
>
>> Hello,
>>
>> You current implementation starts to suffer from performance due to 
>> large number of messages to discover process location.
>> You have to define a formal rule about "id" and its relation to node 
>> where processes exists. Essentially I am talking about consistent 
>> hashing.
>>
>> To be honest, I am not getting what is wrong with ETS and gproc. I am 
>> using a similar approach for my cluster management. I am using P2P 
>> methodology
>> where local tables get sync periodically + updates on local table is 
>> replicated to cluster members. Each node is capable to observe the 
>> status of cluster members. Once node is disconnected the table is 
>> clean-up. However, I am using that approach for "internal" processes. 
>> "Client connection" are not distributed globally.
>>
>> Best Regards,
>> Dmitry
>>
>> On May 31, 2013, at 1:12 PM, Morgan Segalis < 
>> <mailto:>> wrote:
>>
>>> Hi,
>>>
>>> Actually right now, I have this implementation :
>>>
>>> Each node has a clientpool
>>> Each client pool manage an ets table, where all client are inserted 
>>> / removed / searched
>>>
>>> If an user A sends a message to user B, it will check first the 
>>> local ets, if not found, will ask to all nodes if they found user B 
>>> in their ets table, if user B is connected to another node, this 
>>> node will return the process pid.
>>>
>>> Advantage of this implementation : no synchronization required if 
>>> one node goes down and back up again...
>>> Disadvantage of this implementation : I guess the number of message 
>>> / sec is much higher of number of connection/disconnection / sec
>>>
>>>
>>> Le 31 mai 2013 à 00:36, Chris Hicks < 
>>> <mailto:>> a écrit :
>>>
>>>> Please keep in mind that this is the result of a total of about 5 
>>>> seconds of thinking.
>>>>
>>>> You could have a coordinator on each node which is responsible for 
>>>> communicating with the coordinators on all of the other connected 
>>>> nodes. Your ETS entries would need to be expanded to keep track of 
>>>> the node that the user is connected on as well. The coordinators 
>>>> track the joining/leaving of the cluster of all other nodes and 
>>>> will purge the ETS table of any entries that belong to any recently 
>>>> downed node. As long as you don't have hard real-time requirements, 
>>>> which if you do you're using the wrong tool anyway, then you can 
>>>> come up with a bunch of ways to group together updates between 
>>>> coordinators to make sure they don't get overloaded.
>>>>
>>>> Without a lot more details on the exact sort of metrics your system 
>>>> needs to be able to handle it's all really just a guessing game, in 
>>>> the end.
>>>>
>>>>
>>>> On Thu, May 30, 2013 at 8:38 AM, Morgan Segalis < 
>>>> <mailto:>> wrote:
>>>>
>>>>     Hi everyone,
>>>>
>>>>     I'm currently looking for better ways to do a clustered
>>>>     connected client list.
>>>>
>>>>     On every server I have an ets table, in which I insert / delete
>>>>     / search the process of each client connecting to the server.
>>>>
>>>>     Each row in the ets table is for now a simple {"id",
>>>>     <pid.of.process>}
>>>>
>>>>     I have tried the gproc module from Ulf Wiger, but when a
>>>>     cluster goes down, everything goes wrong... (especially if it
>>>>     is the elected leader).
>>>>
>>>>     If a cluster goes down, other clusters should consider that
>>>>     every client connected on the said cluster are actually not
>>>>     connected (even if is just a simple connection error between
>>>>     clusters).
>>>>     If it goes back online, back on the nodes() list, other
>>>>     clusters should consider clients on this cluster back online.
>>>>
>>>>     What would be in your opinion the best way to do that ?
>>>>
>>>>     It is a messaging system, so it has to handle massive message
>>>>     passing through processes.
>>>>
>>>>     Thank you for your help.
>>>>
>>>>     Morgan.
>>>>     _______________________________________________
>>>>     erlang-questions mailing list
>>>>      <mailto:>
>>>>     http://erlang.org/mailman/listinfo/erlang-questions
>>>>
>>>>
>>>
>>> _______________________________________________
>>> erlang-questions mailing list
>>>  <mailto:>
>>> http://erlang.org/mailman/listinfo/erlang-questions
>>
>
>
>
> _______________________________________________
> erlang-questions mailing list
> 
> http://erlang.org/mailman/listinfo/erlang-questions

-- 

Bryan Hughes
*Go Factory*
http://www.go-factory.net

/"Internet Class, Enterprise Grade"/


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20130601/c1e655a2/attachment.html>


More information about the erlang-questions mailing list