[erlang-questions] Best implementation for a clustered connected client list ?
Bryan Hughes
bryan@REDACTED
Sat Jun 1 17:48:43 CEST 2013
Hi Morgan,
Have you taken a look at Basho's Riak Core (its open source) - they have
solved nicely the consistent hashing mapping to vnodes that allow
clusters to dynamically change in size while being stable.
http://basho.com/where-to-start-with-riak-core/
We are looking at using it to implement distribution in our solution.
Haven't dove into yet, so can speak to what is involved in adopting
their solution.
Cheers,
Bryan
On 5/31/13 3:52 AM, Morgan Segalis wrote:
> Hi Dmitry,
>
> I have though about consistent hashing,
>
> The only issue is that consistent hashing will work if we have a fixed
> number of cluster, If we add dynamically another cluster, the hash
> won't gives me the same cluster...
> I might be wrong...
>
> Actually right now I have a gateway, which will choose the cluster on
> which there is the less number of connected clients, and redirect the
> client to this one. Working like a load balancer among other things
> the gateway does.
>
> Best regards,
> Morgan.
>
>
> Le 31 mai 2013 à 12:30, Dmitry Kolesnikov <dmkolesnikov@REDACTED
> <mailto:dmkolesnikov@REDACTED>> a écrit :
>
>> Hello,
>>
>> You current implementation starts to suffer from performance due to
>> large number of messages to discover process location.
>> You have to define a formal rule about "id" and its relation to node
>> where processes exists. Essentially I am talking about consistent
>> hashing.
>>
>> To be honest, I am not getting what is wrong with ETS and gproc. I am
>> using a similar approach for my cluster management. I am using P2P
>> methodology
>> where local tables get sync periodically + updates on local table is
>> replicated to cluster members. Each node is capable to observe the
>> status of cluster members. Once node is disconnected the table is
>> clean-up. However, I am using that approach for "internal" processes.
>> "Client connection" are not distributed globally.
>>
>> Best Regards,
>> Dmitry
>>
>> On May 31, 2013, at 1:12 PM, Morgan Segalis <msegalis@REDACTED
>> <mailto:msegalis@REDACTED>> wrote:
>>
>>> Hi,
>>>
>>> Actually right now, I have this implementation :
>>>
>>> Each node has a clientpool
>>> Each client pool manage an ets table, where all client are inserted
>>> / removed / searched
>>>
>>> If an user A sends a message to user B, it will check first the
>>> local ets, if not found, will ask to all nodes if they found user B
>>> in their ets table, if user B is connected to another node, this
>>> node will return the process pid.
>>>
>>> Advantage of this implementation : no synchronization required if
>>> one node goes down and back up again...
>>> Disadvantage of this implementation : I guess the number of message
>>> / sec is much higher of number of connection/disconnection / sec
>>>
>>>
>>> Le 31 mai 2013 à 00:36, Chris Hicks <khandrish@REDACTED
>>> <mailto:khandrish@REDACTED>> a écrit :
>>>
>>>> Please keep in mind that this is the result of a total of about 5
>>>> seconds of thinking.
>>>>
>>>> You could have a coordinator on each node which is responsible for
>>>> communicating with the coordinators on all of the other connected
>>>> nodes. Your ETS entries would need to be expanded to keep track of
>>>> the node that the user is connected on as well. The coordinators
>>>> track the joining/leaving of the cluster of all other nodes and
>>>> will purge the ETS table of any entries that belong to any recently
>>>> downed node. As long as you don't have hard real-time requirements,
>>>> which if you do you're using the wrong tool anyway, then you can
>>>> come up with a bunch of ways to group together updates between
>>>> coordinators to make sure they don't get overloaded.
>>>>
>>>> Without a lot more details on the exact sort of metrics your system
>>>> needs to be able to handle it's all really just a guessing game, in
>>>> the end.
>>>>
>>>>
>>>> On Thu, May 30, 2013 at 8:38 AM, Morgan Segalis <msegalis@REDACTED
>>>> <mailto:msegalis@REDACTED>> wrote:
>>>>
>>>> Hi everyone,
>>>>
>>>> I'm currently looking for better ways to do a clustered
>>>> connected client list.
>>>>
>>>> On every server I have an ets table, in which I insert / delete
>>>> / search the process of each client connecting to the server.
>>>>
>>>> Each row in the ets table is for now a simple {"id",
>>>> <pid.of.process>}
>>>>
>>>> I have tried the gproc module from Ulf Wiger, but when a
>>>> cluster goes down, everything goes wrong... (especially if it
>>>> is the elected leader).
>>>>
>>>> If a cluster goes down, other clusters should consider that
>>>> every client connected on the said cluster are actually not
>>>> connected (even if is just a simple connection error between
>>>> clusters).
>>>> If it goes back online, back on the nodes() list, other
>>>> clusters should consider clients on this cluster back online.
>>>>
>>>> What would be in your opinion the best way to do that ?
>>>>
>>>> It is a messaging system, so it has to handle massive message
>>>> passing through processes.
>>>>
>>>> Thank you for your help.
>>>>
>>>> Morgan.
>>>> _______________________________________________
>>>> erlang-questions mailing list
>>>> erlang-questions@REDACTED <mailto:erlang-questions@REDACTED>
>>>> http://erlang.org/mailman/listinfo/erlang-questions
>>>>
>>>>
>>>
>>> _______________________________________________
>>> erlang-questions mailing list
>>> erlang-questions@REDACTED <mailto:erlang-questions@REDACTED>
>>> http://erlang.org/mailman/listinfo/erlang-questions
>>
>
>
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
--
Bryan Hughes
*Go Factory*
http://www.go-factory.net
/"Internet Class, Enterprise Grade"/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20130601/c1e655a2/attachment.htm>
More information about the erlang-questions
mailing list