[erlang-questions] Load balancing/multiplexing rpc calls amongst Erlang Nodes

Joshua Muzaaya <>
Wed Oct 24 07:51:39 CEST 2012


Thank you so much Geoff Cant.   This is great. let me think along this line
and try to build something like this.

  <http://www.linkedin.com/pub/muzaaya-joshua/39/2ba/202>
Designed with WiseStamp -
<http://r1.wisestamp.com/r/landing?u=cf16262215eb8784&v=3.11.21&t=1351057844240&promo=10&dest=http%3A%2F%2Fwww.wisestamp.com%2Femail-install%3Futm_source%3Dextension%26utm_medium%3Demail%26utm_campaign%3Dpromo_10>Get
yours<http://r1.wisestamp.com/r/landing?u=cf16262215eb8784&v=3.11.21&t=1351057844240&promo=10&dest=http%3A%2F%2Fwww.wisestamp.com%2Femail-install%3Futm_source%3Dextension%26utm_medium%3Demail%26utm_campaign%3Dpromo_10>



On Wed, Oct 24, 2012 at 2:41 AM, Geoff Cant <> wrote:

> So one cluster load balancing scheme that worked reasonably well for me at
> ngmoco was based on gossip with estimation.
>
> Each node has some kind of metric for how loaded it is (in my case number
> of a certain kind of process running, in your case could be RPCs being
> evaluated), and a reasonable estimation for how much a remote job request
> will affect that metric (in my case spawning a new process remotely would
> increase the process count by one :).
>
> Each node maintains a table of
> {node(), LoadMetric, BroadcastTimestamp}.
>
> * It updates its own entry continually (the local metric should be pretty
> accurate), and broadcasts its metric to other nodes every broadcast period
> (5s).
> * Each node receiving a timely load broadcast (they contain timestamps,
> discard any that are more than 1 gossip period old) overwrites its load
> table entry with the load metric and broadcast timestamp (i.e. every 5s,
> the running estimate of remote load will be brought up to date)
> * Every broadcast period (every time you're about to broadcast), scan the
> load table and delete entries with a timestamp older than 3*broadcast
> period. -- Ignore nodes that are probably faulty/down
> * Each time a node routes/transfers load/sends an RPC to a remote node, it
> bumps the local load table entry for the remote node -- this is the
> estimation step.
>
> Now you have a local ets table of size M rows (cluster cardinality) which
> you can read and make your load balancing decision on. I generally went for
> lowest loaded node if there is more than  one entry and the local node
> otherwise.
>
> The local estimation step is important if you have a deterministic load
> balancing function.
>
>
> You can also extend/modify this implementation to add a
> graceful-cluster-exit scheme for a node by adding an administrative
> mechanism to stop a node broadcasting its own load. The other nodes will
> stop transferring load to it and load on it will eventually finish up.
>
> I'm sorry I don't have an implementation for you to use as I don't have
> permission to release the code. It's not super-hard to write however.
>
> Cheers,
> -Geoff
>
> On 2012-10-23, at 02:52 , Joshua Muzaaya <> wrote:
>
> > Thank you so much. let me try that. But is it not possible to have a
> > non-random method ? one that is intelligent and fair on all the nodes
> >
> >  <http://www.linkedin.com/pub/muzaaya-joshua/39/2ba/202>
> > Designed with WiseStamp -
> > <
> http://r1.wisestamp.com/r/landing?u=cf16262215eb8784&v=3.11.21&t=1350985854695&promo=10&dest=http%3A%2F%2Fwww.wisestamp.com%2Femail-install%3Futm_source%3Dextension%26utm_medium%3Demail%26utm_campaign%3Dpromo_10
> >Get
> > yours<
> http://r1.wisestamp.com/r/landing?u=cf16262215eb8784&v=3.11.21&t=1350985854695&promo=10&dest=http%3A%2F%2Fwww.wisestamp.com%2Femail-install%3Futm_source%3Dextension%26utm_medium%3Demail%26utm_campaign%3Dpromo_10
> >
> >
> >
> >
> > On Tue, Oct 23, 2012 at 10:45 AM, Paul Peregud <
> >wrote:
> >
> >>> find the erlang random generator skewing results making around 50/60
> >> requests hitting one Node while others are waiting
> >>
> >> This is unusual. I've seen PRNG from random module skewing the results,
> >> but never to such extent. Please check if it is properly seeded
> >> (random:seed/0 seeds with predefined constant).
> >>
> >> If seeding is done properly, then you may want to consider switching to
> >> crypto:rand_uniform/2. It's a bit slower, but it produces random
> numbers of
> >> quality better then enough for purpose of load balancing.
> >>
> >>
> >> On Tue, Oct 23, 2012 at 7:22 AM, Joshua Muzaaya <
> >wrote:
> >>
> >>> Yes,i tries using the random method, but because requests are so
> >>> frequently many, you find the erlang random generator skewing results
> >>> making around 50/60 requests hitting one Node while others are waiting.
> >>> Another thing is that, i am not using gen_servers at the Web Server
> layer.
> >>> I am using yaws web server and for each connection, yaws spawns a
> process,
> >>> this process communicates with Mnesia Nodes to query for data. But the
> >>> connections are so many and i wanted to scale the application
> horizontally,
> >>> adding more web servers and more mnesia Nodes. I came to think of a
> load
> >>> balancing middle ware, abstracting my processes from knowing where the
> call
> >>> has hit ( i.e on what mnesia node the call has hit). This middle ware
> >>> ensures that requests are load balanced across my Mnesia DBs.
> >>>
> >>> That is the background of the problem. Its a real-time Web Notification
> >>> system, plugged into a major intranet Management System. However,
> clients
> >>> are many, and yaws is sustaining 30,000 concurrent connections at low
> >>> peaks. I am a software engineer in one of the telecommunications
> companies
> >>> in Africa. I keep running into a few memory problems on single node
> yaws
> >>> server, so i need ti add more web servers to assist. Also, mnesia
> sometime
> >>> will get *** Too many DB Tables ** when requests are too many and too
> >>> frequent. I changed everything to use dirty operations and when i
> by-passed
> >>> the transaction manager, things improved a bit.
> >>>
> >>> I need fellow erlangers to think of a load balancing algorithm in such
> a
> >>> situation. Do you think a process dictionary like GPROC would be so
> useful
> >>> ? i was kinda thinking about it last night, but i wonder how i would
> apply
> >>> it in this case.
> >>>
> >>> Having one gen_server to decide where the request may go, might alos
> slow
> >>> down the application as all requests will have to go through that
> >>> gen_server.
> >>>
> >>>  <http://www.linkedin.com/pub/muzaaya-joshua/39/2ba/202>
> >>> Designed with WiseStamp -
> >>> <
> http://r1.wisestamp.com/r/landing?u=cf16262215eb8784&v=3.11.21&t=1350969121119&promo=10&dest=http%3A%2F%2Fwww.wisestamp.com%2Femail-install%3Futm_source%3Dextension%26utm_medium%3Demail%26utm_campaign%3Dpromo_10
> >Get
> >>> yours<
> http://r1.wisestamp.com/r/landing?u=cf16262215eb8784&v=3.11.21&t=1350969121119&promo=10&dest=http%3A%2F%2Fwww.wisestamp.com%2Femail-install%3Futm_source%3Dextension%26utm_medium%3Demail%26utm_campaign%3Dpromo_10
> >
> >>>
> >>>
> >>>
> >>> On Tue, Oct 23, 2012 at 2:53 AM, Yogish Baliga <>
> wrote:
> >>>
> >>>> One option is to run proxy gen_server on each Mnesia box and register
> >>>> these gen_server pids with pg2. Now you can do load balancing on pg2
> >>>> processes based on message queue length as described here
> >>>>
> >>>> http://dev.lethain.com/load-balancing-across-erlang-process-groups/
> >>>>
> >>>> When I used this method in my last project in Erlang, it gave better
> >>>> result than normal round robin. Under very low load, all requests were
> >>>> redirected to single Mnesia instance.
> >>>>
> >>>> http://dev.lethain.com/load-balancing-across-erlang-process-groups/
> >>>>
> >>>> -- baliga
> >>>>
> >>>>
> >>>> On Mon, Oct 22, 2012 at 2:22 PM, Paul Peregud <
> >wrote:
> >>>>
> >>>>> May you specify why load balancing should be based on time and can
> not
> >>>>> be random? Have you implemented random load balancing? Has it proved
> >>>>> to be insufficient?
> >>>>>
> >>>>> On Mon, Oct 22, 2012 at 9:51 AM, muzaaya_joshua <>
> >>>>> wrote:
> >>>>>> Building from this question (
> >>>>> http://stackoverflow.com/q/5339329/431620 ),
> >>>>>> imagine an application with N Erlang Web Servers, and N/2 Mnesia
> >>>>> Database
> >>>>>> Nodes. The set up is such that the Web Servers, each, runs on its
> own
> >>>>>> hardware server (say HP DL385), and each Mnesia Instance, runs on
> its
> >>>>> own
> >>>>>> hardware Server as well.
> >>>>>>
> >>>>>> Web Servers make rpc:call/4 calls to the back end (the Mnesia DB
> >>>>> Servers).
> >>>>>> The Data is all replicated across all the Mnesia instances. Now, you
> >>>>> want to
> >>>>>> have the calls being made to the Database servers, MULTIPLEXED, more
> >>>>>> precisely ( by TIME), on each Web Server, so that some kind of LOAD
> >>>>>> BALANCING is attained.
> >>>>>>
> >>>>>> If Web Server A makes a connection to Mnesia Instance 3, it cannot
> >>>>> make the
> >>>>>> next connection to the same Instance. All Database Nodes need to be
> >>>>> kept
> >>>>>> busy and not having any one of them idle while the others are
> >>>>> working. The
> >>>>>> Load balancing Algorithm should not be random, but should be aimed
> at
> >>>>>> balancing the load on the Database Servers.
> >>>>>>
> >>>>>> Qn 1: Come up with your load balancing strategy, in such a
> situation.
> >>>>> Also,
> >>>>>> please show with some sample illustrative code, how you would
> >>>>> implement this
> >>>>>> strategy.
> >>>>>>
> >>>>>> Qn 2: If a Mnesia Instance goes down, how would your load balance
> >>>>> Algorithm
> >>>>>> adapt to the changes in the cluster ?
> >>>>>>
> >>>>>> Qn 3: Is there any Erlang library aimed at load balancing of Erlang
> >>>>> Servers
> >>>>>> working within the same system, and calling each other via
> rpc:call/4
> >>>>> ?
>
>


-- 
*Muzaaya Joshua
Systems Engineer
+256774115170*
*"Through it all, i have learned to trust in Jesus. To depend upon His Word"
*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20121024/2ab2de21/attachment.html>


More information about the erlang-questions mailing list