Thank you so much Geoff Cant.   This is great. let me think along this line and try to build something like this. <br><br><div id="WISESTAMP_SIG_2976"><div style="font-size:13.3px;font-family:Verdana,Arial,Helvetica,sans-serif">

<div style="max-width:469px;padding:0.5em 0 0.5em">                                     <a href="http://www.linkedin.com/pub/muzaaya-joshua/39/2ba/202" target="_blank">                                              <img src="https://s3.amazonaws.com/images.wisestamp.com/apps/buttons/linkedinbutton_option_1.png" style="border: none;" border="0">                                 </a></div>

<div style="border-top:1px solid #eeeeee;margin-top:17px;padding-top:2px;font-size:75%"><a style="color:#6f6f6f;text-decoration:none" href="http://r1.wisestamp.com/r/landing?u=cf16262215eb8784&v=3.11.21&t=1351057844240&promo=10&dest=http%3A%2F%2Fwww.wisestamp.com%2Femail-install%3Futm_source%3Dextension%26utm_medium%3Demail%26utm_campaign%3Dpromo_10"><span style="color:#6f6f6f">Designed with WiseStamp - </span></a><a style="color:#3f48cc;text-decoration:underline" href="http://r1.wisestamp.com/r/landing?u=cf16262215eb8784&v=3.11.21&t=1351057844240&promo=10&dest=http%3A%2F%2Fwww.wisestamp.com%2Femail-install%3Futm_source%3Dextension%26utm_medium%3Demail%26utm_campaign%3Dpromo_10"><span style="color:#3f48cc">Get yours</span></a> <img src="http://static.wisestamp.com/promo/promo.html/p.gif?u=cf16262215eb8784&v=3.11.21&t=1351057844240&promo=10" width="1" height="1"></div>

<img src="https://wisestamp.appspot.com/pixel.png?p=mozilla&v=3.11.21&t=1351057844240&u=cf16262215eb8784" width="1" height="1"></div></div><br><br><div class="gmail_quote">On Wed, Oct 24, 2012 at 2:41 AM, Geoff Cant <span dir="ltr"><<a href="mailto:nem@erlang.geek.nz" target="_blank">nem@erlang.geek.nz</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">So one cluster load balancing scheme that worked reasonably well for me at ngmoco was based on gossip with estimation.<br>


<br>

Each node has some kind of metric for how loaded it is (in my case number of a certain kind of process running, in your case could be RPCs being evaluated), and a reasonable estimation for how much a remote job request will affect that metric (in my case spawning a new process remotely would increase the process count by one :).<br>


<br>

Each node maintains a table of<br>

{node(), LoadMetric, BroadcastTimestamp}.<br>

<br>

* It updates its own entry continually (the local metric should be pretty accurate), and broadcasts its metric to other nodes every broadcast period (5s).<br>

* Each node receiving a timely load broadcast (they contain timestamps, discard any that are more than 1 gossip period old) overwrites its load table entry with the load metric and broadcast timestamp (i.e. every 5s, the running estimate of remote load will be brought up to date)<br>


* Every broadcast period (every time you're about to broadcast), scan the load table and delete entries with a timestamp older than 3*broadcast period. -- Ignore nodes that are probably faulty/down<br>

* Each time a node routes/transfers load/sends an RPC to a remote node, it bumps the local load table entry for the remote node -- this is the estimation step.<br>

<br>

Now you have a local ets table of size M rows (cluster cardinality) which you can read and make your load balancing decision on. I generally went for lowest loaded node if there is more than  one entry and the local node otherwise.<br>


<br>

The local estimation step is important if you have a deterministic load balancing function.<br>

<br>

<br>

You can also extend/modify this implementation to add a graceful-cluster-exit scheme for a node by adding an administrative mechanism to stop a node broadcasting its own load. The other nodes will stop transferring load to it and load on it will eventually finish up.<br>


<br>

I'm sorry I don't have an implementation for you to use as I don't have permission to release the code. It's not super-hard to write however.<br>

<br>

Cheers,<br>

-Geoff<br>

<div class="im"><br>

On 2012-10-23, at 02:52 , Joshua Muzaaya <<a href="mailto:joshmuza@gmail.com">joshmuza@gmail.com</a>> wrote:<br>

<br>

> Thank you so much. let me try that. But is it not possible to have a<br>

> non-random method ? one that is intelligent and fair on all the nodes<br>

><br>

</div>>  <<a href="http://www.linkedin.com/pub/muzaaya-joshua/39/2ba/202" target="_blank">http://www.linkedin.com/pub/muzaaya-joshua/39/2ba/202</a>><br>

> Designed with WiseStamp -<br>

> <<a href="http://r1.wisestamp.com/r/landing?u=cf16262215eb8784&v=3.11.21&t=1350985854695&promo=10&dest=http%3A%2F%2Fwww.wisestamp.com%2Femail-install%3Futm_source%3Dextension%26utm_medium%3Demail%26utm_campaign%3Dpromo_10" target="_blank">http://r1.wisestamp.com/r/landing?u=cf16262215eb8784&v=3.11.21&t=1350985854695&promo=10&dest=http%3A%2F%2Fwww.wisestamp.com%2Femail-install%3Futm_source%3Dextension%26utm_medium%3Demail%26utm_campaign%3Dpromo_10</a>>Get<br>


> yours<<a href="http://r1.wisestamp.com/r/landing?u=cf16262215eb8784&v=3.11.21&t=1350985854695&promo=10&dest=http%3A%2F%2Fwww.wisestamp.com%2Femail-install%3Futm_source%3Dextension%26utm_medium%3Demail%26utm_campaign%3Dpromo_10" target="_blank">http://r1.wisestamp.com/r/landing?u=cf16262215eb8784&v=3.11.21&t=1350985854695&promo=10&dest=http%3A%2F%2Fwww.wisestamp.com%2Femail-install%3Futm_source%3Dextension%26utm_medium%3Demail%26utm_campaign%3Dpromo_10</a>><br>


<div><div class="h5">><br>

><br>

><br>

> On Tue, Oct 23, 2012 at 10:45 AM, Paul Peregud <<a href="mailto:paulperegud@gmail.com">paulperegud@gmail.com</a>>wrote:<br>

><br>

>>> find the erlang random generator skewing results making around 50/60<br>

>> requests hitting one Node while others are waiting<br>

>><br>

>> This is unusual. I've seen PRNG from random module skewing the results,<br>

>> but never to such extent. Please check if it is properly seeded<br>

>> (random:seed/0 seeds with predefined constant).<br>

>><br>

>> If seeding is done properly, then you may want to consider switching to<br>

>> crypto:rand_uniform/2. It's a bit slower, but it produces random numbers of<br>

>> quality better then enough for purpose of load balancing.<br>

>><br>

>><br>

>> On Tue, Oct 23, 2012 at 7:22 AM, Joshua Muzaaya <<a href="mailto:joshmuza@gmail.com">joshmuza@gmail.com</a>>wrote:<br>

>><br>

>>> Yes,i tries using the random method, but because requests are so<br>

>>> frequently many, you find the erlang random generator skewing results<br>

>>> making around 50/60 requests hitting one Node while others are waiting.<br>

>>> Another thing is that, i am not using gen_servers at the Web Server layer.<br>

>>> I am using yaws web server and for each connection, yaws spawns a process,<br>

>>> this process communicates with Mnesia Nodes to query for data. But the<br>

>>> connections are so many and i wanted to scale the application horizontally,<br>

>>> adding more web servers and more mnesia Nodes. I came to think of a load<br>

>>> balancing middle ware, abstracting my processes from knowing where the call<br>

>>> has hit ( i.e on what mnesia node the call has hit). This middle ware<br>

>>> ensures that requests are load balanced across my Mnesia DBs.<br>

>>><br>

>>> That is the background of the problem. Its a real-time Web Notification<br>

>>> system, plugged into a major intranet Management System. However, clients<br>

>>> are many, and yaws is sustaining 30,000 concurrent connections at low<br>

>>> peaks. I am a software engineer in one of the telecommunications companies<br>

>>> in Africa. I keep running into a few memory problems on single node yaws<br>

>>> server, so i need ti add more web servers to assist. Also, mnesia sometime<br>

>>> will get *** Too many DB Tables ** when requests are too many and too<br>

>>> frequent. I changed everything to use dirty operations and when i by-passed<br>

>>> the transaction manager, things improved a bit.<br>

>>><br>

>>> I need fellow erlangers to think of a load balancing algorithm in such a<br>

>>> situation. Do you think a process dictionary like GPROC would be so useful<br>

>>> ? i was kinda thinking about it last night, but i wonder how i would apply<br>

>>> it in this case.<br>

>>><br>

>>> Having one gen_server to decide where the request may go, might alos slow<br>

>>> down the application as all requests will have to go through that<br>

>>> gen_server.<br>

>>><br>

</div></div>>>>  <<a href="http://www.linkedin.com/pub/muzaaya-joshua/39/2ba/202" target="_blank">http://www.linkedin.com/pub/muzaaya-joshua/39/2ba/202</a>><br>

>>> Designed with WiseStamp -<br>

>>> <<a href="http://r1.wisestamp.com/r/landing?u=cf16262215eb8784&v=3.11.21&t=1350969121119&promo=10&dest=http%3A%2F%2Fwww.wisestamp.com%2Femail-install%3Futm_source%3Dextension%26utm_medium%3Demail%26utm_campaign%3Dpromo_10" target="_blank">http://r1.wisestamp.com/r/landing?u=cf16262215eb8784&v=3.11.21&t=1350969121119&promo=10&dest=http%3A%2F%2Fwww.wisestamp.com%2Femail-install%3Futm_source%3Dextension%26utm_medium%3Demail%26utm_campaign%3Dpromo_10</a>>Get<br>


>>> yours<<a href="http://r1.wisestamp.com/r/landing?u=cf16262215eb8784&v=3.11.21&t=1350969121119&promo=10&dest=http%3A%2F%2Fwww.wisestamp.com%2Femail-install%3Futm_source%3Dextension%26utm_medium%3Demail%26utm_campaign%3Dpromo_10" target="_blank">http://r1.wisestamp.com/r/landing?u=cf16262215eb8784&v=3.11.21&t=1350969121119&promo=10&dest=http%3A%2F%2Fwww.wisestamp.com%2Femail-install%3Futm_source%3Dextension%26utm_medium%3Demail%26utm_campaign%3Dpromo_10</a>><br>


<div class="HOEnZb"><div class="h5">>>><br>

>>><br>

>>><br>

>>> On Tue, Oct 23, 2012 at 2:53 AM, Yogish Baliga <<a href="mailto:baliga@gmail.com">baliga@gmail.com</a>> wrote:<br>

>>><br>

>>>> One option is to run proxy gen_server on each Mnesia box and register<br>

>>>> these gen_server pids with pg2. Now you can do load balancing on pg2<br>

>>>> processes based on message queue length as described here<br>

>>>><br>

>>>> <a href="http://dev.lethain.com/load-balancing-across-erlang-process-groups/" target="_blank">http://dev.lethain.com/load-balancing-across-erlang-process-groups/</a><br>

>>>><br>

>>>> When I used this method in my last project in Erlang, it gave better<br>

>>>> result than normal round robin. Under very low load, all requests were<br>

>>>> redirected to single Mnesia instance.<br>

>>>><br>

>>>> <a href="http://dev.lethain.com/load-balancing-across-erlang-process-groups/" target="_blank">http://dev.lethain.com/load-balancing-across-erlang-process-groups/</a><br>

>>>><br>

>>>> -- baliga<br>

>>>><br>

>>>><br>

>>>> On Mon, Oct 22, 2012 at 2:22 PM, Paul Peregud <<a href="mailto:paulperegud@gmail.com">paulperegud@gmail.com</a>>wrote:<br>

>>>><br>

>>>>> May you specify why load balancing should be based on time and can not<br>

>>>>> be random? Have you implemented random load balancing? Has it proved<br>

>>>>> to be insufficient?<br>

>>>>><br>

>>>>> On Mon, Oct 22, 2012 at 9:51 AM, muzaaya_joshua <<a href="mailto:joshmuza@gmail.com">joshmuza@gmail.com</a>><br>

>>>>> wrote:<br>

>>>>>> Building from this question (<br>

>>>>> <a href="http://stackoverflow.com/q/5339329/431620" target="_blank">http://stackoverflow.com/q/5339329/431620</a> ),<br>

>>>>>> imagine an application with N Erlang Web Servers, and N/2 Mnesia<br>

>>>>> Database<br>

>>>>>> Nodes. The set up is such that the Web Servers, each, runs on its own<br>

>>>>>> hardware server (say HP DL385), and each Mnesia Instance, runs on its<br>

>>>>> own<br>

>>>>>> hardware Server as well.<br>

>>>>>><br>

>>>>>> Web Servers make rpc:call/4 calls to the back end (the Mnesia DB<br>

>>>>> Servers).<br>

>>>>>> The Data is all replicated across all the Mnesia instances. Now, you<br>

>>>>> want to<br>

>>>>>> have the calls being made to the Database servers, MULTIPLEXED, more<br>

>>>>>> precisely ( by TIME), on each Web Server, so that some kind of LOAD<br>

>>>>>> BALANCING is attained.<br>

>>>>>><br>

>>>>>> If Web Server A makes a connection to Mnesia Instance 3, it cannot<br>

>>>>> make the<br>

>>>>>> next connection to the same Instance. All Database Nodes need to be<br>

>>>>> kept<br>

>>>>>> busy and not having any one of them idle while the others are<br>

>>>>> working. The<br>

>>>>>> Load balancing Algorithm should not be random, but should be aimed at<br>

>>>>>> balancing the load on the Database Servers.<br>

>>>>>><br>

>>>>>> Qn 1: Come up with your load balancing strategy, in such a situation.<br>

>>>>> Also,<br>

>>>>>> please show with some sample illustrative code, how you would<br>

>>>>> implement this<br>

>>>>>> strategy.<br>

>>>>>><br>

>>>>>> Qn 2: If a Mnesia Instance goes down, how would your load balance<br>

>>>>> Algorithm<br>

>>>>>> adapt to the changes in the cluster ?<br>

>>>>>><br>

>>>>>> Qn 3: Is there any Erlang library aimed at load balancing of Erlang<br>

>>>>> Servers<br>

>>>>>> working within the same system, and calling each other via rpc:call/4<br>

>>>>> ?<br>

<br>

</div></div></blockquote></div><br><br clear="all"><br>-- <br>*Muzaaya Joshua<br>Systems Engineer<br>+256774115170*<br>*"Through it all, i have learned to trust in Jesus. To depend upon His Word"<br>*<br><br><br>

<br>