Troubleshooting a high-load scenario

Joel Reymont joelr1@REDACTED
Tue Jan 17 15:05:27 CET 2006


On Jan 17, 2006, at 1:28 PM, Matthias Lang wrote:

> Are you working to understand the system, or just twiddling random
> knobs in public in the hope of sudden "problem disappearance"?

I'm trying to understand what knobs to twiddle. I'm having trouble  
with this and thus I'm asking the public.

> I'd _start_ with this experiment
>
>   1. Find a value N such that
>
>          a) a load generator running N bots runs acceptably

I have established that 500 bots from one VM run fine.

>      AND b) a load generator running M bots, where N < M < 2N,
>             does not run acceptably.

I have established that 1000 bots do not run fine on one VM. Running  
two VMs with 500 bots each fails also.

>   2. Use two load generators (i.e. seperate, otherwise idle
>      machines!), each running N bots.

We ran that and it appears that the bottleneck could be on the  
server. One machine running 500 bots is fine. Two machines running  
500 bots is not.

> Next step: sniff the network and analyse the traffic.

I will look into that.

> N.B. Your description of the problem leaves open the possibility of
> the number of messages being quadratically related to the number of
> subscribers. My experiment above is set up for a linear relation.

Every bot gets notifications of other bots. So whenever 1 bot acts  
everyone else gets notification. 2 bots would generate 2 messages for  
every action, 10 bots would generate 10 messages, etc.

	Thanks, Joel

--
http://wagerlabs.com/








More information about the erlang-questions mailing list