[erlang-questions] trouble with erlang or erlang is a ghetto

Ulf Wiger ulf.wiger@REDACTED
Thu Jul 28 13:00:45 CEST 2011

On 28 Jul 2011, at 12:41, Joel Reymont wrote:

> On Jul 28, 2011, at 11:33 AM, Ulf Wiger wrote:
>> The problems with Distributed Erlang are related to a heavy-handed backpressure solution, where processes trying to send to the dist_port are simply suspended if the output queue exceeds a given threshold. When the queue falls under the threshold, all suspended processes are resumed. Since the algorithm doesn't differentiate between processes, this fate can befall the net ticker as well.
> I thought my net splits were due to heavy process messaging traffic and the net ticker messages falling behind. 
> That didn't quite explain it but what you said does.

Yeah, we (or mainly, Michal Ptaszek) had reason to dig into this fairly recently, and found that tuning can really make a big difference. Still, the whole area should be revisited for smarter overload handling.

A particularly interesting fault situation was when this dynamic ended up suspending the rpc server. It could still receive and process requests (spawning dynamic workers for the processing), but was suspended practically every time it tried to send a reply. Eventually, its message queue used up all memory and killed the node. :)

Actually, these changes in R14B01 are relevant:

    OTP-8901  The runtime system is now less eager to suspend processes
	      sending messages over the distribution. The default value of
	      the distribution buffer busy limit has also been increased
	      from 128 KB to 1 MB. This in order to improve throughput.

    OTP-8912  The distribution buffer busy limit can now be configured at
	      system startup. For more information see the documentation of
	      the erl +zdbbl command line flag. (Thanks to Scott Lystig
and possibly also this:

    OTP-8916  The inet driver internal buffer stack implementation has been
	      rewritten in order to reduce lock contention.

Ulf W

Ulf Wiger, CTO, Erlang Solutions, Ltd.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20110728/aed63b1f/attachment.htm>

More information about the erlang-questions mailing list