[erlang-questions] Large-scale Erlang in practice

Fri Feb 3 19:13:57 CET 2012

As some of you know, we operate a medium-to-large-scale Erlang
infrastructure, with about 16 nodes (each of which is a dual-Xeon type
Dell server)

As we put load on the system, we keep finding places where the system
falls over, typically based on N-squared behavior in various parts of
the OTP and the runtime library. This is a bit disconcerting, because
any system that scales really should be built with constant-time (or,
worst case, log-time) runtime behavior.

The latest such behavior we've found is gen_udp:send(). This will make
a synchronous call from the calling process to the inet system. Then
it will wait for a response. Unfortunately, that wait-for-response
means a matched receive to the current process. When the current
process has a mailbox with a number of messages, then each call to
gen_udp:send() will have to scan all the messages in the mailbox.

So, consider a burst of, say, 10,000 messages I want to send. If each
of those messages are an input message to a process that calls
gen_udp:send(), then sending one of those messages will cause a
mailbox scan of the entire queue.

I might say that "I shouldn't have a long mailbox," but in a
message-based system, that's pretty much impossible to avoid. For
example, if some part of the Internet is having bad weather, ten or
twenty thousand clients will possibly be disconnected at one time,
each of which will cause a process monitor message to be queued with
the supervisor or other such process. If all operations are
constant-time, that doesn't matter, but with each operation being
linear in pending messages or linear in number of supervised
processes, such an event will cause unbounded disrupion to the system
at large -- not just the clients affected by the "bad weather."

So, what's the meaning of this rant? Three things:

1) I'm considering writing some nifs that allow me to send UDP
datagrams without going through a mailbox scan. For syslog and
statistics, this would be a perfectly fine thing to do.

2) Are there any other
"linear-in-number-of-users-so-n-squared-overall" systems we need to
worry about? Mailbox scans may happen deep inside the standard
libraries, and I'm at the mercy of those implementations, no matter
how careful I am in crafting my own code. (N-squared garbage generated
by using dict for data that mutates often was another "neat"
realization.) What's the next thing that we'll run into?

3) Has there been any attempt at removing these kinds of bottlenecks
on a more systematic level? We're still using R13B3 because we're
using Ubuntu LTS which is still version 10.04... However, if there is
a significant, measurable gain to be had by upgrading, it might be
worth it.

Sincerely,

jw

--
Americans might object: there is no way we would sacrifice our living
standards for the benefit of people in the rest of the world.
Nevertheless, whether we get there willingly or not, we shall soon
have lower consumption rates, because our present rates are
unsustainable.