[erlang-questions] Architecture: How to achieve concurrency when gen_server:handle_call waits?

Sun Feb 7 14:49:05 CET 2016

Hi Luke.

On 2016年2月8日 月曜日 00:20:34 Luke wrote:
> Hi,
> 
> The way I'm getting my head around the actor model & OTP is to imagine it's
> like an office building full of people with specialised jobs (gen_servers),
> and you send emails to people who work on your request right away and reply
> when they're done (for a handle_call). If I have used call then it's as if
> I  fire off the email and sit there clicking refresh until I get a reply
> (or like a literal call, I am on hold until they get back to me), but if I
> use cast then I just get on with my day and maybe check my inbox again
> later. I can see a lot of benefits with this kind of organisation of your
> programs and like it a lot.
> 
> My problem is that when a gen_server is working on something they are busy
> and can't answer more calls/casts. Is the correct way to achieve
> concurrency in Erlang to have each gen_server spawn a brand new process and
> then go back to checking their inbox again? In the office building
> metaphor, essentially each worker also has access to an infinite pool of
> interns and they are able to forward tasks, immediately delegating all the
> work away (I used to work at Ericsson, I can see how this model comes
> naturally to them :P)

There is nothing inherently wrong with this way of doing things, but
consider two things:

- Most of the time an Erlang process is less about doing work (else why is
the caller not just spawning his own process?) and more about managing
state. It is the *context* of the call that is significant more than the
algorithm underlying whatever was done.

- How must the reply now be achieved? Either the worker must be given the
full reply reference (which is a bit more complex to deal with, but
absolutely workable) or the original gen_server must receive *another*
message back from its worker and forward it to the caller. This turns the
gen_server into more of a dispatch server than a state server.

Again, there is nothing at all wrong with this model, its just that very
often this approach is more awkward than it is worth. There *are* some
cases you very occasionally run in to where this approach is a very big
win -- but its not so common that OTP abstracts it out for you, and often
when you do need this the situation is special enough that a generic
solution may not fit (but I could imagine a generalization of the idea
working well all the same: the existing implementation is called
spawn_link).

> If this is indeed the correct way to achieve concurrency, I still have the
> following questions:

First off -- you're *already* achieving concurrency.

The bigger issue is that you're stuck thinking in terms of synchronous
calls. Don't do that. Try seeing if you can write a program that deals
only in casts. Sometimes you can, sometimes you can't -- but you'll develop
a sense for where casts are sufficient and calls are actually necessary.

> 1 - Why isn't this done automatically behind the scenes in gen_server? When
> would you ever not want to free up your gen_server to handle more requests?

Usually you don't run into situations where a single gen_server's response
time dictates your entire system's latency. If you have this situation you
have just invented a bottleneck. That's silly. Don't do that. Don't make your
bottlenecks performant, just don't have bottlenecks in the first place.

It is surprising how little centralization you can achieve in most systems
if you think carefully about it (or get used to the idea).

> 2 - Is it best practice to spawn each new process under the gen_server, a
> supervisor somewhere, or not all?

It is usually a good thing to spawn them always under a supervisor. But
there are a variety of approaches. Generally I prefer to have a supervisor
that is simple_one_for_one per one-off task other processes in the system
will want to spawn for parallel computation. But this also constrains your
thinking considerably (which is also usually a good thing). Sometimes you
do need the flexibility of a loose process -- but whenever you do this
spawn_link it, don't just spawn it -- that way if it does its entire branch
of the supervision tree dies with it and is restarted (or not) in an
understandable way. You get surprising and weird things happening if you
start spawning detached processes.

As a guideline: Always spawn under a supervisor
As a hard rule: Always ALWAYS spawn linked to the supervision tree.

> 3 - If your gen_server is being flooded with messages, would one viable
> solution to achieve concurrency be creating say 10 of the same gen_server
> under a supervisor, and having any processes passing messages to this "job"
> randomly pick one, effectively using probability to reduce traffic to each
> by 1/10 - is there a library/methodology for doing this kind of thing
> already?

This is pooling, and yes, there are tools for it. It is sort of silly to
ever have this problem, though (imo). Whenever I want pooling what I
really want is a bunch of stateless workers I can ride into the ground,
and the only reason I ever want to have a fixed number of them is to
control resource usage. It is also possible to just spawn processes as
needed, though, and let them die whenever they are done doing what they are
doing. gen_server are *slightly* more overhead to spawn than pure Erlang
ones (for some definition of slightly...).

In brief: there are dogmatic answers either way, and neither is a clean
fit for many cases you'll run into in reality. The only real answer to
pooling VS spawning is: "it depends".

> 4 - This seems like it would be a common piece of code, is this bundled
> into OTP somewhere? Is this situation what I'm supposed to use gen_event
> for? Or if I'm completely wrong, what is the actual way programs like yaws
> achieve high concurrency, as reading the source code has not revealed the
> answer to me.

For client/connection driven services the typical way is not to have a
connection pool, but to spawn a connection per client.

A way of dealing with this I use in messaging systems, business software
and games where a single user may concurrently log in from several clients
is to have a process spawned to handle each network connection, but a
separate client "controller" spawned to handle each *session* (which
itself may have several connections associated with it).

There isn't anything in OTP that abstracts that away. I'd love to write
a library for this -- but no money and no time. Poof! Anywway, writing
this isn't hard, just sort of annoying to do more than once.

-Craig