[erlang-questions] Best way to handle multiple gen_server

Tue Feb 14 21:33:05 CET 2017

One thing to think about when you spawn a gen_server for each request is
what should happen if the system decides to terminate immediate processes
around it. One way is to hook them into a simple-one-for-one supervisor, so
you have a notion of them in the supervision tree. If however, they are
truly one-off processes, you can often just spawn_link them under the
process that started them. Do note that an error in your process then
terminates the processes it is linked to.

In general, avoid "naked" spawns where you have a process that is not tied
to anything, unless you know it is a process that is guaranteed to
terminate in a short amount of time. Otherwise you are looking at leaks.

Gen servers can terminate by returning {stop, normal, State} for a normal
exit. These do not create errors in the supervisor if it is tied to one and
are seen as a graceful exit. Another OTP rule is that {shutdown, Reason}
works, but a normal termination is probably fine in this case.

As for running thousands of requests, it is usually not a problem, unless
each of your requests construct a process with a memory footprint of
several hundred megabytes. My usual ballpark figure is that you can have a
million processes in less than a gigabyte of memory. YMMV depending on
memory footprint of course.

It is often beneficial to just spawn a process per work unit you have as
this maximizes your concurrency level in the system. It also avoids
pipelined architectures, since each unit of work can simply do what it
needs to do. One thing to beware of though, is while Erlang has no trouble
creating say 20,000 request to a foreign system, that foreign system may
not be able to cope with that load. In this case, you often use some kind
of queueing process that makes sure you limit the outbound communication to
the foreign system. Database pools (github.com/devinus/poolboy) and HTTP
pools are examples of such limiters. As are Ulf Wiger's 'jobs' framework,
and my own 'safetyvalve' framework. Another example is if your processes
will be, say, resizing images, which can take up lots of memory. You may
want to spawn 20,000 processes, but have a system which hands out "tokens"
once in a while to limit the concurrency to something like 8-16 actually
doing work. Here something like https://github.com/duomark/epocxy and its
Concurrency Fount might come in handy as well (Jay Nelson is the author).

As for losing a mailbox because a process dies: if you think about it, this
is the same as if the message had been lost in transit to the mailbox. Your
system must be able to cope with message loss like this, though you can
usually assume the loss rate to be very low (say one in a million). It
makes for a more robust system, and also, it allows you to seamlessly
divide work over multiple node()'s later, should it be necessary. If you
can, prefer a system which uses idempotent mechanisms: retry in a while if
no answer is had or the event didn't go through. Make sure that a retry of
a thing which has already been done returns as if it was the first time
being done.

On Tue, Feb 14, 2017 at 6:13 PM Alex S. <alex0player@REDACTED> wrote:

>
> > 14 февр. 2017 г., в 20:06, Felipe Beline <fe.belineb@REDACTED>
> написал(а):
> >
> > Hi,
> >
> > My first question is: If I have a gen_server started by a Supervisor and
> it's handling a request (but has several others waiting in the queue), then
> it dies for some reason, and the supervisor restart it. Will the requests
> in the queue be lost?
> >
> > Other question is, if I want to create a process of this specific
> gen_server for each request that I have(then it should run in "parallel"
> several instances), but after each one finishes the execution and return
> the calculated value, I want to it to terminate himself. How should be the
> properly way to implement it?
> >
> > Another doubt is, let's  say that several thousands os request is made
> at same time, and it creates one instance of the server for each one,
> should it be "ok" :)? Or  should  I create a limited amount of servers and
> then distribute the requests over then? In this case the OTP pool should
> help me ?
> >
> > Cheers, Felipe
> The requests in queue will be lost, but your clients will receive an error
> and can retry.
>
> As far as spawning short-lived tasks, I suggest looking into proc_lib and
> its start/init_ack capabilities. It unfortunately conceals a little bit of
> debugging info right now but that’ll be fixed in OTP 20.
>
> I wouldn’t recommend `pool` module personally. Handling several thousand
> requests simultaneously should be no problem, but that of course depends on
> the nature of handling. I’d say if you cannot cache any data between
> requests, there’s no real point in pooling.
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20170214/906afe29/attachment.htm>