[erlang-questions] Using gen_server or writing a new behavior

Fri Jan 16 20:55:52 CET 2009

On Fri, Jan 16, 2009 at 18:15, Ciprian Dorin, Craciun
<ciprian.craciun@REDACTED> wrote:
>> * each request is a process that look at the method, url, etc,
>> dispatch to functions that parse what is needed, such as posted bodies
>> or query args.
>> * all the data goes into a function that calls plain ordinary
>> gen_servers (if they need to at all!), still in the same process
>> * the result is sent to a template function that renders the result
>> values to something htmlish, and still in the same process.
>
>    What do you mean by "calls plain ordinary gen_servers still in the
> same process"? Calling a gen_server implies sending a message to a
> gen_server process, right?

That was a bit unclear. I mean that the controller might not need to
call the gen_server at all. Simple form validation can be done in the
request process, i.e. if the form values dont match regexps and such,
then it can just throw an exception or serve the form again with the
complaints added.

But if the form submitted is valid it can take it to the next level
and actually perform the side-effects, be it either to call a
gen_server, or to run a mnesia transaction.

>> This is because sending each request directly into a gen_server will
>> limit concurrency. While it processes one requests, others could be
>> _waiting to be served_. The latter being the problem.
>
>    Yes, indeed this would happen, and this would be an
> disadvantage... (Limiting concurency, adding message passing overhead)

(Message passing has very low overhead, check out the ring
benchmarks.) It's all about the concurrency. It just doesnt make sense
to perform form validation in a gen_server and have other processes
waiting to have their forms validated. It's not something that depend
on each other's completion order.

>    But there are two reasons I would like to do this (that is using
> gen_server like processes):
>    * first is that it looks more Erlang-ish, than dispatching the
> request to a module:function... (the Erlang way is have lot's of
> processes right?);
>    * second, I could create multiple request handlers that are used
> in a round robin fashion; (and thus control the load on a given set of
> URLs, without having fancy code behind it, just by limiting the number
> of registered handlers);

I've never felt that the erlang-way is to throw processes at things
for the fun of it. Rules for when to use processes is something that
has been discussed on the list without any clear rules of thumb being
mentioned. It is really up for a situation-per-situation analysis.
Here I just dont see what you gain. Two requests are "embarrassingly
parallellizable" until they go for the same resource. Let them be
dynamically created processes, one per request.

I do not buy the second argument because the process scheduler already
does the job of making sure processes get a slice of the cpu fairly.
If you want to distribute requests over multiple nodes then it might
be sensible, but reverse proxies for http already provide a mechanism
for it.

>   As a side note, this is what bothers me with most Erlang http
> servers: they use module:callback dispatchers and not processes... It
> adds performance, but reduces flexibility / OTP-way of doing things
> (if there is such a thing)...

It is because the http servers have a process created for that tcp
connection, and it has parsed the http headers and now go on to the
callback for processing the request. Yaws, mochiweb, iserve... it is
all one-process-per-request already.

I appreciate the otp behaviours for long-running services. For a
single http request i dont bother. The http port listener is a long
running process that tends to be supervised and a otp behaviour. If it
crashes it is a problem.  If a single http request crashes it is not
much of a problem, as long as the http listener is up it can start a
new acceptor, so it is an isolated problem for just that request. Just
log/notify some error status.

What flexibility do you lose?  Doing the gen_server:reply/2 trick by
having a gen_server spawn new processes just seems like duplication to
get what you already had with callback-based http servers.