Advantages of a large number of threads cf other approaches?

Shawn Pearce <>
Thu Feb 19 01:15:29 CET 2004


I think Sean forgot to post this to the list, and replied directly to
me... His "anyone?" comment at the end gives it away.  :)

When a process forks in UNIX to create a child, all file descriptors
are automatically made available in the child.  When the child uses an
exec call to start a different program within that process, some
file descriptors are automatically closed, and some remain open.  This
is controlled by the FD_CLOEXEC flag, set through fcntl(2).

The idea here with gen_tcp would be to:

 - use a "node launcher" that creates the incoming UDP and TCP sockets
 - removes FD_CLOEXEC bit to make sure the fd stays open
 - forks a process for each node
 - execs erlang

 - use gen_tcp and gen_udp's {fd, F} options to get 'real' sockets in
   erlang.
 - create new driver calls to allow adding and removing that fd from
   the set of FDs being monitored by the erts event loop.

The reason the last is important is that some OSes may tell multiple
nodes there is data ready, but only one UDP packet has actually come
in.  If the socket is in blocking mode, the other nodes will wake up,
enter the driver, and block trying to read the UDP packet, while one
node will get the packet.  Clearly not good.

Some UNIX variants have a "feature" (bug) where it wakes up multiple
processes, telling them a UDP packet or new TCP connection is available.
The processes come out of select/poll, hit read/accept and more than
one of them gets the same TCP socket or the same UDP packet.  Now you
have more than one node attempting to process the data, and you may
generate two different replies when only one was necessary.

I guess you could try to have the Erlang nodes elect who will handle
the connection/packet that just arrived, but there is no way for
them to know if they have duplicates or not.  So life could get
ugly.  But if the gen_tcp/udp drivers support dynamically enabling
disabling the server socket from the erts event loop, life is ok.

This would work very well on a multiple CPU machine, as each CPU
can have its own dedicated node, epmd+normal erlang monitoring can
be used to know state of the other nodes, and life is very sweet.

So how about it?  :-)


Sean Hinde <> wrote:
> On 18 Feb 2004, at 01:51, Shawn Pearce wrote:
> >Well, given that erts is bound to a single processor, you would need to
> >create a cluster of erts nodes, all running yaws, with some type of 
> >load
> >balancing front end.  This is one area Apache really shines in, as it
> >easily allows this to be setup: because Apache is multi-process 
> >already, it
> >can easily share the single TCP server socket with all of its siblings 
> >and
> >decide who gets the next request.
> >
> >Does anyone think it might be possible to modify gen_tcp in such a way 
> >that
> >we could use multiple nodes on the same system all bound to the same 
> >TCP port,
> >and using some sort of accept lock between them?  I'd think this could 
> >be done
> >something like this:
> 
> This method was suggested to me earlier today in a completely different 
> context but using UDP sockets. A strange co-incidence indeed.
> 
> For UDP sockets the documentation would seem to suggest an existing 
> mechanism using the option:
> 
> "{fd,Fd}
> If a UDP socket has somehow been opened without  using gen_udp, use 
> this option to pass in the file descriptor for it and create a Socket 
> for it."
> 
> But just getting the FD (using inet:getfd(Socket). ) and trying this in 
> another Erlang node with gen_udp:fdopen(FD, []) doesn't work (probably 
> obvious in hindsight).
> 
> The usage described to me earlier today is that 1 UNIX process opens 
> the socket and then forks additional child processes which get access 
> to the file descriptor. I guess that UNIX treats child processes in a 
> special way (allows them to receive data on another processes FD).
> 
> Anyway. If it is possible this would be a VERY attractive way to take 
> advantage of multiple CPU machines.
> 
> Anyone?
> 
> Sean

-- 
Shawn.



More information about the erlang-questions mailing list