[erlang-questions] gen_fsm and active sockets

Tue Jan 15 21:16:38 CET 2008

Hi Dave,

I'm not an Erlang guru, by any stretch of imagination, but here's my  
take.

As far as I know, you can easily have a large number of processes,  
even in the hundreds of thousands. Each process has a minimal memory  
overhead, to the tune of a few hundred of bytes.  For 500K processes  
you will need roughly 256-512MB of RAM.  If your processes are mostly  
blocked waiting in their receive loops, the CPU consumption should be  
constant.  This can be easily handled by any semi-decent server grade  
machine.

Now look at it this way: I'm not sure what you are trying to do, but  
if you're using permanent connections, such as TCP sockets, then it's  
likely that you will run out of file descriptors before you run out  
of process capacity.  So I would not be so much concerned about  
doubling the number of processes in a scenario like option #1.  On  
the other hand, with option #1, you increase the number of messages  
moving in your system, which will up your CPU consumption (extra data  
copies, extra GC activity).  Depending on your application and your  
setup, this might or might not be a problem.

Anyways, that's just my $0.02.  It would be great if if anybody with  
experience in these situations would like to comment

Cheers,
Mihai

On Jan 15, 2008, at 1:51 PM, Dave Smith wrote:

> Greetings,
>
> I'm using gen_fsm to manage a socket connection, and have a question
> about how one SHOULD use gen_fsm for this purpose. I'm specifically
> using the "active" mode for the socket, so I'm currently receiving
> events via the handle_info/3 callback. I understand why the socket
> events arrive there, but I wonder what the best way to pass the event
> along to the actual FSM is. I see there being 3 possible options:
>
> 1. Receive socket events on a dedicated process and pass events into
> gen_fsm via that process. Upside is that this provides nice separation
> of socket and fsm logic. Downside there is that I'm doubling the
> number of processes -- i.e. i had one process per socket, now I have
> two. That's not a big problem with a couple of thousand connections,
> but once I'm in the 20k-30k connections realm, I'm not quite sure what
> the implications of doubling the number of processes is. Is it
> "normal" in a production system to run 100k+ processes? Note: I'm
> still recovering from pthreads land, where 100k+ theads is a scary,
> scary thing -- so maybe this concern over # of processes is a
> threading world "hangover" :)
>
> 2. Receive socket events in handle_info and invoke
> gen_fsm:send_event() from there. This seems like the "obvious"
> approach, but it feels wrong -- I'm already in process and don't
> really want to queue up another event. Again, possibly a "hangover"
> from non-Erlang land.
>
> 3. Receive socket events in handle_info, then do a
> ?MODULE:StateName({socket_event ...}, State). Avoids (perceived)
> overhead of approach #3, but...is this a good idea?!
>
> Hopefully this isn't a stupid/obvious question -- I'm finding that
> erlang has a tendency to turn "common sense" on its head (in a good
> way). Any guidance from the gurus would be happily accepted.. :)
>
> D.
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions
>
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2411 bytes
Desc: not available
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20080115/cdfe5570/attachment.bin>