[erlang-questions] spawn problem

Tue Mar 27 04:17:05 CEST 2007

On 27 Mar 2007, at 1:17 pm, Fernando Ipar <fipar@REDACTED>  
wrote

> test(_, 0) ->
>         true;
> test(Command, Threads) when number(Threads) ->
>         spawn('os', 'cmd', [Command]),
>         test(Command, Threads - 1).

You do not need quotation marks around atoms like os and cmd.

>
> start() ->
>         Command = list_to_atom(os:getenv("Command")),
>         Threads = list_to_integer(os:getenv("Threads")),
>         test(Command,Threads).

You do not need to convert Command to an atom; the documentation for  
os: says
     cmd(Command) -> string()
Types:
     Command = string() | atom()

Mind you, sending an atom in a message to a process is cheaper than  
sending a string,
but since atoms are not garbage collected, there's an advantage to  
using strings in a
long-running process.

> My problem is that if I call this with more than 508 'Threads' (bear
> with me on the variable name selection..) I get ... emfile
...
> open files                      (-n) 1024

508 is suspiciously close to 512, which is half of 1024.
Doesn't it look to you as though os:cmd/1 is opening TWO files?

Looking in lib/kernel/src/os.erl,

	cmd/1 -> unix_cmd/1 -> spawns unix_cmd/1 -> creates a port using  
start_port/0
	-> open_port({spawn,"sh -s unix:cmd  2>&1"}, [stream])
	The command is then sent to the STANDARD INPUT of that new process
	and the STANDARD OUTPUT of the process read and returned to the  
original caller.

It would be a little clearer if the documentation for  
erlang:open_port/2 said explicitly
that open_port({spawn,...},[...]) used up two file descriptors, but  
this text:
	use_stdio
	    This is only valid for {spawn, Command}.
	    It allows the standard input and output (file descriptors 0 and 1)
	    of the spawned (UNIX) process for communication with Erlang.
does give you a strong hint.

Why does os:cmd/1 send the command to the shell's standard input,
	sh -s unix:cmd <"your command string"  ,
rather than
	sh -c "your command string" </dev/null ?

Not being the author, I can't say for sure, but there is an upper  
bound on the size of a
command in the second case which does not apply in the first (if the  
Command is passed as
a string, because atoms have their own limits).

> While this project in particular is not important, I'll appreciate any
> help with this problem, since I'd like to know how many processes I  
> can
> create in erlang, and what tweaking needs to be
> done to the OS (if any).

Well, now you know the answer.  If sysconf(_SC_OPEN_MAX) is N, the  
answer is
"a little bit less than N/2", and apparently it's (N-8)/2.

You can create any number of processes; the limit is on the number  
that can be
active at the same time.  The limit is determined by sysconf 
(_SC_CHILD_MAX) --
25813 on my machine -- and sysconf(_SC_OPEN_MAX)/2 minus a bit -- my  
machine
limits the number of open files to 256, so I can only have a little  
over 100
os:cmd/1 processes active at the same time.

Since there is no guaranteed way for an Erlang program to determine  
whether there
will be enough file descriptors left or not, a reasonable extension  
might be to
add a {timeout,T} option to to erlang:open_port/2 to wait up to T  
milliseconds for
sufficient file descriptors to become available again.  Until  
something like that
is done, the simplest thing is to keep the number of OS processes you  
create low,
and the next simplest is to write your own wrapper which catches  
emfile, waits for
some time, and retries, giving up after some number of retries.