[erlang-questions] max nodes on one server

Angel <>
Sat Mar 27 15:56:52 CET 2010


On Viernes, 26 de Marzo de 2010 19:01:14 Erwan MAS escribió:
> Hello ,
> 
> I tried to start many nodes on one server .
> 
> Currently i can not  have more than 1017 nodes .
> 
> My system is rhel 5.4 .
> 
> On the shell , before i start the command , i set the max open file limit
>  with ulimit -n 4096 .
> 
> my erl command is :
> 
> erl -hidden -connect_all false  -rsh ssh +K true -env ERL_MAX_PORTS 4096 
>  +P 134217727
> 
> I think i reach a limit but i dont know which ?
> 
> 
> 
> ps :
> My code is  :
> 
> startnodes(NumberOfNodes, NodeNamePrefix) ->
>         startnodes1(NumberOfNodes, NodeNamePrefix, []).
> startnodes1(0, _NodeNamePrefix, Acc) ->
>         Acc;
> startnodes1(NumberOfNodes, NodeNamePrefix, Acc) ->
>         NodeName =  "benchnode"++ atom_to_list(NodeNamePrefix)  ++
>  integer_to_list(NumberOfNodes), Args = "-setcookie " ++
>  atom_to_list(erlang:get_cookie()), Pas = case init:get_argument(pa) of
>                           error ->
>                                   "";
>                           {ok, Palist} ->
>                                   lists:foldl(
>                                         fun(X, Str) ->
>                                                 Str ++ " -pa " ++
>  filename:absname(X) end, "",
>                                         lists:append(Palist))
>                   end,
>         Res=slave:start_link("localhost", NodeName, Args ++ Pas),
>         case Res of
>                 {ok, Node} ->
>                         startnodes1(NumberOfNodes - 1, NodeNamePrefix, Acc
>  ++ [Node]) ; Error ->
>                         io:format("Error ~p OK~n", [Error]),
>                         io:format("~p processes was not started.~n",
>  [NumberOfNodes]), Acc
>         end.
> 
Hi

What's the error you see?

I played last year with slave and got several timeouts with large node 
numbers.

Maybe this can help you:

I was using pool and i came across failures on the timing logic (fixed to 32  
secs) .  I was using a remote code server for my nods so the more nodes i 
spawned the more load was put on the code server, and eventually new nodes 
werent able to startup in time so slave killed them.

i modified slave to allow configurable timeout and that solved my problem...

Ive attached my code (its old around R12..)

the funtion wait_for_slave checks for a new option {timeout,TimeOut} and uses 
then to alter the after clause when she waits for new nodes to startup..

Now i can pass {timeout, 100000} on start_link and slave waits nicely for the 
nodes.



/Angel


Most people know C is not so high level....
                ...Everybody else just got assembler overdose
 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: myslave.erl
Type: text/x-erlang
Size: 9452 bytes
Desc: not available
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20100327/a72dba16/attachment.bin>


More information about the erlang-questions mailing list