[erlang-questions] A problem with exit erlang node.

Imants Cekusins imantc@REDACTED
Thu Nov 13 00:35:37 CET 2014


Could the 'gateway' node close its open sockets before shutting down?

Maybe pause for a second after pinging an exit signal to the gateway node,
stopping the apps but before calling erlang:halt()?
On 12 Nov 2014 17:14, "adam chan" <114999420@REDACTED> wrote:

Hi List,

I have a problem to stop or exit a erlang node.
When I called erlang:halt(), the node is fake dead, and the cpu goes up to
100%.

Here is the situation:
I'm running OTP_R15B02 on Centos 6.3.

I have 3 nodes named 'server', 'unite' and 'gateway' which connected to
each other.
The 'gateway' node listens to a port , receives socket datas from client,
and then transfers to 'server' and 'unite'.
The response data from 'server' and 'unite' will send back to client
through 'gateway' node too.

When I want to stop all these 3 nodes, the 'gateway' node CAN NOT exit
completely sometimes (small probability) .
The nodes is running in screen of linux, the starting scripts like this:

*[start_all.sh]*
...
/usr/bin/screen -dmS server -s $ScriptPath/start_server.sh $Log
...
/usr/bin/screen -dmS unite -s $ScriptPath/start_unite.sh $Log
...
/usr/bin/screen -dmS gateway -s $ScriptPath/start_gateway.sh $Log

*[start_gateway.sh]*
#!/bin/bash
cd /data/web/server/server/config
ulimit -s 262140
erl -kernel inet_dist_listen_min 40001 -kernel inet_dist_listen_max 40100
+P 1024000 +K true -smp disable -name gateway@REDACTED -setcookie abc
-boot start_sasl -config gs_main -pa ../ebin -s gs_main start
-extra 192.168.7.100 9001 2


I stop the nodes in the order of 'gateway' -> 'unite' -> 'server'
The stop scripts like this:
*[stop_all.sh]*
#!/bin/bash
cd /data/web/server/server/scripts/
chmod +x stop_gateway.sh
chmod +x stop_unite.sh
chmod +x stop_server.sh
./stop_gateway.sh
./stop_unite.sh
./stop_server.sh

*[stop_gateway.sh]*
#!/bin/bash
cd /data/web/server/server/config
erl -noshell -hidden -name stop_gateway@REDACTED -setcookie abc -pa
../ebin -eval "rpc:call('gateway@REDACTED', gs_main, stop, [])." -s c q


*[gs_main.erl]*
-define(SERVER_APPS, [sasl, gs_main]).
...
stop() ->
    ok = stop_applications(?SERVER_APPS),
    erlang:halt().


The 'server' and 'unite' node can exit completely every time, and the
screen which is running the node also exit too.
But the 'gateway' node sometimes (small probability) can't exit, the
screen remains too:

*[root@REDACTED logs]# screen -ls*
There are screens on:
        20107.gateway  (Detached)

*[root@REDACTED logs]# ps -ef | grep gateway*
root     20107     1  0 Nov10 ?        00:00:00 /usr/bin/SCREEN -dmS
gateway -s /data/web/server/server/scripts/start_gateway.sh -L -c
/data/web/server/server/var/logs/screenrc_gateway
root     20110 20107  0 Nov10 pts/7    00:00:00 /bin/bash
/data/web/server/server/scripts/start_gateway.sh
root     20111 20110 90 Nov10 pts/7    1-19:56:53
/usr/local/lib/erlang/erts-5.9.2/bin/beam -P 1024000 -K true -- -root
/usr/local/lib/erlang -progname erl -- -home /root -- -kernel
inet_dist_listen_min 40001 -kernel inet_dist_listen_max 40100 -smp disable
-name gateway@REDACTED -setcookie abc -boot start_sasl -config gs_main
-pa ../ebin -s gs_main start -extra 192.168.7.100 9001 2

*[root@REDACTED logs]# strace -c -p 20111*
Process 20111 attached - interrupt to quit
^CProcess 20111 detached

strace command has no effect here. And one CPU core keeps running at 100%.
At the end of the 'gateway' node's log, it says the application is exited:
*[gateway.log]*
=INFO REPORT==== 11-Nov-2014::10:21:18 ===
    application: gs_main
    exited: stopped
    type: temporary

It seems that some endless loop occured after the printing of the =INFO
REPORT=.
The application is not really exited, or the 'ps -ef | grep gateway'
command won't find the 20111 process.

Any ideas?
Thanks in advance.

------------------
Adam Chan


_______________________________________________
erlang-questions mailing list
erlang-questions@REDACTED
http://erlang.org/mailman/listinfo/erlang-questions
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20141113/f82b9205/attachment.htm>


More information about the erlang-questions mailing list