heart does not restart node launched with run_erl

erlang-questions <>
Wed Jan 4 22:27:57 CET 2006


Hi all,
  Ran into a weird problem.  I have an embedded application that is started with run_erl from a .sh script.  I also use heart to restart the application. HEART_COMMAND is set to launch the same start.sh script that was used to start the application initially.  At the start, the process tree looks as follows:

 3196 ?        S      0:00 /home/drpdev/erts-5.4.10/bin/run_erl -daemon /home/drpdev/var/tmp/drp /home/drpdev/var/log/drp -exec /home/drpdev/bin/start_erl
 3202 pts/2    Ssl+   0:02  _ /home/drpdev/erts-5.4.10/bin/beam -- -root /home/drpdev -progname drip -- -home /home/drpdev -boot /home/drpdev/releases/1.
 3222 ?        Ss     0:00      _ heart -pid 3202
 3227 ?        Ss     0:00      _ inet_gethost 4
 3228 ?        S      0:00      |   _ inet_gethost 4
 3229 ?        Ss     0:00      _ sh -s disksup

To test the restart, I kill pid 3202 and see the following:

 3222 ?        Ss     0:00 heart -pid 3202
 3196 ?        S      0:00 /home/drpdev/erts-5.4.10/bin/run_erl -daemon /home/drpdev/var/tmp/drp /home/drpdev/var/log/drp -exec /home/drpdev/bin/start_erl
 3202 ?        Zs     0:02  _ [beam] <defunct>


Next, heart launches the script:

 3253 ?        S      0:00    /bin/bash /home/drpdev/bin/drip.sh start
 3272 ?        S      0:00        _ sleep 3
 3196 ?        S      0:00 /home/drpdev/erts-5.4.10/bin/run_erl -daemon /home/drpdev/var/tmp/drp /home/drpdev/var/log/drp -exec /home/drpdev/bin/start_erl
 3202 ?        Zs     0:02  _ [beam] <defunct>

The sleep 3 is right before it calls the run_erl command to start the embedded application. Note that the old run_erl (pid 3196) is still hanging around although the node itself (pid 3202) is defunct.

When drip.sh calls run_erl, the old run_erl (pid 3196) goes away, but no new run_erl process appears.  Application is not started either. erlang.log.1 does not showI see the following in the run_erl.log:

-------
Pty master read; run_erl [3196] Wed Jan  4 15:59:37 2006
Pty master read; run_erl [3196] Wed Jan  4 16:00:46 2006
Pty master read; run_erl [3196] Wed Jan  4 16:00:51 2006
Pty master read; run_erl [3279] Wed Jan  4 16:00:54 2006
/home/drpdev/erts-5.4.10/bin/run_erl: pid is : 3279
run_erl [3196] Wed Jan  4 16:00:54 2006
FIFO read; run_erl [3196] Wed Jan  4 16:00:54 2006
OK
run_erl [3196] Wed Jan  4 16:00:54 2006
Pty master read; run_erl [3196] Wed Jan  4 16:00:54 2006
Pty master read; run_erl [3196] Wed Jan  4 16:00:54 2006
Pty master read; run_erl [3196] Wed Jan  4 16:00:54 2006
Erlang closed the connection.
-------

I am curious why new run_erl (pid 3279) process did not start. Also, why did the old run_erl (pid 3196) did not terminate until the new run_erl attempted to start?  I verified that this is not a coincidence - old run_erl will remain hanging in the process list until a new run_erl is started.

Please, let me know if anyone else experienced similar issue. If needed I can provide additional info/config files, but not sure at this point which ones.

Thank you.
Dmitry Korsun
IDT Corp.
_________________________________________________________
Sent using Mail2Forum (http://m2f.sourceforge.net)



More information about the erlang-questions mailing list