[erlang-questions] Starting Erlang as a Unix daemon
Serge Aleynikov
saleyn@REDACTED
Sat Jan 10 04:23:41 CET 2009
Sam,
Actually there are already tools that can help you with this task.
Use run_erl (http://www.erlang.org/doc/apps/erts/index.html) in order to
start your node (without providing the -daemon option). In your shell
script run run_erl as a background task, and use erl_call (from
erl_interface) to attempt connecting to the node or alternatively just
grep the erlang.log.1 (produced by run_erl) for expected successful
startup message.
Lately I've been using this approach to install and run nodes with help
of chkconfig on Linux quite successfully. The outer shell script is run
as root, and all it does is starting run_erl in the background as a
non-privileged user, detecting its death and restarting the process
using some throttling rate to avoid too frequent restarts in case of bad
configuration or something like that.
We've had some bad experience with the -heart option to erl in
combination with run_erl. As I recall two issues came up:
1. The later has some race conditions related to signal handling and
calling select(). We are actually running a patched version of run_erl
that fixes the issue.
http://www.erlang.org/pipermail/erlang-patches/2006-January/000147.html
2. On a few occasions we've seen on a busy node heart missing some
heartbeat replies and killing the monitored node. Some details on this
can be found here:
http://erlang.org/pipermail/erlang-questions/2006-December/024365.html
On the contrary the shell-based startup alternative described above
works pretty stably.
Serge
Sam Bobroff wrote:
> I already use a system very like the one described on your web page but
> I'm having several problems with it. They are:
>
> * The init script should return success (a zero exit code and an "OK"
> message) if and only if the server has started successfully. Running
> "erl -detached" (as your script does) unfortunately always returns
> success. It's just not acceptable to have a service report that it's
> started successfully when it hasn't.
> * Running the stop init script should only return success if it has
> actually managed to stop the server. (*)
>
> Without these properties a system can't be (safely) used from package
> managers like RPM and it also looks bad to system administrators (who
> have to use "ps" and look at some log file just to see if it's started
> up).
>
> I think I'll have to write a C wrapper program to handle this. It could
> register itself as a C node and run Erlang as a (non-detached) child
> process. It could then display any output from Erlang during the start
> up phase on the terminal, and once it receives a specific message from
> the Erlang application (via a simple message to the C node) it could
> detach itself from the terminal and return a result code (leaving Erlang
> running in the background). If that sounds useful to anyone else, let
> me know and I'll try to open source it :-)
>
> (Even better would be to include this functionality in erl itself, and
> provide a BIF that you could call from within your Erlang application to
> complete the detachment and return a result code from erl to the shell.)
>
> Cheers,
>
> Sam.
>
> (*) I've got this part working fairly well by using a system similar to
> yours but by also passing the result of the stop function out to the
> init script using halt(1) or halt(0).
More information about the erlang-questions
mailing list