[erlang-questions] Making an Ubuntu daemon out of a release while using heart

Mon Mar 14 07:00:09 CET 2011

On 10/03/11 05:32, ori brost wrote:
> I have a release of a server program in Erlang that I would like to turn
> into a daemon. Currently I did this by writing a script (based on the
> release management script generated by rebar) which runs the Erlang VM
> synchronously, and writing an upstart .conf file which runs the script).
>
> My problem is that I want to use erl -heart. I have noticed that the
> whenever the VM dies, the heart process restarts it, but then the VM starts
> a new heart process and kills the old one. This means I have no permanent
> PIDs to follow. I was thinking of implementing my solution by having the
> script run a while-loop that will only exit if there is no heart process in
> the system (for now, assume there is no other erlang program running on the
> machine), and have upstart think that the script is the daemon.
>
> Does anyone have recommendation for a better solution? Or some general
> recommendation on how to make a daemon out of an Erlang server? Preferably
> using upstart?
>
> Best Regards,
>
> Ori Bar, Software Engineer
> Nivertech Ltd
>
> email:    ori.bar@REDACTED
>
Hi Ori,

    We faced similar problems when deploying an Erlang application to
RedHat machines and ended up writing our own replacement for "run_erl"
and "heart" (as a single program) that we call "erld". If you're
interested I could probably send you the source (we're intending to
release it as open source eventually anyway). It's intended for running
an Erlang VM as a "well behaved" Unix daemon.

    The problems we hit (and which it solves), are, from memory:

* PID confusion: The only PID that a unix init script can get hold of
when starting Erlang is the PID of the beam.smp process, and when that
is restarted by heart, it changes and the init script can't know what
process to stop or kill. run_erl doesn't seem to support a "pidfile" option.

* Zombie Erlang: If heart dies (or is killed), Erlang will beam.smp will
restart it. If beam.smp dies, heart will restart it. This makes it
rather hard to kill a system that is crashing badly (in fact if you
don't know how to stop-but-not-kill a unix process you may have to
reboot the server!).

* Bad detach behavior: run_erl detaches from the console before it's
actually started anything so an init script can't possibly report "OK"
or "FAIL"... it just has to say "OK" and hope that the program has
started. Admins have to check the logs every time to see if it's worked.
(This is REALLY annoying to our system admins.) Properly behaved daemons
won't detach until after they've started up *successfully*. Any fatal
errors should be reported to the console.

* Which log is current? The log rotation system that's built in has a
bizarre rotation strategy that we don't like (the .1 .2 .4 .5 thing that
you've probably seen).

* Because run_erl will happily created any number of sockets in /tmp,
it's easy to accidentally start multiple (probably non-functional)
copies of your application.

It also does a few other things:
* Reopens log files on SIGHUP.
* Captures any stdout from the Erlang VM.

I suspect even fairly big Erlang Unix apps face these problems, but I
haven't yet seen any other solutions (e.g. ejabberd, yaws). Does anyone
know?

Peace,
Sam.