Implications of setting SIGCHLD in relation to NIFs

Lukas Larsson lukas@REDACTED
Mon Nov 16 20:53:47 CET 2020

On Mon, Nov 16, 2020 at 7:12 PM José Valim <jose.valim@REDACTED> wrote:

> Hi everyone,
> I am working on Tensorflow bindings and, at some point, Tensorflow forks a
> child process to invoke a separate program. Unfortunately, when running
> inside the Erlang VM, Tensorflow fails when calling waitpid, in exactly
> this line
> <>
> .
> After some debugging, we found out the root cause is because the Erlang VM
> sets SIGCHLD to SIG_IGN. According to waitpid docs
> <>:
> > If the calling process sets SIGCHLD to SIG_IGN, and the process has no
> unwaited for children that were transformed into zombie processes, the
> calling thread blocks until all of the children of the process terminate,
> at which time waitpid() returns -1 with errno set to ECHILD.
> Setting os:set_signal(sigchld, default) fixes the issue, however, it
> leaves me wondering:
> 1. Is it safe to set sigchld back to default? Or is the VM expecting it to
> be ignored? Are there any implications we should be aware of?
> 2. In case it is safe to have it as a default, why is it being ignored in
> the first place?

The VM does not care but some other systems do care, eg. docker.

It should be fine to change it as long as you are aware that you leak
zombies if erlang is run as pid 1.

Calling waitpid in a nif may work now, but we give no guarantee that it
will work in the future. In fact, before OTP-19, doing that would have
broken a lot of code.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the erlang-questions mailing list