[erlang-questions] output to standard_error crashes sometimes on R18

Lukas Larsson garazdawi@REDACTED
Fri Aug 7 18:58:50 CEST 2015


Hello,

Many thanks for the detailed analysis. I've gotten the similar reports from a couple of places.

I'll take a look at it when I get back from summer vacation.

Lukas


> On 6 aug 2015, at 12:41, Andreas Schultz <aschultz@REDACTED> wrote:
> 
> Hi,
> 
> Since I upgraded to 18.0.2, my Erlang application Linux sometimes crash
> with egain in standard error. The crash is triggered by something like this:
> 
>   io:format(standard_error, "started ~s~n", [application]).
> 
> The actual crash looks like this:
> 
> ** Generic server standard_error_sup terminating 
> ** Last message in was {'EXIT',<0.28.0>,eagain}
> ** When Server state == {state,standard_error,undefined,<0.28.0>,{local,standard_error_sup}}
> ** Reason for termination == 
> ** eagain
> 2015-08-06 09:14:41 =CRASH REPORT====
>  crasher:
>    initial call: supervisor_bridge:standard_error/1
>    pid: <0.27.0>
>    registered_name: standard_error_sup
>    exception exit: {eagain,[{gen_server,terminate,7,[{file,"gen_server.erl"},{line,826}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]}
>    ancestors: [kernel_sup,<0.10.0>]
>    messages: []
>    links: [<0.11.0>]
>    dictionary: []
>    trap_exit: true
>    status: running
>    heap_size: 376
>    stack_size: 27
>    reductions: 155
>  neighbours:
> 2015-08-06 09:14:41 =SUPERVISOR REPORT====
>     Supervisor: {local,kernel_sup}
>     Context:    child_terminated
>     Reason:     eagain
>     Offender:   [{pid,<0.27.0>},{id,standard_error},{mfargs,{standard_error,start_link,[]}},{restart_type,temporary},{shutdown,2000},{child_type,supervisor}]
> 
> I have straced erl and it does get an EAGAIN on fd 2:
> 
> 14209 09:14:41.170429 writev(2, [{"started gen_listener_tcp\n", 25}], 1 <unfinished ...>
> 14209 09:14:41.170446 <... writev resumed> ) = -1 EAGAIN (Resource temporarily unavailable)
> 
> So whats going on here???
> 
> After some searching I found a nice explanation about duped fd's and file status flags
> here: http://stackoverflow.com/a/9677130 (a quick test program verifies that this indeed
> happens on Linux).
> 
> So according to that, setting any of the stdio file descriptors to non-blocking, would
> set all of them to non-blocking. And sure enough, strace shows this:
> 
>  14170 09:14:39.787004 fcntl(1, F_SETFL, O_RDWR|O_NONBLOCK|O_LARGEFILE) = 0
> 
> (pid 14170 is the main erl process, pid 14209 one of the threads in erl)
> 
> I have stopped digging through this at that point. Clearly, the standard_error
> process or the underlying port driver should handle the EAGAIN gracefully, but
> fail to do so.
> 
> Andreas
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions



More information about the erlang-questions mailing list