[erlang-questions] How to diagnose stuck Erlang node

Kirill Zaborsky qrilka@REDACTED
Fri Oct 28 10:47:30 CEST 2011


About message queue crash dump viewer says "The dump is truncated, no data
available", so I've got no more infromation :-\
epmd -names showed the node running but I could not contact it.

Kind regards,
Kirill Zaborsky

2011/10/28 Ahmed Omar <spawn.think@REDACTED>

> Are you able to expand message queue of user_drv process? That might give
> some information.
> Did you check epmd status before dumping?
>
> On Fri, Oct 28, 2011 at 10:10 AM, Kirill Zaborsky <qrilka@REDACTED>wrote:
>
>> Just 2 days passed and Erlang node got stuck once again.
>> This time I killed it with SIGUSR1 and received a crash dump.
>> Checking all the logs on host didn't bring any hints where the problem may
>> be.
>> And in crash dump the only suspicious thing is that user_drv has message
>> queue length equal to 7550. The program counter points
>> to user_drv:server_loop/5 + 48 - is there any way to get info what
>> instruction in the source code it corresponds to?
>> BTW crash dump viewer says that crash dump was truncated is there any way
>> to get full crash dump?
>> The system is running R14B03 if it matters.
>> Any advices are welcomed.
>>
>> Kind regards,
>> Kirill Zaborsky
>>
>> 2011/10/26 Kirill Zaborsky <qrilka@REDACTED>
>>
>>> Recently we have found some problems with our Erlang application:
>>> For some time system works ok (e.g. before today it run with no problems
>>> for at least 17 days). Then something happens and it "stucks". It does not
>>> repond to pings, http interface (mochiweb) gives no replies. The only thing
>>> that can be observed is standard "ALIVE" message sent to stdout every 15
>>> minutes when there is no output to stdout. Messages from logs show nothing
>>> special before logging stops.
>>> The only thing I could do is just kill the emulator. That gives me
>>> opportunity to restart the system but gives no additional information about
>>> the roots of the problem.
>>> On JVM it's possible to get program thread dump (using QUIT signal) is
>>> there some ways to "manually" force Erlang emulator to produce crash dump
>>> without using erlang:halt/1?
>>> Are there some other ways to diagnose this problem which I should take a
>>> look at?
>>>
>>> Kind regars,
>>> Kirill Zaborksy
>>>
>>
>>
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
>>
>>
>
>
> --
> Best Regards,
> - Ahmed Omar
> http://nl.linkedin.com/in/adiaa
> Follow me on twitter
> @spawn_think <http://twitter.com/#!/spawn_think>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20111028/26ece8c0/attachment.htm>


More information about the erlang-questions mailing list