[erlang-questions] How to diagnose stuck Erlang node

Ahmed Omar spawn.think@REDACTED
Fri Oct 28 13:23:12 CEST 2011


Maybe providing some information about what your application is doing might
help?

On Fri, Oct 28, 2011 at 10:47 AM, Kirill Zaborsky <qrilka@REDACTED> wrote:

> About message queue crash dump viewer says "The dump is truncated, no data
> available", so I've got no more infromation :-\
> epmd -names showed the node running but I could not contact it.
>
> Kind regards,
> Kirill Zaborsky
>
>
> 2011/10/28 Ahmed Omar <spawn.think@REDACTED>
>
>> Are you able to expand message queue of user_drv process? That might give
>> some information.
>> Did you check epmd status before dumping?
>>
>> On Fri, Oct 28, 2011 at 10:10 AM, Kirill Zaborsky <qrilka@REDACTED>wrote:
>>
>>> Just 2 days passed and Erlang node got stuck once again.
>>> This time I killed it with SIGUSR1 and received a crash dump.
>>> Checking all the logs on host didn't bring any hints where the problem
>>> may be.
>>> And in crash dump the only suspicious thing is that user_drv has message
>>> queue length equal to 7550. The program counter points
>>> to user_drv:server_loop/5 + 48 - is there any way to get info what
>>> instruction in the source code it corresponds to?
>>> BTW crash dump viewer says that crash dump was truncated is there any way
>>> to get full crash dump?
>>> The system is running R14B03 if it matters.
>>> Any advices are welcomed.
>>>
>>> Kind regards,
>>> Kirill Zaborsky
>>>
>>> 2011/10/26 Kirill Zaborsky <qrilka@REDACTED>
>>>
>>>> Recently we have found some problems with our Erlang application:
>>>> For some time system works ok (e.g. before today it run with no problems
>>>> for at least 17 days). Then something happens and it "stucks". It does not
>>>> repond to pings, http interface (mochiweb) gives no replies. The only thing
>>>> that can be observed is standard "ALIVE" message sent to stdout every 15
>>>> minutes when there is no output to stdout. Messages from logs show nothing
>>>> special before logging stops.
>>>> The only thing I could do is just kill the emulator. That gives me
>>>> opportunity to restart the system but gives no additional information about
>>>> the roots of the problem.
>>>> On JVM it's possible to get program thread dump (using QUIT signal) is
>>>> there some ways to "manually" force Erlang emulator to produce crash dump
>>>> without using erlang:halt/1?
>>>> Are there some other ways to diagnose this problem which I should take a
>>>> look at?
>>>>
>>>> Kind regars,
>>>> Kirill Zaborksy
>>>>
>>>
>>>
>>> _______________________________________________
>>> erlang-questions mailing list
>>> erlang-questions@REDACTED
>>> http://erlang.org/mailman/listinfo/erlang-questions
>>>
>>>
>>
>>
>> --
>> Best Regards,
>> - Ahmed Omar
>> http://nl.linkedin.com/in/adiaa
>> Follow me on twitter
>> @spawn_think <http://twitter.com/#!/spawn_think>
>>
>>
>


-- 
Best Regards,
- Ahmed Omar
http://nl.linkedin.com/in/adiaa
Follow me on twitter
@spawn_think <http://twitter.com/#!/spawn_think>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20111028/951d1e1a/attachment.htm>


More information about the erlang-questions mailing list