[erlang-questions] Tracking down the reason for my Segmentation Fault (Core dump) problems.

Felix Gallo felixgallo@REDACTED
Thu Feb 12 00:43:57 CET 2015


Was thinking perhaps it was a 32-bit installation, which can run out of
memory in situations like that very quickly.

Besides OS-level memory examination tools, the best (in my opinion) tools
for diagnosing issues like this are eper (https://github.com/massemanet/eper)
and recon (http://ferd.github.io/recon/).  The redoubtable and inestimably
worthy Mr. Hebert gives an excellent overview of the debugging process in
situations just like this in his free ebook "Stuff Goes Bad: Erlang in
Anger" (http://www.erlang-in-anger.com/).

On Wed, Feb 11, 2015 at 3:32 PM, Gene Sher <corticalcomputer@REDACTED>
wrote:

> Though as noted it happened on various Erlang installations for me, the
> one I am currently using and on which I also get the Seg Fault is:
>
> Erlang/OTP 17 [erts-6.3] [source-f9282c6] [64-bit] [smp:24:24]
> [async-threads:10] [hipe] [kernel-poll:false]
>
>
> On Wed, Feb 11, 2015 at 6:29 PM, Felix Gallo <felixgallo@REDACTED> wrote:
>
>> What does it say when you type 'erl' at the command line?  Example:
>>
>> Erlang/OTP 17 [erts-6.0] [source] [64-bit] [smp:2:2] [async-threads:10]
>> [hipe] [kernel-poll:false]
>>
>> Eshell V6.0  (abort with ^G)
>> 1>
>>
>> On Wed, Feb 11, 2015 at 2:54 PM, Gene Sher <corticalcomputer@REDACTED>
>> wrote:
>>
>>> Hello List,
>>>
>>> Hardware: E5-Xeon 2697 v2, 32GB of RAM.
>>> OSes tried: Xubuntu 14.04.1 LTS, CentOS 7, Ubuntu 12.04 LTS
>>> Erlang versions the code was tried on: Erlang/OTP 17, R16, & R14
>>>
>>> I have an issue where every time I use processes which contain within
>>> themselves large data structures (Large deep learning single process
>>> nodes), after just a minute or so Erlang core dumps. The amount of ram used
>>> is only about 2GB, so it can't be the system running out of memory, and its
>>> only using about 10 cores, since I'm only running 10 such processes. Now
>>> the same code, the same program, the same platform, functions without a
>>> problem when I keep these processes small (substantially smaller monolithic
>>> NN-module in each process). Everything is written purely in Erlang (No NIFs
>>> were involved in this particular NN code).
>>>
>>> What exactly is happening? is something running out of space? Can anyone
>>> recommend what option during the Erlang startup I should perhaps modify to
>>> alleviate the issue?
>>>
>>> There are no crushdump files that I can find, but I did get a core
>>> backtrace produced during one of these crashes when I was running
>>> erts-5.10.4, here is a partial paste of it:
>>>
>>> ccpp-2015-02-11-09\:18\:41-2273/core_backtrace:
>>> {   "signal": 11
>>> ,   "executable": "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
>>> ,   "stacktrace":
>>>       [ {   "crash_thread": true
>>>         ,   "frames":
>>>               [ {   "address": 5352736
>>>                 ,   "build_id":
>>> "69494bd95d056f5549e80b6fe507e55af574137f"
>>>                 ,   "build_id_offset": 1158432
>>>                 ,   "function_name": "sweep_one_area"
>>>                 ,   "file_name":
>>> "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
>>>                 }
>>>               , {   "address": 5367589
>>>                 ,   "build_id":
>>> "69494bd95d056f5549e80b6fe507e55af574137f"
>>>                 ,   "build_id_offset": 1173285
>>>                 ,   "function_name": "erts_garbage_collect"
>>>                 ,   "file_name":
>>> "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
>>>                 }
>>>               , {   "address": 5369251
>>>                 ,   "build_id":
>>> "69494bd95d056f5549e80b6fe507e55af574137f"
>>>                 ,   "build_id_offset": 1174947
>>>                 ,   "function_name": "erts_gc_after_bif_call"
>>>                 ,   "file_name":
>>> "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
>>>                 }
>>>               , {   "address": 5871217
>>>                 ,   "build_id":
>>> "69494bd95d056f5549e80b6fe507e55af574137f"
>>>                 ,   "build_id_offset": 1676913
>>>                 ,   "function_name": "nbif_3_gc_after_bif"
>>>                 ,   "file_name":
>>> "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
>>>                 } ]
>>>         }
>>>       , {   "frames":
>>>               [ {   "address": 1101651978
>>>                 ,   "build_id_offset": 1101651978
>>>                 } ]
>>>         }
>>>       , {   "frames":
>>>               [ {   "address": 139994957883141
>>>                 ,   "build_id":
>>> "18562ee0363bc9bd7101610bd86469aa426d0c44"
>>>                 ,   "build_id_offset": 46853
>>>                 ,   "function_name": "pthread_cond_wait@@GLIBC_2.3.2"
>>>                 ,   "file_name": "/lib64/libpthread.so.0"
>>>                 }
>>>               , {   "address": 6128777
>>>                 ,   "build_id":
>>> "69494bd95d056f5549e80b6fe507e55af574137f"
>>>                 ,   "build_id_offset": 1934473
>>>                 ,   "function_name": "ethr_cond_wait"
>>>                 ,   "file_name":
>>> "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
>>>                 }
>>>               , {   "address": 4665919
>>>                 ,   "build_id":
>>> "69494bd95d056f5549e80b6fe507e55af574137f"
>>>                 ,   "build_id_offset": 471615
>>>                 ,   "function_name": "sys_msg_dispatcher_func"
>>>                 ,   "file_name":
>>> "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
>>>                 }
>>>               , {   "address": 6134325
>>>                 ,   "build_id":
>>> "69494bd95d056f5549e80b6fe507e55af574137f"
>>>                 ,   "build_id_offset": 1940021
>>>                 ,   "function_name": "thr_wrapper"
>>>                 ,   "file_name":
>>> "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
>>>                 }
>>>               , {   "address": 139994957868531
>>>                 ,   "build_id":
>>> "18562ee0363bc9bd7101610bd86469aa426d0c44"
>>>                 ,   "build_id_offset": 32243
>>>                 ,   "function_name": "start_thread"
>>>                 ,   "file_name": "/lib64/libpthread.so.0"
>>>                 }
>>>               , {   "address": 139994952778157
>>>                 ,   "build_id":
>>> "23d9f6f74c80c45a602094e5016f047bfc4d046c"
>>>                 ,   "build_id_offset": 1008045
>>>                 ,   "function_name": "__clone"
>>>                 ,   "file_name": "/lib64/libc.so.6"
>>>                 } ]
>>>         }
>>>       , {   "frames":
>>>               [ {   "address": 139994957894237
>>>                 ,   "build_id":
>>> "18562ee0363bc9bd7101610bd86469aa426d0c44"
>>>                 ,   "build_id_offset": 57949
>>>                 ,   "function_name": "read"
>>>                 ,   "file_name": "/lib64/libpthread.so.0"
>>>                 }
>>>               , {   "address": 5741674
>>>                 ,   "build_id":
>>> "69494bd95d056f5549e80b6fe507e55af574137f"
>>>                 ,   "build_id_offset": 1547370
>>>                 ,   "function_name": "signal_dispatcher_thread_func"
>>>                 ,   "file_name":
>>> "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
>>>                 }
>>>               , {   "address": 6134325
>>>                 ,   "build_id":
>>> "69494bd95d056f5549e80b6fe507e55af574137f"
>>>                 ,   "build_id_offset": 1940021
>>>                 ,   "function_name": "thr_wrapper"
>>>                 ,   "file_name":
>>> "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
>>>                 }
>>>               , {   "address": 139994957868531
>>>                 ,   "build_id":
>>> "18562ee0363bc9bd7101610bd86469aa426d0c44"
>>>                 ,   "build_id_offset": 32243
>>>                 ,   "function_name": "start_thread"
>>>                 ,   "file_name": "/lib64/libpthread.so.0"
>>>                 }
>>>               , {   "address": 139994952778157
>>>                 ,   "build_id":
>>> "23d9f6f74c80c45a602094e5016f047bfc4d046c"
>>>                 ,   "build_id_offset": 1008045
>>>                 ,   "function_name": "__clone"
>>>                 ,   "file_name": "/lib64/libc.so.6"
>>>                 } ]
>>>         }
>>> ...
>>>
>>> Thanks in advance for any suggestions and help,
>>> -Gene
>>>
>>> _______________________________________________
>>> erlang-questions mailing list
>>> erlang-questions@REDACTED
>>> http://erlang.org/mailman/listinfo/erlang-questions
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20150211/a6b176a7/attachment.htm>


More information about the erlang-questions mailing list