[erlang-questions] Tracking down the reason for my Segmentation Fault (Core dump) problems.

Lukas Larsson lukas@REDACTED
Thu Feb 12 09:52:05 CET 2015


Hello Gene,

Without having a full core file, I cannot do much more than guess as to
what might be wrong. What most likely is happened is that something somehow
got corrupted on the heap of that process. Normally I would blame a badly
written NIF/linked-in driver, but as you say you don't have anything like
that, then we can rule that out. Without more information (i.e. a full core
file) it is extremely hard to tell what has gone wrong.

This is most likely not due to the VM running out of memory, you would see
another error if that were to happen.

Lukas

On Wed, Feb 11, 2015 at 11:54 PM, Gene Sher <corticalcomputer@REDACTED>
wrote:

> Hello List,
>
> Hardware: E5-Xeon 2697 v2, 32GB of RAM.
> OSes tried: Xubuntu 14.04.1 LTS, CentOS 7, Ubuntu 12.04 LTS
> Erlang versions the code was tried on: Erlang/OTP 17, R16, & R14
>
> I have an issue where every time I use processes which contain within
> themselves large data structures (Large deep learning single process
> nodes), after just a minute or so Erlang core dumps. The amount of ram used
> is only about 2GB, so it can't be the system running out of memory, and its
> only using about 10 cores, since I'm only running 10 such processes. Now
> the same code, the same program, the same platform, functions without a
> problem when I keep these processes small (substantially smaller monolithic
> NN-module in each process). Everything is written purely in Erlang (No NIFs
> were involved in this particular NN code).
>
> What exactly is happening? is something running out of space? Can anyone
> recommend what option during the Erlang startup I should perhaps modify to
> alleviate the issue?
>
> There are no crushdump files that I can find, but I did get a core
> backtrace produced during one of these crashes when I was running
> erts-5.10.4, here is a partial paste of it:
>
> ccpp-2015-02-11-09\:18\:41-2273/core_backtrace:
> {   "signal": 11
> ,   "executable": "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
> ,   "stacktrace":
>       [ {   "crash_thread": true
>         ,   "frames":
>               [ {   "address": 5352736
>                 ,   "build_id": "69494bd95d056f5549e80b6fe507e55af574137f"
>                 ,   "build_id_offset": 1158432
>                 ,   "function_name": "sweep_one_area"
>                 ,   "file_name":
> "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
>                 }
>               , {   "address": 5367589
>                 ,   "build_id": "69494bd95d056f5549e80b6fe507e55af574137f"
>                 ,   "build_id_offset": 1173285
>                 ,   "function_name": "erts_garbage_collect"
>                 ,   "file_name":
> "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
>                 }
>               , {   "address": 5369251
>                 ,   "build_id": "69494bd95d056f5549e80b6fe507e55af574137f"
>                 ,   "build_id_offset": 1174947
>                 ,   "function_name": "erts_gc_after_bif_call"
>                 ,   "file_name":
> "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
>                 }
>               , {   "address": 5871217
>                 ,   "build_id": "69494bd95d056f5549e80b6fe507e55af574137f"
>                 ,   "build_id_offset": 1676913
>                 ,   "function_name": "nbif_3_gc_after_bif"
>                 ,   "file_name":
> "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
>                 } ]
>         }
>       , {   "frames":
>               [ {   "address": 1101651978
>                 ,   "build_id_offset": 1101651978
>                 } ]
>         }
>       , {   "frames":
>               [ {   "address": 139994957883141
>                 ,   "build_id": "18562ee0363bc9bd7101610bd86469aa426d0c44"
>                 ,   "build_id_offset": 46853
>                 ,   "function_name": "pthread_cond_wait@@GLIBC_2.3.2"
>                 ,   "file_name": "/lib64/libpthread.so.0"
>                 }
>               , {   "address": 6128777
>                 ,   "build_id": "69494bd95d056f5549e80b6fe507e55af574137f"
>                 ,   "build_id_offset": 1934473
>                 ,   "function_name": "ethr_cond_wait"
>                 ,   "file_name":
> "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
>                 }
>               , {   "address": 4665919
>                 ,   "build_id": "69494bd95d056f5549e80b6fe507e55af574137f"
>                 ,   "build_id_offset": 471615
>                 ,   "function_name": "sys_msg_dispatcher_func"
>                 ,   "file_name":
> "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
>                 }
>               , {   "address": 6134325
>                 ,   "build_id": "69494bd95d056f5549e80b6fe507e55af574137f"
>                 ,   "build_id_offset": 1940021
>                 ,   "function_name": "thr_wrapper"
>                 ,   "file_name":
> "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
>                 }
>               , {   "address": 139994957868531
>                 ,   "build_id": "18562ee0363bc9bd7101610bd86469aa426d0c44"
>                 ,   "build_id_offset": 32243
>                 ,   "function_name": "start_thread"
>                 ,   "file_name": "/lib64/libpthread.so.0"
>                 }
>               , {   "address": 139994952778157
>                 ,   "build_id": "23d9f6f74c80c45a602094e5016f047bfc4d046c"
>                 ,   "build_id_offset": 1008045
>                 ,   "function_name": "__clone"
>                 ,   "file_name": "/lib64/libc.so.6"
>                 } ]
>         }
>       , {   "frames":
>               [ {   "address": 139994957894237
>                 ,   "build_id": "18562ee0363bc9bd7101610bd86469aa426d0c44"
>                 ,   "build_id_offset": 57949
>                 ,   "function_name": "read"
>                 ,   "file_name": "/lib64/libpthread.so.0"
>                 }
>               , {   "address": 5741674
>                 ,   "build_id": "69494bd95d056f5549e80b6fe507e55af574137f"
>                 ,   "build_id_offset": 1547370
>                 ,   "function_name": "signal_dispatcher_thread_func"
>                 ,   "file_name":
> "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
>                 }
>               , {   "address": 6134325
>                 ,   "build_id": "69494bd95d056f5549e80b6fe507e55af574137f"
>                 ,   "build_id_offset": 1940021
>                 ,   "function_name": "thr_wrapper"
>                 ,   "file_name":
> "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
>                 }
>               , {   "address": 139994957868531
>                 ,   "build_id": "18562ee0363bc9bd7101610bd86469aa426d0c44"
>                 ,   "build_id_offset": 32243
>                 ,   "function_name": "start_thread"
>                 ,   "file_name": "/lib64/libpthread.so.0"
>                 }
>               , {   "address": 139994952778157
>                 ,   "build_id": "23d9f6f74c80c45a602094e5016f047bfc4d046c"
>                 ,   "build_id_offset": 1008045
>                 ,   "function_name": "__clone"
>                 ,   "file_name": "/lib64/libc.so.6"
>                 } ]
>         }
> ...
>
> Thanks in advance for any suggestions and help,
> -Gene
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20150212/157476d1/attachment.htm>


More information about the erlang-questions mailing list