[erlang-questions] Tracking down the reason for my Segmentation Fault (Core dump) problems.
Felix Gallo
felixgallo@REDACTED
Thu Feb 12 00:29:49 CET 2015
What does it say when you type 'erl' at the command line? Example:
Erlang/OTP 17 [erts-6.0] [source] [64-bit] [smp:2:2] [async-threads:10]
[hipe] [kernel-poll:false]
Eshell V6.0 (abort with ^G)
1>
On Wed, Feb 11, 2015 at 2:54 PM, Gene Sher <corticalcomputer@REDACTED>
wrote:
> Hello List,
>
> Hardware: E5-Xeon 2697 v2, 32GB of RAM.
> OSes tried: Xubuntu 14.04.1 LTS, CentOS 7, Ubuntu 12.04 LTS
> Erlang versions the code was tried on: Erlang/OTP 17, R16, & R14
>
> I have an issue where every time I use processes which contain within
> themselves large data structures (Large deep learning single process
> nodes), after just a minute or so Erlang core dumps. The amount of ram used
> is only about 2GB, so it can't be the system running out of memory, and its
> only using about 10 cores, since I'm only running 10 such processes. Now
> the same code, the same program, the same platform, functions without a
> problem when I keep these processes small (substantially smaller monolithic
> NN-module in each process). Everything is written purely in Erlang (No NIFs
> were involved in this particular NN code).
>
> What exactly is happening? is something running out of space? Can anyone
> recommend what option during the Erlang startup I should perhaps modify to
> alleviate the issue?
>
> There are no crushdump files that I can find, but I did get a core
> backtrace produced during one of these crashes when I was running
> erts-5.10.4, here is a partial paste of it:
>
> ccpp-2015-02-11-09\:18\:41-2273/core_backtrace:
> { "signal": 11
> , "executable": "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
> , "stacktrace":
> [ { "crash_thread": true
> , "frames":
> [ { "address": 5352736
> , "build_id": "69494bd95d056f5549e80b6fe507e55af574137f"
> , "build_id_offset": 1158432
> , "function_name": "sweep_one_area"
> , "file_name":
> "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
> }
> , { "address": 5367589
> , "build_id": "69494bd95d056f5549e80b6fe507e55af574137f"
> , "build_id_offset": 1173285
> , "function_name": "erts_garbage_collect"
> , "file_name":
> "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
> }
> , { "address": 5369251
> , "build_id": "69494bd95d056f5549e80b6fe507e55af574137f"
> , "build_id_offset": 1174947
> , "function_name": "erts_gc_after_bif_call"
> , "file_name":
> "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
> }
> , { "address": 5871217
> , "build_id": "69494bd95d056f5549e80b6fe507e55af574137f"
> , "build_id_offset": 1676913
> , "function_name": "nbif_3_gc_after_bif"
> , "file_name":
> "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
> } ]
> }
> , { "frames":
> [ { "address": 1101651978
> , "build_id_offset": 1101651978
> } ]
> }
> , { "frames":
> [ { "address": 139994957883141
> , "build_id": "18562ee0363bc9bd7101610bd86469aa426d0c44"
> , "build_id_offset": 46853
> , "function_name": "pthread_cond_wait@@GLIBC_2.3.2"
> , "file_name": "/lib64/libpthread.so.0"
> }
> , { "address": 6128777
> , "build_id": "69494bd95d056f5549e80b6fe507e55af574137f"
> , "build_id_offset": 1934473
> , "function_name": "ethr_cond_wait"
> , "file_name":
> "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
> }
> , { "address": 4665919
> , "build_id": "69494bd95d056f5549e80b6fe507e55af574137f"
> , "build_id_offset": 471615
> , "function_name": "sys_msg_dispatcher_func"
> , "file_name":
> "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
> }
> , { "address": 6134325
> , "build_id": "69494bd95d056f5549e80b6fe507e55af574137f"
> , "build_id_offset": 1940021
> , "function_name": "thr_wrapper"
> , "file_name":
> "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
> }
> , { "address": 139994957868531
> , "build_id": "18562ee0363bc9bd7101610bd86469aa426d0c44"
> , "build_id_offset": 32243
> , "function_name": "start_thread"
> , "file_name": "/lib64/libpthread.so.0"
> }
> , { "address": 139994952778157
> , "build_id": "23d9f6f74c80c45a602094e5016f047bfc4d046c"
> , "build_id_offset": 1008045
> , "function_name": "__clone"
> , "file_name": "/lib64/libc.so.6"
> } ]
> }
> , { "frames":
> [ { "address": 139994957894237
> , "build_id": "18562ee0363bc9bd7101610bd86469aa426d0c44"
> , "build_id_offset": 57949
> , "function_name": "read"
> , "file_name": "/lib64/libpthread.so.0"
> }
> , { "address": 5741674
> , "build_id": "69494bd95d056f5549e80b6fe507e55af574137f"
> , "build_id_offset": 1547370
> , "function_name": "signal_dispatcher_thread_func"
> , "file_name":
> "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
> }
> , { "address": 6134325
> , "build_id": "69494bd95d056f5549e80b6fe507e55af574137f"
> , "build_id_offset": 1940021
> , "function_name": "thr_wrapper"
> , "file_name":
> "/usr/lib64/erlang/erts-5.10.4/bin/beam.smp"
> }
> , { "address": 139994957868531
> , "build_id": "18562ee0363bc9bd7101610bd86469aa426d0c44"
> , "build_id_offset": 32243
> , "function_name": "start_thread"
> , "file_name": "/lib64/libpthread.so.0"
> }
> , { "address": 139994952778157
> , "build_id": "23d9f6f74c80c45a602094e5016f047bfc4d046c"
> , "build_id_offset": 1008045
> , "function_name": "__clone"
> , "file_name": "/lib64/libc.so.6"
> } ]
> }
> ...
>
> Thanks in advance for any suggestions and help,
> -Gene
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20150211/268f3b1d/attachment.htm>
More information about the erlang-questions
mailing list