[erlang-questions] beam.smp crash reproducibly

Maxim Sokhatsky maxim@REDACTED
Wed Nov 12 23:16:14 CET 2014


Hello!

tl;dr We’ve managed to put beam.smp in crash reproducibly.
      It is resistant to reproduce on 17.3 and 16B02 versions.
      This happens under heavy load. No HiPE. No NIFs. ulimit is ok.

We have very simple application that consumes RabbitMQ queue and store data in mnesia's disc_copies. For that purposes we use RabbitMQ client stack amqp_client/rabbit_common wrapped in our simple library synrc/mqs (300 LOC) along with very simple wrapper over mnesia synrc/kvs (200 LOC). We have 16GB RAM on powerful machine and performance is good. However after reaching memory consumption near 10GB the system goes to core. We used the original Ubuntu 12.04 package R16B02 which was without symbol information needed to bug report. So we’ve built with KERL Erlang 17.3 from sources and situation hadn’t change.

Here is GBD session we did retrieve from core file along with detailed information about application, build procedure, etc.:

         1. https://gist.github.com/5HT/e35d58b76bc25680e17b
         2. https://gist.github.com/5HT/224c569df807f1e337aa

We heard that 17.3 have some unstable memory allocators. But crashes was also reproducible on R16B02. So we decided not to panic and ask in community the recipe how to perform further checks and plan in calm the regression test kit. 

As you can see crash core files contains information about allocators and gc. We think the problem is there. That leads us with following questions to community:

         1. Which memory allocators you suggest us to try at first?
         2. What other steps we should perform?
         3. How do you think what is the real cause of the problem?

Sources are really simple, I can send them by request to Ericsson OTP team. The application itself weight only 1332 LOC.

—
Synrc Research Center  
Maxim





More information about the erlang-questions mailing list