[erlang-questions] Strange BEAM slowdown

dmkolesnikov@REDACTED dmkolesnikov@REDACTED
Mon Feb 22 19:28:34 CET 2016


Hello,

I've seen a similar behavior when message processing rate was slower then message arrival rate. I would use etop to check if some process has significant mailbox size or huge reductions; use recon tool to check for binary leaked. 

There is very good book by Fried, it might shed some light on your problem

https://s3.amazonaws.com/erlang-in-anger/text.v1.0.2.pdf

Best Regards, 
Dmitry

Sent from my iPhone

> On 22 Feb 2016, at 19:14, Timothy Legant <tlegant@REDACTED> wrote:
> 
> Hello,
> 
> We have an application where we read a huge volume of small messages
> from ZMQ sockets and distribute them to Erlang processes.  We are
> seeing strange behavior where, after a short while, beam.smp's load
> drops quite a bit and then the data begins queuing, eating memory
> until we either stop the program or the Linux OOM killer does it for
> us.
> 
> DETAILS
> -------
> CentOS release 6.6 (Final)
> Erlang/OTP 17 [erts-6.4] [source-2e19e2f] [64-bit] [smp:56:56] [async-threads:20] [hipe] [kernel-poll:true]
> 
> beam.smp is started with the flags: +sbt db +sub true
> 
> We have 60+ data sources (TCP/ZMQ sockets), each of which feeds an
> independent set of processes; there is no interaction between the
> processes handling the data from one socket and the processes handling
> data from other sockets.
> 
> Our first implementation used the erlzmq2 library to read the socket.
> We then parsed the messages in Erlang and sent Erlang terms to the
> data handling processes.
> 
> After seeing the problem behavior we suspected that the repeated calls
> to erlzmq:recv() and parsing in Erlang might be the cause of the
> backup so we rewrote that code as a NIF (background thread + several
> API calls).  Our NIF implementation reads the ZMQ socket, parses the
> data and then sends it to the data handling processes.  We (obviously,
> I suppose) create one of these background threads for each of the 60+
> data source sockets.
> 
> Despite the entirely different implementation of ZMQ handling, parsing
> and dispatch of the data, we are seeing the same issue: first the load
> drops off precipitously and then the data starts queuing in the ZMQ
> socket buffers and the program is unusable.
> 
> 
> We are curious if anyone has seen this sort of behavior with BEAM or
> might have suggestions on where to look for the issue.
> 
> 
> Thanks,
> 
> Tim
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions



More information about the erlang-questions mailing list