[erlang-questions] Strange BEAM slowdown
dmkolesnikov@REDACTED
dmkolesnikov@REDACTED
Mon Feb 22 19:28:34 CET 2016
Hello,
I've seen a similar behavior when message processing rate was slower then message arrival rate. I would use etop to check if some process has significant mailbox size or huge reductions; use recon tool to check for binary leaked.
There is very good book by Fried, it might shed some light on your problem
https://s3.amazonaws.com/erlang-in-anger/text.v1.0.2.pdf
Best Regards,
Dmitry
Sent from my iPhone
> On 22 Feb 2016, at 19:14, Timothy Legant <tlegant@REDACTED> wrote:
>
> Hello,
>
> We have an application where we read a huge volume of small messages
> from ZMQ sockets and distribute them to Erlang processes. We are
> seeing strange behavior where, after a short while, beam.smp's load
> drops quite a bit and then the data begins queuing, eating memory
> until we either stop the program or the Linux OOM killer does it for
> us.
>
> DETAILS
> -------
> CentOS release 6.6 (Final)
> Erlang/OTP 17 [erts-6.4] [source-2e19e2f] [64-bit] [smp:56:56] [async-threads:20] [hipe] [kernel-poll:true]
>
> beam.smp is started with the flags: +sbt db +sub true
>
> We have 60+ data sources (TCP/ZMQ sockets), each of which feeds an
> independent set of processes; there is no interaction between the
> processes handling the data from one socket and the processes handling
> data from other sockets.
>
> Our first implementation used the erlzmq2 library to read the socket.
> We then parsed the messages in Erlang and sent Erlang terms to the
> data handling processes.
>
> After seeing the problem behavior we suspected that the repeated calls
> to erlzmq:recv() and parsing in Erlang might be the cause of the
> backup so we rewrote that code as a NIF (background thread + several
> API calls). Our NIF implementation reads the ZMQ socket, parses the
> data and then sends it to the data handling processes. We (obviously,
> I suppose) create one of these background threads for each of the 60+
> data source sockets.
>
> Despite the entirely different implementation of ZMQ handling, parsing
> and dispatch of the data, we are seeing the same issue: first the load
> drops off precipitously and then the data starts queuing in the ZMQ
> socket buffers and the program is unusable.
>
>
> We are curious if anyone has seen this sort of behavior with BEAM or
> might have suggestions on where to look for the issue.
>
>
> Thanks,
>
> Tim
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
More information about the erlang-questions
mailing list