[erlang-questions] Troubling gen_tcp.send/3 performance

Matthew Shapiro <>
Sun Dec 4 03:45:25 CET 2016

Well this is on windows 10, and all applications are running locally on  I suppose I could throw this on a linux box somewhere and test
to see if localhost is just broken.

On Sat, Dec 3, 2016 at 9:36 PM, <> wrote:

> Another avenue you could check would be your operating system, nic driver,
> and the physical layer. For example, I traced issues with symptoms much
> like yours to buffer problems inside a consumer wifi access point.  It's
> possible there is a misconfiguration or limitation somewhere between you
> and the receiver.
> Anecdotally, Erlang is commonly used to ship many hundreds of concurrent
> video streams from a single box, so your failure at 1 stream is not
> expected.
> F.
> On Dec 3, 2016, at 5:47 PM, Matthew Shapiro <> wrote:
> I posted this question in the Elixir forums a day or so ago, but I wanted
> to put it here as well to gain visibility by people who have more
> experience with the internals of Erlang, since my question is related more
> to the Erlang libraries rather than Elixir itself.
> ## Summary
> I am trying to create a media streaming server in Elixir, with an initial
> focus on RTMP publishing and playback.  I chose Elixir/Erlang because it
> seemed like a perfect candidate but I seem to be having trouble.
> The testing setup is 3 applications, 1 RTMP publisher (3rd party OBS
> studio), 1 RTMP viewer (VLC), and my Elixir server.  Both the publisher and
> viewer connect to my elixir server over localhost, the publisher sends the
> elixir server video and audio data and each packet gets relayed off to the
> viewer, all over TCP.  The publisher is currently set to send 2500kbps, and
> network traffic shows it pretty close to this.
> When running the test I notice the video is stuttering a lot.  VLC debug
> messages show it's receiving frames inconsistently and trying to compensate
> for it.
> After getting help from people in IRC and looking through observer, I
> think I have pretty much pinpointed the issue to the `:gen_tcp.send()`
> calls being slow, so slow in fact I have observed up to 5-10 seconds just
> to push out an individual send call.
> Since i know Erlang is heavily used in switches I can't believe that this
> performance I"m getting is normal.  Lowering my video's bitrate to 500kbps
> does show smoother playback but I can still tell there is an issue.
> For reference, the code I have so far is up at
> https://github.com/KallDrexx/mmids-temp.  Note that this is a temporary
> repository, I plan to split each of hte apps up into their own
> repositories, slap an MIT license on them, then upload them to hex once I
> have this thing stabilized.
> Based on diagnostics I coded a 2500kbps video is averaging 200-250
> messages per second going from the publisher to the viewing client.
> # What is the architecture?
> The general architecture I have right now is that when any type of client
> connects I utilize `ranch` to spawn a `gen_server`.  This server receives
> TCP binary (using `active_once` and `raw` flags), attempts to deserialize
> any RTMP messages contained in it, react to messages that can/should be
> reacted to, and respond with any responses back to the client.  This all
> occurs within a single `gen_server` and no other processes are involved.
> For demonstration purposes when a viewing client requests playback I use
> `pg2` to subscribe to a specific channel for audio and video data.
> Publishing clients that are publishing a/v data on that same stream key
> push that data to all subscribed clients.  The viewing clients then receive
> the a/v data, serialize them into RTMP messages, serialize them into
> binary, then send them off across the network pipe.
> # What have I tried?
> First I tried utilizing `:os.system_time(:milli_seconds)` to determine
> how long any audio/video data packet took from deserializing from the
> publisher to right before binary serialization of the client.  I noticed
> that it would start out extremely fast and then pauses would occur (long
> 5-10 second pauses) and then batches of packets would get processed, then
> another pause, etc...
> Then I was reminded about observer, and I loaded it and saw the following
> graph: https://dl.dropboxusercontent.com/u/6753359/observer1.PNG.  The
> I/O graph told me that while inbound traffic was smooth, outbound was being
> staggered.
> I then opened the process for the server managing the viewing client.  I
> noticed the message queue length was constantly increasing, never
> decreasing, and the process was constantly stuck in the `prim_inet:send/3`
> function.
> In doing some Googling I came across [this thread](http://erlang.2086793.
> n4.nabble.com/why-is-gen-tcp-send-slow-td2106954.html) talking about slow
> `send()` performance, and while it didn't have a definite fix it did
> mention batching up the binary for the send() call so I wasn't calling it
> 200 times every second.
> The first thing I tried was to utilize a timer.  Instead of calling
> `send()` every message I put the binary in an iodata queue held in the
> gen_server's state.  I then added `:timer.send_interval(100, :send_queue)`
> to my initialization thinking I could send data once every 100ms.
> This did not give any better results outside of managing the message queue
> better.  What I noticed with observer and this timer was odd in that I
> would keep pressing the refresh hotkey and I would see my queue keep
> growing for up 5-10 seconds, and then go down to zero again.  This repeated
> over and over, and every refresh it was still stuck on `prim_inet:send/3`.
> This seems to me that send is just taking a ridiculous amount of time.
> Changing the timer interval up or down did not really help noticably.
> The next thing I tried was to stop the interval and send every X times I
> try to send a message, allowing me to batch messages together but make
> smaller batches then the interval method caused.  This didn't help by a
> noticeable amount either, and was worse for managing the message queue.
> Finally I tried tweaking the watermark values (even having them go up to
> 64k) but I could not stop prim_inet:send/3 from causing my process to wait
> upwards of 15 seconds.
> # So what now?
> I'm not quite sure how to proceed from here.  I can't believe that sending
> data via TCP is really that bad for a VM that I hear so many low latency
> and soft-realtime praise for.
> At the end of the day when the final final system is built I am hoping to
> get 50 inputs sending data to 150 outputs (based on current performance
> I've seen from other third party products), each connection (in and out)
> dealing with around 3Mbps of audio/video data.  So it's a bit disconcerning
> that I can't even get 1 in 1 out working reliably.
> Does anyone have any advice on where I go from here?
> _______________________________________________
> erlang-questions mailing list
> http://erlang.org/mailman/listinfo/erlang-questions
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20161203/b638162f/attachment.html>

More information about the erlang-questions mailing list