[erlang-questions] Need help with async disk IO and thread pool on many devices (more than 1 Gbit/s )

Wed Apr 3 09:58:38 CEST 2013

Hi.

Situation is following: erlyvideo is running on Xeon E5 server with 16 SATA
HDD attached. RAID and other nightmare devices are removed, only raw disks
are mounted in Linux.

2500 clients are watching video on speed about 1600 Mbit/s. They are
downloading chunks of video via HTTP. Each chunk requires two disk pread
calls: video and audio chunks, so disk is read by contiguous blocks of
about 300-700 Kb.

Everything is ok, response time is lower than 50 ms.

But if video is uploaded to one HDD, it becomes unresponsible and whole
server becomes unresponsible.

I think that all 160 async threads gets blocked by pread calls to this
device and thus whole system becomes unresponsible.

So, problem is not in raw throughput and RPS, there is not more than 400
RPS.
Problem is as usual: how to handle errors and make system very soft and
responsive.
Webserver must not fail with 500 error timeout after 5 seconds of waiting,
it should respond 503 after several milliseconds.

The problem, I've met is not specific for video, it may be common for any
web server. I don't see any reasons why erlang should be slower than nginx,
however I see some minor problems that doesn't allow to reach the same
predictiveness.

There are following ideas to solve this problem:
1) spawn separate node per each device and route all disk requests to it.
Communicate via TCP. It is dumb, but it is a solution.

2) spawn pool of separate file reader process for each opened file. It may
be even better than previous choice, because OS will fully schedule them.

3) add async disk io from linux to erlang VM. Question is: why it isn't
already done? Maybe there are some problems with it?

4) add some affinity between ports and async threads. But frankly speaking
it will require adding feature that allows dynamic changing size of this
thread pool.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20130403/7551dc90/attachment.htm>