[erlang-questions] Obsolete exported functions file:raw_{read, write}_file_info/2 - why?
Scott Lystig Fritchie
fritchie@REDACTED
Wed Sep 14 23:17:03 CEST 2011
Attila Rajmund Nohl <attila.r.nohl@REDACTED> wrote:
arn> That's interesting. I was also chasing a performance problem a
arn> couple of weeks ago (the server crawled to a halt for a minute or
arn> two, the load of the Linux OS went over 50, then everything went
arn> back to normal) and noticed that the file_server process used a lot
arn> of CPU. My solution(?) was to randomize the jobs that would write
arn> to the disk, so the 30 processes tried not to write to the disk at
arn> the same time.
Hrm, you didn't mention how you're writing to disk. If you're using
file:writefile/2, that's another func that ends up going through the
file_server_2.
If you're opening file descriptors for each file (and opening them in
'raw' mode), then you probably should also be using the +A flag when
starting the VM so that all computation by the VM won't be blocked by
slow file I/O.
And if you're already using separate file descriptors and the +A flag,
simply doing too much I/O at once can be a Bad Idea. If your disk is
overloaded (measure using "iostat -x 1" or equivalent), then all you can
really do is wait(*). If you believe that your disk(s) are not yet
saturated and that you believe that your bottleneck is the Erlang VM, it
may be possible that you've got too much parallel I/O relative to the
size of the +A I/O worker pool.
For example, say that you use "+A 4" and (as mentioned above) have 50
file writes happening simultaneously. The code in efile_drv.c assigns a
worker Pthread based on the Erlang port number(**). So you could have
many ports' worth of I/O assigned to each worker Pthread, 12-13 on
average if everything is sync'ed perfectly. In a pathological worst
case, where you opened 4 ports and discarded every 3, you could have all
50 ports assigned to the same worker Pthread.(***)
-Scott
(*) Or fundamentally change the data you're writing and how you do it
and when.
(**) The assignment of port -> worker thread is *not* done by "first
idle Pthread in the pool".
(***) The lack of visibility into this part of the efile_drv.c driver is
a motivating reason that got me active in patching the VM to add DTrace
probes. See https://github.com/slfritchie/otp/tree/dtrace-experiment
for source. Also, Dustin Stallings is hacking on DTrace, starting from
a different direction, see https://github.com/dustin/otp/tree/dtrace.
More information about the erlang-questions
mailing list