[erlang-bugs] file_server processes very large numbers of messages slowly

Fri Sep 16 09:15:47 CEST 2011

On Thu, Sep 15, 2011 at 06:12:15PM +0100, Alexandru Scvorţov wrote:
> Hi,
> 
> If you try to do a lot of IO concurrently (e.g. 100 000 calls to
> file:read_file_info/1), you notice that the calls take longer and longer
> as the number of concurrent requests rises.
> 
> As far as we (RabbitMQ) can tell, the bottleneck seems to be
> file_server.
> 
> For instance, running the attached program (100 000 calls to file:r_f_i) with:
>   erlc breaky.erl && erl +P 1048576 -s breaky break_fileserver -s init stop
> you can see this happening clearly.  The first calls, which happen when
> file_server is congested with a large number of messages on its queue,
> take a long time to run; as the messages are processed, the processing
> speed increases as well (the first column shows the time to serve 1000
> requests).
> 
> Our guess is that this happens because of the selective receive in
> prim_file:drv_get_response/1.  That seems to cause cause a full scan of the
> mailbox (which is obviously takes longer when there are a lot of messages).
> 
> Does this sound right?  Is there any workaround other than not doing a lot
> of file IO concurrently?

I'd guess you are right, except that you can through 'raw' files do more
_file IO_ concurrently. But file:read_file_info/1 does not have any 'raw'
possibility.

The 'raw' vs non-raw (cooked?) modes have some history and implications.
A node can run on a diskless node using file_server_2 on one other node
and by that see the other nodes filesystem. So the whole file server
subsystem is geared around this concept. The 'raw' files will see
the actual filesystem of the own node, so they can be different
filesystems. In most e.g Unix networks the interesting filesystems
of one machine look the same from all machines making 'raw' files
look like just an speed optimization instead of a different concept.

So, you can open a file in 'raw' mode as well as read and write from it,
without bothering a central bottleneck, but the meta operations does
not (yet) have a 'raw' option. The file you open with a 'raw' option
could be a different one than the one you just did read_file_info
on if you are running as a fileserver slave against another node's
file_server_2.

Therefore what is lacking is read_file_info in a 'raw' variant.
As well as advise/4, change_group/2, change_owner/2,3, change_time/2,3 ...
... write_file/2,3, write_file_info/2. A total of around 28 operations.
A strategy is needed, either for naming, adding an option list
to all calls, or a file:raw(Operation, ...) ...

/ Raimo

> 
> Cheers,
> Alex
> 

> -module(breaky).
> -compile([export_all]).
> 
> break_fileserver() ->
>     N = 100000,
>     Self = self(),
>     [spawn(fun () ->
>                    {ok, _} = file:read_file_info("breaky.erl"),
>                    Self ! done
>            end) || _ <- lists:seq(0,N)],
>     gather(erlang:now(), N).
> 
> gather(_, 0) ->
>     ok;
> gather(Now, N) ->
>     Now2 = if N rem 1000 =:= 0 ->
>                    Now1 = erlang:now(),
>                    io:format("~p: ~p ~p~n", [timer:now_diff(Now1, Now) / 1000,
>                                              N,
>                                              process_info(whereis(file_server_2),
>                                                           message_queue_len)]),
>                    Now1;
>               true -> Now
>     end,
>     receive
>         done -> gather(Now2, N-1)
>     end.

> _______________________________________________
> erlang-bugs mailing list
> erlang-bugs@REDACTED
> http://erlang.org/mailman/listinfo/erlang-bugs

-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB