interesting I/O bottleneck

James Hague james.hague@REDACTED
Tue Jun 1 15:31:08 CEST 2010


I've got an application which reads through directory trees, compares file
dates, sorts lists of files, that sort of thing. I'm not loading files so
much as calling file:list_dir and file:read_file_info. It's slower than I
expected it to be, so I ran it through eprof. The result is that over 55% of
the time is spent in file:file_name. Even functions I expected to be
slightly expensive, like building a dict of all the filenames in a tree, are
irrelevant in comparison.

file:file_name looks like this:

file_name(N) ->
    try
        file_name_1(N)
    catch Reason ->
        {error, Reason}
    end.

file_name_1([C|T]) when is_integer(C), C > 0, C =< 255 ->
    [C|file_name_1(T)];
file_name_1([H|T]) ->
    file_name_1(H) ++ file_name_1(T);
file_name_1([]) ->
    [];
file_name_1(N) when is_atom(N) ->
    atom_to_list(N);
file_name_1(_) ->
    throw(badarg).

I didn't realize until looking at the source that a filename can be a deep
list of characters and atoms. If it was an iolist, then the entire function
could just go away, but that wouldn't handle atoms. As it stands, this
function is surprisingly expensive.


More information about the erlang-questions mailing list