[erlang-questions] Fast directory walker

Stanislaw Klekot erlang.org@REDACTED
Sat Dec 10 00:50:27 CET 2016


On Fri, Dec 09, 2016 at 11:15:58PM +0000, Frank Muller wrote:
> I would like to improve the speed of my directory walker.
> 
> walk(Dir) ->
>     {ok, Files} = prim_file:list_dir(Dir),
>     walk(Dir, Files).

Why prim_file:list_dir() instead of file:list_dir()? The former is
undocumented internal function.

[...]
> Compared to almost anything i found on the web, it’s still very slow:
> > timer:tc(fun() -> dir:walk("/usr/share") end).
> {4662361,ok}

What is it this "anything you found on the web"? And how did you run
your comparisons? There's a large difference between first and second
consequent run caused by OS' directory cache, and there's large
difference between simply walking through the directory and walking with
printing something to the screen for every file.

Then there's also your using filelib:is_dir() and then
filelib:file_size(), which means two stat(2) calls, while you only need
to do it once per file (file:read_file_info()).

-- 
Stanislaw Klekot



More information about the erlang-questions mailing list