<div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px">Hi Stanislaw</div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><br class=""></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px">First, I don't care if I've to use documented/undocumented calls as long as I can achieve my goal: faster dir walking.</div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><br></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px">And you're right, here is a detailed comparison with other scripting languages:</div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><br></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px">In my /usr/share, there’s:</div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px">2580 directories</div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px">28953 files</div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><br class=""></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px">1. Erlang (no io:format/1, just recurse):</div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><br class=""></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><div class="">walk(Dir) -></div><div class="">    {ok, Files} = file:list_dir(Dir),</div><div class="">    walk(Dir, Files).</div><div class=""><br class=""></div><div class="">walk(Dir, [ Basename | Rest ]) -></div><div class="">    Path = filename:join([ Dir, Basename ]),</div><div class="">    case filelib:is_dir(Path) of</div><div class="">        true  -></div><div class="">            walk(Path);</div><div class="">        false -></div><div class="">          %%  io:format("~s~n", [Path]),</div><div class="">            filelib:file_size(Path)</div><div class="">    end,</div><div class="">    walk(Dir, Rest);</div><div class="">walk(_, []) -></div><div class="">    ok.</div></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><br class=""></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><div class="">timer:tc(fun() -> directoy:walker("/usr/share") end).</div><div class="">{<a href="tel:4662361" dir="ltr">4662361</a>,ok}</div></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><br class=""></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px">2. Python (this code even count the size of dir):</div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px">From: <a href="http://stackoverflow.com/questions/1392413/calculating-a-directory-size-using-python" class="">http://stackoverflow.com/questions/1392413/calculating-a-directory-size-using-python</a></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><br class=""></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px">import os</div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px">def get_size(start_path = '.'):</div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px">    total_size = 0</div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px">    for dirpath, dirnames, filenames in os.walk(start_path):</div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px">        for f in filenames:</div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px">            fp = os.path.join(dirpath, f)</div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px">            total_size += os.path.getsize(fp)</div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px">    return total_size</div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><br class=""></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px">print get_size()</div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><br class=""></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px">$ cd /usr/share</div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px">$ time dir_walker.py</div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><div class=""><a href="tel:432034130" dir="ltr">432034130</a></div><div class="">0.25 real         0.13 user         0.10 sys</div></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><br class=""></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><div class="">2. Perl (same, count dir size)</div><div class=""><a href="http://www.perlmonks.org/?node_id=168974" class="">http://www.perlmonks.org/?node_id=168974</a></div></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><br class=""></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><div class="">use File::Find;           </div><div class="">my $size = 0;             </div><div class="">find(sub { $size += -s if -f $_ }, "/usr/share");</div></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><br class=""></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><div class="">$ time perl <a href="http://dir_walker.pl">dir_walker.pl</a></div><div class=""><a href="tel:432034130" dir="ltr">432034130</a></div><div class="">0.13 real         0.05 user         0.08 sys</div></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><br class=""></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px">3. Ruby (same, count dir size):</div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><br class=""></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><div class="">def directory_size(path)</div><div class="">  path << '/' unless path.end_with?('/')</div><div class="">  raise RuntimeError, "#{path} is not a directory" unless File.directory?(path)</div><div class="">  total_size = 0</div><div class="">  Dir["#{path}**/*"].each do |f|</div><div class="">    total_size += File.size(f) if File.file?(f) && File.size?(f)</div><div class="">  end</div><div class="">  total_size</div><div class="">end</div><div class="">puts directory_size '/usr/share’</div></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><br class=""></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><div class="">$ time walker.rb</div><div class=""><a href="tel:432028422" dir="ltr">432028422</a></div><div class="">0.21 real         0.09 user         0.11 sys</div></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><br class=""></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px">4. Lua:</div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px">From: <a href="http://lua-users.org/wiki/DirTreeIterator" class="">http://lua-users.org/wiki/DirTreeIterator</a></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><br class=""></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><div class="">require "lfs"</div><div class=""><br class=""></div><div class="">function dirtree(dir)</div><div class="">  assert(dir and dir ~= "", "directory parameter is missing or empty")</div><div class="">  if string.sub(dir, -1) == "/" then</div><div class="">    dir=string.sub(dir, 1, -2)</div><div class="">  end</div><div class=""><br class=""></div><div class="">  local function yieldtree(dir)</div><div class="">    for entry in lfs.dir(dir) do</div><div class="">      if entry ~= "." and entry ~= ".." then</div><div class="">        entry=dir.."/"..entry</div><div class=""><span class="Apple-tab-span" style="white-space:pre">      </span>local attr=lfs.attributes(entry)</div><div class=""><span class="Apple-tab-span" style="white-space:pre">  </span>coroutine.yield(entry,attr)</div><div class=""><span class="Apple-tab-span" style="white-space:pre">       </span>if attr.mode == "directory" then</div><div class=""><span class="Apple-tab-span" style="white-space:pre">        </span>  yieldtree(entry)</div><div class=""><span class="Apple-tab-span" style="white-space:pre">      </span>end</div><div class="">      end</div><div class="">    end</div><div class="">  end</div><div class=""><br class=""></div><div class="">  return coroutine.wrap(function() yieldtree(dir) end)</div><div class="">end</div><div class=""><br class=""></div><div class="">for filename, attr in dirtree("/usr/share") do</div><div class="">      print(attr.mode, filename)</div><div class="">end</div></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><br class=""></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px">$ luarocks install luafilesystem</div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px">$ time lua walker.lua > /dev/null</div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><div class="">0.30 real         0.16 user         0.14 sys</div></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><br class=""></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px">Do you need more?</div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><br></div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px">Thanks for you help.</div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px">/Frank</div><div class="" style="font-family:UICTFontTextStyleBody;font-size:17px"><div class="" style="font-family:Helvetica;font-size:12px;word-wrap:break-word"><div class="" style="word-wrap:break-word"><div class="" style="word-wrap:break-word"><div class="" style="word-wrap:break-word"><div class="" style="word-wrap:break-word"><div class="" style="margin:0px;line-height:normal;min-height:14px"><span class="" style="-webkit-font-kerning: none;"></span></div></div></div></div></div></div><span style="font-family:Helvetica;font-size:12px"><div><span class="" style="-webkit-font-kerning: none;"><br></span></div></span></div><div class="gmail_quote"><div>Le sam. 10 déc. 2016 à 00:51, Stanislaw Klekot <<a href="mailto:erlang.org@jarowit.net">erlang.org@jarowit.net</a>> a écrit :<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Fri, Dec 09, 2016 at 11:15:58PM +0000, Frank Muller wrote:<br class="gmail_msg"><br>> I would like to improve the speed of my directory walker.<br class="gmail_msg"><br>><br class="gmail_msg"><br>> walk(Dir) -><br class="gmail_msg"><br>>     {ok, Files} = prim_file:list_dir(Dir),<br class="gmail_msg"><br>>     walk(Dir, Files).<br class="gmail_msg"><br><br class="gmail_msg"><br>Why prim_file:list_dir() instead of file:list_dir()? The former is<br class="gmail_msg"><br>undocumented internal function.<br class="gmail_msg"><br><br class="gmail_msg"><br>[...]<br class="gmail_msg"><br>> Compared to almost anything i found on the web, it’s still very slow:<br class="gmail_msg"><br>> > timer:tc(fun() -> dir:walk("/usr/share") end).<br class="gmail_msg"><br>> {4662361,ok}<br class="gmail_msg"><br><br class="gmail_msg"><br>What is it this "anything you found on the web"? And how did you run<br class="gmail_msg"><br>your comparisons? There's a large difference between first and second<br class="gmail_msg"><br>consequent run caused by OS' directory cache, and there's large<br class="gmail_msg"><br>difference between simply walking through the directory and walking with<br class="gmail_msg"><br>printing something to the screen for every file.<br class="gmail_msg"><br><br class="gmail_msg"><br>Then there's also your using filelib:is_dir() and then<br class="gmail_msg"><br>filelib:file_size(), which means two stat(2) calls, while you only need<br class="gmail_msg"><br>to do it once per file (file:read_file_info()).<br class="gmail_msg"><br><br class="gmail_msg"><br>--<br class="gmail_msg"><br>Stanislaw Klekot<br class="gmail_msg"><br></blockquote></div></div>