<div style="white-space:pre-wrap">this is kind of bullshit (sorry ;).... at the end this is what does the helpers in filelib:<br><a href="https://github.com/erlang/otp/blob/maint/lib/stdlib/src/filelib.erl#L257">https://github.com/erlang/otp/blob/maint/lib/stdlib/src/filelib.erl#L257</a><br><br>except if you have a better algorithm in mind i don't se the point of rewriting something that is aleaready existing ...</div><br><div class="gmail_quote"><div dir="ltr">On Sat, 10 Dec 2016 at 09:36, Sergej Jurečko <<a href="mailto:sergej.jurecko@gmail.com">sergej.jurecko@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="auto" class="gmail_msg">Stop using filelib functions. Use file:read_file_info and file:list_dir.</div><div dir="auto" class="gmail_msg"><div dir="auto" class="gmail_msg"><br class="gmail_msg"></div><div dir="auto" class="gmail_msg">Sergej</div></div><div class="gmail_extra gmail_msg"><br class="gmail_msg"><div class="gmail_quote gmail_msg">On Dec 10, 2016 9:29 AM, "Frank Muller" <<a href="mailto:frank.muller.erl@gmail.com" class="gmail_msg" target="_blank">frank.muller.erl@gmail.com</a>> wrote:<br type="attribution" class="gmail_msg"><blockquote class="gmail_quote gmail_msg" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="gmail_msg"><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">Hi Stanislaw</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">First, I don't care if I've to use documented/undocumented calls as long as I can achieve my goal: faster dir walking.</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">And you're right, here is a detailed comparison with other scripting languages:</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">In my /usr/share, there’s:</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">2580 directories</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">28953 files</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">1. Erlang (no io:format/1, just recurse):</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><div class="gmail_msg">walk(Dir) -></div><div class="gmail_msg">    {ok, Files} = file:list_dir(Dir),</div><div class="gmail_msg">    walk(Dir, Files).</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">walk(Dir, [ Basename | Rest ]) -></div><div class="gmail_msg">    Path = filename:join([ Dir, Basename ]),</div><div class="gmail_msg">    case filelib:is_dir(Path) of</div><div class="gmail_msg">        true  -></div><div class="gmail_msg">            walk(Path);</div><div class="gmail_msg">        false -></div><div class="gmail_msg">          %%  io:format("~s~n", [Path]),</div><div class="gmail_msg">            filelib:file_size(Path)</div><div class="gmail_msg">    end,</div><div class="gmail_msg">    walk(Dir, Rest);</div><div class="gmail_msg">walk(_, []) -></div><div class="gmail_msg">    ok.</div></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><div class="gmail_msg">timer:tc(fun() -> directoy:walker("/usr/share") end).</div><div class="gmail_msg">{<a href="tel:4662361" dir="ltr" class="gmail_msg" target="_blank">4662361</a>,ok}</div></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">2. Python (this code even count the size of dir):</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">From: <a href="http://stackoverflow.com/questions/1392413/calculating-a-directory-size-using-python" class="gmail_msg" target="_blank">http://stackoverflow.com/questions/1392413/calculating-a-directory-size-using-python</a></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">import os</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">def get_size(start_path = '.'):</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">    total_size = 0</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">    for dirpath, dirnames, filenames in os.walk(start_path):</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">        for f in filenames:</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">            fp = os.path.join(dirpath, f)</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">            total_size += os.path.getsize(fp)</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">    return total_size</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">print get_size()</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">$ cd /usr/share</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">$ time dir_walker.py</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><div class="gmail_msg"><a href="tel:432034130" dir="ltr" class="gmail_msg" target="_blank">432034130</a></div><div class="gmail_msg">0.25 real         0.13 user         0.10 sys</div></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><div class="gmail_msg">2. Perl (same, count dir size)</div><div class="gmail_msg"><a href="http://www.perlmonks.org/?node_id=168974" class="gmail_msg" target="_blank">http://www.perlmonks.org/?node_id=168974</a></div></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><div class="gmail_msg">use File::Find;           </div><div class="gmail_msg">my $size = 0;             </div><div class="gmail_msg">find(sub { $size += -s if -f $_ }, "/usr/share");</div></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><div class="gmail_msg">$ time perl <a href="http://dir_walker.pl" class="gmail_msg" target="_blank">dir_walker.pl</a></div><div class="gmail_msg"><a href="tel:432034130" dir="ltr" class="gmail_msg" target="_blank">432034130</a></div><div class="gmail_msg">0.13 real         0.05 user         0.08 sys</div></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">3. Ruby (same, count dir size):</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><div class="gmail_msg">def directory_size(path)</div><div class="gmail_msg">  path << '/' unless path.end_with?('/')</div><div class="gmail_msg">  raise RuntimeError, "#{path} is not a directory" unless File.directory?(path)</div><div class="gmail_msg">  total_size = 0</div><div class="gmail_msg">  Dir["#{path}**/*"].each do |f|</div><div class="gmail_msg">    total_size += File.size(f) if File.file?(f) && File.size?(f)</div><div class="gmail_msg">  end</div><div class="gmail_msg">  total_size</div><div class="gmail_msg">end</div><div class="gmail_msg">puts directory_size '/usr/share’</div></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><div class="gmail_msg">$ time walker.rb</div><div class="gmail_msg"><a href="tel:432028422" dir="ltr" class="gmail_msg" target="_blank">432028422</a></div><div class="gmail_msg">0.21 real         0.09 user         0.11 sys</div></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">4. Lua:</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">From: <a href="http://lua-users.org/wiki/DirTreeIterator" class="gmail_msg" target="_blank">http://lua-users.org/wiki/DirTreeIterator</a></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><div class="gmail_msg">require "lfs"</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">function dirtree(dir)</div><div class="gmail_msg">  assert(dir and dir ~= "", "directory parameter is missing or empty")</div><div class="gmail_msg">  if string.sub(dir, -1) == "/" then</div><div class="gmail_msg">    dir=string.sub(dir, 1, -2)</div><div class="gmail_msg">  end</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">  local function yieldtree(dir)</div><div class="gmail_msg">    for entry in lfs.dir(dir) do</div><div class="gmail_msg">      if entry ~= "." and entry ~= ".." then</div><div class="gmail_msg">        entry=dir.."/"..entry</div><div class="gmail_msg"><span class="m_1984904274731870973m_480222179950307766Apple-tab-span gmail_msg" style="white-space:pre-wrap">   </span>local attr=lfs.attributes(entry)</div><div class="gmail_msg"><span class="m_1984904274731870973m_480222179950307766Apple-tab-span gmail_msg" style="white-space:pre-wrap"> </span>coroutine.yield(entry,attr)</div><div class="gmail_msg"><span class="m_1984904274731870973m_480222179950307766Apple-tab-span gmail_msg" style="white-space:pre-wrap">      </span>if attr.mode == "directory" then</div><div class="gmail_msg"><span class="m_1984904274731870973m_480222179950307766Apple-tab-span gmail_msg" style="white-space:pre-wrap">       </span>  yieldtree(entry)</div><div class="gmail_msg"><span class="m_1984904274731870973m_480222179950307766Apple-tab-span gmail_msg" style="white-space:pre-wrap">     </span>end</div><div class="gmail_msg">      end</div><div class="gmail_msg">    end</div><div class="gmail_msg">  end</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">  return coroutine.wrap(function() yieldtree(dir) end)</div><div class="gmail_msg">end</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">for filename, attr in dirtree("/usr/share") do</div><div class="gmail_msg">      print(attr.mode, filename)</div><div class="gmail_msg">end</div></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">$ luarocks install luafilesystem</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">$ time lua walker.lua > /dev/null</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><div class="gmail_msg">0.30 real         0.16 user         0.14 sys</div></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">Do you need more?</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">Thanks for you help.</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">/Frank</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><div style="font-family:Helvetica;font-size:12px;word-wrap:break-word" class="gmail_msg"><div style="word-wrap:break-word" class="gmail_msg"><div style="word-wrap:break-word" class="gmail_msg"><div style="word-wrap:break-word" class="gmail_msg"><div style="word-wrap:break-word" class="gmail_msg"><div style="margin:0px;line-height:normal;min-height:14px" class="gmail_msg"><span class="gmail_msg"></span></div></div></div></div></div></div><span style="font-family:Helvetica;font-size:12px" class="gmail_msg"><div class="gmail_msg"><span class="gmail_msg"><br class="gmail_msg"></span></div></span></div><div class="gmail_quote gmail_msg"><div class="gmail_msg">Le sam. 10 déc. 2016 à 00:51, Stanislaw Klekot <<a href="mailto:erlang.org@jarowit.net" class="gmail_msg" target="_blank">erlang.org@jarowit.net</a>> a écrit :<br class="gmail_msg"></div><blockquote class="gmail_quote gmail_msg" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Fri, Dec 09, 2016 at 11:15:58PM +0000, Frank Muller wrote:<br class="m_1984904274731870973m_480222179950307766gmail_msg gmail_msg"><br class="gmail_msg">> I would like to improve the speed of my directory walker.<br class="m_1984904274731870973m_480222179950307766gmail_msg gmail_msg"><br class="gmail_msg">><br class="m_1984904274731870973m_480222179950307766gmail_msg gmail_msg"><br class="gmail_msg">> walk(Dir) -><br class="m_1984904274731870973m_480222179950307766gmail_msg gmail_msg"><br class="gmail_msg">>     {ok, Files} = prim_file:list_dir(Dir),<br class="m_1984904274731870973m_480222179950307766gmail_msg gmail_msg"><br class="gmail_msg">>     walk(Dir, Files).<br class="m_1984904274731870973m_480222179950307766gmail_msg gmail_msg"><br class="gmail_msg"><br class="m_1984904274731870973m_480222179950307766gmail_msg gmail_msg"><br class="gmail_msg">Why prim_file:list_dir() instead of file:list_dir()? The former is<br class="m_1984904274731870973m_480222179950307766gmail_msg gmail_msg"><br class="gmail_msg">undocumented internal function.<br class="m_1984904274731870973m_480222179950307766gmail_msg gmail_msg"><br class="gmail_msg"><br class="m_1984904274731870973m_480222179950307766gmail_msg gmail_msg"><br class="gmail_msg">[...]<br class="m_1984904274731870973m_480222179950307766gmail_msg gmail_msg"><br class="gmail_msg">> Compared to almost anything i found on the web, it’s still very slow:<br class="m_1984904274731870973m_480222179950307766gmail_msg gmail_msg"><br class="gmail_msg">> > timer:tc(fun() -> dir:walk("/usr/share") end).<br class="m_1984904274731870973m_480222179950307766gmail_msg gmail_msg"><br class="gmail_msg">> {4662361,ok}<br class="m_1984904274731870973m_480222179950307766gmail_msg gmail_msg"><br class="gmail_msg"><br class="m_1984904274731870973m_480222179950307766gmail_msg gmail_msg"><br class="gmail_msg">What is it this "anything you found on the web"? And how did you run<br class="m_1984904274731870973m_480222179950307766gmail_msg gmail_msg"><br class="gmail_msg">your comparisons? There's a large difference between first and second<br class="m_1984904274731870973m_480222179950307766gmail_msg gmail_msg"><br class="gmail_msg">consequent run caused by OS' directory cache, and there's large<br class="m_1984904274731870973m_480222179950307766gmail_msg gmail_msg"><br class="gmail_msg">difference between simply walking through the directory and walking with<br class="m_1984904274731870973m_480222179950307766gmail_msg gmail_msg"><br class="gmail_msg">printing something to the screen for every file.<br class="m_1984904274731870973m_480222179950307766gmail_msg gmail_msg"><br class="gmail_msg"><br class="m_1984904274731870973m_480222179950307766gmail_msg gmail_msg"><br class="gmail_msg">Then there's also your using filelib:is_dir() and then<br class="m_1984904274731870973m_480222179950307766gmail_msg gmail_msg"><br class="gmail_msg">filelib:file_size(), which means two stat(2) calls, while you only need<br class="m_1984904274731870973m_480222179950307766gmail_msg gmail_msg"><br class="gmail_msg">to do it once per file (file:read_file_info()).<br class="m_1984904274731870973m_480222179950307766gmail_msg gmail_msg"><br class="gmail_msg"><br class="m_1984904274731870973m_480222179950307766gmail_msg gmail_msg"><br class="gmail_msg">--<br class="m_1984904274731870973m_480222179950307766gmail_msg gmail_msg"><br class="gmail_msg">Stanislaw Klekot<br class="m_1984904274731870973m_480222179950307766gmail_msg gmail_msg"><br class="gmail_msg"></blockquote></div></div>
<br class="gmail_msg">_______________________________________________<br class="gmail_msg">
erlang-questions mailing list<br class="gmail_msg">
<a href="mailto:erlang-questions@erlang.org" class="gmail_msg" target="_blank">erlang-questions@erlang.org</a><br class="gmail_msg">
<a href="http://erlang.org/mailman/listinfo/erlang-questions" rel="noreferrer" class="gmail_msg" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a><br class="gmail_msg">
<br class="gmail_msg"></blockquote></div></div>
_______________________________________________<br class="gmail_msg">
erlang-questions mailing list<br class="gmail_msg">
<a href="mailto:erlang-questions@erlang.org" class="gmail_msg" target="_blank">erlang-questions@erlang.org</a><br class="gmail_msg">
<a href="http://erlang.org/mailman/listinfo/erlang-questions" rel="noreferrer" class="gmail_msg" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a><br class="gmail_msg">
</blockquote></div>