<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">read_file_info does the job of is_dir and file_size in a single call. That was the intention.<div class=""><br class=""></div><div class="">Also use file:read_file_info(name,[raw])<br class=""><div dir="auto" class=""><br class=""></div><div dir="auto" class="">Sergej</div><div dir="auto" class=""><br class=""></div><div><blockquote type="cite" class=""><div class="">On 10 Dec 2016, at 09:42, Benoit Chesneau <<a href="mailto:bchesneau@gmail.com" class="">bchesneau@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div style="white-space:pre-wrap" class="">this is kind of bullshit (sorry ;).... at the end this is what does the helpers in filelib:<br class=""><a href="https://github.com/erlang/otp/blob/maint/lib/stdlib/src/filelib.erl#L257" class="">https://github.com/erlang/otp/blob/maint/lib/stdlib/src/filelib.erl#L257</a><br class=""><br class="">except if you have a better algorithm in mind i don't se the point of rewriting something that is aleaready existing ...</div><br class=""><div class="gmail_quote"><div dir="ltr" class="">On Sat, 10 Dec 2016 at 09:36, Sergej Jurečko <<a href="mailto:sergej.jurecko@gmail.com" class="">sergej.jurecko@gmail.com</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="auto" class="gmail_msg">Stop using filelib functions. Use file:read_file_info and file:list_dir.</div><div dir="auto" class="gmail_msg"><div dir="auto" class="gmail_msg"><br class="gmail_msg"></div><div dir="auto" class="gmail_msg">Sergej</div></div><div class="gmail_extra gmail_msg"><br class="gmail_msg"><div class="gmail_quote gmail_msg">On Dec 10, 2016 9:29 AM, "Frank Muller" <<a href="mailto:frank.muller.erl@gmail.com" class="gmail_msg" target="_blank">frank.muller.erl@gmail.com</a>> wrote:<br type="attribution" class="gmail_msg"><blockquote class="gmail_quote gmail_msg" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="gmail_msg"><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">Hi Stanislaw</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">First, I don't care if I've to use documented/undocumented calls as long as I can achieve my goal: faster dir walking.</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">And you're right, here is a detailed comparison with other scripting languages:</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">In my /usr/share, there’s:</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">2580 directories</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">28953 files</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">1. Erlang (no io:format/1, just recurse):</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><div class="gmail_msg">walk(Dir) -></div><div class="gmail_msg"> {ok, Files} = file:list_dir(Dir),</div><div class="gmail_msg"> walk(Dir, Files).</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">walk(Dir, [ Basename | Rest ]) -></div><div class="gmail_msg"> Path = filename:join([ Dir, Basename ]),</div><div class="gmail_msg"> case filelib:is_dir(Path) of</div><div class="gmail_msg"> true -></div><div class="gmail_msg"> walk(Path);</div><div class="gmail_msg"> false -></div><div class="gmail_msg"> %% io:format("~s~n", [Path]),</div><div class="gmail_msg"> filelib:file_size(Path)</div><div class="gmail_msg"> end,</div><div class="gmail_msg"> walk(Dir, Rest);</div><div class="gmail_msg">walk(_, []) -></div><div class="gmail_msg"> ok.</div></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><div class="gmail_msg">timer:tc(fun() -> directoy:walker("/usr/share") end).</div><div class="gmail_msg">{<a href="tel:4662361" dir="ltr" class="gmail_msg" target="_blank">4662361</a>,ok}</div></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">2. Python (this code even count the size of dir):</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">From: <a href="http://stackoverflow.com/questions/1392413/calculating-a-directory-size-using-python" class="gmail_msg" target="_blank">http://stackoverflow.com/questions/1392413/calculating-a-directory-size-using-python</a></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">import os</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">def get_size(start_path = '.'):</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"> total_size = 0</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"> for dirpath, dirnames, filenames in os.walk(start_path):</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"> for f in filenames:</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"> fp = os.path.join(dirpath, f)</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"> total_size += os.path.getsize(fp)</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"> return total_size</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">print get_size()</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">$ cd /usr/share</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">$ time dir_walker.py</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><div class="gmail_msg"><a href="tel:432034130" dir="ltr" class="gmail_msg" target="_blank">432034130</a></div><div class="gmail_msg">0.25 real 0.13 user 0.10 sys</div></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><div class="gmail_msg">2. Perl (same, count dir size)</div><div class="gmail_msg"><a href="http://www.perlmonks.org/?node_id=168974" class="gmail_msg" target="_blank">http://www.perlmonks.org/?node_id=168974</a></div></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><div class="gmail_msg">use File::Find; </div><div class="gmail_msg">my $size = 0; </div><div class="gmail_msg">find(sub { $size += -s if -f $_ }, "/usr/share");</div></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><div class="gmail_msg">$ time perl <a href="http://dir_walker.pl/" class="gmail_msg" target="_blank">dir_walker.pl</a></div><div class="gmail_msg"><a href="tel:432034130" dir="ltr" class="gmail_msg" target="_blank">432034130</a></div><div class="gmail_msg">0.13 real 0.05 user 0.08 sys</div></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">3. Ruby (same, count dir size):</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><div class="gmail_msg">def directory_size(path)</div><div class="gmail_msg"> path << '/' unless path.end_with?('/')</div><div class="gmail_msg"> raise RuntimeError, "#{path} is not a directory" unless File.directory?(path)</div><div class="gmail_msg"> total_size = 0</div><div class="gmail_msg"> Dir["#{path}**/*"].each do |f|</div><div class="gmail_msg"> total_size += File.size(f) if File.file?(f) && File.size?(f)</div><div class="gmail_msg"> end</div><div class="gmail_msg"> total_size</div><div class="gmail_msg">end</div><div class="gmail_msg">puts directory_size '/usr/share’</div></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><div class="gmail_msg">$ time walker.rb</div><div class="gmail_msg"><a href="tel:432028422" dir="ltr" class="gmail_msg" target="_blank">432028422</a></div><div class="gmail_msg">0.21 real 0.09 user 0.11 sys</div></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">4. Lua:</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">From: <a href="http://lua-users.org/wiki/DirTreeIterator" class="gmail_msg" target="_blank">http://lua-users.org/wiki/DirTreeIterator</a></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><div class="gmail_msg">require "lfs"</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">function dirtree(dir)</div><div class="gmail_msg"> assert(dir and dir ~= "", "directory parameter is missing or empty")</div><div class="gmail_msg"> if string.sub(dir, -1) == "/" then</div><div class="gmail_msg"> dir=string.sub(dir, 1, -2)</div><div class="gmail_msg"> end</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg"> local function yieldtree(dir)</div><div class="gmail_msg"> for entry in lfs.dir(dir) do</div><div class="gmail_msg"> if entry ~= "." and entry ~= ".." then</div><div class="gmail_msg"> entry=dir.."/"..entry</div><div class="gmail_msg"><span class="gmail_msg m_1984904274731870973m_480222179950307766Apple-tab-span" style="white-space:pre-wrap"> </span>local attr=lfs.attributes(entry)</div><div class="gmail_msg"><span class="gmail_msg m_1984904274731870973m_480222179950307766Apple-tab-span" style="white-space:pre-wrap"> </span>coroutine.yield(entry,attr)</div><div class="gmail_msg"><span class="gmail_msg m_1984904274731870973m_480222179950307766Apple-tab-span" style="white-space:pre-wrap"> </span>if attr.mode == "directory" then</div><div class="gmail_msg"><span class="gmail_msg m_1984904274731870973m_480222179950307766Apple-tab-span" style="white-space:pre-wrap"> </span> yieldtree(entry)</div><div class="gmail_msg"><span class="gmail_msg m_1984904274731870973m_480222179950307766Apple-tab-span" style="white-space:pre-wrap"> </span>end</div><div class="gmail_msg"> end</div><div class="gmail_msg"> end</div><div class="gmail_msg"> end</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg"> return coroutine.wrap(function() yieldtree(dir) end)</div><div class="gmail_msg">end</div><div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">for filename, attr in dirtree("/usr/share") do</div><div class="gmail_msg"> print(attr.mode, filename)</div><div class="gmail_msg">end</div></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">$ luarocks install luafilesystem</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">$ time lua walker.lua > /dev/null</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><div class="gmail_msg">0.30 real 0.16 user 0.14 sys</div></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">Do you need more?</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><br class="gmail_msg"></div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">Thanks for you help.</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg">/Frank</div><div style="font-family:UICTFontTextStyleBody;font-size:17px" class="gmail_msg"><div style="font-family:Helvetica;font-size:12px;word-wrap:break-word" class="gmail_msg"><div style="word-wrap:break-word" class="gmail_msg"><div style="word-wrap:break-word" class="gmail_msg"><div style="word-wrap:break-word" class="gmail_msg"><div style="word-wrap:break-word" class="gmail_msg"><div style="margin:0px;line-height:normal;min-height:14px" class="gmail_msg"><span class="gmail_msg"></span></div></div></div></div></div></div><span style="font-family:Helvetica;font-size:12px" class="gmail_msg"><div class="gmail_msg"><span class="gmail_msg"><br class="gmail_msg"></span></div></span></div><div class="gmail_quote gmail_msg"><div class="gmail_msg">Le sam. 10 déc. 2016 à 00:51, Stanislaw Klekot <<a href="mailto:erlang.org@jarowit.net" class="gmail_msg" target="_blank">erlang.org@jarowit.net</a>> a écrit :<br class="gmail_msg"></div><blockquote class="gmail_quote gmail_msg" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Fri, Dec 09, 2016 at 11:15:58PM +0000, Frank Muller wrote:<br class="gmail_msg m_1984904274731870973m_480222179950307766gmail_msg"><br class="gmail_msg">> I would like to improve the speed of my directory walker.<br class="gmail_msg m_1984904274731870973m_480222179950307766gmail_msg"><br class="gmail_msg">><br class="gmail_msg m_1984904274731870973m_480222179950307766gmail_msg"><br class="gmail_msg">> walk(Dir) -><br class="gmail_msg m_1984904274731870973m_480222179950307766gmail_msg"><br class="gmail_msg">> {ok, Files} = prim_file:list_dir(Dir),<br class="gmail_msg m_1984904274731870973m_480222179950307766gmail_msg"><br class="gmail_msg">> walk(Dir, Files).<br class="gmail_msg m_1984904274731870973m_480222179950307766gmail_msg"><br class="gmail_msg"><br class="gmail_msg m_1984904274731870973m_480222179950307766gmail_msg"><br class="gmail_msg">Why prim_file:list_dir() instead of file:list_dir()? The former is<br class="gmail_msg m_1984904274731870973m_480222179950307766gmail_msg"><br class="gmail_msg">undocumented internal function.<br class="gmail_msg m_1984904274731870973m_480222179950307766gmail_msg"><br class="gmail_msg"><br class="gmail_msg m_1984904274731870973m_480222179950307766gmail_msg"><br class="gmail_msg">[...]<br class="gmail_msg m_1984904274731870973m_480222179950307766gmail_msg"><br class="gmail_msg">> Compared to almost anything i found on the web, it’s still very slow:<br class="gmail_msg m_1984904274731870973m_480222179950307766gmail_msg"><br class="gmail_msg">> > timer:tc(fun() -> dir:walk("/usr/share") end).<br class="gmail_msg m_1984904274731870973m_480222179950307766gmail_msg"><br class="gmail_msg">> {4662361,ok}<br class="gmail_msg m_1984904274731870973m_480222179950307766gmail_msg"><br class="gmail_msg"><br class="gmail_msg m_1984904274731870973m_480222179950307766gmail_msg"><br class="gmail_msg">What is it this "anything you found on the web"? And how did you run<br class="gmail_msg m_1984904274731870973m_480222179950307766gmail_msg"><br class="gmail_msg">your comparisons? There's a large difference between first and second<br class="gmail_msg m_1984904274731870973m_480222179950307766gmail_msg"><br class="gmail_msg">consequent run caused by OS' directory cache, and there's large<br class="gmail_msg m_1984904274731870973m_480222179950307766gmail_msg"><br class="gmail_msg">difference between simply walking through the directory and walking with<br class="gmail_msg m_1984904274731870973m_480222179950307766gmail_msg"><br class="gmail_msg">printing something to the screen for every file.<br class="gmail_msg m_1984904274731870973m_480222179950307766gmail_msg"><br class="gmail_msg"><br class="gmail_msg m_1984904274731870973m_480222179950307766gmail_msg"><br class="gmail_msg">Then there's also your using filelib:is_dir() and then<br class="gmail_msg m_1984904274731870973m_480222179950307766gmail_msg"><br class="gmail_msg">filelib:file_size(), which means two stat(2) calls, while you only need<br class="gmail_msg m_1984904274731870973m_480222179950307766gmail_msg"><br class="gmail_msg">to do it once per file (file:read_file_info()).<br class="gmail_msg m_1984904274731870973m_480222179950307766gmail_msg"><br class="gmail_msg"><br class="gmail_msg m_1984904274731870973m_480222179950307766gmail_msg"><br class="gmail_msg">--<br class="gmail_msg m_1984904274731870973m_480222179950307766gmail_msg"><br class="gmail_msg">Stanislaw Klekot<br class="gmail_msg m_1984904274731870973m_480222179950307766gmail_msg"><br class="gmail_msg"></blockquote></div></div>
<br class="gmail_msg">_______________________________________________<br class="gmail_msg">
erlang-questions mailing list<br class="gmail_msg">
<a href="mailto:erlang-questions@erlang.org" class="gmail_msg" target="_blank">erlang-questions@erlang.org</a><br class="gmail_msg">
<a href="http://erlang.org/mailman/listinfo/erlang-questions" rel="noreferrer" class="gmail_msg" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a><br class="gmail_msg">
<br class="gmail_msg"></blockquote></div></div>
_______________________________________________<br class="gmail_msg">
erlang-questions mailing list<br class="gmail_msg">
<a href="mailto:erlang-questions@erlang.org" class="gmail_msg" target="_blank">erlang-questions@erlang.org</a><br class="gmail_msg">
<a href="http://erlang.org/mailman/listinfo/erlang-questions" rel="noreferrer" class="gmail_msg" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a><br class="gmail_msg">
</blockquote></div>
</div></blockquote></div><br class=""></div></body></html>