[erlang-questions] escript cutting output

Jesper Louis Andersen jesper.louis.andersen@REDACTED
Sun Nov 20 16:26:53 CET 2011


On 2011-11-20 07:53, Bob Gustafson wrote:
> $ cat b_read.rb
> #!/usr/env ruby
>
> fin = File.open("buffer.out","r")
> count = {}
> fin.each_line do|line|
>    line.each_char do|c|
>      if count[c] == nil
>         count[c] = 1
>      else
>         count[c] += 1
>      end
>    end
> end

Not tested, but here is a general idea. Define a module and export the 
relevant function.

-module(foo).

-export([go/1]).

Now, go/1 will open the file in buffered mode so we get a little bit of 
read_ahead speedup. Opening the file as  a binary is even faster but for 
now, this will definitely do.

go(FN) ->
     %% fin = File.open("buffer.out","r")
     {ok, Fd} = file:open(FN, [read, read_ahead, raw]),
     R = frequency_count(Fd),
     file:close(Fd),
     {ok, R}.

The way to count frequencies is to read in the first line, and keep a 
dictionary with us to update. We initialize the dictionary as the empty one.

frequency_count(IODev) ->
     %% count = {}
     frequency_count(IODev, file:read_line(IODev), dict:new()).

Two patterns. Either there is a line, in which case we process it, or 
there are no more lines, in which case we return the dictionary.

%% fin.each_line do |line|
frequency_count(IODev, {ok, Line}, Dict) ->
     frequency_count(IODev, file:read_line(IODev),
                     update_line(Line, Dict));
frequency_count(_IODev, eof, Dict) -> Dict.

Same game, either there is another character or there isn't. When we are 
done with the line, we return the dict. When there is a line, the scheme 
in the Ruby code can be handled by the function dict:update_counter/3, 
and by noticing we can process each character one at a time.

%%   line.each_char do |c|
update_line([Char | Rest], Dict) ->
%%     if count[c] == nil
%%        count[c] = 1
%%     else
%%        count[c] += 1
%%     end
     NewDict = dict:update_counter(Char, 1, Dict),
     update_line(Rest, NewDict);
update_line([], Dict) -> Dict.

-- 
Jesper Louis Andersen
   Erlang Solutions, Copenhagen, DK

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20111120/a3caa6b2/attachment.htm>


More information about the erlang-questions mailing list