[erlang-questions] Charset conversion / stylistic question.

Ulf Wiger (TN/EAB) ulf.wiger@REDACTED
Thu Apr 19 10:13:00 CEST 2007


 
Tim Becker wrote:
>
> > map_1(F, Bin) ->
> >   {C, Rest} = F(Bin),
> >   [C | map_1(F, Rest)];
> > map_1(F, <<>>) -> [].
> 
> Just to make sure I understand. The above example isn't
> tail recursive either, correct? The [|] list
> construction gets executed after the recursive call
> to map_1 ...

This is true. However, as Thomas commented, tail recursion
isn't everything. The alternatives making the above function
tail recursive will probably not be faster. 

There are some threads in the archive that discuss 
tail recursion. Here are some samples:

Richard O'Keefe giving some perspective on tail-recursion:
http://www.erlang.org/ml-archive/erlang-questions/200309/msg00028.html

Me reporting a case where passing funs rather than waiting
for functions to return gave a 9000x speedup (extreme case):
http://erlang.org/ml-archive/erlang-questions/200112/msg00032.html

In this particular instance, I believe it was a case of
deep recursion with functions creating lots of garbage.
Kostis Sagonas suggested that HiPE would have handled it
better then, since it scans the stack, and would pick
up the garbage.


> So what I've changed is:
> 
> the binary map just converts the binary to a list which
> uses lists:map/2:
> 
>   map (Function, <<B/binary>>) -> lists:map(Function, 
> binary_to_list(B));
>   map (_Function, <<>>) -> [].

That's certainly a convenient way to do it.


> I've moved all the conversion functions into a separate 
> module, so I'd have one module per charset later. Still:
>   ...
>   cp037_to_iso_8859_1 (129) -> 97; % a
>   cp037_to_iso_8859_1 (130) -> 98; % b
>   cp037_to_iso_8859_1 (131) -> 99; % c
>   ...
> 
> seems sort of inefficient, especially considering the
> size of the source to convert UCS :)

The matching of function clause patterns is actually
very efficient. And I don't think you need to worry
about lines of code either. There are examples of 
generated parser modules with 3 functions spanning
more than 20 000 lines of code.

Writing the actual code by hand might be a bit boring, 
though, but actually, generating such code is quite easy:

Consider the following module:
-module(gen_keyvals).

-export([mod/3]).


mod(Mod, Function, Vals) when is_atom(Mod), is_atom(Function) ->
    [{attribute, 1, module, Mod},
     {attribute, 1, export, [{Function, 1}]},
     gen_function(Function, Vals)].

gen_function(F, Vals) ->
    {function, 1, F, 1,
     [{clause, 1, [{atom, 1, K}], [],
       [erl_parse:abstract(V)]} ||
	 {K,V} <- Vals]}.

An example:

Eshell V5.5.3.1  (abort with ^G)
1> c(gen_keyvals).
{ok,gen_keyvals}
2> gen_keyvals:mod(m,foo,[{a,1},{b,2},{c,3}]).
[{attribute,1,module,m},
 {attribute,1,export,[{foo,1}]},
 {function,1,
           foo,
           1,
           [{clause,1,[{atom,1,a}],[],[{integer,0,1}]},
            {clause,1,[{atom,1,b}],[],[{integer,0,2}]},
            {clause,1,[{atom,1,c}],[],[{integer,0,3}]}]}]
3> compile:forms(v(2)).
{ok,m,
 
<<70,79,82,49,0,0,1,184,66,69,65,77,65,116,111,109,0,0,0,51,0,0,0,8,1,10
9,
      ...>>}
4> code:load_binary(m,"m.beam",element(3,v(3))).
{module,m}
5> m:foo(a).
1
6> m:foo(c).
3

If you want the generated code in a .erl file, this is 
also easily accomplished:

9> [erl_pp:form(F) || F <- v(2)].
[[[[[[45,"module"]],[[40,[["m"],41]]]],".\n"]],
 [[[["-export"],[[40,[[91,[[["foo",47,"1"]],93]],41]]]],".\n"]],
 [[[[[[["foo",[[40,["a",41]]]]]," ->"],["\n    ",["1",59]]],
    [10,[[[["foo",[[40,["b",41]]]]]," ->"],["\n    ",["2",59]]]],
    [10,[[[["foo",[[40,["c",41]]]]]," ->"],["\n    ",["3"]]]]],
   ".\n"]]]
10> io:format("~s~n", [v(9)]).               
-module(m).
-export([foo/1]).
foo(a) ->
    1;
foo(b) ->
    2;
foo(c) ->
    3.


Perhaps a bit over the top as beginner's advice,
but you can always return to it later, if you
want. (:

BR,
Ulf W




More information about the erlang-questions mailing list