[erlang-questions] Parameterized module idioms

Thu Apr 22 01:03:44 CEST 2010

There's quite a lot to reply to, and I don't have the time right now.
I'll just make a few points.

> Here there is a recompilation and selection of module. But if we  
> want to compute at startup time some value to serve as "parameter" ,  
> that value will have to be stored somewhere in a globally acessible  
> data structure like ets, to be consulted by my_config:dumplimit/1.  
> That can sometimes be slow.

If we want to compute at startup time some value to serve as a  
"parameter",
that value will have to be stored somewhere,
AND THAT SOMEWHERE CAN BE A MODULE.

There is no law that says a module can't be written by a program.

Here's an actual transcript.

Eshell V5.7.2  (abort with ^G)
1> c(reload).
{ok,reload}
2> reload:put(27).
{module,reloadable}
3> reload:get().
27
4> reload:put(42).
{module,reloadable}
5> reload:get().
42
6> reload:put(137).
{module,reloadable}
7> reload:get().
137
8> reload:put([<<1>>,<<2,3>>,<<4,5,6>>]).
{module,reloadable}
9> reload:get().
[<<1>>,<<2,3>>,<<4,5,6>>]

Now here's the proof-of-concept implementation.
Using compile:forms(..., [binary,...]) we can get the
same effect without touching the file system, and in
a "real" implementation that's what I'd do.

-module(reload).
-export([put/1,get/0]).

put(Datum) ->
     {ok,Device} = file:open("reloadable.erl", [write]),
     io:format(Device,
         "-module(reloadable).~n-export([datum/0]).~ndatum() ->~n     
~p .~n",
         [Datum]),
     ok = file:close(Device),
     compile:file("reloadable", []),
     code:purge(reloadable),
     code:load_file(reloadable).

get() ->
     reloadable:datum().

And yes, I do realise that what we have here is a global mutable
variable with an amazingly slow assignment statement,
but that's precisely what a changeable configuration parameter IS.

The overhead of of setting the parameter up (or changing it) is
moderately high (although using compile:forms(..., [binary,...])
would be more efficient as well as safer).  But the overhead of
*using* the parameter is the same as the overhead of any
cross-module function call.  And if that weren't tolerable, we
wouldn't be using Erlang.

And of course this can be generalised in a whole lot of ways.

>
> Now I remember that one of the "permissible" uses for the process  
> dictionary is to store "parameters" written once but never changed  
> later. This kind of use ties code with the process structure and the  
> use of get/1 can be slow.

Slow?  Time for some numbers.

6> getput:k(100000000).
[{variant,constant},{result,100000000},{time,530}]
7> getput:b(100000000).
[{variant,direct},{result,100000000},{time,1730}]
8> getput:t(100000000).
[{variant,dictionary},{result,100000000},{time,2290}]

Here's the code that produced those:

-module(getput).
-export([t/1, b/1, k/1]).

t(N) ->
     put(key, 1),
     {T0,_} = statistics(runtime),
     R = loop(N, 0),
     {T1,_} = statistics(runtime),
     [{variant,dictionary},{result,R}, {time,T1-T0}].

loop(0, R) -> R;
loop(N, R) -> loop(N-1, R+get(key)).

b(N) ->
     {T0,_} = statistics(runtime),
     R = loup(N, 0),
     {T1,_} = statistics(runtime),
     [{variant,direct},{result,R}, {time,T1-T0}].

loup(0, R) -> R;
loup(N, R) -> loup(N-1, R+(N div N)).

k(N) ->
     {T0,_} = statistics(runtime),
     R = lowp(N, 0),
     {T1,_} = statistics(runtime),
     [{variant,constant},{result,R}, {time,T1-T0}].

lowp(0, R) -> R;
lowp(N, R) -> lowp(N-1, R+1).

So
  - the loop that just  adds 1 takes  5.3 ns per iteration
  - the loop that adds N div N takes 17.3 ns per iteration
  - the loop that uses get()   takes 22.9 ns per iteration
We conclude that
  - N div N  takes 12 ns
  - get(key) takes 17.6 ns
and therefore
  EITHER my benchmark and interpretation are hopelessly fouled up
  OR using get/1 is NOT particularly slow.

To be honest, I incline to the former; how can looking something up
in a hash table be so good compared with an integer division?

While it may be true that get/1 _can_ be slow (I'd need to see the
numbers), you should never just _assume_ that get/1 is slow for
the use you intend to make of it.
>
> A quite common example would be several instances of a web server  
> together listening in different ports.

It's not clear why the port should be part of a web server's context
rather than part of its state, or why these instances need to be all
together in a single Erlang node (because if they aren't, module
parameters offer us no convenience), or why if they are all in a single
node "they" shouldn't be "it", a single system listening on several
ports and doing load balancing of some sort.
>