[erlang-questions] Parameterized module idioms

Wed Apr 21 04:01:24 CEST 2010

On Apr 20, 2010, at 11:25 PM, Paulo SŽergio Almeida wrote:

> Closures must be passed somehow, if they are to be used in a  
> function. If I am writing many functions which use some nonrelated  
> parameters, some closures are being passed around. That is already  
> too much pollution for my tastes.

If they really *are* nonrelated, they should be separate parameters  
anyway.
If they are related, they can be in one data structure (which might be
a closure).  "Pollution" is not a property of closures, but of a coding
style.

>
>>> Trivial, but requires a lot of boilerplate code and certainly isn't
>>> any easier to understand or debug than parameterized modules. It  
>>> also
>>> becomes impossible to write a useful type spec if you use closures
>>> like that.
>> (a) Such boilerplate code as is required can be automatically  
>> generated.
>
> If some code would need to be generated if I didn't have  
> parameterized modules, then parameterized modules already give me  
> something.

What?  Automatic code generation isn't anything you need to be aware of,
let alone involved with.
>
> Let's look at some examples (from a homemade "database" I wrote some  
> time ago).
>
> The more simple use may be for substituting constants with something  
> calculated at runtime; e.g.
>
> Instead of
>
>  need_dump(Tab, LogOps) -> LogOps > ?DUMPLIMIT * ets:info(Tab, size).
>
> I can have
>
>  need_dump(Tab, LogOps) -> LogOps > DumpLimit * ets:info(Tab, size).
>
> with no change in the interface of the function.

But this is the Functional Programming Lesson:
   there *is* a change in the interface of the function!

In the first case, the function uses nothing but its arguments.
In the second case, the function has an extra parameter (DumpLimit).
The function is _really_

     need_dump(%Hidden%, Tab, LogOps) ->
	LogOps > %Hidden%#%hidden%.DumpLimit * ets:info(Tab, size).

How it's _compiled_ is a separate issue; I'm arguing about what it  
_means_.
>
> But it becomes much more interesting when modules themselves are  
> used as parameters of other modules.
>
> The first version had a constant directory path defined and used by  
> a module pets_lib that served as a sort of general library. Then  
> other modules used pets_lib.
>
> For example, a module pets_tm would have somewhere:
>
>  Res = pets_lib:delete_table(Tab),
>
> How do I make the path (as well as many other configuration  
> parameters) a value chosen at runtime, with little effort in  
> rewriting the code which didn´t contemplate such possibility  
> beforehand?

The problem is that you can't.
Yes, you *can* replace pets_lib: by Pets_Lib:, but now
  - either you have to pass Pets_Lib around all over the place,
    which doesn't count as "little effort", or
  - you have to pass Pets_Lib as a parameter to the module containing
    this call, which transitively affects its callers as well, ...
  - and you had better first take care to rename any existing  
occurrences
    of "Pets_Lib" to something else

Perhaps we can name this a "module parameter cascade".

> Making pets_lib a parameterized module. What is the impact of that  
> on client code? A simple change to:
>
>  Res = Lib:delete_table(Tab),
>
> It looks pretty much the same, but now we have this Lib variable.

Looks, as they say, can be deceiving.

> If we were using closures, the closure would have to be passed  
> somehow (who knows how many levels of invocations) until is was  
> available to the function which performs this invocation.

(a) People seriously underestimate what closures can do.
(b) This is not an argument for modules with parameters,
     it is an argument for nested functions.

> But if the pets_lib instantiation is a parameter of pets_tm, then I  
> can use statements like the above all over pets_tm by doing a simple:
>
> :%s/pets_lib/Lib/g

You had better pray desperately to whatever god(s) you recognise
that there are no other occurrences of Lib, and while you're at
it, beg forgiveness for breaking the name link between the module
pets_lib and the module instance variable Lib (which would be
better as Pets_Lib, so that the only difference between the module
and the instance variable is capitalisation).
>
> and making pets_tm parameterized; e.g.
>
> -module(pets_tm, [Lib, DumpLimit]).

Can anyone explain to me why the syntax for this is not

	-module(pets_tm(Lib, DumpLimit))

?  Not a reason of the form "this was the quickest hack to the parser",
but a reason why this is the _right_ syntax?
>
> Then we only need to glue modules together at service starting time;  
> e.g.

That is, you are making a change to a module which requires
  - *remote* compensatory changes
  - to possibly *many* service startups
  - which previously never mentioned the module in question at all.

And before you reply, no, I am *not* saying that using closures will
fix these issues.  But using closures *will* make you try hard to
think of ways of addressing the problem-space issues that don't have
these solution-space consequences in the first place.

One thing that bothers me is that we used to have one kind of hot  
loading.
Now we have at least three.

(a) classic replacement of a plain module with another.
(b) replacement of a parameterised module, with consequences on
     existing instances
(c) replacement of an instance by another instance with different
     parameter values.

How do you do (c)?

The thing is that adding a parameter to a plain module
converts all cases of (a) involving that module to cases of
(b) or (c).  In particular, replacing one ?DUMP_LIMIT by another
turns an (a) -- handled by existing tools -- into a (c).

I understand ML-style signatures, structures, and functors well
enough to use them with some confidence.  (Although I was _always_
surprised when a new problem with the framework was pointed out.)
But they are *static*.

I understand object orientation tolerable well.  I can read and write
C++ as long as it doesn't use too many templates.  I'm fluent in
Smalltalk, and I used to do a fair bit of Eiffel.  So I understand
constructing webs of objects and replacing one of them.

Erlang modules with parameters are neither one thing nor the other.
They are not pure static things like ML functors, because the
underlying modules can be hot-reloaded.  They are not quite OO
because there doesn't seem to be any simple way to do (c).

Reverting to the pets-lib example,

- changing from a compile-time limit to a startup-configurable
   limit is not a trivial DESIGN change, so a technique that makes
   it look like a trivial CODE change makes me suspicious and anxious

- it can obviously be done by querying a configuration file at run
   time (the way C programs call sysconf() instead of using
   templates or macros).

Querying a configuration at run time is fully compatible with all
the old Erlang tools.  Indeed, from the point of view of system
maintenance, being able to trace the catalogue lookups in someone
else's large body of code and find out just who needs what when
can be a help.

The catalogue can be a module.
Let's take the data base example.

> need_dump(Tab, LogOps) -> LogOps > ?DUMPLIMIT * ets:info(Tab, size).

We want to make DUMP_LIMIT something that can be configured at
run time.  But that's easy!

     need_dump(Tab, LogOps) ->
	my_config:dump_limit(Tab) * ets:info(Tab, size).

where
     -module(my_config).
     -export([dump_limit/1]).

     dump_limit(_Tab) -> ?DUMP_LIMIT.

To change the configured value, load a new version of my_config.
To select a value at system startup, select which version of the
configuration module to load.

The use-case for modules with parameters (if there is one) is
where there are grounds for believing that there may need to
be multiple distinct instances of the same module at the same
time AND where the module parameter cascade is tolerable.