[erlang-questions] Parameterized module idioms
Paulo Sergio Almeida
Wed Apr 21 17:31:51 CEST 2010
Richard O'Keefe wrote:
> If they really *are* nonrelated, they should be separate parameters anyway.
> If they are related, they can be in one data structure (which might be
> a closure). "Pollution" is not a property of closures, but of a coding
The pollution I mean is the passing around along the invocation chain
from the "root" function up to the "leaf" functions, of several possibly
>> If some code would need to be generated if I didn't have parameterized
>> modules, then parameterized modules already give me something.
> What? Automatic code generation isn't anything you need to be aware of,
> let alone involved with.
Ok. I can buy that. It is some extra infrastructure but not my problem.
>> Instead of
>> need_dump(Tab, LogOps) -> LogOps > ?DUMPLIMIT * ets:info(Tab, size).
>> I can have
>> need_dump(Tab, LogOps) -> LogOps > DumpLimit * ets:info(Tab, size).
>> with no change in the interface of the function.
> But this is the Functional Programming Lesson:
> there *is* a change in the interface of the function!
The interface is what matters to client code, not internals. The
interface has not changed:
Before: I supply an atom and an integer and get a boolean.
After: I supply an atom and an integer and get a boolean.
The client code that performs the invocation of the function does not
see a change in the interface and does not need to be changed. More
precisely: the invocations in the same module (intra-module calls) do
not need to be changed. Invocations in other modules will be changed in
that now the module name in the call is a variable; i.e. from m:f(P) to
> In the first case, the function uses nothing but its arguments.
> In the second case, the function has an extra parameter (DumpLimit).
> The function is _really_
> need_dump(%Hidden%, Tab, LogOps) ->
> LogOps > %Hidden%#%hidden%.DumpLimit * ets:info(Tab, size).
> How it's _compiled_ is a separate issue; I'm arguing about what it _means_.
But it almost seems like you are talking about how it is compiled ;)
What the function _uses_ is irrelevant to its interface. It is as if
saying that the values captured by a closure are part of its interface,
because the closure definition uses them.
That now an extra parameter is used is just an implementation choice,
not a fundamental thing in the concept. One could have another
implementation (not that I am saying it would be a good idea) were the
module source is compiled at runtime with the parameters substituted and
then loaded, resulting in a module like
which has the same functions, with the exact same interface, and no
hidden extra parameter. And when a client does
what _really_ happens could be
which would not be much different than doing now:
M = lists,
with no hidden extra parameters being passed. Such an implementation
would have very different consequences, both disadvantages like runtime
instantiation costs and possible errors, and advantages like possible
optimizations using runtime knowledge, and I am not saying it would be
realistic or in the spirit of what we expect. I am just saying that the
interface remains the same, but now the function belongs to a module
that is only computed at runtime and that for intra-module calls even
that is irrelevant.
>> For example, a module pets_tm would have somewhere:
>> Res = pets_lib:delete_table(Tab),
>> How do I make the path (as well as many other configuration
>> parameters) a value chosen at runtime, with little effort in rewriting
>> the code which didn´t contemplate such possibility beforehand?
> The problem is that you can't.
Yes I can ;)
I actually could.
> Yes, you *can* replace pets_lib: by Pets_Lib:, but now
> - either you have to pass Pets_Lib around all over the place,
> which doesn't count as "little effort", or
> - you have to pass Pets_Lib as a parameter to the module containing
Exactly: if it is a paramter in the client mode. Then it is little
effort. It really WAS little effort in rewriting my library.
> this call, which transitively affects its callers as well, ...
This is an interesting point, which I don´t know if it was much
discussed here. I tended to notice that to be able to have nice
little-effort changes, client modules end up having parameters. This is
a sort of "viral" phenomena, which we want to contain.
This viral aspect made me think that when building an abstraction, the
module(s) that are exposed to the outside world should NOT be
parameterized, while the modules used internally can be parameterized if
it helps productivity in writing code.
This is what I ended up with, looking at each module definition:
pets_gc.erl:-module(pets_gc, [Lib, MaxTids, CollectRatio, PurgeRatio]).
pets_loader.erl:-module(pets_loader, [Lib, MaxReaders, MaxInserters]).
pets_tm.erl:-module(pets_tm, [Lib, DumpLimit]).
pets_writer.erl:-module(pets_writer, [Lib, SyncDelay]).
The only module that clients use, "pets", is not parameterized. The
modules used internally are either parameterized:
pets_gc, pets_lib, pets_loader, pets_tm, pets_writer
or not parameterized:
> - and you had better first take care to rename any existing occurrences
> of "Pets_Lib" to something else
Of course. But it is easy to: invent a nice name; then check that it
doesn´t occur in any module.
> Perhaps we can name this a "module parameter cascade".
>> Making pets_lib a parameterized module. What is the impact of that on
>> client code? A simple change to:
>> Res = Lib:delete_table(Tab),
>> It looks pretty much the same, but now we have this Lib variable.
> Looks, as they say, can be deceiving.
>> If we were using closures, the closure would have to be passed somehow
>> (who knows how many levels of invocations) until is was available to
>> the function which performs this invocation.
> (a) People seriously underestimate what closures can do.
> (b) This is not an argument for modules with parameters,
> it is an argument for nested functions.
Not sure what you mean. Using closures will have a greater impact on
client code, and also on the implementation code that I am trying to reuse.
>> But if the pets_lib instantiation is a parameter of pets_tm, then I
>> can use statements like the above all over pets_tm by doing a simple:
> You had better pray desperately to whatever god(s) you recognise
> that there are no other occurrences of Lib, and while you're at
easy: grep Lib *erl and look at the result.
> it, beg forgiveness for breaking the name link between the module
> pets_lib and the module instance variable Lib (which would be
> better as Pets_Lib, so that the only difference between the module
> and the instance variable is capitalisation).
This is a curious point. I am aware that I broke the connection. It was
intentional. Before it was pets_lib only because the application is
called pets. But I see it as my general library from this application. I
preferably wouldn´t want to worry about what the application is going
to be called and to have to change all over from "pets_lib" to
"amnesia_lib" if I decide to rename it. This kind of use is the normal
convention in Erlang, having to worry about global module namespace
pollution. Another advantage of using variable for modules, like "Lib"
above, is having to worry less about that kind of pollution, and what
the application is going to be called. ;)
>> Then we only need to glue modules together at service starting time; e.g.
> That is, you are making a change to a module which requires
> - *remote* compensatory changes
> - to possibly *many* service startups
> - which previously never mentioned the module in question at all.
I dont't understand what you mean. But all this gluing is done in the
"main" for the internal modules of the application. As I exemplified, it
was easy to do. The result is also easy to reason about. After
instantiation, everything will behave as if the values passed to the
glued modules had been -define'ed constants; no strange side-effects;
referential transparency; functional style using "POF"s: plain old
functions. (Does this term exist? ;))
> Let's take the data base example.
>> need_dump(Tab, LogOps) -> LogOps > ?DUMPLIMIT * ets:info(Tab, size).
> We want to make DUMP_LIMIT something that can be configured at
> run time. But that's easy!
> need_dump(Tab, LogOps) ->
> my_config:dump_limit(Tab) * ets:info(Tab, size).
> dump_limit(_Tab) -> ?DUMP_LIMIT.
> To change the configured value, load a new version of my_config.
> To select a value at system startup, select which version of the
> configuration module to load.
Here there is a recompilation and selection of module. But if we want to
compute at startup time some value to serve as "parameter" , that value
will have to be stored somewhere in a globally acessible data structure
like ets, to be consulted by my_config:dumplimit/1. That can sometimes
Now I remember that one of the "permissible" uses for the process
dictionary is to store "parameters" written once but never changed
later. This kind of use ties code with the process structure and the use
of get/1 can be slow. Parameterized modules can make some of these uses
> The use-case for modules with parameters (if there is one) is
> where there are grounds for believing that there may need to
> be multiple distinct instances of the same module at the same
> time AND where the module parameter cascade is tolerable.
A quite common example would be several instances of a web server
together listening in different ports. But in this case the module(s)
exposed to the client being parameterized, contrary to my "guideline"
above, would lead to the client possibly storing instances in some data
structures to avoid being itself parameterized and containing the
"viral" impact of parameterized modules.
More information about the erlang-questions