Dependency injection in Erlang (Disgression from: Longstanding issues: structs & standalone Erlang)

Fri Feb 24 07:57:22 CET 2006

Mikael Karlsson wrote:
[...]
> I am still
> missing a component binding contoller. Coming from objectweb,
> have you thought anything about a ( a'la Objectweb) Fractal
> binding (and attribute) controller for Erlang gen_servers?

I was one of the initial designers of Fractal, so I am quite 
fluent in Fractal concepts ;-), and like you I am missing some 
of Fractal features in Erlang. I even started implementing a 
generic Fractal membrane in Erlang. 

A problem with OTP is that it has mostly a static viewpoint: it 
is mostly all about code (modules), and the runtime architecture 
is static (the hierarchy of supervisor is generally static, 
processes have global names set statically, client processes 
depend statically on such global names, etc.).
On the other hand, Fractal has a purely dynamic viewpoint, in the 
spirit of the OSI RM-ODP standard (Reference Model of Open 
Distributed Processing). On some points Fractal is more advanced 
than RM-ODP (separation between membrane and content in every 
object / component), and on some points it is more restricted 
(the only allowed interactions are interrogations, not signals or 
announcements or flows), but it maps to RM-ODP quite well.

The strong point about Fractal/RM-ODP, is that the architecture 
is a dynamic concern: components (objects, in RM-ODP) are bound 
together through (possibly distributed) bindings, that are sorts 
of channels allowing interactions. Think of it as the channels 
and routes in SDL for instance, but dynamically reconfigurable. 
Fractal/RM-ODP allows also composite objects, i.e. objects that 
contain objects. In Fractal, these form hierarchies of control 
domains (composite components do control their content).
Those concepts can be found almost as-is in Erlang/OTP: Erlang 
objects are like RM-ODP objects, hierarchies of supervisors and 
processes are similar to hierarchies of composite components in 
Fractal (hierarchy of control domains)...
Erlang/OTP even offers runtime reflective access to the hierarchy 
of supervisors.

But one feature is missing: Dependency Injection.
In Erlang, bindings are implicit. I.e., there are bindings (in 
the engineering viewpoint, when first sending a message to a 
remote node, a TCP connection is first implicitly open, etc.), 
but creating and destroying bindings explicitly before and after 
sending a message is not necessary for a process.
The problem, since bindings are not explicit, is that it becomes 
impossible to manage the architecture at runtime: it is 
impossible to know which process would send messages to which 
process. Therefore, if I want to replace a process with another, 
there is no general way to replace its name (pid) in the state 
of the processes that depended on it. Hence, I can't replace the 
process.

In Erlang/OTP, a limited solution is the use of naming domains 
(node-local and global). But this is very limited. It makes it 
possible to replace a named process by another process with the 
same name: if client processes access that process through the 
naming domain, and not directly using its pid, then the 
replacement is possible and transparent. However, if for 
instance two processes 'client1' and 'client2' send messages to 
a same process 'server' through its name (e.g. 'server' is a 
name registered in the global naming domain), it is not possible 
at runtime to reconfigure 'client1' and 'client2' to make them 
access two different server processes. The problem, is that 
usually in Erlang/OTP (in all the Erlang programs I have seen), 
the names of server process are statically specified in the code 
of client modules, and there is no way to modify them at 
runtime.
And it is even difficult to modify them statically, because it 
generally requires to modify the code. As a consequence, this 
makes it even difficult to reuse code (modules): a module that 
sends messages to a process named 'myniceserver' cannot easily 
be reused without the module that starts and implements the 
process named 'myniceserver'.

That's where Dependency Injection is necessary.
Cf. Fowler's excellent article: 
http://www.martinfowler.com/articles/injection.html
The purpose of Fractal's binding controllers is essentially to 
offer dependency injection, but they can do more (they can 
create distributed bindings, etc.).
In Erlang, we already have bindings, although implicit, so I 
believe that we need only to add a dependency injection 
mechanism, as an OTP design principle.

Dependency injection would consist in having a naming domain 
local to every process, and letting the mapping between 
process-local names and real process names or pids be done from 
outside of the process implementation (hence, the architecture 
becomes a separate concern, implemented separately from 
functional code).

There are two ways to do that in Erlang. Let's take gen_server as 
an example.

1- constructor injection: simply pass the pids or names of all 
the processes messages will be sent to, in the Args parameter, 
e.g.:
gen_server:start_link(client1, [server1], []),
gen_server:start_link(client1, [server2], []),
gen_server:start_link(client2, [server1, server10, AnyOtherArg], 
[]),
...
In the modules, the domain name could be implemented generically 
as a dictionary in the process state:

-module(client2).

init([FooServerName, BarServerName, Anything]) ->
    Deps = dict:new(),
    Deps2 = dict:store('foo', FooServerName, Deps),
    Deps3 = dict:store('bar', BarServerName, Deps2),
    {ok, #state{deps = Deps3, anything = Anything}}.

handle_cast(Request, State) ->
    Deps = State#state.deps,
    {ok, FooServerName} = dict:find('foo', Deps),
    %% e.g., forward the request:
    gen_server:cast(FooServerName, {hey, Request}),
    {noreply, State}.

Drawback: there is no way to modify dependencies after the 
process is started. This is solved with method 2-.

Alternatively, the process names could be stored directly in the 
State record in that case (one record field for every 
dependency), but it makes reflective access more difficult, cf. 
method 2- below...

2- getter/setter/interface injection: implement gen_server calls 
to get / set dependencies, i.e. to modify the process' local 
naming domain. E.g.:

-module(client2).
...
handle_call({getdep, Key}, From, State) ->
    Deps = State#state.deps,
    {reply, dict:find(Key, Deps), State};    
handle_call({setdep, Key, Pid}, From, State) ->
    %% should check that Key is valid...
    Deps = State#state.deps,
    NewDeps = dict:store(Key, Pid, Deps),
    NewState = State#state{deps = NewDeps},
    {reply, ok, NewState}.

Of course, it is preferable to implement both approaches 
simultaneously. In addition, we could also add as in Fractal the 
distinction between optional and mandatory client interfaces, and 
the distinction between singleton and collection interfaces.
And maybe it would be more efficient to use the process' 
dictionary directly (using get/1 and put/2)...??

Attribute control should be done the same way: through init/1 
parameters, and through gen_server calls ({getattr, Attr} and 
{setattr, Attr, Val}). Although both concerns seem very similar 
that way, they must be separate (i.e. we must not to mix binding 
and attribute control) because the callbacks have a different 
semantics. For instance, when setting a dependency (setdep 
call), one would like to automatically link/1 the client and the 
server process.

The dependency injection implementation above is the very 
minimum, but it allows many things already: transparent 
interposition, application-specific distributed bindings 
implemented in Erlang (e.g. one could implement a transparent  
proxy process between communicating processes, to do load 
balancing between several server processes, or to do group 
communication transparently...), etc.
Of course, if we want to implement generic membranes as in 
Fractal, we would have to add a lot of things around, but 
functional modules would not have to implement more than the DI 
callbacks shown above, just as in Fractal/Java.
My opinion is: KISS for developers, and be as Erlang- and 
OTP-compliant as possible.

For instance, I don't like ErlCOM, which imposes a lot of 
non-functional code in modules (altough the concepts are the 
same as in Fractal, both being rooted in RM-ODP):
http://www.ist-runes.org/docs/publications/EUC05.pdf
It tries to translate implementation solutions that make sense in 
object-oriented, statically typed languages, into Erlang/OTP. 
But I think that it does not fit Erlang/OTP well.
For instance, the idea to formally define interface signatures 
statically is a good idea in static typing languages such as 
Java (and I am a strong advocate of that), but does make little 
sense in a dynamic language such as Erlang, in which case it 
restricts flexibility for little gain.
For instance, the set of messages that can be received or sent by 
an Erlang process at a given time, may change during the 
process' lifetime. In RM-ODP terms, its set of server and client 
interfaces mayh change over time. For instance, consider guards 
in  receive statements, or in handle_cast callbacks:

test(State) ->
    receive
        {sayhello, Arg}
        when State#state.acceptsayhello == true ->
            ...
    end.

handle_cast({sayhello, Arg}, State)
    when State#state.acceptsayhello == true ->
    ...

This cannot be captured in ErlCOM or Fractal, which consider that 
the set of client and server interfaces, and their signatures, 
(i.e. the component's type) do not change after component 
creation.

One should not impose such limitations in Erlang. So le'ts keep 
it simple, stupid...
And again, I think that the only thing that should be imposed to 
developers is the implementation of DI callbacks as described 
above.
Any other control should be implemented outside of the functional 
modules' implementations, and even should be made optional.

-- 
Romain LENGLET