[erlang-questions] RFC: On `inet_tcp_dist` and `erl_epmd` interaction
Ciprian Dorin Craciun
ciprian.craciun@REDACTED
Tue Oct 25 16:00:59 CEST 2011
Ok... As not many responded to my email, actually no-one :), I've
prepared a small branch based on R14B04 release, which fixes what I've
proposed.
My patches (3 small ones) are found at:
https://github.com/cipriancraciun/otp/tree/patches/erl_epmd-as-proper-gen_server
To fetch:
git fetch git://github.com/cipriancraciun/otp.git
patches/erl_epmd-as-proper-gen_server
To compare my patches:
https://github.com/cipriancraciun/otp/compare/patches%2Ferl_epmd-as-proper-gen_server
https://github.com/cipriancraciun/otp/compare/patches%2Ferl_epmd-as-proper-gen_server.patch
(I hope I got "submitting patches" right. :) )
I'll wait a couple of days and if there is no feedback, (or better
if there is positive feedback), I'll submit it to `erlang-patches`
mailing list.
Ciprian.
P.S.: I've seen that there is also another patch pending in `pu`
branch related to `erl_epmd` which adds support for IPv6. I think my
patch won't cleanly apply over this (as we touch the same functions),
but from what I've seen the fix-up is trivial. How should I handle
this situation? (I think I should prepare a forth patch to merge with
`pu`, right?)
On Mon, Oct 24, 2011 at 16:27, Ciprian Dorin Craciun
<ciprian.craciun@REDACTED> wrote:
> == Summary ==
>
> I've found out that it is "theoretically" possible to override the
> behavior of the default `erl_epmd` module with a custom, but
> "compatible" module, without touching the `kernel` application (only
> through configuration directives). I've labeled this method as
> "theoretical" because the way in which the modules `erl_epmd` and
> `inet_tcp_dist` (or any of the `inet_*_dist` family) interact makes
> them inseparable.
>
> I'm writing this email as I want to help in enabling the
> overriding of the default `erl_epmd` module in a correct, simple, and
> the least intrusive method possible. (By "I want to help" I mean I am
> offering to discuss, write, document and test the code.)
>
>
> == Problem description ==
>
> As stated, there is a function `net_kernel:epmd_module`, which
> conforming to the (source code) documentation should (quote): "return
> module_name of erl_epmd or similar gen_server_module".
> https://github.com/erlang/otp/blob/OTP_R14B04/lib/kernel/src/net_kernel.erl#L1283
>
> Unfortunately its only usage is in `erl_distribution.erl` to start
> the `gen_server` process.
> https://github.com/erlang/otp/blob/OTP_R14B04/lib/kernel/src/erl_distribution.erl#L39
>
> All the other important modules `inet_*_dist`, `net_adm` directly
> use the module `erl_epmd`, without the `net_kernel` indirection.
> https://github.com/erlang/otp/blob/OTP_R14B04/lib/kernel/src/inet_tcp_dist.erl#L70
> https://github.com/erlang/otp/blob/OTP_R14B04/lib/kernel/src/inet_tcp_dist.erl#L254
>
> As a result it is impossible to actually replace the way in which
> `inet_*_dist` modules resolve the transport layer address (more
> exactly the port) of the other nodes.
>
>
> == Problem analysis ==
>
> I think there are possible purposes of the `net_kernel:epmd_module`:
> a) to give the name of a module which should export a `start_link`
> function, which in turn spawns a process, registering under the name
> `erl_epmd` and responding to `erl_epmd` messages in a proper manner
> (thus implementing the "internal" `erl_epmd` protocol); (and as a
> backend, maybe the UDP EPMD protocol;)
> b) or to give the name of a module which should export the
> `register_node/2`, `port_please/2`, `names/0`, and `names/1` functions
> which should act according to the specs in `erl_epmd` (thus
> implementing the `erl_epmd` "interface" / behavior);
>
> As such there is a decision between "implementing a message
> protocol" or "implementing an interface". I.e.:
> * in the first case (implementing the `erl_epmd` internal
> protocol) the overriding module receives messages, and responds to
> them in a proper manner; but the "clients" still use the `erl_epmd`
> module as a frontend (which in turn sends messages to the named
> `erl_epmd` process);
> * in the second case *all* clients should use the overriding
> module (via `net_kernel:epmd_module`), and this one in its turn is
> free to implement the "interface" functions as it sees fit as long as
> it doesn't break the spec;
>
> Now the way in which `net_kernel:epmd_module` is used (only once
> to start the server) and the fact that all `inet_*_dist` modules use
> directly the `erl_epmd` module, suggests that the initial plan was to
> go with solution a) -- i.e. the overriding module should register a
> process under the well established name, and it should respond to
> messages. (This is also suggested by the documentation quote: "or
> similar gen_server_module".)
>
> Unfortunately the way in which `erl_epmd` module is implemented
> suggests method b). Actually it is even worse:
> * half of the functionality is implemented by delegating work to a
> `gen_server` process, see `register_node` function:
> https://github.com/erlang/otp/blob/OTP_R14B04/lib/kernel/src/erl_epmd.erl#L108
> * and half is implemented by directly executing the code in the
> "client" process, see `port_please` and `names` functions, which in
> turn call `get_port` and `get_names`:
> https://github.com/erlang/otp/blob/OTP_R14B04/lib/kernel/src/erl_epmd.erl#L292
> https://github.com/erlang/otp/blob/OTP_R14B04/lib/kernel/src/erl_epmd.erl#L418
>
>
> == Solution ==
>
> Now by me, method a) (as presented above, i.e. implementing the
> internal `erl_epmd` protocol by a named process) is the one most
> "in-line" with OTP principles. (But even b) could work.)
>
> Thus in order to touch as little as possible the existing code, I
> would propose to:
> * update `erl_epmd` module, so that all the "public" functions
> (i.e. `port_please`, `names`, etc.) in fact send a message through
> `gen_server:call` to that process registered under the `erl_epmd` name
> (as `register_node` does);
> * the default implementation in `erl_epmd` in `handle_call`,
> spawns a new process where it calls the internal `get_port` or
> `get_names` and replies to the original call via `gen_server:reply`;
> (to keep the concurrency model as is now, without serializing
> requests);
>
>
> == Conclusion ==
>
> For me -- and the project I'm involved in -- it is really
> imperative to be able to replace the way in which ports are resolved.
> I could do this by branching OTP, and maintaining a set of patches.
> But I would prefer (and I think it could benefit others too) to "fix"
> the current situation.
>
> As stated in the summary, I'm offering to write the patch and test
> it. But before I come up with a patch, I want to ask for feedback as
> maybe I've missed something. Therefore any feedback is very important
> to me.
>
> Thanks for the time (as the email is quite long) :)
> Ciprian.
>
>
> P.S.: The reason I want to replace the current `erl_epmd` module I
> can describe in a different thread. (There are actually two different
> but related reasons, one not being directly tied to this problem, but
> both are related to the `-no_epmd` option, which I've tried to discuss
> in a previous thread.)
More information about the erlang-questions
mailing list