<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <div class="moz-cite-prefix">Hi!<br>

      On 11/07/2012 08:03 AM, Dmitry Demeshchuk wrote:<br>

    </div>

    <blockquote

cite="mid:CANH2pztO-VT8u3GRB1bHOUKuOWA6+X+Hb0M1TVn9NxV9aoZ+5A@mail.gmail.com"

      type="cite">

      <meta http-equiv="Content-Type" content="text/html;

        charset=ISO-8859-1">

      Hello, list.

      <div><br>

      </div>

      <div>As you may know, epmd may sometimes misbehave. Loses nodes

        and doesn't add them back, for example (unless you do some

        magic, like this: <a moz-do-not-send="true"

          href="http://sidentdv.livejournal.com/769.html">http://sidentdv.livejournal.com/769.html</a>

        ).</div>

      <div><br>

      </div>

    </blockquote>

    First of all, we have no bug reports were epmd looses nodes except

    if you deliberately kill epmd or deliberately disconnect. I

    unfortunately cannot read the article you are referring to (the

    language is not one I understand), so I cannot explain what's going

    on there.<br>

    <blockquote

cite="mid:CANH2pztO-VT8u3GRB1bHOUKuOWA6+X+Hb0M1TVn9NxV9aoZ+5A@mail.gmail.com"

      type="cite">

      <div>A while ago, Peter Lemenkov got a wonderful idea that epmd

        may be actually written in Erlang instead. EPMD protocol is very

        simple, and it's much easier to implement all the failover

        scenarios in Erlang than in C. So far, here's a prototype of

        his: <a moz-do-not-send="true"

          href="https://github.com/lemenkov/erlpmd">https://github.com/lemenkov/erlpmd</a></div>

      <div><br>

      </div>

    </blockquote>

    Failover is usually not needed, it's one single process on a

    machine, it should only stop if the machine stops. What scenario are

    we talking about here?<br>

    <br>

    As epmd works today, a distributed erlang node connects to a *local*

    epmd (it's after all just a portmapper, similar to many other

    portmappers), and tells it what name and port number it has. When

    the beam process ends (in some way or another) the socket get's

    closed and epmd is more or less instantly informed. Epmd survives

    starts and stops of Erlang nodes on the machine and is the single

    database mapping ports for erlang distribution on the host.<br>

    <br>

    If we were to implement epmd in Erlang with that scheme, the first

    Erlang node either has to survive for all of the host's lifespan or

    has to transfer the ownership of the open sockets (ALIVE-sockets) to

    "the next" node to take over the task of epmd. Note that these nodes

    may not be in the same cluster, epmd is bound to a machine, not an

    Erlang cluster. Erlang VM's participating in different Erlang

    clusters may exist on the same machine. This would be feasible if we

    had an *extra* Erlang node for port mapping, which of course could

    be a working solution.<br>

    <br>

    To implement this in Erlang, using the already present distributed

    Erlang machines, would probably require a different mechanism for

    registering and unregistering nodes. Looking out for closed sockets

    will not do, as we will need to monitor nodes that has no connection

    to us (or they have to re-establish such a connection at least,

    which is not needed today). Also a reliable takeover by nodes

    participating in different clusters could be implemented, it is in

    no way impossible of course. You would also need to reopen the known

    port when taking over, so there will be a race, or rather a short

    time with no epmd listening. All clients have to handle that.<br>

    <br>

    Implementing a more simple epmd for a machine with only one Erlang

    node is far easier and could be useful for small embedded systems.

    In that case we will not need to change the protocol. Usage will be

    limited of course.<br>

    <br>

    You could also rewrite epmd in Erlang and have an extra (non

    distributed) Erlang machine resident in the system (after all, it

    would be more or less the same thing as having a C program

    resident). That would not require complicated takeover scenarios,

    but would increase the memory footprint slightly. An implementation

    in Erlang could cover both the single VM system and a solution with

    an extra Erlang machine, which would be nice.<br>

    <blockquote

cite="mid:CANH2pztO-VT8u3GRB1bHOUKuOWA6+X+Hb0M1TVn9NxV9aoZ+5A@mail.gmail.com"

      type="cite">

      <div>When hacking it, I've noticed several things:</div>

      <div><br>

      </div>

      <div>1. When we send ALIVE2_REQ and reply with ALIVE2_RESP, we

        establish a TCP connection. Closing of which is a signal of node

        disconnection. This approach does have a point, since we can use

        keep-alive and periodically check that the node is still here on

        the TCP level. But next, some weird thing follows:</div>

    </blockquote>

    Note that this is local connections. Keep-alive has nothing to do

    with it. The loopback detects a close and informs immediately.

    Keep-alive detects network problems (badly) and is only useful when

    talking across a real network.<br>

    <blockquote

cite="mid:CANH2pztO-VT8u3GRB1bHOUKuOWA6+X+Hb0M1TVn9NxV9aoZ+5A@mail.gmail.com"

      type="cite">

      <div><br>

      </div>

      <div>2. When we send other control messages from a node connected

        to epmd, we establish a new TCP connection, each time. Could use

        the main connection instead. Was it a design decision or it's

        just a legacy thing?</div>

    </blockquote>

    When you communicate with epmd after alive is sent, you establish a

    connection to the epmd *on the host you want to connect to*, which

    is only the same epmd  as you used for registration if the Erlang

    node you want to talk to is on the same host as you yourself are.

    You are looking for a port on the particular machine that your

    remote Erlang machine resides on. Only in the local case you could

    reuse your connection, which would only add a special case with very

    little gain.<br>

    <blockquote

cite="mid:CANH2pztO-VT8u3GRB1bHOUKuOWA6+X+Hb0M1TVn9NxV9aoZ+5A@mail.gmail.com"

      type="cite">

      <div><br>

      </div>

      <div>3. The client (node) part of epmd seems to be all implemented

        in C and sealed inside ERTS. However, it seems like this code

        could be implemented inside the net_kernel module instead (or

        something similar).</div>

    </blockquote>

    <div>erl_epmd is the module and it's called by net_kernel. No epmd

      communication except the inet_driver itself is written in C on

      that side. The epmd daemon is of course written in C, but it's not

      part of the VM.<br>

    </div>

    <blockquote

cite="mid:CANH2pztO-VT8u3GRB1bHOUKuOWA6+X+Hb0M1TVn9NxV9aoZ+5A@mail.gmail.com"

      type="cite">

      <div><br>

      </div>

      <div><br>

      </div>

      <div>Why bother and switch to Erlang when everything is already

        written and working? First of all, sometimes it doesn't really

        work in big clusters (see my first link). And, secondly, using

        Erlang we can easily extend the protocol. For example, add

        auto-discovery feature, which has been discussed on the list a

        lot. Add an ability for a node to reconnect if its TCP session

        has been terminated for some reason. Add lookups of nodes by

        prefix (like, "give me all nodes that match mynode@*"). The list

        can be probably extended further.</div>

    </blockquote>

    I think a lot of this should be solved in the client, which is

    already written in Erlang. Rewriting the server might just add

    complexity, at least if you want to solve it in the already running

    distributed nodes, with takeover and whatnot.<br>

    <br>

    <blockquote

cite="mid:CANH2pztO-VT8u3GRB1bHOUKuOWA6+X+Hb0M1TVn9NxV9aoZ+5A@mail.gmail.com"

      type="cite">

      <div>Do you think such a thing (with full backwards compatibility,

        of course) could go upstream? Also, a question for epmd

        maintainers: is it going to change at all, or the protocol is

        considered to be full enough for its purposes?</div>

    </blockquote>

    We have thought about a distributed epmd over the years, but have

    never considered it worth the effort, due to the takeover complexity

    etc. Portmapping is really basic functionality, you wouldn't want to

    mess that up. A separate Erlang machine would maybe be a solution,

    but as epmd is such a simple program, we have not really thought it

    worth the extra memory footprint.<br>

    <br>

    So it would not be the easiest thing to convince us to take

    upstream, but given a well thought through solution, we could get

    rid of some maintenance - Erlang is after all far nicer to maintain

    than C... One could also make it possible to chose between different

    epmd solution, in that way we would cover the cases where people

    would not want an extra Erlang machine for portmapping. More

    elaborate things could then be experimented with in the

    Erlang-written epmd.<br>

    <br>

    If you can isolate a bug or explain a malfunction in the current

    epmd, it would be a great contribution!<br>

    <br>

    <blockquote

cite="mid:CANH2pztO-VT8u3GRB1bHOUKuOWA6+X+Hb0M1TVn9NxV9aoZ+5A@mail.gmail.com"

      type="cite">

      <div>

        <div><br>

        </div>

        -- <br>

        Best regards,<br>

        Dmitry Demeshchuk<br>

      </div>

      <br>

    </blockquote>

    Cheers,<br>

    /Patrik<br>

    <blockquote

cite="mid:CANH2pztO-VT8u3GRB1bHOUKuOWA6+X+Hb0M1TVn9NxV9aoZ+5A@mail.gmail.com"

      type="cite">

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">_______________________________________________

erlang-questions mailing list

<a class="moz-txt-link-abbreviated" href="mailto:erlang-questions@erlang.org">erlang-questions@erlang.org</a>

<a class="moz-txt-link-freetext" href="http://erlang.org/mailman/listinfo/erlang-questions">http://erlang.org/mailman/listinfo/erlang-questions</a>

</pre>

    </blockquote>

    <br>

  </body>

</html>