[erlang-questions] Monitor process on remote node
Alex Gunin
guninalexander@REDACTED
Fri Jun 5 13:59:53 CEST 2015
Yes. You are right. I explain situation then first node believes that second has died,but second believes that first is alive.
I never have this situation too.
But in my case I need drop some data in process P2 then P1 stop monitor and must do it with strong guarantees.
Does some additional heartbeat messages need for it or I can trust Erlang’s distributed monitor/link functionality.
> On 05 Jun 2015, at 14:21, zxq9 <zxq9@REDACTED> wrote:
>
> On 2015年6月5日 金曜日 13:53:40 Alex Gunin wrote:
>> We have two process P1 on node N1 and P2 on node N2.
>> P1 is monitoring P2 and P2 is monitoring P1.
>> Is’t possible after some network failures/problems that P1 receives {‘DOWN’,…} message,but P2 does’t. Both process life all this time.
>
> From the perspective of either P1 or P2 there is no difference between a network failure and a process crash: either process becoming unavailable for whatever reason is a failure that generates a 'DOWN' message.
>
> There may be some amazing edge case where the gap between N1 and N2 recognizing the network problem is significant, but that is part of why synchronous messaging is built on top of asynchronous messaging (which is true of both OTP 'call' and TCP).
>
> Now that I've mentioned TCP, are there cases where one side of the connection doesn't recognize that the connection on the other end has been dropped? Of course -- temporarily. There are brief periods of this that must exist in distributed Erlang as well, but that's part of what timeouts are for.
>
> In any case, the runtime's support for monitors and links has been so robust that I've never encountered a situation where a node dropping off introduced unexpected hangs in my system. That is, unless I have let some network fallacies creep into my code... especially when I initially code assuming everything is running within a single node and start cheating here and there for performance. (So far that has always turned out is optimization I never needed anyway! Ugh!)
>
> It would be very interesting to learn about the exact mechanism underlying Erlang's distributed monitor/link functionality -- but at the moment I'm too busy trying to solve customer problems to care beyond the fact that it works in a remarkably reliable way.
>
> -Craig
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
More information about the erlang-questions
mailing list