[erlang-questions] R12B erlang node restart after system clock change

Stephen Han kruegger@REDACTED
Wed Sep 9 06:18:09 CEST 2009


Thanks for the information. I was able to resolve the issue right after
see your email and I am getting the same behavior as R10B. I thought it was
my mistake during the code upgrade. However, today I download the fresh
R13B01 for my own fun and found out that same typo in the code again. Then I
realized it was not really my mistake from the beginning.

I am still using Linux 2.4 kernel so that is why I am getting this issue

In heart.c


The CORRET_USING_TIMES is a typo, I did not bother to look when it  was
introduced but it should be CORRECT_USING_TIMES, otherwise the system clock
change will restart the node as default behavior in 2.4 system


On Wed, Aug 26, 2009 at 1:29 AM, Ulf Wiger

> The documentation for heart does say:
> "It should be noted that if the system clock is adjusted with more than
> HEART_BEAT_TIMEOUT seconds, heart will timeout and try to reboot the system.
> This can happen, for example, if the system clock is adjusted automatically
> by use of NTP (Network Time Protocol)."
> (...even in R10B).
> However, the reason why you're not seeing this in R10B is, I think,
> that heart.c has been re-written to use the system timestamp by
> default, whereas it derived timestamps from system ticks in R10.
> One relevant difference in the code seems to be:
> /*
>  * Implement time correction using times() call even on Linuxes
>  * that can simulate gethrtime with clock_gettime, no use implementing
>  * a phony gethrtime in this file as the time questions are so infrequent.
>  */
> #endif
> Timestamps are still simulated on WIN32 or if HAVE_GETHRTIME is not
> defined, but HEART_CORRECT_USING_TIMES is.
> (Please verify for yourself by reading erts/etc/common/heart.c,
> as this is not documented, from what I can tell, and you should
> never draw conclusions based solely on my sloppy reading of C code).
> Perhaps using ticks whenever possible would be the best strategy
> for heart.c, as it is hardly a feature that it goes bezerk if
> someone dabbles with the system clock. It doesn't need hi-res
> timestamps to begin with, as no one in their right mind would
> set HEART_BEAT_TIMEOUT to something in the millisecond range
> (I don't really recommend anything less than a minute, actually,
> as heart is just a last resort, and /will/ interfere will
> crash dump generation too, if given a chance).
> BR,
> Ulf W
> Stephen Han wrote:
>> Hi
>> I am facing an issue where erlang node is restarted by "heart" whenever I
>> change the system clock forward. It seems beam got KILL signal and the
>> "heart" restarting the node. The node got restarted even I move forward
>> the
>> system clock for 1 minute.
>> FYI, I am using OTP R12B-3.
>> The problem is I am not even sure whether the node got restarted by our
>> application or Erlang/OTP.
>> However, this is also not reproducible in our old software which used to
>> use
>> R10B-8.
>> Is there any changes have been made to post R10B where Erlang node should
>> restart if the system clock move forward?
>> Can you suggest any good method to debugging this kind of problem?
>> regards,
> --
> Ulf Wiger
> CTO, Erlang Training & Consulting Ltd
> http://www.erlang-consulting.com

More information about the erlang-questions mailing list