<div class="gmail_quote">When I implemented now() in Erjang, I found a lock-free algorithm for it which has the right "timestamp must be monotonically increasing" property.<br>I did wonder at the time whether it would make sense to suggest using it in BEAM as well. I didn't (although I was proud of the simple solution I'd discovered); mostly because the corresponding BEAM source, with all its smoothing algorithms and support for different platforms, was perhaps a bit intimidating.<br>

Now, triggered by the mention on this list that the mutex used in now() is actually being a bottleneck (and having spent the past couple of months ruthlessly refactoring other people's organically grown code), I dare suggest it :-)<br>

<br>The algorithm could be used as-is to at least reduce the extent of the timeofday section in get_now(), if used for adjusting for that property after a timestamp has been obtained.<br>(This is assuming that we have true atomic operations of at least 64-bit width; this assumption may of course fail on some platforms. Among the supported (or interesting) platforms, do any have SMP but not such atomic operations?)<br>

It would be even better to eliminate the locking in (the common code path of) get_now() altogether, but there are quite a few variants for obtaining the timestamp (3 implementations of get_tolerant_timeofday()).<br><br>The timer reading and smoothing code seems to have complexity of the "don't change unless you're really sure what you're doing" kind; bearing that in mind, here's what I could imagine for locking-reducing improvements:<br>

<br>0) The code as it is now has two parts: a "read timer, adjust timer value according to state, and adjust state" part, and a "ensure monotonicity" part.<br>1) Split the former into a lock-less "read+adjust" part, and a locked "adjust state" part which is done seldomly. This can be done if the state for adjusting can be read atomically.<br>

2) The main algorithm for now() is then:<br>// Warning - pseudo-code ahead - I hope it's not too detailed for the purpose<br>{<br>  read = adjusted_timer_value();<br>  if (time_to_adjust_timer_state(read)) {<br>    lock()<br>

    adjust_timer_state(read);<br>    unlock()<br>    time = adjusted_timer_value() // Re-read<br>  } else time = read;<br>  return ensure_monotonically_increasing_now(time)<br>}<br><br>3) Here, time_to_adjust_timer_state() is also lock-free, taking advantage of the fact that it doesn't matter if two threads get a "yes" answer at the same time; if nothing else, the condition can be re-tested in adjust_timer_state().<br>

<br>static erts_smp_atomic_t  adjust_time;<br>bool time_to_adjust_timer_state(long read_time) {<br> return erts_smp_atomic_read(adjust_time) - read_time > ADJUST_INTERVAL;<br>}<br><br>adjust_time is then updated at the end of adjust_timer_state().<br>

<br>4) Finally, the ensure_monotonically_increasing_now() simply implements the algorithm I discovered for Erjang; here it is in the current Java expression:<br>{ // parameter is named 'micros'<br>                 long prev;<br>

                while ((prev = latest_now.get()) < micros) {<br>                        if (latest_now.compareAndSet(prev,micros)) {<br>                                return micros;<br>                        }<br>                }<br>

                return latest_now.incrementAndGet();<br>}<br><br>So, there it is: ensure that the state needed for the timer correction is readable in a lock-free manner; that it is updated within mutex protection but seldomly; and that timer reading and adjustment itself does not change the global state. Then the common path can use just atomic ops, not locks.<br>

The Erjang implementation of now() is quite a bit faster than that of BEAM. It doesn't have the same smoothing logic, of course, but I like to believe that the lack of locking is the major reason.<br><br>Would the proposed change be worth it, do you think?<br>

<br>2012/7/12 Wei Cao <span dir="ltr"><<a href="mailto:cyg.cao@gmail.com" target="_blank">cyg.cao@gmail.com</a>></span><br>[snip]<br><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">


Finally, I use lcnt to locate all lock collisions in erlang VM, found<br>

the mutex timeofday is the bottleneck.<br>

<br>

                                                          lock<br>

                location  #tries  #collisions  collisions [%]  time<br>

[us]  duration [%]<br>

<br>

    -----                        --------- ------- ------------<br>

--------------- ---------- -------------<br>

<br>

timeofday        'beam/erl_time_sup.c':939  895234       551957<br>

 61.6551    3185159       23.5296<br>

<br>

timeofday        'beam/erl_time_sup.c':971  408006       264498<br>

 64.8270    1473816       10.8874<br>

<br>

<br>

the mutex timeofday is locked each time erts_check_io is invoked to<br>

"sync the machine's idea of time", erts_check_io is executed hundreds<br>

of thounds of times per second, so it's locked too much times, hence<br>

reduce performance.<br>

<br>

I solved this problem by moving the sync operation into a standalone<br>

thread, invoked 1 time per millisecond<br></blockquote></div><br>