<br><br><div class="gmail_quote">On Fri, Mar 2, 2012 at 11:07 PM, Richard O'Keefe <span dir="ltr"><<a href="mailto:ok@cs.otago.ac.nz">ok@cs.otago.ac.nz</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div class="im"><br>

</div>In fact the negative integers are the only points on the real line where factorial is<br>

undefined.  There are, however, an infinite number of them, and that doesn't mean that<br>

guards are inexpressible.<br></blockquote><div><br></div><div>I was thinking more along the lines of topologies that look something like:</div><div><br></div><div><a href="http://www.wolframalpha.com/input/?i=1+%2F%28+x+mod+y%29">http://www.wolframalpha.com/input/?i=1+%2F%28+x+mod+y%29</a></div>

<div><br></div><div>which can become very expensive to sample with a Monte Carlo method or more complex systems like:</div><div><br></div><div><a href="http://www.wolframalpha.com/input/?i=Duffing+Differential+Equation">http://www.wolframalpha.com/input/?i=Duffing+Differential+Equation</a></div>

<div><br></div><div>which in most forms can only be approximated through sampling.  There's a lot of interesting surfaces which while attempting to sample to gain resolution end up degenerating into the run-away case, but run away can be either positive or negative, a guard may not help, especially when the topology is driven by a heuristic. (i.e. I don't know the equation a head of time, and I can't generate guards on the fly).</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

I just tried it on a Mac.  "CPU A Temperature Diode" was initially at 32C.<br>

Running fac(-1), it rose over a couple of minutes to 38C, and then fell back to<br>

34C, where it remained. <br></blockquote><div><br></div><div>I have a Mac with a busted thermal sensor and it will continue to heat up until the whole  thing locks.  Your test actually proved you have a working thermal sensor and that the OS was safely limiting how quickly you were wasting CPU and power.  The devices we're building do have termal sensors so I can guarantee that they exist, I can also guarantee that it is available in user mode (I control the kernel as well), but what interests me is Erlang has its own scheduler and threading model on top of the OS's, which means relying on Linux to "tune" performance this way is like taking a big stick and beating all the Erlang processes for one bad apple.   All of our server back off CPU frequency, and with designs like bigLITTLE from ARM becoming more common place.  What do you do when the outside air temperature is 50C, and if your CPU steps below a certain power threshold it can no longer keep up with throughput to ensure real time response.</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I have programs that saturate I/O without anything being wrong.</blockquote><div><br></div><div>I have programs that if they saturate I/O it will cause a failure as it will no longer be able to guarantee real time response.  Having tooling in Erlang so that one can identify a run-away I/O process would be generally useful.  The wrongness of a program is a matter of how it fits it's purpose.  You might be able to tolerate I/O saturation.  In my case, our customers get very angry (and litigious).</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I've been reading about self-stablising systems lately, and the rock bottom answer seems<br>

to be "watchdog timers".  A watchdog timer would certainly have caught this.<br></blockquote><div><br></div><div>Watchdog timers help, but you typically need more instrumentation to make reasonable decisions.  In one system, I have a watchdog checking each long running process every second, and a secondary process measuring the average message volume processed by each entity.  We also have each process announce memory pressure,  GC state, I/O  totals, and CPU load on a per OS process level, which are used to determine how to dynamically route data through the system.  Without sufficient data, a watchdog will not make the right decision.</div>

</div><br>Basically, what I'd be interesting in seeing is power, heat, time, cycles, and I/O traffic accounting on a per-Erlang process level, so that the supervisors could do a better job at managing a system under stress.<br clear="all">

<div><br></div>-- <br>-=-=-=-=-=-=-=-=-=-=- <a href="http://blog.dloh.org/">http://blog.dloh.org/</a><br>