[erlang-questions] Frying pan bug

Richard O'Keefe ok@REDACTED
Mon Mar 5 04:46:32 CET 2012


On 4/03/2012, at 4:54 AM, David Goehrig wrote:
> 
> Basically, what I'd be interesting in seeing is power, heat, time, cycles, and I/O traffic accounting on a per-Erlang process level, so that the supervisors could do a better job at managing a system under stress.

Problem 1:  the machine I'm typing on has two cores, but only one temperature diode.
If Erlang process X is running on Core 0, and
   Erlang process Y is running on Core 1, and
   process X is doing something to make the chip overheat
the temperature diode cannot tell me which core is producing the
heat, so it cannot tell me which Erlang process is doing it.

Problem 2:  something I have not been able to find an answer to
yet (and I've asked a couple of people I was sure would know)
is how *fast* are the temperature diodes?  Or more accurately,
given a certain power change by a core and the thermal
characteristics of the chip as a whole, and given that the
sensors only report to the nearest 1 degree C, how long does
it take before a power change causes a change in the reading
from the temperature diode?  If that change is not less than
half the time slicing interval used by the scheduler, it's not
clear to me that you can discriminate two processes running on
the same core.

I can imagine an averaging method to sort of deal with problem 2,
but it doesn't survive problem 1.

Another alternative would be to monitor the power used by each core,
if that is possible, but the readings I see are frankly insane.
(As are the f_bsize results that I get back from statvfs() on a Mac;
can the OS *really* be recommending 32 MiB as the block size for
read() on a particular partition?  But that's another story.)





More information about the erlang-questions mailing list