[erlang-bugs] Different handling of floating point underflows between Linux and Solaris-based OSes

Fri Dec 26 19:07:41 CET 2014

Corey Cossentino writes:
 > I sent this yesterday but it doesn't look like it went through, so
 > apologies if anyone gets this twice.
 > 
 > 
 > Calculating math:pow(2, -1075) returns 0 on Linux, but causes an
 > exception on a Solaris-based system. This was causing some crashes in
 > RabbitMQ when it tries to calculate math:exp with inputs less than
 > -745.133.
 > 
 > Using OTP 17.4 on OmniOS r151006.
 > 
 > --
 > 
 > Erlang/OTP 17 [erts-6.3] [source] [smp:24:24] [async-threads:10]
 > [hipe] [kernel-poll:false]
 > 
 > Eshell V6.3  (abort with ^G)
 > 1> math:pow(2, -1074.999).
 > 5.0e-324
 > 2> math:pow(2, -1074) * math:pow(2, -1).
 > 0.0
 > 3> math:pow(2, -1075).
 > ** exception error: an error occurred when evaluating an arithmetic
 > expression
 >      in function  math:pow/2
 >         called as math:pow(2,-1075)
 > 4> math:exp(-745).
 > 5.0e-324
 > 5> math:exp(-746).
 > ** exception error: an error occurred when evaluating an arithmetic
 > expression
 >      in function  math:exp/1
 >         called as math:exp(-746)

I can reproduce this on Solaris 10 / SPARC.

I have reviewed the situation with matherr() on Linux/glibc and Solaris 10,
and I believe a reasonable resolution is to remove the #if !NO_FPE_SIGNALS
block in matherr(), so it reduces to a single "return 1;".

There are problems with checking math routine results for errors in general,
and the matherr() interface in particular.

1. The VM relies on !isfinite() to detect if a math routine failed.
   This appears to work on most systems, but there is a potential problem
   in how various systems and libm implementations behave: while most
   return HUGE_VAL (== INFINITY) on overflows, some return HUGE which is
   a large but finite value.  Solaris' cc -Xt does the latter, but gcc on
   Solaris does the former.  On my glibc-based Linux systems, matherr(3)
   lists HUGE as the return value on overflows for some routines, but my
   tests indicate that HUGE_VAL is returned instead, which while good is
   inconsistent with parts of the documentation.

   It's entirely possible that other libm implementations also return HUGE
   rather than HUGE_VAL on overflows, which thoroughly breaks our !isfinite()
   test.  On Linux there are at least 3 non-glibc libc/libm implementations,
   and who knows what's in all those *BSD variants.

2. matherr(), when properly enabled, is called also in situations the VM does
   not consider to be errors, in particular the underflow case you reported.
   When FP exceptions also are enabled, matherr() sets the FP exception flag,
   causing underflows to erroneously trigger errors.

   However, on systems where plain HUGE is returned for overflows, matherr()
   + FP exceptions may be the only viable way of detecting those errors.

3. As you discovered, matherr() isn't enabled by default on Linux.

As long as we limit ourselves to systems that consistently return HUGE_VAL
on overflows, as Linux/glibc and Solaris w/ gcc do, we don't need matherr()
to detect errors, which is why having it just return 1 should be Ok.

Can you run the emulator test suite on your Solaris system, first with
vanilla 17.4 and then with the proposed code change, and check that the
test suite results are the same?

/Mikael