[erlang-bugs] Different handling of floating point underflows between Linux and Solaris-based OSes

Corey Cossentino corey@REDACTED
Wed Dec 31 17:41:28 CET 2014


OK, finished running the tests on an OmniOS virtual machine. I'm not
completely sure how to interpret the results, but it looks like a lot
of tests are failing, in both the patched and unpatched version.

The differences I can see between the two runs, based on the
index.html file that was generated:
 tests.common_test_test - went from 1 failure to 3 with the code change
 tests.tools_test - went from 1 failure to 2 with the code change

Is there a file I should send over that would give more information?

On Fri, Dec 26, 2014 at 1:07 PM, Mikael Pettersson <mikpelinux@REDACTED> wrote:
> Corey Cossentino writes:
>  > I sent this yesterday but it doesn't look like it went through, so
>  > apologies if anyone gets this twice.
>  >
>  >
>  > Calculating math:pow(2, -1075) returns 0 on Linux, but causes an
>  > exception on a Solaris-based system. This was causing some crashes in
>  > RabbitMQ when it tries to calculate math:exp with inputs less than
>  > -745.133.
>  >
>  > Using OTP 17.4 on OmniOS r151006.
>  >
>  > --
>  >
>  > Erlang/OTP 17 [erts-6.3] [source] [smp:24:24] [async-threads:10]
>  > [hipe] [kernel-poll:false]
>  >
>  > Eshell V6.3  (abort with ^G)
>  > 1> math:pow(2, -1074.999).
>  > 5.0e-324
>  > 2> math:pow(2, -1074) * math:pow(2, -1).
>  > 0.0
>  > 3> math:pow(2, -1075).
>  > ** exception error: an error occurred when evaluating an arithmetic
>  > expression
>  >      in function  math:pow/2
>  >         called as math:pow(2,-1075)
>  > 4> math:exp(-745).
>  > 5.0e-324
>  > 5> math:exp(-746).
>  > ** exception error: an error occurred when evaluating an arithmetic
>  > expression
>  >      in function  math:exp/1
>  >         called as math:exp(-746)
>
> I can reproduce this on Solaris 10 / SPARC.
>
> I have reviewed the situation with matherr() on Linux/glibc and Solaris 10,
> and I believe a reasonable resolution is to remove the #if !NO_FPE_SIGNALS
> block in matherr(), so it reduces to a single "return 1;".
>
> There are problems with checking math routine results for errors in general,
> and the matherr() interface in particular.
>
> 1. The VM relies on !isfinite() to detect if a math routine failed.
>    This appears to work on most systems, but there is a potential problem
>    in how various systems and libm implementations behave: while most
>    return HUGE_VAL (== INFINITY) on overflows, some return HUGE which is
>    a large but finite value.  Solaris' cc -Xt does the latter, but gcc on
>    Solaris does the former.  On my glibc-based Linux systems, matherr(3)
>    lists HUGE as the return value on overflows for some routines, but my
>    tests indicate that HUGE_VAL is returned instead, which while good is
>    inconsistent with parts of the documentation.
>
>    It's entirely possible that other libm implementations also return HUGE
>    rather than HUGE_VAL on overflows, which thoroughly breaks our !isfinite()
>    test.  On Linux there are at least 3 non-glibc libc/libm implementations,
>    and who knows what's in all those *BSD variants.
>
> 2. matherr(), when properly enabled, is called also in situations the VM does
>    not consider to be errors, in particular the underflow case you reported.
>    When FP exceptions also are enabled, matherr() sets the FP exception flag,
>    causing underflows to erroneously trigger errors.
>
>    However, on systems where plain HUGE is returned for overflows, matherr()
>    + FP exceptions may be the only viable way of detecting those errors.
>
> 3. As you discovered, matherr() isn't enabled by default on Linux.
>
> As long as we limit ourselves to systems that consistently return HUGE_VAL
> on overflows, as Linux/glibc and Solaris w/ gcc do, we don't need matherr()
> to detect errors, which is why having it just return 1 should be Ok.
>
> Can you run the emulator test suite on your Solaris system, first with
> vanilla 17.4 and then with the proposed code change, and check that the
> test suite results are the same?
>
> /Mikael



More information about the erlang-bugs mailing list