[erlang-bugs] Different handling of floating point underflows between Linux and Solaris-based OSes
Corey Cossentino
corey@REDACTED
Wed Dec 31 17:41:28 CET 2014
OK, finished running the tests on an OmniOS virtual machine. I'm not
completely sure how to interpret the results, but it looks like a lot
of tests are failing, in both the patched and unpatched version.
The differences I can see between the two runs, based on the
index.html file that was generated:
tests.common_test_test - went from 1 failure to 3 with the code change
tests.tools_test - went from 1 failure to 2 with the code change
Is there a file I should send over that would give more information?
On Fri, Dec 26, 2014 at 1:07 PM, Mikael Pettersson <mikpelinux@REDACTED> wrote:
> Corey Cossentino writes:
> > I sent this yesterday but it doesn't look like it went through, so
> > apologies if anyone gets this twice.
> >
> >
> > Calculating math:pow(2, -1075) returns 0 on Linux, but causes an
> > exception on a Solaris-based system. This was causing some crashes in
> > RabbitMQ when it tries to calculate math:exp with inputs less than
> > -745.133.
> >
> > Using OTP 17.4 on OmniOS r151006.
> >
> > --
> >
> > Erlang/OTP 17 [erts-6.3] [source] [smp:24:24] [async-threads:10]
> > [hipe] [kernel-poll:false]
> >
> > Eshell V6.3 (abort with ^G)
> > 1> math:pow(2, -1074.999).
> > 5.0e-324
> > 2> math:pow(2, -1074) * math:pow(2, -1).
> > 0.0
> > 3> math:pow(2, -1075).
> > ** exception error: an error occurred when evaluating an arithmetic
> > expression
> > in function math:pow/2
> > called as math:pow(2,-1075)
> > 4> math:exp(-745).
> > 5.0e-324
> > 5> math:exp(-746).
> > ** exception error: an error occurred when evaluating an arithmetic
> > expression
> > in function math:exp/1
> > called as math:exp(-746)
>
> I can reproduce this on Solaris 10 / SPARC.
>
> I have reviewed the situation with matherr() on Linux/glibc and Solaris 10,
> and I believe a reasonable resolution is to remove the #if !NO_FPE_SIGNALS
> block in matherr(), so it reduces to a single "return 1;".
>
> There are problems with checking math routine results for errors in general,
> and the matherr() interface in particular.
>
> 1. The VM relies on !isfinite() to detect if a math routine failed.
> This appears to work on most systems, but there is a potential problem
> in how various systems and libm implementations behave: while most
> return HUGE_VAL (== INFINITY) on overflows, some return HUGE which is
> a large but finite value. Solaris' cc -Xt does the latter, but gcc on
> Solaris does the former. On my glibc-based Linux systems, matherr(3)
> lists HUGE as the return value on overflows for some routines, but my
> tests indicate that HUGE_VAL is returned instead, which while good is
> inconsistent with parts of the documentation.
>
> It's entirely possible that other libm implementations also return HUGE
> rather than HUGE_VAL on overflows, which thoroughly breaks our !isfinite()
> test. On Linux there are at least 3 non-glibc libc/libm implementations,
> and who knows what's in all those *BSD variants.
>
> 2. matherr(), when properly enabled, is called also in situations the VM does
> not consider to be errors, in particular the underflow case you reported.
> When FP exceptions also are enabled, matherr() sets the FP exception flag,
> causing underflows to erroneously trigger errors.
>
> However, on systems where plain HUGE is returned for overflows, matherr()
> + FP exceptions may be the only viable way of detecting those errors.
>
> 3. As you discovered, matherr() isn't enabled by default on Linux.
>
> As long as we limit ourselves to systems that consistently return HUGE_VAL
> on overflows, as Linux/glibc and Solaris w/ gcc do, we don't need matherr()
> to detect errors, which is why having it just return 1 should be Ok.
>
> Can you run the emulator test suite on your Solaris system, first with
> vanilla 17.4 and then with the proposed code change, and check that the
> test suite results are the same?
>
> /Mikael
More information about the erlang-bugs
mailing list