[erlang-questions] IEEE-754 subnormals parsing and handling problems and bugs

Tue Dec 27 23:48:25 CET 2011

Hello!

Interesting stuff! Do you have any information about how other high
level languages handle this? Maybe looking at how Ruby and Python
behaves in these edge cases will help to create a good solution.

Too bad that a change which would create exceptions when working with
small floats breaks backwards computability quite severely. Any ideas
on how to do this without breaking backwards compatibility? for
list_to_float it would be easy, but float division and multiplication
is harder.

Lukas

On Tue, Dec 27, 2011 at 11:30 PM, Witold Baryluk
<baryluk@REDACTED> wrote:
> Hello,
>
> I found problem when parsing small numbers
>
> 1> list_to_float("0."++lists:duplicate(322, $0)++"1").
> 1.0e-323
> 2> list_to_float("0."++lists:duplicate(323, $0)++"1").
> 0.0
>
> This is contrasting difference to the handling of big numbers
>
> 3> list_to_float("1"++lists:duplicate(308, $0)++".0").
> 1.0e308
> 4> list_to_float("1"++lists:duplicate(309, $0)++".0").
> ** exception error: bad argument
>     in function  list_to_float/1
>        called as list_to_float("1000000...[lots of zeros removed]......000000000.0")
>
> Example 2, shourly should throw error. But actually example 1 also,
> becaus it create so called subnormal numbers (aka denormal numbers,
> underflow value). Take look at this two examples:
>
> 5> list_to_float("0."++lists:duplicate(322, $0)++"123456789").
> 1.0e-323
> 6> list_to_float("0."++lists:duplicate(300, $0)++"123456789").
> 1.23456789e-301
>
>
> One can check how arithmetic exception handling is working here:
>
> 7> 1.0e200 * 1.0e200.
> ** exception error: bad argument in an arithmetic expression
>     in operator  */2
>        called as 1.0e200 * 1.0e200
>
> but this doesn't happen for small numbers
>
> 8> 0.123456789e-300.
> 1.23456789e-301
> 9> 0.123456789e-400.
> 0.0
> 10> 0.123456789e-320.
> 1.235e-321
>
> 11> 0.123456789e-100 * 0.123456789e-100.
> 1.524157875019052e-202
> 12> 0.123456789e-200 * 0.123456789e-200.
> 0.0
>
> Why infinities are trapped, but subnormals not? Because there is no good
> syntax for NaNs and Infinites, but there is for subnormals and zero?
> As of speed in fact subnormal processing is much more slower, than infinities.
> This means, that for example adding lots of numbers near 1.0e-320
> can be few times slower, than normal numbers. Other oprations, like
> square root, trigonometry or multiplications can have even bigger
> performance impact.
>
> 13> timer:tc(lists, sum, [lists:duplicate(1000000, 0.1e-200)]).
> {164756,1.0000000000056682e-195}
> 14> timer:tc(lists, sum, [lists:duplicate(1000000, 0.1e-300)]).
> {163616,9.99999999972789e-296}
> 15> timer:tc(lists, sum, [lists:duplicate(1000000, 0.1e-310)]).
> {238753,1.0000000000158536e-305}
> 16> timer:tc(lists, sum, [lists:duplicate(1000000, 0.1e-320)]).
> {471354,9.98012605e-316}
> 17> timer:tc(lists, sum, [lists:duplicate(1000000, 0.1e-321)]).
> {471283,9.881313e-317}
> 18> timer:tc(lists, sum, [lists:duplicate(1000000, 0.1e-322)]).
> {471398,9.881313e-318}
> 19> timer:tc(lists, sum, [lists:duplicate(1000000, 0.1e-323)]).
> {164053,0.0}
>
> % on my machine (Athlon_. I tested on Intel Core2 in 32-bits, and
> % differences are much bigger, up to 20 times slower!
>
>
> Similar behaviour I found in erl_scan:string/1.
>
> This is also very different behaviour than erl_scan:string/1, used by compiler
> to parse shell input and source files.
>
> 20> erl_scan:string("1"++lists:duplicate(309, $0)++".0").
> {error,{1,erl_scan,{illegal,float}},1}
> 21> erl_scan:string("1"++lists:duplicate(308, $0)++".0").
> {ok,[{float,1,1.0e308}],1}
>
> Everything fine, infinities are trapped.
>
> Unfortunetly, for underflows compiler similary behaves wrong
>
> 22> erl_scan:string("0."++lists:duplicate(322, $0)++"1").
> {ok,[{float,1,1.0e-323}],1} % should return error
> 23> erl_scan:string("0."++lists:duplicate(323, $0)++"1").
> {ok,[{float,1,0.0}],1} % should return error
>
> Also string:to_float behaves in same way:
>
> 24> string:to_float("0."++lists:duplicate(322, $0)++"1").
> {1.0e-323,[]} % should return error
> 25> string:to_float("0."++lists:duplicate(323, $0)++"1").
> {0.0,[]} % should return error
>
> 26> string:to_float("1"++lists:duplicate(308, $0)++".0").
> {1.0e308,[]}
> 27> string:to_float("1"++lists:duplicate(309, $0)++".0").
> {error,no_float}
>
>
>
> I think it should be fixed, so more reliable software can be written,
> like statistics software (subnormals can for example easilly appear when
> summing and multipling small numbers, however it is normally rear, and
> should throw error, to not produce bad results, because even using smart
> summation algorithms, like Kahan scheme, will not fix this problem).
>
> For sure such floats should not be allowed in source code, or when
> parsing from string. And probably also not be allowed to appear using
> any builting arithmetic functions.
>
>
> Another example:
>
> 30> math:exp(100).
> 2.6881171418161356e43
> 31> math:exp(1000).
> ** exception error: bad argument in an arithmetic expression
>     in function  math:exp/1
>        called as math:exp(1000)
>
> infinity trapped, but subnormal not:
>
> 32> math:exp(-100).
> 3.720075976020836e-44
> 33> math:exp(-1000).
> 0.0
>
>
> I also found that scientific notation behaves in same way:
>
> 34> list_to_float("1.123456789e-320").
> 1.1235e-320
> 35> list_to_float("1.123456789e-330").
> 0.0
>
>
> My last argument will be about hardware support. For example many new
> ARM processors, supports hardware floating point computations, but often
> without support for subnormals! This makes them not fully IEEE-754
> compilant, unless running in software floating point mode, which is
> slower, especially if it also need to handle subnormals!
>
>
> I'm using 32-bit cpu:
>
> model name      : AMD Athlon(tm)
> stepping        : 2
> cpu MHz         : 1154.450
> cache size      : 256 KB
> fdiv_bug        : no
> hlt_bug         : no
> f00f_bug        : no
> coma_bug        : no
> fpu             : yes
> fpu_exception   : yes
> cpuid level     : 1
> wp              : yes
> flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse syscall mp mmxext 3dnowext 3dnow up
>
>
> Erlang version 1:14.b.4-dfsg-1 installed from Debian/GNU Linux testing.
> Do not know exact complation flags, but here is small info
>
> 1> erlang:system_info(system_version).
> "Erlang R14B04 (erts-5.8.5) [source] [rq:1] [async-threads:0] [hipe] [kernel-poll:false]\n"
> 2> erlang:system_info(system_architecture).
> "i486-pc-linux-gnu"
> 3> erlang:system_info(build_type).
> opt
> 4> erlang:system_info(c_compiler_used).
> {gnuc,{4,6,1}}
> 5> erlang:system_info(debug_compiled).
> false
> 6> erlang:system_info(smp_support).
> false
> 7> erlang:system_info(threads).
> true
>
>
>
>
> speed differences of subnormals on Intel Core2 (32-bit mode), same compiler, same options.
>
> 1> timer:tc(lists, sum, [lists:duplicate(10000000, 0.1e-300)]).
> {278703,9.999999998591641e-295}
> 3> timer:tc(lists, sum, [lists:duplicate(10000000, 0.1e-310)]).
> {1049095,9.999999997470606e-305}
> 5> timer:tc(lists, sum, [lists:duplicate(10000000, 0.1e-320)]).
> {3501330,9.980126046e-315}
>
> % about 12 times slower, and obviously loss of precission.
>
> Beyond infinities, nan, and denormalized numbers, there also
> signed zero, but this is handled without problem:
>
> 10> list_to_float("0.0") =:= list_to_float("-0.0").
> true
> 11> math:sqrt(list_to_float("-0.0")).
> 0.0
>
> This is acceptable, because signed zero is mostly usefull with NaN and
> infinites support, and because we do not have them, it is good solution
> to ignore signed zero problem.
>
>
> I found no discussion on erlang-questions list in the past, so hope it
> is worth discussing.
>
> Also Erlang Reference Manual User's Guide, doesn't mention anything on the matter.
>
> Regards,
> Witek
>
>
>
>
> --
> Witold Baryluk
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions