[erlang-questions] String representations of floating point numbers

Sat Mar 16 12:32:07 CET 2019

   Hi, Bob,

On Fri, Mar 15, 2019 at 05:32:54PM -0700, Bob Ippolito wrote:
> If you dive into the implementation it's effectively a wrapper around
> strtod from C with a validation pass that is more strict than the strtod
> standard.
> 
> https://github.com/erlang/otp/blob/OTP-21.3/erts/emulator/sys/win32/sys_float.c#L54
> https://github.com/erlang/otp/blob/OTP-21.3/erts/emulator/sys/unix/sys_float.c#L732
> 
> So as far as a regex goes it would be something like this:
> 
> [+-]?\d+[.,]\d+([eE][+-]?\d+)
> 
> The major differences between this and other popular float grammars are:
> 
> * At least one digit is required in each part
> * Both integer and fractional parts are required, even if there's an
> exponent part (so "1", ".1", "1e-1" would not be valid)
> * The decimal separator is either , or . (the implementation will try the
> other if necessary to compensate for a different locale)

   Thanks for the confirmation. That's more or less what I discovered
while playing around with list_to_float. It's the first two cases that
are the problems for me, because the spec I'm working to(*) says that
"1." and ".3" are valid floats, for example, as is "1e-1".

   Just for the record, here's the code I'm using to convert a Turtle
double or decimal (the former in scientific notation; the latter
without the E) into a form suitable for list_to_float/1:

    [...]
    % W3C's description of a float is wider than erlang's. We need to
    % split up the number into a few parts to add extra characters
    % where necessary so that list_to_float/1 will work right.
    F = case string:lexemes(Text, "eE") of
        [M, E] ->
            fixup_decimal(M) ++ "e" ++ E;
        [M] ->
            fixup_decimal(M)
    end,
    O = lagra_model:new_literal(list_to_float(F)),
    [...]

-spec fixup_decimal(string()) -> string().
fixup_decimal(M) ->
    case string:lexemes(M, ".") of
        [I] ->
            I++".0";
        [I, ""] ->
            I++".0";
        ["", J] ->
            "0."++J;
        [I, J] ->
            M
    end.

   Hugo.

(*) W3C's Turtle recommendation.

> On Fri, Mar 15, 2019 at 3:52 PM Hugo Mills <hugo@REDACTED> wrote:
> 
> >    Where in the manual is the set of allowable string representations
> > of floating point numbers documented? I'd have expected it to be here:
> >
> > http://erlang.org/doc/reference_manual/data_types.html
> >
> > ... but apparently not.
> >
> >    Specifically, I'm trying to use list_to_float/1, and I've been
> > trying to reverse engineer it:
> >
> > 1> list_to_float("-1").
> > ** exception error: bad argument
> >      in function  list_to_float/1
> >         called as list_to_float("-1")
> > 2> list_to_float("-1.0").
> > -1.0
> > 3> list_to_float("-1.0e-23").
> > -1.0e-23
> > 4> list_to_float("-1e-23").
> > ** exception error: bad argument
> >      in function  list_to_float/1
> >         called as list_to_float("-1e-23")
> > 5> list_to_float(".3").
> > ** exception error: bad argument
> >      in function  list_to_float/1
> >         called as list_to_float(".3")
> >
> >    An actual written specification would be really handy here. Even
> > just a regex or EBNF for them. I'm writing a parser for something
> > where the definition of floating point literals isn't quite the same
> > as Erlang's, and it's a bit painful.
> >
> >    Hugo.
> >

-- 
Hugo Mills             | Nostalgia isn't what it used to be.
hugo@REDACTED carfax.org.uk |
http://carfax.org.uk/  |
PGP: E2AB1DE4          |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: Digital signature
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20190316/b5f4c03e/attachment.bin>