This EEP adds floating point literals using other bases than ten, such as 16 or 2, for exact text representation, as also found in Ada and C99/C++17.
Computers represent floating point numbers in binary, but such numbers
are typically printed using base ten, for example 0.314159265e1
. In
order to preserve the exact bit level precision when writing a printed
number as text and later reading it back again, it is better to use a
base that matches the internally used base, such as 16 for a compact
but still exact representation, or 2 for visualizing or writing down
the exact internal format. One particular case where such exact
representations are useful is in code generating tools.
Some other languages have support for floating point literals in other
bases. Notably, C99/C++17 floating-point literals can be
written in hexadecimal, as e.g. 0xf.ffp8
, where the p
indicates
that the exponent is a power of 2, and in the Ada programming
language, the corresponding syntax is 16#F.FF#E8
. The latter should
look familiar to Erlang users, because it is from Ada that the
<Base>#<Numeral>
syntax was borrowed. (Ada however requires a final
#
even for integers, e.g. 16#fffe#
.) Ada also allows base 2 in a
floating point number, e.g. 2#0.1111_1111#E8
, using underscores as
separators just like Erlang does.
Where Erlang differs from Ada is that Ada does not allow the base to
be larger than 16. Hence, in Ada, 2#111#
, 7#10#
, and 16#7#
are
all the same number, but 17#7#
is illegal, while Erlang allows any
base up to 36 in its integer literals, e.g. 36#z
. It should also be
noted that in the C99 hexadecimal literals, the letter p
for the
exponent is a valid digit in bases above 25 in Erlang, whereas C99
only allows digits up to f
(and could not use e
for exponents in
hex). Because the Ada notation requires a #
character before the
exponent, it has no ambiguity between digits and exponent indicator.
Staying with the Ada notation then seems to be the wise choice, both for consistency and because it makes it trivial to keep allowing any base up to 36 also for floating point literals in Erlang.
Examples:
2#0.111
2#0.10101#e8
16#ff.ff
16#fefe.fefe#e16
32#vrv.vrv#e15
It should be noted that both the base and the exponent are always
interpreted in base ten. Only the digits between the two #
characters are interpreted using the given base. Because Erlang uses
the #
characters at the start of constructs such as maps #{...}
,
we do not want to allow a final trailing #
in a number, like Ada
does. If there is a second #
, it must be followed by the exponent.
In addition to the current based notation:
base # based_numeral
(borrowing Ada’s terminology), where base
is a decimal number and
based_numeral
is a sequence of digits in 0-9
and a-z
or A-Z
,
optionally separated with _
, we extend the parser to also allow:
base # based_numeral.based_numeral [ # exponent ]
where exponent
is the letter e
or E
followed by an optionally
signed decimal number, exactly as in ordinary decimal floating point
literals.
A reference implementation exists in the
hexbinfloat
branch of the author’s GitHub account, together with a
GitHub pull request to the Erlang/OTP repository.
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.