Author:: Richard Carlsson <carlsson.richard(at)gmail(dot)com>
Status:: Draft
Type:: Standards Track
Created:: 7-Jan-2025
Erlang-Version:: OTP-28.0

EEP 75: Based Floating Point Literals #

Abstract #

This EEP adds floating point literals using other bases than ten, such as 16 or 2, for exact text representation, as also found in Ada and C99/C++17.

Rationale #

Computers represent floating point numbers in binary, but such numbers are typically printed using base ten, for example 0.314159265e1. In order to preserve the exact bit level precision when writing a printed number as text and later reading it back again, it is better to use a base that matches the internally used base, such as 16 for a compact but still exact representation, or 2 for visualizing or writing down the exact internal format. One particular case where such exact representations are useful is in code generating tools.

Some other languages have support for floating point literals in other bases. Notably, C99/C++17 floating-point literals can be written in hexadecimal, as e.g. 0xf.ffp8, where the p indicates that the exponent is a power of 2, and in the Ada programming language, the corresponding syntax is 16#F.FF#E8. The latter should look familiar to Erlang users, because it is from Ada that the <Base>#<Numeral> syntax was borrowed. (Ada however requires a final # even for integers, e.g. 16#fffe#.) Ada also allows base 2 in a floating point number, e.g. 2#0.1111_1111#E8, using underscores as separators just like Erlang does.

Where Erlang differs from Ada is that Ada does not allow the base to be larger than 16. Hence, in Ada, 2#111#, 7#10#, and 16#7# are all the same number, but 17#7# is illegal, while Erlang allows any base up to 36 in its integer literals, e.g. 36#z. It should also be noted that in the C99 hexadecimal literals, the letter p for the exponent is a valid digit in bases above 25 in Erlang, whereas C99 only allows digits up to f (and could not use e for exponents in hex). Because the Ada notation requires a # character before the exponent, it has no ambiguity between digits and exponent indicator.

Staying with the Ada notation then seems to be the wise choice, both for consistency and because it makes it trivial to keep allowing any base up to 36 also for floating point literals in Erlang.

Examples:

2#0.111
2#0.10101#e8
16#ff.ff
16#fefe.fefe#e16
32#vrv.vrv#e15

It should be noted that both the base and the exponent are always interpreted in base ten. Only the digits between the two # characters are interpreted using the given base. Because Erlang uses the # characters at the start of constructs such as maps #{...}, we do not want to allow a final trailing # in a number, like Ada does. If there is a second #, it must be followed by the exponent.

Specification #

In addition to the current based notation:

base # based_numeral

(borrowing Ada’s terminology), where base is a decimal number and based_numeral is a sequence of digits in 0-9 and a-z or A-Z, optionally separated with _, we extend the parser to also allow:

base # based_numeral.based_numeral [ # exponent ]

where exponent is the letter e or E followed by an optionally signed decimal number, exactly as in ordinary decimal floating point literals.

Reference Implementation #

A reference implementation exists in the hexbinfloat branch of the author’s GitHub account, together with a GitHub pull request to the Erlang/OTP repository.

Copyright #

This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.