[eeps] EEP-51: Underscore digit separator in numeric literals

Сергей Прохоров seriy.pr@REDACTED
Thu Oct 10 01:50:22 CEST 2019


It was also submitted as PR on GitHub https://github.com/erlang/eep/pull/7

    Author: Sergey Prokhorov <seriy(dot)pr(at)gmail(dot)com>
    Status: Draft
    Type: Standards Track
    Erlang-Version: 23.0
    Created: 07-Oct-2019
    Post-History:
****
EEP 51: Underscore digit separator in numeric literals
----


Abstract
========

This EEP extends numeric literal syntax, allowing underscore `_` as digits
separator. It extends syntax of integer, float and based integer literals.


Copyright
=========

This document has been placed in the public domain.


Specification
=============

Current syntax for numeric literals can be described by following scheme:


    DIGIT = [0-9]
    BASE_DIGIT = [0-9a-zA-Z]
    SIGN = [+-]

    INTEGER = SIGN? DIGIT+

    FLOAT = SIGN? DIGIT+ '.' DIGIT+ ('e' SIGN? DIGIT+)?

    BASED_INTEGER = SIGN? DIGIT+ '#' BASE_DIGIT+


EEP extends this syntax by allowing single underscore `_` character as a
separator between two `DIGIT` or `BASE_DIGIT` characters. I.e., characters
from
both left and right sides of `_` should be digits. Leading, trailing or
repeated
underscores should not be allowed as well as underscores next to `-` `.`
`#` `e`.
Separator is only used to visually separate groups of characters. It doesn't
change semantics of literal and is removed by lexer.

Following are examples of valid numeric literals:


    123_456
    1_2_3_4_5
    123_456.789_123
    1.0e1_23
    16#DEAD_BEEF
    2#1100_1010_0011


And following are examples of literals that are not allowed:


    _123  % will be interpreted as variable name
    123_
    123__456  % only single underscore allowed
    123_.456  % underscore can only separate digits, not other symbols
    123._456
    16#_1234
    16#1234_


The underscores are to be lost when representing abstract forms or terms as
strings (`erl_pp`, `io_lib`).


Motivation
==========

Digit separator becomes especially handy for relatively long literals like
thousands separator for decimals `Million = 1_000_000`, `SecondsInDay =
86_400`
byte separator for hexadecimal `16#DEAD_BEEF_CAFE`. Group separator makes
such literals more readable (when used wisely) and makes it easier to
visualy
spot typos like `MicroSecond = Second * 100000`.
The digit separator feature was adopted by multiple programming languages
starting from at least Ada83.


Rationale
=========

Underscore character was choosen as separator because it is used by multiple
other languages such as "Ada 83", "C#", "Java 7", "Perl 2.0", "Python 3.6",
"Ruby", "Rust", "Swift" and "Go". Also it doesn't conflict with current
Erlang
syntax (unlike, for example, `,`).
The only ambiguous decision was about where exactly digit separators should
be
allowed. It was decided that underscore should be allowed only between
digits
as meant by being a "digit separator".


Backwards Compatibility
=======================

All existing Erlang code remains acceptable. The implementation is done
entirely in the Erlang lexer, so, no changes are needed in parser and
even tools that examine ASTs will be unaffected.
External tools like code editors and syntax highlighters might need to be
updated to be able to recognize new syntax for numeric literals.


Reference Implementation
========================

The reference implementation is provided in a form of pull-request on GitHub

    https://github.com/erlang/otp/pull/2324
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/eeps/attachments/20191010/619bc3cb/attachment.htm>


More information about the eeps mailing list