This EEP proposes new syntax for UTF-8 binary string literals and patterns to bring them into line with list-strings.
List-strings (i.e. strings represented by lists of unicode codepoints) have a
convenient syntax: "This is my string"
, but the corresponding syntax for
UTF-8 encoded binary strings is more cumbersome: <<"This is my string"/utf8>>
.
Here, we propose a lightweight, alternative syntax for UTF-8 binary string literals:
Str = b"This is my string".
and patterns:
case Str of
b"This is my string" -> ok;
_ -> error
end.
Early during compilation and shell evaluation, the new syntax would be desugared to
the corresponding existing syntax, e.g. b"This is my string"
is rewritten to
<<"This is my string"/utf8>>
.
TBD
The new syntax is invalid in older releases, so would not impact existing code.
The implementation of the new syntax would purely be via an early re-writing step, so would be desugared to the existing representation before later compiler stages. This would mean bytecode would be unaffected, but debugging/AST data would reflect the new source represenation.
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.