This EEP proposes the addition of a new, strict variant of all
existing generators (list, bit string and map). Currently existing
generators are relaxed: they ignore terms in the right-hand side
expression that do not match the left-hand side pattern. Strict
generators on the other hand shall fail with exception badmatch
.
The motivation for strict generators is that relaxed generators can hide the presence of unexpected elements in the input data of a comprehension. For example consider the below snippet:
[{User, Email} || #{user := User, email := Email} <- all_users()]
This list comprehension would filter out users that don’t have an email
address. This may be an issue if we suspect potentially incorrect input
data, like in case all_users/0
would read the users from a JSON file.
Therefore cautious code that would prefer crashing instead of silently
filtering out incorrect input would have to use a more verbose map
function:
lists:map(fun(#{user := User, email := Email}) -> {User, Email} end,
all_users())
Unlike the generator, the anonymous function would crash on a user without an email address. Strict generators would allow similar semantics in comprehensions too:
[{User, Email} || #{user := User, email := Email} <:- all_users()]
This generator would crash (with a badmatch
error) if the pattern
wouldn’t match an element of the list.
The proposed operators for strict generators are <:-
(for lists
and maps) and <:=
(for bit strings) instead of <-
and <=
. This
syntax was chosen because <:-
and <:=
somewhat resemble the =:=
operator that tests whether two terms match, and at the same time keep
the operators short and easy to type. Having the two types of operators
differ by a single character, :
, also makes the operators easy to
remember as “:
means strict.”
There are numerous other ways of achieving the goal of crashing upon encountering non-matching input:
Converting the comprehension to a map
call, just as in the example
of the previous section.
Moving the pattern matching that may fail to the result expression of the comprehension:
[begin #{user:= User, email := Email} = Usr, {User, Email} end || Usr <- all_users()]
Moving the pattern matching that may fail to a filter in the comprehension (which would still have to evaluate to a boolean, making this solution quite awkward looking):
[{User, Email} || Usr <- all_users(), (#{user := User, email := Email} = Usr) > 0]
The same as before, but using binders proposed in EEP 12:
[{User, Email} || Usr <- all_users(), #{user := User, email := Email} = Usr]
Most of these solutions are much more verbose than the strict generator syntax, undermining the compactness of comprehensions. But there are more serious issues too:
1 would not work for bit string comprehensions, because there isn’t
a map
function available for bit strings.
2 would make it impossible to use sub terms of the pattern (Usr
in
this example) in subsequent generators or filters.
The intent of the code isn’t clear in most of these solutions (definitely not in 3, and arguably in 4 and 2).
None of these techniques can solve the problem of final non-matching bits of a bit string.
The issue with bit string generators is that unlike lists and maps, bit strings don’t have natural “elements” the generator could iterate over. Instead the pattern in the generator dictates where to split the bit string into parts. This could inevitably lead to situations where the bit string ends with some bits that don’t match the pattern, like in this example:
[X || <<X:16>> <- <<1,2,3>>]
The existing generator would skip these final non-matching bits, and this behaviour cannot be changed due to backward compatibility reasons. It is only possible to guarantee that a bit string generator would fully consume its input by introducing a new type of bit string generator or some other new syntax that would alter the behaviour of the existing generator.
Current implementation: PR #8625.
There are no backward compatibility issues with the proposed new syntax,
since the <:-
and <:=
operators used to be invalid syntax in
previous versions of Erlang.
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.