Author:
Dániel Szoboszlay <dszoboszlay@gmail.com>
Status:
Final/28.0 Implemented in OTP version 28.0
Type:
Standards Track
Created:
01-Jul-2024
Erlang-Version:
28
Post-History:
https://erlangforums.com/t/eep-70-non-filtering-generators/3937 https://github.com/erlang/otp/pull/8625

EEP 70: Strict and relaxed generators #

Abstract #

This EEP proposes the addition of a new, strict variant of all existing generators (list, bit string and map). Currently existing generators are relaxed: they ignore terms in the right-hand side expression that do not match the left-hand side pattern. Strict generators on the other hand shall fail with exception badmatch.

Rationale #

The motivation for strict generators is that relaxed generators can hide the presence of unexpected elements in the input data of a comprehension. For example consider the below snippet:

[{User, Email} || #{user := User, email := Email} <- all_users()]

This list comprehension would filter out users that don’t have an email address. This may be an issue if we suspect potentially incorrect input data, like in case all_users/0 would read the users from a JSON file. Therefore cautious code that would prefer crashing instead of silently filtering out incorrect input would have to use a more verbose map function:

lists:map(fun(#{user := User, email := Email}) -> {User, Email} end,
          all_users())

Unlike the generator, the anonymous function would crash on a user without an email address. Strict generators would allow similar semantics in comprehensions too:

[{User, Email} || #{user := User, email := Email} <:- all_users()]

This generator would crash (with a badmatch error) if the pattern wouldn’t match an element of the list.

The proposed operators for strict generators are <:- (for lists and maps) and <:= (for bit strings) instead of <- and <=. This syntax was chosen because <:- and <:= somewhat resemble the =:= operator that tests whether two terms match, and at the same time keep the operators short and easy to type. Having the two types of operators differ by a single character, :, also makes the operators easy to remember as “: means strict.”

Alternate Designs #

There are numerous other ways of achieving the goal of crashing upon encountering non-matching input:

  1. Converting the comprehension to a map call, just as in the example of the previous section.

  2. Moving the pattern matching that may fail to the result expression of the comprehension:

    [begin #{user:= User, email := Email} = Usr, {User, Email} end || Usr <- all_users()]

  3. Moving the pattern matching that may fail to a filter in the comprehension (which would still have to evaluate to a boolean, making this solution quite awkward looking):

    [{User, Email} || Usr <- all_users(), (#{user := User, email := Email} = Usr) > 0]

  4. The same as before, but using binders proposed in EEP 12:

    [{User, Email} || Usr <- all_users(), #{user := User, email := Email} = Usr]

Most of these solutions are much more verbose than the strict generator syntax, undermining the compactness of comprehensions. But there are more serious issues too:

  • 1 would not work for bit string comprehensions, because there isn’t a map function available for bit strings.

  • 2 would make it impossible to use sub terms of the pattern (Usr in this example) in subsequent generators or filters.

  • The intent of the code isn’t clear in most of these solutions (definitely not in 3, and arguably in 4 and 2).

  • None of these techniques can solve the problem of final non-matching bits of a bit string.

The issue with bit string generators is that unlike lists and maps, bit strings don’t have natural “elements” the generator could iterate over. Instead the pattern in the generator dictates where to split the bit string into parts. This could inevitably lead to situations where the bit string ends with some bits that don’t match the pattern, like in this example:

[X || <<X:16>> <- <<1,2,3>>]

The existing generator would skip these final non-matching bits, and this behaviour cannot be changed due to backward compatibility reasons. It is only possible to guarantee that a bit string generator would fully consume its input by introducing a new type of bit string generator or some other new syntax that would alter the behaviour of the existing generator.

Reference Implementation #

Current implementation: PR #8625.

Backward Compatibility #

There are no backward compatibility issues with the proposed new syntax, since the <:- and <:= operators used to be invalid syntax in previous versions of Erlang.

Copyright #

This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.