[eeps] Commit: r39 - eeps/trunk

Thu Aug 14 09:55:50 CEST 2008

Author: raimo
Date: 2008-08-14 09:55:50 +0200 (Thu, 14 Aug 2008)
New Revision: 39

Modified:
   eeps/trunk/eep-0019.txt
Log:
This revision fixes an incomplete edit,
adds two more examples from Erlang/OTP sources,
and spells out an alternative translation.


Modified: eeps/trunk/eep-0019.txt
===================================================================

--- eeps/trunk/eep-0019.txt	2008-08-12 14:12:09 UTC (rev 38)
+++ eeps/trunk/eep-0019.txt	2008-08-14 07:55:50 UTC (rev 39)
@@ -1,5 +1,5 @@
-EEP: 19
-Title: Comprehension multigenerators
+EEP: XXX
+Title: Extensions to comprehensions
 Version: $Revision$
 Last-Modified: $Date$
 Author: Richard A. O'Keefe <ok@REDACTED>
@@ -7,7 +7,7 @@
 Type: Standards Track
 Erlang-Version: R12B-4
 Content-Type: text/plain
-Created: 30-Jul-2008
+Created: 14-Aug-2008
 Post-History:
 
 
@@ -19,6 +19,8 @@
 
     This is related to EEP 12 [1], but is independent of it.
 
+
+
 Specification
 
     Currently, Erlang has
@@ -54,7 +56,8 @@
 
     if we go with EEP 12.
 
-    The effect of P1 <- E1 && ... Pn <- En
+    Roughly speaking, ignoring errors and side effects,
+    the effect of P1 <- E1 && ... Pn <- En
     is the effect of {P1,...,Pn} <- zip(E1, ..., En)
     where
 
@@ -73,9 +76,96 @@
         P [<-] E   with   P <- E
     and then applying the translation above.
 
+    In the presence of errors, the behaviour of && is not precisely
+    the same as using zip.  We need to specify the actual behaviour
+    more precisely.  For brevity, I ignore binary enumeration.  Both
+    tuple enumeration and tuple comprehension are currently defined
+    by rewriting to plain list comprehension, so that's all we need
+    to worry about for now.
 
+    A list comprehension has the form [E || C1, ..., Cn]
+    where each Ci is
+        - a generator Pattern <- List_Expression
+        - a binding   Pattern =  Any_Expression
+        - a "guard"   Other_Expression that should give true or false.
+    This acts like
 
+        R = [],
+        <| E || [C1, ..., Cn] |>(R),
+        reverse(R)
 
+    where
+
+        <| E || [] |>(R)
+    =>  R = [E | R]             % reassign R
+
+        <| E || [Pi <- Ei|Cs] |>(R)
+    =>  Ti = Ei
+        Label: case Ti
+                 of [Pi|X] -> Ti = X % reassign Ti
+                              <| E || Cs |>(R)
+                              goto Label
+                  ; [_ |X] -> Ti = X % reassign Ti
+                              goto Label
+                  ; []     -> ok                              
+               end
+        
+        <| E || [Pi = Ei|Cs] |>(R)
+    =>  case Ei
+          of Pi -> <| E || Cs |>(R)
+           ; _  -> ok
+        end
+
+        <| E || [Ei|Cs] |>(R)
+    =>  case Ei
+          of true  -> <| E || Cs |>(R)
+           ; false -> ok
+        end
+        
+    In these translations, pattern matching syntax is used, with the
+    intent that the variables which are unbound according to the
+    normal rules of Erlang, and thus get bound by the Pi <- or Pi =
+    matching, are treated *as if* unbound in the code to be generated,
+    ignoring whatever values they might previous have had.  That also
+    applies when R or Ti appears on the left of a pattern match; the
+    fact that the variable really was bound is ignored and a simple
+    assignment is done.
+
+    This does involve (re-)assignment to local variables in the code
+    to be generated, but it does NOT involve user-visible assignment
+    and it does NOT involve mutable data structures.  It is no more
+    problematic for the language or the runtime system than reusing a
+    dead register is.
+
+    Handling multi-list enumeration is a simple, albeit schematic,
+    change to the rule for enumeration.
+
+        <| E || [Pi1 <- Ei1 && Pi2 <- Ei2 && ... && Pik <- Eik|Cs] |>(R)
+    =>  Ti1 = Ei1
+        ...
+        Tik = Eik
+        Label: case {Ti1,...,Tik}
+                 of {[Pi1|X1], ..., [Pik,Xk]} ->
+                    Ti1 = X1    % reassign
+                    ...
+                    Tik = Xk    % reassign
+                    <| E || Cs |>(R)
+                    goto label
+                  ; {[_|X1], ..., [_|Xk]} ->
+                    Ti1 = X1    % reassign
+                    ...
+                    Tik = Xk    % reassign
+                  ; {[], ..., []} ->
+                    ok
+               end
+        
+    Note that the use of tuple syntax in the case expression and the
+    case clauses does not imply the literal creation of a tuple in
+    the generated code, only that k values are to be matched against
+    k patterns in each case clause.
+
+
+
 Motivation
 
     "How do I iterate over several lists at once?" is a moderately
@@ -105,8 +195,31 @@
             || T <- Tests && X <- Xs && Y <- Ys
             ].
 
+    This code from module dialyzer_dep
 
+        merge_outs([#output{type=list, content=L1}|Left],
+                   #output{type=list, content=L2}) ->
+          NewList = [merge_outs([X, Y]) || {X, Y} <- lists:zip(L1, L2)],
+          merge_outs(Left, output(NewList));
 
+    would become
+
+        merge_outs([#output{type=list, content=L1}|Left],
+                    #output{type=list, content=L2]) ->
+            merge_outs(Left, output(
+                [merge_outs([X,Y]) || X <- L1 && Y <- L2]));
+
+    This code from forward_args/3 in module dialyzer_dataflow
+
+        NewArgTypes = [t_sup(X, Y) ||
+                       {X, Y} <- lists:zip(ArgTypes, OldArgTypes)],
+
+    would become
+
+        NewArgTypes = [t_sup(X, Y) || X <- ArgTypes && Y <- OldArgTypes],
+
+
+
 Rationale
 
     This is a case where no invention is required, really.
@@ -136,7 +249,7 @@
         foo([_|Xs]) -> foo(Xs);
         foo([]) -> [].
 
-    With a multigenerator, the translation is similar.
+    With a multi-sequence generator, the translation is similar.
 
         [g(X, Y) || X <- Xs && Y <- Ys, X > Y]
     
@@ -151,8 +264,45 @@
         bar([_|Xs], [_|Ys]) -> bar(Xs, Ys);
         bar([], []) -> [].
 
+    The specification above gives the kind of translation I would like
+    to see; I do have an implementation in mind (based on Pop-2) that
+    doesn't need the reversal but don't know how it would fit in BEAM.
 
+    One obvious question is whether we need this at all.  Why not just
+    get people to write calls to lists:zip and get the compiler to
+    optimise them?  One answer is that this notation is much clearer;
+    the programmer's *intent* is to advance along two or more lists
+    at the same time, not to create a list of pairs.  When you want to
+    create a list of pairs, lists:zip/2 is the perfect way to do it.
+    A more important answer is that the proposed notation is NOT a
+    simple optimisation of equivalent code using lists:zip/2.
 
+        [E || {P,Q} <- lists:zip(A, B)]    % "zip version"
+
+    fails at once if A and B are not proper lists of the same length.
+
+        [E || P <- A && Q <- B]            % "Clean version"
+
+    eventually fails if A and B are not proper lists of the same
+    length, but may have evaluated E (which may have had side effects)
+    many times before that.  So an Erlang compiler would not be
+    allowed to replace the "zip version" by the "Clean version" unless
+    it could prove both that A and B were lists (which may be within
+    the abilities of the Dialyzer) and that they were exactly the same
+    length (which as far as I know isn't).
+
+    However, a multi-sequence generator and a single-sequence one
+    using calls to lists:zip/2 are clearly *similar*, so they should
+    eventually react to lists of different length the same way.
+    In Haskell, zipping two lists of different length acts as if the
+    longer were truncated to the length of the shorter.  Since
+    Haskell has lazy evaluation, lists may be infinite, so you can't
+    afford to wait until the end to start a comprehension.  Since
+    Erlang is strict, and since mistakes are common, lists:zip/2 in
+    Erlang makes sense as it is.
+
+
+
 Backwards Compatibility
 
     The "operator" '&&' is not legal syntax anywhere in Erlang