[eeps] EEP XXX: Pattern-test operator

Sun Apr 22 22:17:34 CEST 2012

[how do you tell gmail about the "mailing list" concept, so that it will
default to "reply to list", rather than "reply to sender"?...]

I've been considering how to implement this ["?=" / better codegen for
guards] without disturbing things too much - in particular, without making
changes to Core Erlang.
I believe that I can argue that the desired extentions simply cannot in
general be represented with Core Erlang (at least not without duplicating
code or tests).

These examples may not be minimal, but should illustrate that there is a
problem:
  a) receive T when element(1,T) ?= <<1,_>> -> ... end
  b) receive T when L ?= tuple_size(T), element(1,T) ?= <<D:L>> -> D end
These don't even include 'orelse'.

a) is quite simple, but could be fixed with appropriate new guard BIFs for
binaries. Lacking such BIFs, there is a problem even without guard->body
bindings.

Den 22. apr. 2012 00.32 skrev Robert Virding <rvirding@REDACTED>:

The actual pattern match compiler does a pretty straight forward
> compilation of guards. It the optimiser which does some very weird
> things and seems pretty incomprehensible.

Yes... only after dumping intermediate representations did I realize that
it wasn't a half-baked attempt to improve handling of "andalso" and
"orelse" in guards that caused the weird beam code. Rather, it's the
peep-hole optimizer which does quite a bit work to make the code look good
(and normally succeeds, except that it isn't optimized for handling guards).

> One though I had when doing
> the pattern match compiler was whether we could instead move patterns
> in the head explicitly into the guard, have everything there, and work
> on that. That would optimise things like type tests in the guard. And
> in the case variable binding. There was a paper describing this but I
> can't remember it's name or author.
>
A "Core Erlang 2" which relied on "if" rather than "case" as the central
branching construct, would certainly ensure that guards are treated no
worse than patterns.
I don't know how variable bindings would be handled most elegantly in that
setting, but surely it can be done.

That would be quite a disruptive change, however.  Changing just
#c_clause{} might be a more realistic approach.  My previous suggestion for
that should be modified, though, to allow for bindings in 'orelse'.

I would personally like to have variable binding in the guards. I have
> a current use for it, so hurry up. :-)
>
Doing the best I can in my spare time :-)
For the moment, we need a plan which is both general enough, sufficiently
simple to implement, and palatable to the OTP team...
Oh, and an updated semantics spec, of course.  Richard, which EEP should we
continue with?

 Robert
>
> On 19 April 2012 00:00, Erik Søe Sørensen <eriksoe@REDACTED> wrote:
> > Guards seem to be given a rather stepmotherly treatment in the
> compiler...
> > whether any decent code is generated appear to be entirely up to the
> mercy
> > of the peephole optimizer. That's worse than I'd imagined.
> >
> > For instance,
> >   g3(X) when X=:=foo orelse
> >          X=:=bar orelse
> >          X=:=baz -> ok.
> > generates the rather asymmetric
> >
> > {select_val,{x,0},{f,11},{list,[{atom,foo},{f,12},{atom,bar},{f,12}]}}.
> >     {label,11}.
> >       {test,is_eq_exact,{f,9},[{x,0},{atom,baz}]}.
> >     {label,12}.
> >       {move,{atom,ok},{x,0}}.
> >       return.
> >
> > And worse, in some cases like
> >   h(X,Y) when (X=:=a andalso Y>0) -> aa;
> >   h(X,Y) when (X=:=b andalso Y>0) -> bb;
> >   h(X,Y) when (X=:=c andalso Y>0) -> cc.
> > the size of the generated code is quadratic in the size of the source
> code:
> >     {select_val,{x,0},
> >                 {f,29},
> >
> {list,[{atom,a},{f,31},{atom,b},{f,33},{atom,c},{f,35}]}}.
> >   {label,31}.
> >     {test,is_lt,{f,32},[{integer,0},{x,1}]}.
> >     {move,{atom,aa},{x,0}}.
> >     return.
> >   {label,32}.
> >     {select_val,{x,0},{f,29},{list,[{atom,b},{f,33},{atom,c},{f,35}]}}.
> >   {label,33}.
> >     {test,is_lt,{f,34},[{integer,0},{x,1}]}.
> >     {move,{atom,bb},{x,0}}.
> >     return.
> >   {label,34}.
> >     {test,is_eq_exact,{f,29},[{x,0},{atom,c}]}.
> >   {label,35}.
> >     {test,is_lt,{f,29},[{integer,0},{x,1}]}.
> >     {move,{atom,cc},{x,0}}.
> >     return.
> > (Notice the tables in the select_val instructions.)
> >
> > This bad behaviour happens after the transition to beam code (in
> > beam_dead:combine_eqs(), afaict).
> > Looks like guards really would like to be taken into account during
> pattern
> > match compilation... (no offence, Dijkstra).
> >
> > Anyway, the first roadblock appears to be that at the time when match
> > compilation occurs (at the core->kernel representation transition),
> guards
> > have already been flattened, meaning that what was at guard top-level in
> the
> > source code is more difficult to find.
> > For this reason, I'm considering a representation change of Core's
> clauses -
> > from
> >   -record(c_clause, {anno=[], pats,       % pats :: [Tree],
> >                      guard,               % guard :: Tree,
> >                      body}).              % body :: Tree
> > to
> >   -record(c_clause, {anno=[], pats,
> >                      guard_pats :: [{Tree, Tree}], % {ExprToTest,
> Pattern}
> >                      guard_exp :: Tree,
> >                      body}).
> > such that the guard is split into a part that can integrated into the
> > pattern matching step, and a remainder that can't.
> > For instance, the guard "when Y > 0, element(1,X) =:= foo" would be
> > represented as
> >   guard_pats = [{E1, P1}], guard_exp=E2
> > where E1 = repr. of "element(1,X)", P1 = repr. of "'foo'", and E2 =
> repr. of
> > "Y > 0".
> >
> > Handling "P ?= E" at guard top level would then follow quite naturally:
> by
> > including {E,P} in guard_pats.
> > (As for scoping, it is handled in an earlier phase.)
> > Integrating type tests need a bit more, but that can be left for later if
> > necessary.
> >
> > /Erik
> >
> > Den 18. apr. 2012 14.54 skrev Erik Søe Sørensen <eriksoe@REDACTED>:
> > [snip]
> >
> >> Handling 'orelse' might be tricky, though; the current approach for ';'
> >> appears to be to simply duplicate the clause and let a downstream
> optimizer
> >> recognize the duplication, so I suppose that will work for 'orelse' as
> well.
> >> (Looking at generated code, I notice that 'orelse' and 'andalso' in
> guards
> >> gives rise (in R14B03) to some surprising - and suboptimal - code.)
> >
> >
> >
> >
> > _______________________________________________
> > eeps mailing list
> > eeps@REDACTED
> > http://erlang.org/mailman/listinfo/eeps
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/eeps/attachments/20120422/7ece4646/attachment.htm>