[eeps] New EEP draft: recursive funs

Raimo Niskanen raimo+eeps@REDACTED
Fri May 27 16:00:36 CEST 2011


I'm sorry, we have changed EEP format into Markdown.
See EEP 33: http://www.erlang.org/eeps/eep-0033.html

I am not sure if this has been communicated properly,
so I would appreciate very much if you could reformat
your EEPs int Markdown according to EEP 33. Otherwise
I can't dig into that until sometime the next week...

Sorry about the inconveniance.

/ Raimo Niskanen, EEP Editor



On Fri, May 27, 2011 at 07:55:34PM +1200, ok@REDACTED wrote:
> Erlang *functions* can have multiple clauses and multiple arguments
> and can be recursive.
> Erlang *funs* can have multiple clause and multiple arguments
> but cannot be recursive.
> 
> We can fix that in a very natural way, which does not invalidate
> the foundations of Erlang memory management, and fortuitously
> makes hot loading safer in the presence of funs, even if you do
> not use recursive ones.
> 
> This doesn't solve all known problems, and in particular,
> doesn't solve the problem of mutually recursive funs, but I
> don't expect a solution to that to arrive short of a radical
> redesign of the whole language (which Joe is thinking about,
> bless him).
> 

> EEP: XXX
> Title: funs with names
> Version: $Revision: 14 $
> Last-Modified: $Date: 2007-06-29 16:24:01 +0200 (Fri, 29 Jun 2007) $
> Author: Richard A. O'Keefe <ok@REDACTED>
> Status: Draft
> Type: Standards Track
> Erlang-Version: R14B-4
> Content-Type: text/plain
> Created: 27-May-2011
> Post-History:
> 
> 
> 
> Abstract
> 
>     The syntax of funs is extended to allow a variable name to
>     be consistently present before each argument list.  This
>     allows funs to be recursive.  The knot is tied in the
>     opcodes that apply a fun to its arguments, so no change to
>     the garbage collector design is required.
> 
> 
> 
> Specification
> 
>     Currently, there are three forms for a fun:
> 
> 	fun Name/Arity
> 	fun Module:Name/Arity
> 
>     and
> 
>         fun Fun_Clause {; Fun_Clause}... end
> 
>     We add another form:
> 
>         fun Variable Fun_Clause {; Variable Fun_Clause}... end
> 
>     If any Fun_Clause has a Variable, all must, and they must all
>     be the same variable.  Like the variables in the argument list(s),
>     this variable is local to the fun, not shared with its context.
>     Within the fun, the variable is bound to the value of the fun
>     expression.
> 
>     There are several possible ways to implement this.  One is
>     rather neat because it preserves the cycle-freedom of the
>     data structures the garbage collector has to deal with.
> 
>     One way to implement existing funs is this:
> 
>     (a) Create an auxiliary function with a generated name
> 
> 	<foo>(...,X1,...,Xk) ...;
> 	...
> 	<foo>(...,X1,...,Xk) ....
> 
>         having the same argument lists, guards, and clause bodies
>         as the fun, except that each variable shared with the context
>         appears as an extra argument.
> 
>     (b) Translate the fun expression as
> 
> 	'%mk-fun'({fun <foo>/n+k, X1, ..., Xk})
> 
>         which gives the tuple a special tag to say that it represents
>         a fun value.
> 
>     (c) Translate  Foo(E1,...,Em)
>         as A1 := E1, ..., Am := Em; funcall_m(Foo)
>         where the funcall_m instruction checks that its argument is
>         a closure expecting m arguments, moves the X1,...,Xk fields
>         of the tuple to argument registers A<m+1>..A<m+k>, and then
>         jumps to the address in the first field.
> 
>     All it takes to implement recursive funs is
> 
>     (a') Create an auxiliary function
> 
> 	<foo>(...,X1,...,Xk,Variable) ...;
> 	...
> 	<foo>(...,X1,...,Xk,Variable) ....
> 
>     (b') Translate the fun expression as
> 
> 	'%mk-rec-fun'({fun <foo>/<n+k+1>, X1, ..., Xk})
> 
>          which simply applies a second special tag.
> 
>     (c') The funcall_m opcode acts the same for both old and
>          recursive funs, except that just before jumping, it
>          adds tne fun value Foo itself as argument A<m+k+1>.
>          This "ties the knot".
> 
>     So a recursive fun takes no more space or time to create than
>     an existing one, and does not involve creating any cycles of
>     pointers.  Its code can be inserted into the failure path for
>     the funcall_m instructions, whatever their form.
> 
> 
> 
> 
> Motivation
> 
>     Fun names can serve three purposes.
> 
>     First, they can simply be documentation.  For example,
> 
> 	cfun_files(CFun) ->
> 	    fun(F1, F2) ->
> 		    [[?OBJ(T1,_) | _] | _] = F1,
> 		    [[?OBJ(T2,_) | _] | _] = F2,
> 		    CFun(T1, T2)
> 	    end.
> 
>     can be written as
> 
> 	cfun_files(CFun) ->
> 	    fun Compare([[?OBJ(T1,_)|_]|_], [[?OBJ(T2,_)|_]|_]) ->
> 		CFun(T1, T2)
> 	    end.
> 
>     A named fun whose name is not used can be implemented as if
>     the name were not there.
> 
>     Second, the fun's name can be built into its generated name.
>     At the time of writing, we might have
> 
> 	'-F/N-fun-K-'
> 
>     where F/N is the name of the function that includes the fun
>     and K is the number of earlier funs in F/N.  We could build
>     the name in instead, using
> 
> 	'-F/N-fun-Name-[K-]'
> 
>     where K is present only if the outer function contains more
>     than one fun with the same name.  The point of this is that
>     such names are more likely to be useful after hot loading.
>     For example, if we start with
> 
> 	f(...Xs, Ys, ...) ->
> 	    ...
> 	    sort(Xs, fun X_Key({_,N,_}) -> N end),
> 	    sort(Ys, fun Y_Key({N,_,_}) -> N end),
> 	    ...
> 
>     and then we revise it, swapping the two calls to sort/2.
>     With named funs, the two funs retain their generated names,
>     and the module is safe.  With anonymous functions, the
>     chances are that the two funs with swap names; oops!
> 
>     Third, a frequently asked question in the Erlang mailing
>     list is "why can't I have recursive funs?" to which we
>     will now be able to rely, "you can; here is what they
>     look like."
> 
>     This still does not permit mutually recursive funs, but
>     people do not seem to ask for that much.
> 
>     Finally, the next time someone argues that Erlang syntax
>     is inconsistent because function clauses have repeated
>     names and fun clauses do not, we shall be able to reply
>     "but fun clauses CAN have repeated names and probably
>     should."
> 
> 
> 
> 
> Rationale
> 
>     There really seemed to be only two main questions.
>     
>     What should the scope of the fun name variable be?
>     Some variables in a fun are shared between the fun
>     and its context.  Doing that would let us write
> 
>     f(...) ->
>         fun G(...) -> ... end,
>         fun H(...) -> ... end,
>         ... use G and H ...
> 
>     rather like using nested "define" in Scheme, except that
>     while H could use G, G couldn't use H.
> 
>     Since you do not get mutual recursion this way, you should
>     not be tricked into thinking you might.  It's better that
>     you have to write
> 
>     f(...) ->
>         GG = fun G(...) -> ... end,
>         HH = fun H(...) -> ... end,
>         ... use GG and HH ...
> 
>     so that you understand clearly what you are getting.
> 
>     While variables in the body of a fun clause may be shared
>     with the context, variables in the arguments are not,
>     something I have found confusing.  At least this way the
>     fun name follows the same scope rule as the variables in
>     the argument list right next to it.
> 
>     The other main question was whether recursive fun values
>     should be exactly the same representation as existing
>     fun values, but with a cycle in it (tying the knot at
>     construction time), or whether to introduce a new tag
>     (tying the knot at call time).  The lack of cycles in
>     Erlang heaps has been a major factor in the design of
>     several garbage collectors.  I would expect changing
>     that to be an order of magnitude harder than the
>     changes required for this proposal.  It was seeing that
>     the knot could be tied at call with (without slowing
>     down calls to existing funs) that made me dare to hope
>     that this proposal might some day be accepted.
> 
>     The main issue now is that this does not let us define
>     a group of mutually recursive local functions.
>     Adopting this proposal now might get in the way of a
>     better proposal that handles mutual recursion as well.
> 
>     I don't see such a proposal as being likely to arrive soon.
> 
>     There is a special case of this where the fun name is used
>     only in tail call positions, which can be handled entirely
>     by the compiler generating a jump back to the beginning.
>     This need not have any consequences for the run time system
>     at all.
> 
> 
> 
> Backwards Compatibility
> 
>     Code that does not use the new feature does not change its
>     meaning.  There may be code that relies on the form of
>     generated function names; that would need changing.
> 
>     All syntax tools would need to be revised to handle the new form.
>     Existing parse transforms might well fail on code containing the
>     new form, but would work unchanged on code that does not.
> 
>     At least one new instruction is needed to create suitably
>     distinguished closures.  Existing programs that analyse BEAM
>     files will not understand this until they are revised.
> 
>     As described under 'motivation', naming functions is
>     useful even if you do not use the name in any clause body.
>     This means that we can have a staged delivery of the feature.
> 
>     1.  Make the parser recognise fun names and check their identity.
>         Have it report an error if the fun name is used in a body.
>         Have it erase the fun names from the AST before any
>         downstream tool sees it.
> 
>         At this stage, fun names may serve as documentation.
> 
>     2.  Upgrade the downstream tools to recognise an extended 'fun'
>         AST node with two extra fields:  the fun name as an atom and
>         a flag saying whether it is not used, used only in tail
>         position, or used more generally.
> 
>         Upgrade the parser to report fun names, but retain the
>         check that they are not used.  Test the down stream tools.
> 
>     3.  Modify the compiler to use the new, safer, form of generated
>         name.  Ensure that the generated names are accessed only
>         through an interface, so all is consistent.
> 
>         At this stage, fun names help to reduce the danger from
>         code revisions that add, remove, or re-order funs; a
>         change that does not alter the number of funs with a
>         particular name in a function should not change its name.
> 
>     4.  (Optional.)  Revise the code generator to accept the fun
>         name in tail call position and generate a jump.  Modify
>         the parser to allow this.
> 
>         At this point, it is possible to pass a loop as a parameter,
>         like a list traversal or a binary search.  No changes to the
>         representation of Erlang terms or the BEAM engine have been
>         required yet.
> 
>     5.  Add a new tag.  Revise the funcall instructions to check for
>         it if the existing check fails, and push the closure itself.
>         Add a new instruction to make a new closure.  Revise the
>         Erlang term representation to encode recursive funs.  Revise
>         the type test instructions to recognise the new values.
>         Teach HiPE what to do.
> 
>         This is the last stage.
> 
> 
> 
> Reference Implementation
> 
>     None in this draft.  Stage 1 can be done fairly easily.
>     Stage 2 would be hard for me because I'm not even sure what
>     all the relevant modules are.
> 
> 
> 
> References
> 
>     None.
> 
> 
> 
> Copyright
> 
>     This document has been placed in the public domain.
> 
> 
> 
> Local Variables:
> mode: indented-text
> indent-tabs-mode: nil
> sentence-end-double-space: t
> fill-column: 70
> coding: utf-8
> End:
> _______________________________________________
> eeps mailing list
> eeps@REDACTED
> http://erlang.org/mailman/listinfo/eeps


-- 

/ Raimo Niskanen, Erlang/OTP, Ericsson AB



More information about the eeps mailing list