[eeps] New EEP draft: recursive funs
Raimo Niskanen
raimo+eeps@REDACTED
Fri May 27 16:00:36 CEST 2011
I'm sorry, we have changed EEP format into Markdown.
See EEP 33: http://www.erlang.org/eeps/eep-0033.html
I am not sure if this has been communicated properly,
so I would appreciate very much if you could reformat
your EEPs int Markdown according to EEP 33. Otherwise
I can't dig into that until sometime the next week...
Sorry about the inconveniance.
/ Raimo Niskanen, EEP Editor
On Fri, May 27, 2011 at 07:55:34PM +1200, ok@REDACTED wrote:
> Erlang *functions* can have multiple clauses and multiple arguments
> and can be recursive.
> Erlang *funs* can have multiple clause and multiple arguments
> but cannot be recursive.
>
> We can fix that in a very natural way, which does not invalidate
> the foundations of Erlang memory management, and fortuitously
> makes hot loading safer in the presence of funs, even if you do
> not use recursive ones.
>
> This doesn't solve all known problems, and in particular,
> doesn't solve the problem of mutually recursive funs, but I
> don't expect a solution to that to arrive short of a radical
> redesign of the whole language (which Joe is thinking about,
> bless him).
>
> EEP: XXX
> Title: funs with names
> Version: $Revision: 14 $
> Last-Modified: $Date: 2007-06-29 16:24:01 +0200 (Fri, 29 Jun 2007) $
> Author: Richard A. O'Keefe <ok@REDACTED>
> Status: Draft
> Type: Standards Track
> Erlang-Version: R14B-4
> Content-Type: text/plain
> Created: 27-May-2011
> Post-History:
>
>
>
> Abstract
>
> The syntax of funs is extended to allow a variable name to
> be consistently present before each argument list. This
> allows funs to be recursive. The knot is tied in the
> opcodes that apply a fun to its arguments, so no change to
> the garbage collector design is required.
>
>
>
> Specification
>
> Currently, there are three forms for a fun:
>
> fun Name/Arity
> fun Module:Name/Arity
>
> and
>
> fun Fun_Clause {; Fun_Clause}... end
>
> We add another form:
>
> fun Variable Fun_Clause {; Variable Fun_Clause}... end
>
> If any Fun_Clause has a Variable, all must, and they must all
> be the same variable. Like the variables in the argument list(s),
> this variable is local to the fun, not shared with its context.
> Within the fun, the variable is bound to the value of the fun
> expression.
>
> There are several possible ways to implement this. One is
> rather neat because it preserves the cycle-freedom of the
> data structures the garbage collector has to deal with.
>
> One way to implement existing funs is this:
>
> (a) Create an auxiliary function with a generated name
>
> <foo>(...,X1,...,Xk) ...;
> ...
> <foo>(...,X1,...,Xk) ....
>
> having the same argument lists, guards, and clause bodies
> as the fun, except that each variable shared with the context
> appears as an extra argument.
>
> (b) Translate the fun expression as
>
> '%mk-fun'({fun <foo>/n+k, X1, ..., Xk})
>
> which gives the tuple a special tag to say that it represents
> a fun value.
>
> (c) Translate Foo(E1,...,Em)
> as A1 := E1, ..., Am := Em; funcall_m(Foo)
> where the funcall_m instruction checks that its argument is
> a closure expecting m arguments, moves the X1,...,Xk fields
> of the tuple to argument registers A<m+1>..A<m+k>, and then
> jumps to the address in the first field.
>
> All it takes to implement recursive funs is
>
> (a') Create an auxiliary function
>
> <foo>(...,X1,...,Xk,Variable) ...;
> ...
> <foo>(...,X1,...,Xk,Variable) ....
>
> (b') Translate the fun expression as
>
> '%mk-rec-fun'({fun <foo>/<n+k+1>, X1, ..., Xk})
>
> which simply applies a second special tag.
>
> (c') The funcall_m opcode acts the same for both old and
> recursive funs, except that just before jumping, it
> adds tne fun value Foo itself as argument A<m+k+1>.
> This "ties the knot".
>
> So a recursive fun takes no more space or time to create than
> an existing one, and does not involve creating any cycles of
> pointers. Its code can be inserted into the failure path for
> the funcall_m instructions, whatever their form.
>
>
>
>
> Motivation
>
> Fun names can serve three purposes.
>
> First, they can simply be documentation. For example,
>
> cfun_files(CFun) ->
> fun(F1, F2) ->
> [[?OBJ(T1,_) | _] | _] = F1,
> [[?OBJ(T2,_) | _] | _] = F2,
> CFun(T1, T2)
> end.
>
> can be written as
>
> cfun_files(CFun) ->
> fun Compare([[?OBJ(T1,_)|_]|_], [[?OBJ(T2,_)|_]|_]) ->
> CFun(T1, T2)
> end.
>
> A named fun whose name is not used can be implemented as if
> the name were not there.
>
> Second, the fun's name can be built into its generated name.
> At the time of writing, we might have
>
> '-F/N-fun-K-'
>
> where F/N is the name of the function that includes the fun
> and K is the number of earlier funs in F/N. We could build
> the name in instead, using
>
> '-F/N-fun-Name-[K-]'
>
> where K is present only if the outer function contains more
> than one fun with the same name. The point of this is that
> such names are more likely to be useful after hot loading.
> For example, if we start with
>
> f(...Xs, Ys, ...) ->
> ...
> sort(Xs, fun X_Key({_,N,_}) -> N end),
> sort(Ys, fun Y_Key({N,_,_}) -> N end),
> ...
>
> and then we revise it, swapping the two calls to sort/2.
> With named funs, the two funs retain their generated names,
> and the module is safe. With anonymous functions, the
> chances are that the two funs with swap names; oops!
>
> Third, a frequently asked question in the Erlang mailing
> list is "why can't I have recursive funs?" to which we
> will now be able to rely, "you can; here is what they
> look like."
>
> This still does not permit mutually recursive funs, but
> people do not seem to ask for that much.
>
> Finally, the next time someone argues that Erlang syntax
> is inconsistent because function clauses have repeated
> names and fun clauses do not, we shall be able to reply
> "but fun clauses CAN have repeated names and probably
> should."
>
>
>
>
> Rationale
>
> There really seemed to be only two main questions.
>
> What should the scope of the fun name variable be?
> Some variables in a fun are shared between the fun
> and its context. Doing that would let us write
>
> f(...) ->
> fun G(...) -> ... end,
> fun H(...) -> ... end,
> ... use G and H ...
>
> rather like using nested "define" in Scheme, except that
> while H could use G, G couldn't use H.
>
> Since you do not get mutual recursion this way, you should
> not be tricked into thinking you might. It's better that
> you have to write
>
> f(...) ->
> GG = fun G(...) -> ... end,
> HH = fun H(...) -> ... end,
> ... use GG and HH ...
>
> so that you understand clearly what you are getting.
>
> While variables in the body of a fun clause may be shared
> with the context, variables in the arguments are not,
> something I have found confusing. At least this way the
> fun name follows the same scope rule as the variables in
> the argument list right next to it.
>
> The other main question was whether recursive fun values
> should be exactly the same representation as existing
> fun values, but with a cycle in it (tying the knot at
> construction time), or whether to introduce a new tag
> (tying the knot at call time). The lack of cycles in
> Erlang heaps has been a major factor in the design of
> several garbage collectors. I would expect changing
> that to be an order of magnitude harder than the
> changes required for this proposal. It was seeing that
> the knot could be tied at call with (without slowing
> down calls to existing funs) that made me dare to hope
> that this proposal might some day be accepted.
>
> The main issue now is that this does not let us define
> a group of mutually recursive local functions.
> Adopting this proposal now might get in the way of a
> better proposal that handles mutual recursion as well.
>
> I don't see such a proposal as being likely to arrive soon.
>
> There is a special case of this where the fun name is used
> only in tail call positions, which can be handled entirely
> by the compiler generating a jump back to the beginning.
> This need not have any consequences for the run time system
> at all.
>
>
>
> Backwards Compatibility
>
> Code that does not use the new feature does not change its
> meaning. There may be code that relies on the form of
> generated function names; that would need changing.
>
> All syntax tools would need to be revised to handle the new form.
> Existing parse transforms might well fail on code containing the
> new form, but would work unchanged on code that does not.
>
> At least one new instruction is needed to create suitably
> distinguished closures. Existing programs that analyse BEAM
> files will not understand this until they are revised.
>
> As described under 'motivation', naming functions is
> useful even if you do not use the name in any clause body.
> This means that we can have a staged delivery of the feature.
>
> 1. Make the parser recognise fun names and check their identity.
> Have it report an error if the fun name is used in a body.
> Have it erase the fun names from the AST before any
> downstream tool sees it.
>
> At this stage, fun names may serve as documentation.
>
> 2. Upgrade the downstream tools to recognise an extended 'fun'
> AST node with two extra fields: the fun name as an atom and
> a flag saying whether it is not used, used only in tail
> position, or used more generally.
>
> Upgrade the parser to report fun names, but retain the
> check that they are not used. Test the down stream tools.
>
> 3. Modify the compiler to use the new, safer, form of generated
> name. Ensure that the generated names are accessed only
> through an interface, so all is consistent.
>
> At this stage, fun names help to reduce the danger from
> code revisions that add, remove, or re-order funs; a
> change that does not alter the number of funs with a
> particular name in a function should not change its name.
>
> 4. (Optional.) Revise the code generator to accept the fun
> name in tail call position and generate a jump. Modify
> the parser to allow this.
>
> At this point, it is possible to pass a loop as a parameter,
> like a list traversal or a binary search. No changes to the
> representation of Erlang terms or the BEAM engine have been
> required yet.
>
> 5. Add a new tag. Revise the funcall instructions to check for
> it if the existing check fails, and push the closure itself.
> Add a new instruction to make a new closure. Revise the
> Erlang term representation to encode recursive funs. Revise
> the type test instructions to recognise the new values.
> Teach HiPE what to do.
>
> This is the last stage.
>
>
>
> Reference Implementation
>
> None in this draft. Stage 1 can be done fairly easily.
> Stage 2 would be hard for me because I'm not even sure what
> all the relevant modules are.
>
>
>
> References
>
> None.
>
>
>
> Copyright
>
> This document has been placed in the public domain.
>
>
>
> Local Variables:
> mode: indented-text
> indent-tabs-mode: nil
> sentence-end-double-space: t
> fill-column: 70
> coding: utf-8
> End:
> _______________________________________________
> eeps mailing list
> eeps@REDACTED
> http://erlang.org/mailman/listinfo/eeps
--
/ Raimo Niskanen, Erlang/OTP, Ericsson AB
More information about the eeps
mailing list