[erlang-questions] Reassigning variables

Fri Mar 20 04:33:00 CET 2009

So far we have one concrete example where reassignment would
be useful, plus several allegations that this is not unusual.
>
> Now remember, there's still NO need to add "for" loops and "while"
> loops to Erlang.  But re-using names within the same function?  Yes
> please, providing it can be done in a clean way; I realize it can
> conflict with pattern matching if not done correctly.

That's throwing roses at it.

Let's recall the approaches on the table at the moment:

(1)  f(....., St0) ->
	{...,St1} = g(....,St0),
	{...,St2} = h(....,St1),
	{...,St3} = u(....,St2),
         {...,St3}.

(2)  f(....., St) ->
         {...,r(St)} = g(....,St),
	{...,r(St)} = h(....,St),
	{...,r(St)} = u(....,St),
	{...,St}.

      What this saves over (1) is the need for new names.
      Presumably someone who likes this doesn't _want_ to
      refer to any but the most recent state, so the loss
      of the opportunity to do so counts as a gain.
      But there is a heavy cost:  if this is mixed with
      much other code and you want to see where "St" came
      from you cannot easily find "it", because "it" is
      really "them".

      It takes _more_ keyboard activity to type (2) than (1).
      StX...StY becomes St...r(St) for an extra character per update.
      It also hurts that something that looks like a function call
      isn't one.  A different syntax, maybe "*St", could fix both of
      those issues.

      However, (2) and (1) both share the same fundamental problem,
      so that I have to regard (2) as not really being usefully
      different.  The state threading is still explicit.  And just
      as it is possible to mistakenly re-use a numbered version of
      a state variable, {...,St5} = p(...,St5), so it is possible
      to mistakenly omit the "reassign" marker {...,St} = p(...,St)
      when {...,r(St)} or {...,*St} or whatever syntax was meant.

(3)  -changer(core, [f/..,g/..,h/..,u/..]).

      f(....) ->
         ... = g(....),
         ... = h(....),
         ... = u(....),
         ...

     For each -changer(T) declaration a function is subject to,
     it gets an extra argument, and if there is at least one,
     the changer results are tupled with the original result.
     The threading is now the responsibility of the translator,
     not of the programmer.  It is now *impossible* for the
     programmer to get the threading wrong.

But there's another thought.  The need for "state updates" scattered
all through the compiler source code is caused at least as much by
the decision to _have_ a separate "state" data structure as it is by
limitations of the language.

In Prolog, when you are building a data structure to represent some
kind of intermediate language, you often put variables in it.  So I
might generate
	[push_local(X),push_integer(1),add,pop_local(X),...]
where X represents an as-yet unchosen location.  Then a later pass
can come along and fill in the numbers.  I can create a new
representation of a run-time variable just by mentioning a new
logical variable and leave the filling in until later.

You can't do that in Erlang, but you *could* have an intermediate
language that included some kind of 'let' construct.  And given a
tokeniser that tags each token with its source location, you can
use source locations to make things unique.  For example, if I
were translating
	for (Init; Test; Step) { Body }
to
	goto L
     L1	Body
     L2  Step
         Test [true->L1, false->L3]
     L3
so that continue->L2 and break->L3, I could do this by
code something like

	L1 = {l,Pos},
	L2 = {c,Pos},
	L3 = {b,Pos},
	{let_label,[L1,L2,L3],
	  [{place,L1},
	   statement(Body, [{continue,L2},{break,L3}|Context]),
	   {place,L2},
	   statement(Step, Context),
            test(Test, L1, L3),
	   {place,L3}
	  ]}

And then a separate pass over this data structure would thread
the appropriate state through.

One thing about this that commends itself to me is that it doesn't
rely on creating atoms, of which Erlang has a limited stock.  It
had never occurred to me that compiling an Erlang file might use
up atoms that did not occur in the source code.  From now on I shall
be using erlc more and c(...) less.

So (2) eliminated the *numbers* but left everything else there,
(3) eliminated the *appearence* of the state but it still existed,
while (4) is to eliminate the state *entirely* from the main
functions by putting "variable-binding" operations in the data
structure that's being built.