geometric memory growth

Mon Nov 28 14:02:08 CET 2005

Thomas Lindgren wrote:
>
> > and
> > semantically, the access _has_ to be done at the binding 
> occurrence - if it is delayed, one has to lug the entire
> record along to make sure that the record field access
> has the same result as it would have had at the binding
> occurrence.
> 
> I would say that semantically this evaluation MUST NOT be 
> done at the binding occurrence, IF we can observe a 
> functional difference compared to the original program 
> evaluated naively.

In the case of a record access (assuming that a guard test
or other exists that asserts that it is in fact a record),
you can't observe a functional difference in this case.

I guess it all boils down to how we want to view record
field access.

(One could imagine that the compiler would warn about 
cases where record field access is used on something that
isn't known to be a record. Of course, there could be 
other, similar cases where the programmer simply doesn't
realise that by being more specific, the compiler is given
more opportunities to make the code efficient.)

Of course, the really naiive apporach to binding the fun's
environment would be to include _all_ bound variables
every time.

> You have told the compiler that the 
> program shall behave AS IF the evaluation occurs inside the 
> closure; changing program behaviour is usually considered
> bad form.

So if I wrote:

f() -> 
  X = 5,
  fun() -> X + 5 end.

it would be bad form to reduce to.

f() -> 
 fun() -> 10 end.

Exercise for the reader: compile such code with the
-S flag and see what comes out.  (:

(Hint: '+' essentially compiles down to a call to
erlang:'+'/2)

Now change to X = foo, and see what happens(*)

Apparently, it's OK to do such constant propagation
in some cases, but not in others. That's fine, I
don't question that. The question is: where do you
draw the line?

> For plain record accesses: assuming safeness, the compiler 
> basically has to consider the following trade off for 
> profitability: if we extract N fields from the record State, 
> when is the cost of creating a closure with (N-1) more free 
> variables less expensive than performing the accesses every 
> time the closure is used? How many times is the closure used? 

Of course, the compiler can consider none of those, since it
has no way of knowing. It can of course guess... If it 
guessed in my case, it was completely off base.

> The compiler usually resorts to heuristics at this point. For 
> a production compiler, the idea is usually to avoid making 
> the rare but spectacularly bad decisions :-)

And in this particular case, importing the whole record
was a spectacularly bad decision (whether it was mine or 
the compilers.)

/Uffe

(*) For those without a working erlang compiler, here's
the answer:
1) In the first case, the compiler optimizes away the 
   addition, and simply replaces it with a constant in the
   fun.
2) In the second case, the compiler warns that the expression
   would cause a badarith exception in runtime, and promptly
   inserts a call to erlang:error/1 in the fun, rather than
   calling the erlang:'+'/2 function with bad arguments.
   BTW, if you do the same thing with an obviously bad 
   record field access, the compiler does likewise: warns
   that it's going to crash, and replaces the record field
   access with a call to erlang:error/1.