[erlang-questions] moving process state - copy on write - heap memory increase

Wed Sep 12 09:38:14 CEST 2012

On 12/09/2012, at 9:45 AM, Roland wrote that
creating a term with a very great deal of shared subterms
and then sending it to another process "unshared" the subterms.

There's a comment in his code, "% simulate state transfer / loss of semantics".

The problem is that there are *two* levels of semantics for a
language like Erlang:
 - a "value" semantics
 - a "cost" semantics
and subterm sharing is part of the cost semantics
but it is NOT part of the value semantics.

For example,

    copy_pairs([{K,V}|Pairs]) -> [{K,V}|copy_pairs(Pairs)];
    copy_pairs([])            -> [].

is allowed by the value semantics *either* to
create new {K,V} pairs *or* to share the old
ones, even though they have different cost semantics.
(Yes, I know this is a restriction of the identity function.)

This is one place where the divergence between the value
semantics and the cost semantics is observable.  There are
more than just stylistic reasons to write

    copy_pairs([P={_,_}|Pairs]) -> [P|copy_pairs(Pairs)];
    copy_pairs([])              -> [].

term_to_binary and message sending are additional places
where the value and cost semantics diverge.  I thought it
had always been clear from the Erlang documentation that
these things linearise the term they are given.

There's a close connection between term copying and (some
forms of) garbage collection.  Indeed, the first interpreter
I ever wrote for a Lisp-like language reused some of the GC
code in the copier.

I imagine most of us would be very happy with term_to_binary
and message sending preserving sharing as long as it didn't
slow existing code down more than a percent or two.  Not
because we _want_ to pass terms with lots of sharing but
because it's so much nicer not to have to worry about that.