[erlang-questions] Why are messages between processes copied?
Richard Carlsson
carlsson.richard@REDACTED
Thu Feb 23 11:48:23 CET 2012
On 02/23/2012 10:58 AM, Thomas Lindgren wrote:
> I think Richard Carlsson et al tried something similar, which I
> recall as follows (sorry if it's garbled): allocate data that may be
> sent in a global heap, while definitely private data goes in the
> local heap of the process. Classify the data using static analysis.
> Message sends then become pointer management rather than copying. I'm
> not sure of the outcome, percent gained etc.
Yes, it was part of the Hybrid Heap scheme. With the hybrid heap, you
allocate data on the local heaps first, but if the data gets sent as a
message, you move it to a global "message heap". Data that is already on
the message heap doesn't need to be copied again, so redirecting a
received message (or parts thereof) to another process just copies the
pointer. Testing where the data resides is done at runtime as part of
the send operation, and consists of a cheap pointer comparison, so
static analysis is not strictly needed, but only data that gets sent
more than once will benefit from the "raw" hybrid heap.
The main improvements could be had from what we named "message
analysis", figuring out what data structures were likely to end up as a
message and allocating them directly on the message heap. This analysis
had some interesting properties, since neither under- nor
overapproximation would cause errors (because the runtime checks for
copying will still be done).
Message analysis (http://dl.acm.org/citation.cfm?id=1146813) is
essentially the inverse of what Masklinn mentioned, escape analysis.
Because of Erlang's dynamic nature with modules reloaded on the fly,
escape analysis has to be so conservative that it misses most
opportunities for local allocation and forces most data to be allocated
on the global heap. This makes it more like a variant of the shared heap
architecture, which has too many problems with garbage collection to be
really viable.
Our measurements with the hybrid heap showed a lot of promise. However,
this work was done before the support for multithreading was added to
Beam, and ironically, while we saw it as the future heap architecture
for a multithreaded Beam, it was the major hackage done in the VM to
support multithreading that broke our implementation. By then, both
Jesper Wilhelmsson and I had moved on to full time jobs elsewhere, so
the work was orphaned. I still think it's probably a very good idea, if
Ericsson can find the people and time to get it running again. But it
will take quite a bit of knowledge, in particular about the GC (which
was Jesper's area).
/Richard
More information about the erlang-questions
mailing list