[erlang-questions] Why intra-node messages are copied and not refcounted, rationale?

Thu May 7 16:27:17 CEST 2015

On Thu, May 7, 2015 at 3:07 PM, Nahuel Greco <ngreco@REDACTED> wrote:

> What's the rationale of accepting a copying cpu/memory overhead? Why a
> refcounting mechanism like the used for binaries is not desirable for all
> the data structures?

Really good questions!

Why copy and not pass by reference? It is implementation specific, as the
Erlang language allows for both semantics, and there were a variant of the
VM which used varying reference-passing tricks in place of the current
copying solution. The biggest disadvantage is that if you send very large
messages, then the copying overhead is large and costs performance. But
there are also some important advantages to copying. Each process can have
its own heap arena for instance, so this breaks up the memory space into
many small chunks which can be individually garbage collected. Erlang is a
soft realtime system, so you can't afford long garbage collection pauses.
This is usually really good for GC pause times as they can be quite small.
The second big advantage is that the semantics are the same for local
processes as well as distributed processes. You *have* to copy if the
process lives on another node anyway. The third advantage is that data
becomes local to a process, and thus local to a processing core. There are
advantages to this model on a system where memory access is highly
non-uniform, because you can in theory pick memory close to the processing
core.

As an example, I have been brewing on some latency benchmarks for
webservers. A sneak peak is the median response time of this test:

out.go: 50.000%    1.72ms
out.undertow: 50.000%    1.53ms
out.cowboy: 50.000%    6.33ms

where clearly, Erlang is far slower than a Go or Undertow, a Java-based
webserver. But once we look at the 99.99th percentile, things are different:

out.go: 99.990%   58.75ms
out.undertow: 99.990%   47.90ms
out.cowboy: 99.990%   38.62ms

and at the 99.999th percentile it gets really funny:

out.go: 99.999%  216.45ms
out.undertow: 99.999%   64.61ms
out.cowboy: 99.999%   45.09ms

what you see here is garbage collection overhead because you have such
large heaps to traverse, or that there are some other factor imposing
additional latency for a few of the calls.

Why not use refcounting? One price of refcounting is unpredictable
performance. If you ref-count a large list and free it up, then the GC has
to get rid of this list, and it will take quite some time. Many refcounting
GC's runs this as a background job to handle this. In fact, Refcounting GCs
are dual to Mark&Sweep GCs[0]. In the binary heap, a binary can contain no
pointers, which means reclamation of memory is always constant.
Furthermore, without optimizations, refcounting GCs tend to perform badly.
With optimizations, they start to look like Mark&Sweep collectors, as
written in [0].

In other words, both decisions are design decisions which tend to yield
good soft realtime performance on current hardware, and predictable
performance, especially in the worst case. You may not want a system that
stalls for at least 216ms every 100.000th call. But there are no rules in
the Erlang semantics which constrains us from coming up with another
message passing scheme, should we invent one that is even better. It is
just that the current behavior is highly desirable in the systems where
Erlang tend to get used.

[0]
http://researcher.watson.ibm.com/researcher/files/us-bacon/Bacon04Unified.pdf

-- 
J.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20150507/f021673e/attachment.htm>