[erlang-questions] Data locality and the Erlang Runtime

Wed Dec 11 16:03:56 CET 2013

On Wed, Dec 11, 2013 at 11:34 AM, Vincent Siliakus <zambal@REDACTED> wrote:

> Last night I was reading the following article:
> http://gameprogrammingpatterns.com/data-locality.html. It gives a nice
> overview about the importance of data locality for performance critical
> code these days. So I started wondering how much of this applies to my
> favourite runtime (I say runtime because I use Elixir more often than
> Erlang lately).
>

It is hard to say exactly how a given piece of code will execute on a
system. The general consensus the latter years has been optimizing for data
locality and limiting DRAM access speeds things up. This is mostly due to
how caches work and so on. Clock cycles today are almost "free". The Erlang
BEAM VM is quite pointer-heavy in the sense that we usually don't pack data
that tightly and prefer to use a tree-like term structure. The advantage of
the approach is it gives a better and easier path to handle persistence and
immutability, which are cornerstones of writing large functional programs.

Furthermore, you have relatively little control over the concrete memory
layout of data in BEAM, making it harder to pack data tightly. It makes it
hard to apply those kinds of tricks in the article. The "structural zoo" of
Erlang is not that strong and due to the language being dynamic--it is
limited how much representational control you have. If you want to exploit
memory layout, there are better languages out there.

However, the article is written for single-core imperative code. One
advantage you have on the BEAM is that small process heaps tend to stay in
cache. This makes their garbage collection and operation faster and it
improves locality for the heap. Also, the article completely forgets to
touch on the fact that modern systems have multiple cores. In such a
system, immutability and copying can help drive up parallelism, which in
turn means you can get more cores to do productive work. It is not as
clear-cut as one might believe. Multiple cores changes everything,
especially with respect to data mutation which become more expensive.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20131211/d7a5bfa4/attachment.htm>