[erlang-questions] In-process memory sharing doesn’t seem to work

Tue Jun 19 16:30:36 CEST 2018

I have a potentially big, in-memory data structure which I need to maintain in two copies, master-slave. The slave however will be synchronized only periodically (currently once per second) and it may accept ephemeral changes (lost with each synchronization). The changes will be relatively minor (most of the data structure will be preserved intact). For real world context, see the quoted documentation at https://github.com/tendermint/abci/issues/274 . A natural way to handle this problem without bothering about efficiency would be to put the slave state in a separate process and to synchronize it periodically by sending a message from master containing current data dump. However sending messages (and spawning new processes) involves copying data, while making an assignment in the same process should allow to benefit from memory sharing.

There seems to be not much documentation about sharing, but it is described at http://erlang.org/doc/efficiency_guide/processes.html#id70776 (“Loss of sharing”) and there is a discussion at http://erlang.org/pipermail/erlang-questions/2009-September/046436.html . The problem is that memory measurements don’t seem to confirm that sharing actually occurs. For demonstration I’ll use the hog module from http://erlang.org/pipermail/erlang-bugs/2007-November/000488.html .

Erlang/OTP 21 [RELEASE CANDIDATE 2] [erts-10.0] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:1] [hipe]

Eshell V10.0  (abort with ^G)
1> c(hog).
hog.erl:2: Warning: export_all flag enabled - all functions will be exported
hog.erl:29: Warning: variable 'Tree' is unused
{ok,hog}
2> MemoryFun = fun() -> round(memory(processes_used) / 1024 / 1024) end.
#Fun<erl_eval.20.6975923>
3> MemoryFun().
4
4> T = hog:tree(25), ok.
ok
5> erts_debug:size(T).
75
6> erts_debug:flat_size(T).
100663293

So far, so good.

7> MemoryFun().
1814

This already looks suspicious. An interesting fact about erlang:memory/1 is that it looks completely unstable, as running it after a while gives a different result:

8> MemoryFun().
5739

top meanwhile reports 20,3% memory usage of 7843892 KiB total, so it’s more close to the first result.

Interesting things happen also when we force garbage collection:

9> garbage_collect(), MemoryFun().
1814

Looks better, however after a while we get back the bigger result, but two calls more are needed:

10> MemoryFun().
1814
11> MemoryFun().
5739

Let’s now create a tuple which references the same tree multiple times:

12> X = {T, T, T, T}, ok.
ok
eheap_alloc: Cannot allocate 5668310376 bytes of memory (of type "heap").

Crash dump is being written to: erl_crash.dump...done

The generated erl_crash.dump takes 6 gigabytes. It’s interesting that we get “ok”, which suggests that there is sufficient memory for X and Erlang shell tries to allocate the huge number of bytes for something else.

Also output of hog:main/0 is completely different than one in the original report:

1> hog:main().
Creating tree:: allocated 0 bytes, heap size 376 bytes
Receiving tree:: allocated 0 bytes, heap size 233 bytes
true

Does it mean that memory sharing no longer works in Erlang? Or maybe it is a shell-related problem?