[erlang-questions] Why are messages between processes copied?

Anthony Molinaro anthonym@REDACTED
Thu Feb 23 21:34:01 CET 2012


On Thu, Feb 23, 2012 at 12:57:14PM -0600, Tristan Sloughter wrote:
> I don't see it mentioned in this thread and thought it should. Large
> binaries are not stored in the heap (but instead a pointer to it is) and
> are reference counted, correct?
> 
> Would this fact not make it possible to easily test placing certain
> messages on this "shared space"? Instead of people trying to put Erlang on
> a shared heap -- which I think is a bad idea.

I decided to try to test out the speed difference based on whether the
message being sent was a binary or not, so tried the following

-module(send_speed).

-compile(export_all).

time_it(NumProcesses, ListLength) ->
  Term = random_list(ListLength),
  Bin = term_to_binary(Term),
  {Mterm, ok} =
    timer:tc (send_speed, send_to_processes, [Term, spawn_N(NumProcesses)]),
  {Mbin, ok} =
    timer:tc (send_speed, send_to_processes, [Bin, spawn_N(NumProcesses)]),
  {iolist_size (Term), iolist_size (Bin), Mterm, Mbin}.

random_list(N) ->
  random:seed(now()),
  [ random:uniform(255) || _ <- lists:seq(1,N) ].

spawn_N (N) ->
  [ spawn(fun() -> receive _ -> ok end end) || _ <- lists:seq(1,N) ].

send_to_processes(Thing, Processes) ->
  [ P ! Thing || P <- Processes ],
  ok.

Then ran this in the shell as follows

Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:2:2] [rq:2] [async-threads:0]
[hipe] [kernel-poll:false]

Eshell V5.8.5  (abort with ^G)
1> c(send_speed).
{ok,send_speed}
2> [ send_speed:time_it (10000, L) || L <- lists:seq (10,200,10) ].
[{10,14,7656,9396},
 {20,24,8892,7580},
 {30,34,10291,8261},
 {40,44,10766,8411},
 {50,54,11213,8080},
 {60,64,11697,7802},
 {70,74,12578,8802},
 {80,84,13281,10130},
 {90,94,13686,9701},
 {100,104,13589,10325},
 {110,114,15563,9724},
 {120,124,21094,9751},
 {130,134,18604,10466},
 {140,144,19157,10178},
 {150,154,19716,9812},
 {160,164,18653,9582},
 {170,174,19541,10453},
 {180,184,20433,8898},
 {190,194,21181,10227},
 {200,204,20349,9916}]
3> 

The four columns are the size of a list, the size in bytes of the binary
and results of sending to 10000 processes.  So seems like they correllate
pretty well up until the magic 64 byte value, at which point the sending
of binaries seems to remain cheaper.

Now, I don't do anything with the message on the other side and once you
call binary_to_term you'll incur a copy out of the shared space, but if
you were able to do something with the binary without turning it back
to the term this might not be so bad.

Hopefully I didn't mess up my test too horribly,

-Anthony

-- 
------------------------------------------------------------------------
Anthony Molinaro                           <anthonym@REDACTED>



More information about the erlang-questions mailing list