<div dir="ltr"><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">As Dmitry says, the closure must be sent upon the function spawn. If it is large, you can expect the 11MB to be sent when the function is spawned.</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">The best way around it is to avoid sending a large body of information when you create a function. My guess is you are referencing a large map or list of data, which in turn gets copied. In some situations, this will also hurt when you spawn the same function locally, so there is good reason to avoid it there as well (though there are some caveats if the referenced data is part of the literal arena in the memory allocation system, and so on).</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Apr 19, 2019 at 4:47 PM Filip Niksic <<a href="mailto:fniksic@seas.upenn.edu">fniksic@seas.upenn.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_default" style="font-family:verdana,sans-serif">Hi all,</div><div class="gmail_default" style="font-family:verdana,sans-serif"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif">I am trying to understand why a node sends 11 MB of unknown data to another node while spawning a process on that node.</div><div class="gmail_default" style="font-family:verdana,sans-serif"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif">Let me briefly explain my setup. There are two nodes involved: main and a. I am running them in two docker containers, which in turn are running in a simulated network in which I can inspect and analyze network traffic using Wireshark. Once the nodes are started, main spawns a process on node a with spawn_link(). In Wireshark I can observe an exchange of ErlDP (distribution protocol) packets. The spawn_link causes a colossal REG_SEND message being sent from main to a; the message has length 11011057 (11 MB) and it is broken into 7605 TCP packets.</div><div class="gmail_default" style="font-family:verdana,sans-serif"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif">Now, it has to be noted that one of the arguments to the spawned process is a function closure. Could it be that this closure causes the runtime to pack all of its data structures and pass them along with the message? If so, how can such a situation be avoided? Is there some general rule of thumb that function closures should not be passed as arguments in a distributed setting?<br></div><div class="gmail_default" style="font-family:verdana,sans-serif"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif">Thanks,</div><div class="gmail_default" style="font-family:verdana,sans-serif"><br></div><div><div dir="ltr" class="gmail-m_30155594762566719gmail_signature"><div dir="ltr"><div><font face="verdana,sans-serif">Filip<br></font></div><div><font face="verdana,sans-serif"><br></font></div></div></div></div></div>
_______________________________________________<br>
erlang-questions mailing list<br>
<a href="mailto:erlang-questions@erlang.org" target="_blank">erlang-questions@erlang.org</a><br>
<a href="http://erlang.org/mailman/listinfo/erlang-questions" rel="noreferrer" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a><br>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature">J.</div>