[erlang-questions] Discussion and proposal regarding rpc scalability

José Valim jose.valim@REDACTED
Thu Feb 11 22:04:07 CET 2016

Hello everyone,

I was reading the publication "Investigating the Scalability Limits of
Distributed Erlang
<http://www.dcs.gla.ac.uk/~amirg/publications/DE-Bench.pdf>" and one of the
conclusions is:

*> We observed that distributed Erlang scales linearly up to 150 nodes when
no global command is made. Our results reveal that the latency of rpc calls
rises as cluster size grows. This shows that spawn scales much better than
rpc and using spawn instead of rpc in the sake of scalability is advised. *

The reason why is highlighted in a previous section:

*> To find out why rpc’s latency increases as the cluster size grows, we
need to know more about rpc. (...) There is a generic server process (gen
server) on each Erlang node which is named rex. This process is responsible
for receiving and handling all rpc requests that come to an Erlang node.
After handling the request, generated results will be returned to the
source node. In addition to user applications, rpc is also used by many
built-in OTP modules, and so it can be overloaded as a shared service.*

In other words, the more applications we have relying on rpc, the more
likely rpc will become a bottleneck and increase latency. I believe we have
three options here:

1. Promote spawn over rpc, as the paper conclusion did (i.e. mention spawn
in the rpc docs and so on)
2. Leave things as is
3. Allow "more scalable" usage of rpc by supporting application specific
rpc instances

In particular, my proposal for 3 is to allow developers to spawn their own
rpc processes. In other words, we can expose:

rpc:start_link(my_app_rpc) %% start your own rpc

rpc:call({my_app_rpc, nodename}, foo, bar, [1, 2, 3]) %% invoke your own
rpc at the given node

This is a very simple solution that moves the bottleneck away from rpc's
rex process since developers can place their own rpc processes in their
application's tree. The code changes required to support this feature are
also minimal and are almost all at the API level, i.e. support a tuple were
today a node is expected or allow the name as argument, mimicking the same
API provided by gen_server that rpc relies on. We won't change
implementation details. Finally, I believe it will provide a more
predictable usage of rpc.

Feedback is appreciated!

*José Valim*
Skype: jv.ptec
Founder and Director of R&D
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20160211/42c33584/attachment.htm>

More information about the erlang-questions mailing list