[erlang-questions] Structure of a *scalable* application?

Michael Loftis mloftis@REDACTED
Fri Aug 22 18:24:39 CEST 2014


On Fri, Aug 22, 2014 at 7:24 AM, Chris Pacejo <colanderman@REDACTED> wrote:
> I can't find much info on this on the web or in the docs... how does
> one structure a *scalable* Erlang application?  By this, I mean a
> system consisting of multiple nodes, each of which runs the same code,
> so that we can take advantage of more CPU/memory/disk resources.
>
> Structures I can think of:
>
> 1. Single instance of application, one worker launched on each node (using rpc).
> 2. Multiple instances of application, one per node, each with one worker.
> 3. Multiple nodes, not connected, communicating only via TCP (ick!).
> 4. Something else??

Those are all design decisions, none of which will affect your scaling
much IMO, but one of your assumptions, only one worker per node, your
"scaling" would suck, you'd be taking no advantage of parallelism
within a node (And in general in Erlang a node is a machine but
certainly not always) . It is completely application and workload
dependent as to how exactly and where exactly to parallelize.  Most
erlang applications are designed to run an instance per node of the
application module, they then RPC to other instances as necessary (see
mnesia).  I can't think of any offhand where a single application
instance runs across multiple nodes but I'm sure it exists.  Mostly
you've got one application instance per node to manage that node and
they'll communicate with each other to work together.  But you'll
basically rarely, if ever, see a single worker, when your'e talking
about "Scaling" since that means parallelism in this context, because
that single worker just doesn't scale for any workload.  But...some
workloads cannot be parallelized.  Normally though about the only time
you have a single worker is when you get down to say, a single
external connection.  That particular connection will often have a
worker process, or it may be handled as part of a pool of worker
processes.  Erlang processes (not nodes, not OS processes) are
lightweight.  This is Erlang 101.

Further as to 3) the erlang distribution protocol runs over TCP.  It
has it's own scaling limitations as some have found out - f/ex if
you're pushing many megabyte files at a high rate, it might be better
to have your own RPC mechanism to transfer those rather than sending
them over erlang distribution directly - the distribution protocol
comes down to a single thread on a single socket and can become a
bottleneck for inter-node communications.

Really though the best advice is to get it right first, then get it
fast.  Early optimization often isn't nearly as much of a win as
people think.
>
> Thanks.
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions



-- 

"Genius might be described as a supreme capacity for getting its possessors
into trouble of all kinds."
-- Samuel Butler



More information about the erlang-questions mailing list