[erlang-questions] Massive Numbers of Actors vs. Massive Numbers of Objects vs. ????

Wed Feb 29 15:09:50 CET 2012

On Tue, Feb 28, 2012 at 11:03 AM, Miles Fidelman
<mfidelman@REDACTED> wrote:
> Folks,
>
> I'm trying to get a handle on core technology for an application that's
> going to involve massive numbers of entities - where the entities want to
> have characteristics that draw from both the object and actor models.
>
> Think of something like massive numbers of stored email messages, where each
> message is addressable, can respond to events, and in some cases can
> initiate events - for example, on receiving an email message, a reader can
> fill in a form, and have that update every copy of the message spread across
> dozens (or hundreds, or thousands) of mailboxes/folders distributed across
> the Internet. Or where an email message, stored in some folder, can wake up
> and send a reminder.
>
> One sort of wants to blend characteristics of:
> - messages (small, static, easy to store huge numbers, easy to move around)
> - objects (data and methods bound together, inheritance, ...)
> - actors (massive concurrency, active)
>
> The topic has come up before, in discussions of active objects, reactive
> objects, concurrent objects, etc. - I'm wondering what the current state of
> the art and practice look like.
>
> I'm thinking that Erlang might be nice operating environment for such a
> beast, but wonder at what point one hits limits in the numbers of actors
> floating around.  I'm also wondering what other environments might blend
> these characteristics.
>
> Thoughts? Comments?

Erlang is an excellent option if you want to build reliable software
without spending a lot of time/money.

The terms "objects" and "actors" are pretty generalized in your
description, but I suspect you can draw from characteristics of both
in building a solution in Erlang.

If this system is going to live "throughout the Internet" you're
talking about multiple Erlang nodes running on multiple servers.

If you want to scale beyond a few hundred such nodes, you should not
rely on "distributed Erlang" (starting named nodes that communicate
using Erlang's built in message passing primitives). You'll need some
communication protocol that works well over unreliable networks. The
usual suspects here are HTTP (REST, web sockets) and 0MQ.

If a node state needs to survive a crash, restart, etc. you need a
scheme to persist that state and recover it on startup. There are lots
of options here in Erlang.

Whether each "entity" in your system maps to an Erlang process is a
matter of design -- but Erlang does give you options that you don't
typically have in other language environments. The standard for this
is typically, "does the activity correspond to a real world thread of
execution." While Erlang processes are cheap relative to OS processes
or threads, they still have a cost -- you have to weight the trade
offs.

I think you'll find that Erlang provides outstanding tools/facilities
for building the type of system your describing -- but you'll need to
steer the design based on your goals and the laws of physics :)

Garrett