[erlang-questions] Massive Numbers of Actors vs. Massive Numbers of Objects vs. ????

Wed Feb 29 22:36:07 CET 2012

Garrett Smith wrote:
> On Tue, Feb 28, 2012 at 11:03 AM, Miles Fidelman
> <mfidelman@REDACTED>  wrote:
>> Folks,
>>
>> I'm trying to get a handle on core technology for an application that's
>> going to involve massive numbers of entities - where the entities want to
>> have characteristics that draw from both the object and actor models.
>>
> Erlang is an excellent option if you want to build reliable software
> without spending a lot of time/money.

Well that much I get :-)
>
> The terms "objects" and "actors" are pretty generalized in your
> description, but I suspect you can draw from characteristics of both
> in building a solution in Erlang.

Basic notion is to deal with huge numbers of HTML emails containing 
executable javascript, and make them addressable - e.g., send out a 
document by email, follow up with an update that can apply an edit to 
the original document.  Not that unlike a patch applied to a piece of code.

Three conventional models present themselves, none of which quite fit:

i. pure messaging - use message-ids and in-reply-to: headers to sort and 
order messages, the client software applies "patches" to the initial 
message -- seems a bit too rigid in terms of having to build all the 
update logic into the clients

ii. object model: 1st message is treated as an object, subsequent 
messages are passed to an update method - allows each object to have a 
different update method - storing things in an object-oriented database, 
with triggers seems like a viable approach

iii. actor model - first message is an actor, updates are simply 
messages addressed to the first message, once an update is delivered, 
it's up to the actor to deal with it

Message and object databases are relatively understood, and supported by 
relatively mature technology.

The actor model seems conceptually cleaner when thinking about 
independent, addressable entities that can receive and react to incoming 
messages - but I haven't seen any examples of systems built around huge 
numbers of persistent actors - particularly ones where the actors are 
largely in-active (in this case, a file cabinet full of messages that 
are largely dormant, but each one might be updated every once in a long 
while, and some might "wake up" every once in a while - say, to send a 
reminder to someone).

In some sense, I'm describing an "actor-oriented database" - a place to 
park large numbers of persistent actors, surrounded by mechanisms to 
deliver messages, and allow them to wake up after timeouts.

I'm kind of surprised somebody hasn't built such a beast - at least as a 
research experiment.

> If this system is going to live "throughout the Internet" you're
> talking about multiple Erlang nodes running on multiple servers.
>
> If you want to scale beyond a few hundred such nodes, you should not
> rely on "distributed Erlang" (starting named nodes that communicate
> using Erlang's built in message passing primitives). You'll need some
> communication protocol that works well over unreliable networks. The
> usual suspects here are HTTP (REST, web sockets) and 0MQ.

Yup.  That's actually where I'm focusing - communications infrastructure 
for distributing "entities" and then for distributing messages among 
those entities.  But... moving forward ultimately requires making some 
choices about how those entities are going to be represented, stored, 
and accessed.

Representation is pretty much a given - HTML+JavaScript - easy to move 
around (email, nntp, Atom, XMPP, http), web browsers are pretty much the 
universal gui.

Storage, addressing, and invocation are the current open question.  Mbox 
files (or even mh style directories) don't quite do it.  My first, 
simple-minded thought is store each message in a document-oriented 
database, specifically CouchDB - which seems to do most of what I need.  
Also been toying with using an XML database, specifically eXist (AtomPub 
is a nice interface for dealing with document-like things).

But... doing due diligence in trying to survey the landscape for 
something more general purpose.

-- 
In theory, there is no difference between theory and practice.
In practice, there is.   .... Yogi Berra