[erlang-questions] looking into actor-based algorithms?

Thu May 28 15:09:33 CEST 2015

On 05/27, Rich Morin wrote:
>In particular, I'd like to find examples of actor-based algorithms
>(i.e., algorithms which rely on the actor model).  I'm particularly
>interested in graph analysis and presentation, but I'd be delighted
>to hear about anything that seems relevant.
>

In my experience, this tends to happen a lot less than it could.  
Erlang's main objective is fault-tolerance, and not being "an actor 
shotgun" where you can take a problem and shoot it in the face with 
actors.

It's not that you *can't* do it, it's that people often *won't* do it.  
In live production systems I've seen built in Erlang, actors will rather 
be used architecturally, to properly isolate and regroup system 
components that should fail together or separatedly.

Literature with actors seems to also take the route of agent-based 
systems or automata; meaning you will get assumptions that you have 
large numbers of underpowered nodes communicating over the network with 
very little concurrency on each, or assumptions that you have access to 
things like shared memory or *must* enforce it.

Erlang falls into a fun gap where it has a lot of local concurrency 
while still supporting distribution, and it turns out that (in my 
experience) few papers, books, or algorithm seem to be tailored to that 
specific mix.

One book that particularly fits the bill is The Handbook of 
Neuroevolution Through Erlang, by Gene I.  Sher 
(https://www.springer.com/gp/book/9781461444626) which has been written 
with the explicit goal of doing Neuroevolution with Erlang.

>A few weeks ago, I asked for help on elixir-lang-talk.  José Valim
>suggested this book, which I have been reading with _great_ interest:
>
> Distributed Algorithms for Message-Passing Systems
> Michel Raynal, Springer Berlin Heidelberg, 2013
> http://www.amazon.com/dp/B00DPE0EXG
>

It's rather well-written and goes in depth on a lot of the basic (or 
rather foundational) concepts of distributed computing, and more 
advanced ones too.

It's approachable, although not as much if you have an aversion to 
mathematical notation, but it's possible to pull through by focusing on 
the pseudo code and descriptions.

An annoying thing about the book is that it never ever mentions 
failures, time outs, and network issues in distributed algorithms. I 
went the entire book thinking that nearly none of the material in there 
was of practical use.

Then you go to the afterword, after 470 pages of content and algorithm 
reading, down to the section "A Series of Books" where it is suddenly 
revealed to you that the book's purpose was in fact to have algorithms 
about failure-free asynchronous systems (these things don't really 
exist, they're always failure prone in some way! They're more of a 
mental framework to explore the algorithms themselves)

The author does recommend alternative books to get that content, but hot 
damn would I have liked to know about it beforehand, maybe in the 
preface or something, given there is one already.

In any case, the approach taken in the book is pretty cool to get that 
decentralize mode of thoughts about organizing actors together, but the 
disconnect from the real world in how it's built means it should be seen 
more as foundational work leading the reader to other sources for 
real-world implementations in my opinion.

The other 'gotcha' is that for most of these algorithms, given they work 
mostly in a failure-free case and that very often you'll want 
node-boundaries to be important in how you organize topology, you may 
end up with better practical results using a centralized approach on 
each node, and then going on and doing that on all nodes; you sooner or 
later end up with algorithms much more smilar to "provide a service and 
register yourself" or something a bit like the IP networks' organization 
with routers gatekeeping local clusters.

In practice, that's why Erlang looks like the canonical 'microservices' 
(or nano-services, as Garrett re-dubbed them last week) approach people 
seem to embrace. It ends up being easier and more practical to have a 
group of actors provide a given functionality or service, register them 
in some node-local or global registry, and contact them that way.

I'm sorry if I end up sounding a bit negative; I'd be really eager to 
see fancypants good algorithms implementable in Erlang (or Elixir, or 
LFE, or efene, etc.) in a production setting, but I haven't seen many.

So there seems to be this divide between "I toy with this and use Erlang 
as an environment to experiment" and "I ship actual systems" when it 
comes to using actors in specific algorithms that are hand-made for 
actors.

Part of that is probably that large systems shipped in Erlang often have 
more users than processors, so designing your concurrency/paralellism to 
be per-user (for example), is likely to give you good performance and 
processor usage while retaining a mostly sequential model for each of 
them; this makes it a lot easier to reason about than a set of 
inherently actor-based algorithms, and I'm not surprised to see that 
approach win in the wild.

Regards,
Fred.