[erlang-questions] How to think and reason when designing process- and message-centric systems?

Fri Dec 16 00:35:23 CET 2016

On 15/12/2016 12:04, IRLeif wrote:
> Dear Erlang community,
>
> This is my first email to the mailing list. I apologize in advance if 
> this is odd or off-topic.
>
> Coming from an object-oriented and data-centric background, I have 
> cognitive difficulties when it comes to conceptualizing, thinking 
> about and designing systems consisting of modules, processes and 
> key-value data stores.
>
> My brain reverts to thinking about classes, objects, inheritance 
> trees, encapsulation and SQL-style relational data models. I'm afraid 
> this could lead to unidiomatic Erlang system architectures and 
> implementations, which would be undesirable.
>
> Here are some of the essential complexities I have difficulties grasping:
>
> A) Identifying discrete modules and processes and finding good names 
> for them.
> B) Appointing supervisor and worker modules; defining process hierarchies.
> C) Deciding which processes should communicate with each other and how.
> D) Designing a sensible persistent data model with Mnesia or other 
> NoSQL data models (e.g. using CouchDB).
> E) Deciding which processes should read and write persistent data records.
> F) Incorporating global modules/"shared facilities" like event 
> handlers, loggers, etc.
> G) Visualizing the system architecture, processes and communication 
> lines; what kind of graphics to use.
> H) Organizing source code files into separate projects and directory 
> structures.
>
> Questions:
>
> 1) How do you unlearn "bad habits" from object-oriented way of thinking?
> 2) How do you think and reason about process-centric systems designs?
> 3) When designing a new system, how do you approach the above activities?
>

Well, for me Erlang is like a better version of OOP, it's how actually 
classes should have been implemented from the beginning. You simply have 
to mentally match actors <https://en.wikipedia.org/wiki/Actor_model> 
with classes and method invocation with message passing.

An actor is like a class that creates "living" entities instead of 
pieces of data. Having the definition of an actor (in an Erlang module), 
you can create multiple actors sharing the same code. Each actor 
maintains its own state exposing outside an API through which other 
actors can read or update the actor's state.

In OOP an instance of a class is simply some piece of memory to keep the 
state - instance variables. If the responsibility for the state wasn't 
delegated to class or instance methods, every method could change the 
state in a haphazard way. You need to synchronize which methods update 
which variables, add setters and getters, restrict the variables with 
private/protected access modifiers, only to protect the state from 
becoming undefined or corrupted.

Actor does the same but better. Whereas you can find a way to update an 
instance without invoking any methods (e.g. using pointers) it's not 
possible to update a state of an actor if it doesn't expose an API - 
Erlang doesn't provide any semantic to do that (you can only kill an actor).

But the main difference is that whereas instance methods run 
sequentially in the order in which they are called, actors actually run 
in parallel. If you have ever tried to implement any concurrent 
computation in OOP you will know that those two don't get together very 
well. If you have 10 processors and 100 instances of classes you can't 
distribute the instances across the processors/servers to share the 
load. That even doesn't make sense. You would need to use some message 
passing library, e.g. MPI, to distribute such code across processors.

However, in Erlang the distribution is done by the VM (BEAM) where 
schedulers can transparently exchange messages send between actors no 
matter on which processor those actors are running. They can even 
transparently distribute the actors across multiple nodes connected with 
network interfaces.

In Erlang an actor is implemented using behaviours, e.g. gen_server 
<http://erlang.org/doc/man/gen_server.html>, gen_fsm 
<http://erlang.org/doc/man/gen_fsm.html>. During execution an Erlang 
application can have as many running processes/actors as an OOP 
application can have created instances of classes.

You could start by trying to:
  - map global state (global variables, data held in static 
classes/istances used by the whole application) to ets 
<http://erlang.org/doc/man/ets.html> or mnesia 
<http://erlang.org/doc/man/mnesia.html>.
  - map singletons and classes that maintain state or provide access to 
resources to actors/processes.
  - map classes that only provide a collection of functions to Erlang 
modules.

Many books discussing OOP try to map the world to classes and instances 
(e.g. a class 'washing machine' that has methods 'wash', 'dry', etc). 
For me it's easier to map the surrounding world to actors rather than 
classes because objects in reality don't wait for each other. They 
exist, respond to actions and can change in real time independently of 
each other - exactly like actors.

I hope this helps
Greg

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20161215/17651c9d/attachment.htm>