[erlang-questions] Granularity of process (and module) identification. Was: Supervision strategies to automatically restart dynamically added children

Wed Mar 9 12:03:07 CET 2011

On Wed, 09 Mar 2011 09:56:02 +1100, Lukas Larsson  
<lukas.larsson@REDACTED> wrote:

> You can still do a logical separation between a player and a game, it  
> does not have to be physical. By creating an opaque datatype for each  
> player which you only can access through the player module you put a  
> separation in between the two concepts but still keep the same  
> concurrent activities as there are in the world. i.e.
>
> -module(player).
> %% Private record
> -record(player, {name}).
>
> create() ->
>   #player{}.
>
> set_name(Name, Player) ->
>   Player#player{ name = Name }.
>
> get_name(Player) ->
>   Player#player.name.
>
> etc.
>
> This makes it (atleast for me) possible to enforce a logical model upon  
> my physical restrictions which makes my code easier to read and also to  
> maintain when upgrading.
>
> Lukas
> ----- Original Message -----
> From: "Dhananjay Nene" <dhananjay.nene@REDACTED>
> To: erlang-questions@REDACTED
> Sent: Tuesday, March 8, 2011 10:18:49 PM GMT +01:00 Amsterdam / Berlin /  
> Bern / Rome / Stockholm / Vienna
> Subject: [erlang-questions] Granularity of process (and module)  
> identification. Was: Supervision strategies to automatically restart  
> dynamically added children
>
> On Tue, Mar 8, 2011 at 4:13 PM, Edmond Begumisa
> <ebegumisa@REDACTED> wrote:
>> On Tue, 08 Mar 2011 17:52:29 +1100, Dhananjay Nene
>> <dhananjay.nene@REDACTED> wrote:
>>
>>> On Mon, Mar 7, 2011 at 5:08 AM, Edmond Begumisa
>>> <ebegumisa@REDACTED> wrote:
>>>>
>>>> Hi Dhananjay,
>>>>
>>>> I too struggled with this exact question for quite some time so I'll
>>>> chime
>>>> in here on the two techniques I used to solve it...
>>>> On Thu, 03 Mar 2011 05:02:06 +1100, Dhananjay Nene
>>>> <dhananjay.nene@REDACTED> wrote:
>>>>
>>>>>
>>>>> Question in short : If I have a supervisor which has a number of
>>>>> dynamic children, how do I set up a mechanism where in case of a
>>>>> complete system crash, all the dynamic children restart at the point
>>>>> they were when the system (including the supervisor) crashed.
>>>>>
>>>>> Question in long :
>>>>> =============
>>>>>
>>>>> Sample Context : A bowling game
>>>>> -------------------------------------------------
>>>>>
>>>>> Lets say I am writing the software to implement the software  
>>>>> necessary
>>>>> to track various games at a bowling alley. I've set up the following
>>>>> processes :
>>>>>
>>>>> a. Lanes : If there are 10 lanes, there are 10 processes, one for  
>>>>> each
>>>>> lane. These stay fixed for the entire duration of the program
>>>>> b. Games : A group of players might get together to start a game on a
>>>>> free lane. A new game will get created to track the game through its
>>>>> completion. When the game is over, this process shall terminate
>>>>> c. Players : Each game has a number of players. One process
>>>>> "player_game" is started per player. Sample state of a player game
>>>>> would include current score for the player and if the last two rolls
>>>>> were strike or a spare. For the purpose of brevity, the remainder of
>>>>> this mail only refers to this process and ignores the others
>>>>>
>>>>
>>>> You could reduce complexity by having each lane process maintain it's
>>>> current game (players and scores) as part of it's state. The game and
>>>> player_game processes appear unnecessarily confusing to me.
>>>>
>>>
>>> Interesting point. The lanes are the only static aspects of the game.
>>> I tried to consider whether it would make any difference from a client
>>> API perspective, but I imagine for a client, there is no particular
>>> reason to believe a lane is a better or worse abstraction than a game
>>> (or a player_game).
>>>
>>>>> Objective :
>>>>> ---------------
>>>>>
>>>>> Assuming this is a single node implementation, if the machine were to
>>>>> crash, upon machine / node restart, all the player_games should be
>>>>> restarted and should be at the point where the player_games were when
>>>>> the machine crashed.
>>>>>
>>>>> Possible supervision strategy :
>>>>> --------------------------------------
>>>>>
>>>>> 1. Create a simple_one_for_one supervisor player_game_sup which upon
>>>>> starting up for the first time would have no children associated with
>>>>> them. Use supervisor:start_child to start each process
>>>>> 2. The supervisor creates an entry in a database (say mnesia) every
>>>>> time it launches a new process
>>>>> 3. Each player_game updates the entry every time the score gets
>>>>> modified. Upon termination that entry gets deleted
>>>>> 4. Post crash, the supervisor is started again (say after an
>>>>> application restart or via another supervisor)
>>>>> 5. (Here's the difference). By default the supervisor will not  
>>>>> restart
>>>>> the dynamically added children (all the player_games). However we
>>>>> modify the init code to inspect the database and launch a player_game
>>>>> for each record it finds.
>>>>
>>>> How? I don't think you can instruct a simple_one_for_one supervisor to
>>>> create children from it's init/1 callback. From the documentation...
>>>>
>>>> http://www.erlang.org/doc/man/supervisor.html#Module:init-1
>>>>
>>>> "...No child process is then started during the initialization phase,  
>>>> but
>>>> all children are assumed to be started dynamically using
>>>> supervisor:start_child/2..."
>>>
>>> Fair point. Wasn't something that struck me as an issue then, but yes,
>>> supervisor starting dynamic children inside init doesn't quite rock.
>>>
>>>> AFIAK, creating dynamic children (calling supervisor:start_child/2)  
>>>> has
>>>> to
>>>> be done after the supervisor has initialised by a process other than  
>>>> the
>>>> supervisor process.
>>>
>>> Certainly. And your separate modeling of a lane_ldr (later down this
>>> mail) helps that.
>>>
>>>> This is normally not a problem if you are calling start_child/2 during
>>>> the
>>>> "normal" operation of the application because the supervisor in  
>>>> question
>>>> is
>>>> likely to already be up. But here, you want to call start_child/2 at
>>>> *startup*. From my experience with this precise matter, this requires
>>>> some
>>>> process coordination.
>>>>
>>>>> The player_game initialises itself to the
>>>>> current state as in the database and the game(s) can continue where
>>>>> it/they left off.
>>>>>
>>>>> My questions :
>>>>> --------------------
>>>>> a. Does it make sense to move the responsibility to the supervisor to
>>>>> update the database each time a new player game is started or
>>>>> completed ?
>>>>
>>>> I personally don't see the advantage of doing this. Besides (as per my
>>>> understanding of OTP design principles), a supervisor's job should be
>>>> just
>>>> that -- supervising workers and not doing work itself.
>>>>
>>>> Doing this from the your worker gen_servers make more sense to me and
>>>> seems
>>>> more natural. i.e Reading the scores from the DB the during
>>>> player_game:init
>>>> and writing them every time a score gets bumped or something similar.
>>>>
>>>
>>> I agree
>>>
>>>
>>>> Possible supervision strategy 2a: (Loader version)
>>>> --------------------------------------------------
>>>>
>>>> Rather than separate dynamic children for players and games as in
>>>> Strategy
>>>> 1, instead, each lane stores, as part of it's state, info on the  
>>>> current
>>>> game (the players playing on the lane and their state/scores). The
>>>> supervision tree might look like this...
>>>>
>>>>          alley_sup
>>>>         /         \
>>>>  lane_ldr  ___lanes_sup_____
>>>>          /       |     :   \
>>>>       lane(1)  lane(2) .. lane(N)
>>>>
>>>> * Application has a startup configuration parameter no_of_lanes which
>>>> comes
>>>> from a conf file or the .app file and loaded by the alley_sup...
>>>>
>>>
>>> This is a suggestion thats really had me thinking. I suspect there's a
>>> bit of the traditional OO modeling experience which is grumbling about
>>> not being able to model a game or a player game.
>>
>> It's not that you can't model them, it's that you don't need to.
>>
>> One mantra in Erlang literature (e.g. Casarini & Thompson, pg110), is to
>> create a process for every concurrent *activity* you observe in the real
>> world and not every *task* you observe. So you don't necessarily need  
>> to use
>> a process for every "object" you see in the real world.
>>
>> With this in mind, my immediate interpretation of your application was  
>> in
>> two ways:
>>
>> A)
>>
>> * You have a bowling alley which has lanes.
>> * Different _lanes_ can be *concurrently* used at the same time: map  
>> these
>> to processes.
>> * Only 1 player can use a _lane_ at a time: no need for player  
>> processes.
>> * Only 1 game can take place on a _lane_ at a time: no need for game
>> processes.
>> * It follows that players and their game are just the state of each
>> concurrently used _lane_.
>>
>> So you only need processes for lanes.
>>
>> Alternatively, B)
>>
>> * You have a bowling alley where people play games.
>> * Several _games_ can be *concurrently* played at the same time: map  
>> these
>> to processes.
>> * Only 1 player can make a _game_ play at a time: no need for player
>> processes.
>> * Only 1 lane can be used per _game_: no need for lane processes.
>> * It follows that players and their lane are just the state of each
>> concurrently played game.
>>
>> So you only need processes for games.
>>
>> A) *might* be easier to implement than B) when you have to interact with
>> hardware that manages the lane machinery, which is why I suggested it.  
>> But
>> either way, you only need *one* class of processes. IMO, introducing  
>> more
>> just complicates matters unnecessarily.
>
> Understood. Here's a dump of my thoughts.
>
> Yes, we need to have upto only as many processes as the number of
> lanes (either the lanes themselves or maximum one game per lane). Lets
> assume we model a game as a process.
>
> There's a lot of state thats maintained at a player level, and very
> little at a game level. As a goal, separation of game and player (and
> the states) seems desirable to support these separation of concerns.
> Perhaps we could still model these using separate modules. Yet, given
> how state is carried over from one handle_call/handle_cast into the
> next, not modeling a player as a process forces the internal state of
> the player to be completely accessible to the game - most of it, it
> simply does not need access to.

Not necessarily. To reinforce what Lukas said above, it could treat the  
player and game elements of it's state as something it doesn't understand  
but keeps hold of. It then takes actions on these elements by calling  
other modules player.erl and game.erl that do understand them. In this  
case, those modules would not need to create their own processes to have  
some seperation.

I'll expand on Lukas's example coz I've found it can be hard to visualise  
if you've come from OO...

=== lane.erl ===
-behaviour(gen_server).
..
init(Id) ->
     ..
    % We don't know what we're creating, we just know it represents a game
    {ok, game:create(Id)}.

hanlde_cast({add_player, PlayerName}, Game0) ->
     {noreply, game:add_player(Game0)};
handle_cast({game_play, PlayerName, PinsDown}, Game0) ->
     % game:play must succeed otherwise lane/game state will be incorrect
     Game1 = game:play(Game0, PlayerName, PinsDown),
     ok = reset_pins(),
     {noreply, Game1}.

handle_call({get_score, PlayerName}, Game) ->
    case Reply = game:get_score(Game, PlayerName) of
       {ok, Score} ->
          {reply, Reply, Game}.
       Error ->
          {reply, Reply, Game}  % Same but just making it clear that  
game:get_score is allowed to fail
    end.

reset_pins() ->
   .. maybe talk to some hardware driver ..

== game.erl===
-export(create/1, add_player/2, play/3).

create(LaneId) ->
     .. create path from LaneId ..
     AllPlayers = try read_game(PersistPath)
                  catch
                    _:Why ->
                     ..
                     []
                  end
     {Id,PersistPath,AllPlayers}. % Understood only by game.erl

add_player({LaneId,PersistPath,AllPlayers0}, PlayerName) ->
     {LaneId, PersistPath, lists:keystore(PlayerName, 1, AllPlayers0,
                                          player:create(PlayerName)).

%% Part of Jasper's error kernel -- assert happy-case
play({LaneId,PersistPath,AllPlayers0}, PlayerName, PinsDown) ->
     Player0 = proplist:get_value(PlayerName, AllPlayers0),
     false = undefined == Player0,
     AllPlayers1 = lists:keyreplace(PlayerName, 1, AllPlayers0,
                                     player:bump_score(Player0, PinsDown))},
     ok = write_game(PersistPath, AllPlayers1),
     {LaneId,PersistPath,AllPlayers1}.

get_score({_,_,AllPlayers), PlayerName) ->
     case proplist:get_value(PlayerName, AllPlayers) of
         undefined ->
             {badarg, PlayerName};
         Player ->
             {ok, player:get_score(Player)}
     end.

%%% Internal functions
read_game() ->
    ..
write_game() ->
    ..

== player.erl ===

-export(create, bump_score, get_score)

%% Jasper's Error Kernel: This state and operations against it must be  
correct.
-record(state, {frame = 0,
                 shot = 1,
                 bonus_shot = false,
                 last_shot = normal,
                 prior_to_last_shot = normal,
                 max_pins = 10,
                 score = 0}).
create() ->
     #state{}. % Only understood by player.erl

bump_score(#state{} = Player, PinsDown) ->
    ..
get_score(#state{} = Player) ->
    ..

> This stems from the fact that in
> erlang, state is maintained at a process level and not at a module
> level.
>
> I would imagine, this is a dilemma which is not entirely uncommon.
> Modeling a low level intricate module as a process, allows the finer
> details of the state to be contained in the implementation
> modules/processes, and can allow the policy to be maintained in a
> higher level module/process. This can also allow for easier way to
> change implementations (eg. hypothetically, in this case the precise
> semantics of scoring - one could go as far as defining a player to be
> behaviour). I am fully aware these are thoughts which stem from a
> classical OO experience.
>
> That begs the question -> are there situations where experienced
> erlang programmers choose to model processes not because they run
> concurrently, but because they have different independently
> encapsulated states, and in addition could also be helpful for
> separating out implementation specific behaviour.

[Keeping in mind that I'm not an "experienced Erlang programmer"]

State machines come to mind (gen_fsm).

You could certainly go your original route. Process creation is fast.  
Message passing is fast. Just be careful not to inadvertently create a  
N-to-1-to-N routing for messages between processes. This can become a  
bottleneck when a program grows and/or becomes very busy. Tracing and  
debugging gets a little tricker too.

- Edmond -

It can also make things a little hard to follow in terms of tracing and  
debugging.

> Or is there another
> programming feature / trick that I am not aware of which could help
> resolve these conflicting objectives?
>
> Dhananjay
>

-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/