Granularity of process (and module) identification. Was: Supervision strategies to automatically restart dynamically added children

Tue Mar 8 22:18:49 CET 2011

On Tue, Mar 8, 2011 at 4:13 PM, Edmond Begumisa
<ebegumisa@REDACTED> wrote:
> On Tue, 08 Mar 2011 17:52:29 +1100, Dhananjay Nene
> <dhananjay.nene@REDACTED> wrote:
>
>> On Mon, Mar 7, 2011 at 5:08 AM, Edmond Begumisa
>> <ebegumisa@REDACTED> wrote:
>>>
>>> Hi Dhananjay,
>>>
>>> I too struggled with this exact question for quite some time so I'll
>>> chime
>>> in here on the two techniques I used to solve it...
>>> On Thu, 03 Mar 2011 05:02:06 +1100, Dhananjay Nene
>>> <dhananjay.nene@REDACTED> wrote:
>>>
>>>>
>>>> Question in short : If I have a supervisor which has a number of
>>>> dynamic children, how do I set up a mechanism where in case of a
>>>> complete system crash, all the dynamic children restart at the point
>>>> they were when the system (including the supervisor) crashed.
>>>>
>>>> Question in long :
>>>> =============
>>>>
>>>> Sample Context : A bowling game
>>>> -------------------------------------------------
>>>>
>>>> Lets say I am writing the software to implement the software necessary
>>>> to track various games at a bowling alley. I've set up the following
>>>> processes :
>>>>
>>>> a. Lanes : If there are 10 lanes, there are 10 processes, one for each
>>>> lane. These stay fixed for the entire duration of the program
>>>> b. Games : A group of players might get together to start a game on a
>>>> free lane. A new game will get created to track the game through its
>>>> completion. When the game is over, this process shall terminate
>>>> c. Players : Each game has a number of players. One process
>>>> "player_game" is started per player. Sample state of a player game
>>>> would include current score for the player and if the last two rolls
>>>> were strike or a spare. For the purpose of brevity, the remainder of
>>>> this mail only refers to this process and ignores the others
>>>>
>>>
>>> You could reduce complexity by having each lane process maintain it's
>>> current game (players and scores) as part of it's state. The game and
>>> player_game processes appear unnecessarily confusing to me.
>>>
>>
>> Interesting point. The lanes are the only static aspects of the game.
>> I tried to consider whether it would make any difference from a client
>> API perspective, but I imagine for a client, there is no particular
>> reason to believe a lane is a better or worse abstraction than a game
>> (or a player_game).
>>
>>>> Objective :
>>>> ---------------
>>>>
>>>> Assuming this is a single node implementation, if the machine were to
>>>> crash, upon machine / node restart, all the player_games should be
>>>> restarted and should be at the point where the player_games were when
>>>> the machine crashed.
>>>>
>>>> Possible supervision strategy :
>>>> --------------------------------------
>>>>
>>>> 1. Create a simple_one_for_one supervisor player_game_sup which upon
>>>> starting up for the first time would have no children associated with
>>>> them. Use supervisor:start_child to start each process
>>>> 2. The supervisor creates an entry in a database (say mnesia) every
>>>> time it launches a new process
>>>> 3. Each player_game updates the entry every time the score gets
>>>> modified. Upon termination that entry gets deleted
>>>> 4. Post crash, the supervisor is started again (say after an
>>>> application restart or via another supervisor)
>>>> 5. (Here's the difference). By default the supervisor will not restart
>>>> the dynamically added children (all the player_games). However we
>>>> modify the init code to inspect the database and launch a player_game
>>>> for each record it finds.
>>>
>>> How? I don't think you can instruct a simple_one_for_one supervisor to
>>> create children from it's init/1 callback. From the documentation...
>>>
>>> http://www.erlang.org/doc/man/supervisor.html#Module:init-1
>>>
>>> "...No child process is then started during the initialization phase, but
>>> all children are assumed to be started dynamically using
>>> supervisor:start_child/2..."
>>
>> Fair point. Wasn't something that struck me as an issue then, but yes,
>> supervisor starting dynamic children inside init doesn't quite rock.
>>
>>> AFIAK, creating dynamic children (calling supervisor:start_child/2) has
>>> to
>>> be done after the supervisor has initialised by a process other than the
>>> supervisor process.
>>
>> Certainly. And your separate modeling of a lane_ldr (later down this
>> mail) helps that.
>>
>>> This is normally not a problem if you are calling start_child/2 during
>>> the
>>> "normal" operation of the application because the supervisor in question
>>> is
>>> likely to already be up. But here, you want to call start_child/2 at
>>> *startup*. From my experience with this precise matter, this requires
>>> some
>>> process coordination.
>>>
>>>> The player_game initialises itself to the
>>>> current state as in the database and the game(s) can continue where
>>>> it/they left off.
>>>>
>>>> My questions :
>>>> --------------------
>>>> a. Does it make sense to move the responsibility to the supervisor to
>>>> update the database each time a new player game is started or
>>>> completed ?
>>>
>>> I personally don't see the advantage of doing this. Besides (as per my
>>> understanding of OTP design principles), a supervisor's job should be
>>> just
>>> that -- supervising workers and not doing work itself.
>>>
>>> Doing this from the your worker gen_servers make more sense to me and
>>> seems
>>> more natural. i.e Reading the scores from the DB the during
>>> player_game:init
>>> and writing them every time a score gets bumped or something similar.
>>>
>>
>> I agree
>>
>>
>>> Possible supervision strategy 2a: (Loader version)
>>> --------------------------------------------------
>>>
>>> Rather than separate dynamic children for players and games as in
>>> Strategy
>>> 1, instead, each lane stores, as part of it's state, info on the current
>>> game (the players playing on the lane and their state/scores). The
>>> supervision tree might look like this...
>>>
>>>          alley_sup
>>>         /         \
>>>  lane_ldr  ___lanes_sup_____
>>>          /       |     :   \
>>>       lane(1)  lane(2) .. lane(N)
>>>
>>> * Application has a startup configuration parameter no_of_lanes which
>>> comes
>>> from a conf file or the .app file and loaded by the alley_sup...
>>>
>>
>> This is a suggestion thats really had me thinking. I suspect there's a
>> bit of the traditional OO modeling experience which is grumbling about
>> not being able to model a game or a player game.
>
> It's not that you can't model them, it's that you don't need to.
>
> One mantra in Erlang literature (e.g. Casarini & Thompson, pg110), is to
> create a process for every concurrent *activity* you observe in the real
> world and not every *task* you observe. So you don't necessarily need to use
> a process for every "object" you see in the real world.
>
> With this in mind, my immediate interpretation of your application was in
> two ways:
>
> A)
>
> * You have a bowling alley which has lanes.
> * Different _lanes_ can be *concurrently* used at the same time: map these
> to processes.
> * Only 1 player can use a _lane_ at a time: no need for player processes.
> * Only 1 game can take place on a _lane_ at a time: no need for game
> processes.
> * It follows that players and their game are just the state of each
> concurrently used _lane_.
>
> So you only need processes for lanes.
>
> Alternatively, B)
>
> * You have a bowling alley where people play games.
> * Several _games_ can be *concurrently* played at the same time: map these
> to processes.
> * Only 1 player can make a _game_ play at a time: no need for player
> processes.
> * Only 1 lane can be used per _game_: no need for lane processes.
> * It follows that players and their lane are just the state of each
> concurrently played game.
>
> So you only need processes for games.
>
> A) *might* be easier to implement than B) when you have to interact with
> hardware that manages the lane machinery, which is why I suggested it. But
> either way, you only need *one* class of processes. IMO, introducing more
> just complicates matters unnecessarily.

Understood. Here's a dump of my thoughts.

Yes, we need to have upto only as many processes as the number of
lanes (either the lanes themselves or maximum one game per lane). Lets
assume we model a game as a process.

There's a lot of state thats maintained at a player level, and very
little at a game level. As a goal, separation of game and player (and
the states) seems desirable to support these separation of concerns.
Perhaps we could still model these using separate modules. Yet, given
how state is carried over from one handle_call/handle_cast into the
next, not modeling a player as a process forces the internal state of
the player to be completely accessible to the game - most of it, it
simply does not need access to. This stems from the fact that in
erlang, state is maintained at a process level and not at a module
level.

I would imagine, this is a dilemma which is not entirely uncommon.
Modeling a low level intricate module as a process, allows the finer
details of the state to be contained in the implementation
modules/processes, and can allow the policy to be maintained in a
higher level module/process. This can also allow for easier way to
change implementations (eg. hypothetically, in this case the precise
semantics of scoring - one could go as far as defining a player to be
behaviour). I am fully aware these are thoughts which stem from a
classical OO experience.

That begs the question -> are there situations where experienced
erlang programmers choose to model processes not because they run
concurrently, but because they have different independently
encapsulated states, and in addition could also be helpful for
separating out implementation specific behaviour. Or is there another
programming feature / trick that I am not aware of which could help
resolve these conflicting objectives?

Dhananjay

-- 
-----------------------------------------------------------------------------------
http://blog.dhananjaynene.com twitter: @dnene