[erlang-patches] Non-overlapping Application Distribution Node Sets

Fredrik <>
Fri May 3 10:42:57 CEST 2013


On 05/03/2013 10:07 AM, Vance Shipley wrote:
> Fetch here:
>
>     git fetch git://github.com/vances/otp.git non_overlap_application_distribution
>
> Browse here:
>
>     https://github.com/vances/otp/commit/61f4da70e32bf745d96455b6d2f2ca42c4e4a3a7
>
> Commit message:
>
> Support non-overlapping application distribution nodes
>
> Currently all known nodes should have the same value for the
> kernel application's 'distributed' environment variable.  It
> is not expected that any application will be distributed on
> on more than one set of nodes.
>
> It should be possible to distributed an application between
> multiple non-overlapping sets of nodes.  For example with this
> system configuration file on nodes  and :
>
>     [{kernel,
>        [{distributed, [{app_no, [, ]}]},
>         {sync_nodes_optional, [, ]},
>         {sync_nodes_timeout, 5000}]}].
>
> ... and this system configuration file on nodes  and :
>
>     [{kernel,
>        [{distributed, [{app_no, [, ]}]},
>         {sync_nodes_optional, [, ]},
>         {sync_nodes_timeout, 5000}]}].
>
> Other applications may be distributed involving some other
> combination of these nodes without interference.
>
> This patch adds checks in dist_ac to ignore DAC protocol
> messages of an application from nodes not included in that
> application's distribution specification locally.
>
>
> Rationale:
>
> We often want to have active/standby pairs for applications
> while also having multiple instances of the application running
> on different nodes.  When nodes within the cluster are communicating
> in order to, for example, distribute mnesia tables, suddenly there
> is a potential conflict between these otherwise unrelated node pairs.
>
> Currently there is a window of time during node (re)starts where a
> conflict may occur.  This patch simply corrects this error case.
>
> The documentation is somewaht unclear as to whether the configuration
> above is legal or not.  The fact that, other than in the race condition
> noted above, this distribution does work as expected allows one to use
> the more liberal interpretation that when it says:
>
>     "All involved nodes must have the same value for distributed and
>      sync_nodes_timeout, or the behaviour of the system is undefined."
>
> Involved nodes refers to nodes involved in tthis application's distribution.
> With this patch that interpretation holds true.
>
> Tests:
>
> The existing tests are extended to support the following application
> distribution configuration:
>
>     :
>        [{kernel,
>           [{sync_nodes_optional, [, ]},
>            {distributed,
>               [{app1, [, , ]},
>                {app2, 2000, [, , ]},
>                {app_sp, 1000, [{, }, ]},"
>                {app_no, 1000, [, ]}]}]}].
> '
>     :
>        [{kernel,
>           [{sync_nodes_optional, [, ]},
>            {distributed,
>               [{app1, [, , ]},
>                {app2, 2000, [, , ]},
>                {app_sp, 1000, [{, }, ]},"
>                {app_no, 1000, [, ]}]}]}].
>
>     :
>        [{kernel,
>           [{sync_nodes_optional, [, , ]},
>            {distributed,
>               [{app1, [, , ]},
>                {app2, 2000, [, , ]},
>                {app_sp, 1000, [{, }, ]},"
>                {app_no, 1000, [, ]}]}]}].
>
>     :
>        [{kernel,
>           [{sync_nodes_optional, []},
>            {distributed,
>                {app_no, 1000, [, ]}]}]}].
>
> The Cp4 node is added along with the app_no application which is
> distributed in two active/standby pairs on Cp1/Cp2 and Cp3/Cp4.
> The tests check that these pairs are unaffected by the other applications'
> starts, stops, failovers and takeovers.  And that they do not affect each
> other's.
>
Hello Vance,
Could you please rebase this patch upon the current maint branch.
Thanks,

-- 

BR Fredrik Gustafsson
Erlang OTP Team



More information about the erlang-patches mailing list