[erlang-questions] Orber load balancing and fault tolerance

Thu Dec 21 16:16:43 CET 2006

Hello!

Since Orber cannot know for sure if, for example, two objects are required
to execute within the same context or not, the user must take an active
role creating the correct object types and configure Orber properly.

How to achieve fault tolerance? Lets assume that there are two nodes in
the Orber domain (10.0.0.1 & 10.0.0.2). Then there are a few things you
must consider:

 (1) Which type of object type shall I create and must they share a common
view (i.e. state)?

 (2) Shall the IOR be transient or not?

 (3) How can a request be redirected from from one node to the other in
case of a failover/takeover?

Object type - first of all the request be executable on both nodes. Hence,
the server cannot be the default type or global ({regname, {global,
Term}} see also the reference manual corba:create and Module_Interface
and the user's guide Orber Stubs/Skeletons). Instead you shall choose
a pseudo object ({pseudo, Boolean}) or {regname, {local, Atom}}. The
latter shall be supervised ({sup_child, Boolean}), but in both cases you
might need to make sure they share the same state. Since Orber uses Mnesia
this would probably be the smartest solution. If uou want the IOR to be
transient (i.e. you can terminate the service), you should go for a local
object.

Redirect request - you can use gratuitous ARP. This way the client will
not notice that the IP address now points to a different machine. You can
also configure Orber to add several IP adresses in the exported IOR:s. For
example:

%% Node 1
 erl> corba:orb_init([{ip_address,{multiple,["10.0.0.1","10.0.0.2"]}}]).

or

 erl> NewIOR = corba:add_alternate_iiop_address(IOR, "10.0.0.2", Port).

%% Node 2
 erl> corba:orb_init([{ip_address,{multiple,["10.0.0.2","10.0.0.1"]}}]).

or

 erl> NewIOR = corba:add_alternate_iiop_address(IOR, "10.0.0.1", Port).

The client-side ORB will look for more peer data if the first one fails.
If none of the profiles or alternate addresses are available a system
exception is thrown.

If a request fails (e.g. COMM_FAILURE exception), the client can resolve a
new reference from the name service. The orbInitRef parameter returns an
IOR representing the first accessible name service (i.e. by invoking
corba:resolve_initial_references("NameService")):

 erl> corba:orb_init([{orbInitRef,
                      ["NameService=corbaloc::10.0.0.1/NameService",
                       "NameService=corbaloc::10.0.0.2/NameService"]}]).

The OMG has released a FT (Fault Tolerant) specification, but this
requires that both ORB:s support it. Orber support a small part of the
specification and has been prepared to be compliant with FT.

What about load balancing? This requires that both nodes are active. The
user must implement a solution that via, for example, round robin resolves
IOR:s from both nodes. One can also add more nodes to the Orber-domain and
let 10.0.0.1 & 10.0.0.2 "foward" requests to the internal nodes (in this
case you should pseudo objects on the front nodes, which only relays
the requests). 

/Nick

On Wed, 20 Dec 2006, Dominic Williams wrote:

> Hello,
> 
> The Orber User's guide states:
> 
>   "A multi-node Orber makes it possible to load balance and
>   create a more fault tolerant system."
> 
> Is there any more documentation about this?
> 
> Does load balancing and fault tolerance happen automatically
> in a multi-node Orber, or does some specific configuration
> need to be done? How does the balancing and tolerance
> behave?
> 
> Thanks,
> 
> Dominic Williams
> http://www.dominicwilliams.net
> 
> ----
> 
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions
>