systematic global registration discrepancies

Valentin Nechayev netch@REDACTED
Tue Nov 23 15:10:24 CET 2010


Hi,
we are using Erlang cluster for 20-25 nodes which all resides on different
hosts. We are experiencing systematic problems with global registration of
following kinds:

1. Attempt of register via global:register_name() hangs for an unlimited time
(we could see it for a few hours until our patience is expired).

2. A name which is successfully reported as registered disappears from
registered name lists at all nodes (including the registering one!)

We had to add monitoring of global functionality which stops the node where
registration hangs.  It periodically detects registration failure and stops
nodes, usually this is group of 7-10 nodes per one such failure. But it can't
detect second case (silent disappearing).

We use R12B5; it's planned to upgrade but is impossible for the closest next
release.

Did anybody seen this? Please suggest how to debug such problem.


-netch-


More information about the erlang-questions mailing list