Deadlock in global ? (we see global:random_sleep)
Eric Newhuis
enewhuis@REDACTED
Tue Apr 27 22:41:11 CEST 2004
I've heard that use of the global app is dangerous because the
algorithms do not recover properly if there is a temporary partitioning
of the network that divides a single cloud into multiple clouds and
then rejoins them.
Today we had a multiple-node deadlock after running solid for several
months. In fact the last time this happened was due to accidental
manual tinkering. Our distributed application uses global to ensure
only a single instance of a particular resource is connecting to our
server farm.
All our processes appear to be hung waiting on global:random_sleep.
My question is if anyone knows if there is a "known problem" with
global. And can someone describe what that problem is and perhaps
point us in the direction of alternatives?
The problem seems quite rare. So if it is a bug in global or a
non-solvable problem then perhaps a workaround for now would be a means
to reset the global state somehow?
More information about the erlang-questions
mailing list