[erlang-patches] Re: gen:call({global, Name}, ...)

Geoff Cant nem@REDACTED
Thu Mar 31 20:16:16 CEST 2011


Henrik Nord <henrik@REDACTED> writes:

> On 03/31/2011 12:28 PM, Henrik Nord wrote:
>> On 03/30/2011 03:19 AM, Geoff Cant wrote:
>>> Hi all, I discovered today that gen:call({global, Name}, Label, Request,
>>> Timeout) calls global:safe_whereis_name(Name) to determine the Pid to
>>> look up globally registered names.
>>>
>>> global:safe_whereis_name/1 doesn't seem to offer any particular safety
>>> and more importantly, serializes all global name lookups on a
>>> node. (Using
>>> global:whereis_name/1 instead is just an ets lookup).
>>>
>>> Can we safely make a change like
>>> https://github.com/archaelus/otp/commit/4f6e8a147b3c600eef2dd05f8ce0d51cf9c35383 
>>>
>>> in gen.erl and improve call time and reduce the load on
>>> global_name_server at a stroke?
>>>
>>> This git repo contains the patch I'm thinking of:
>>> git fetch git://github.com/archaelus/otp.git gen_where
>>>
>>> Cheers,
>> Hello
>>
>> Thank you!
>> This branch is now cooking in 'pu'
>>
>
> Hello
> This has apparently been tested before, and found to be unsafe.
> So im pulling it out.

Hi Henrik, thanks for looking at this patch.

It would be great if the OTP team could explain this a little further.

gen.erl is using the undocumented function
global:safe_whereis_name/1. This function seems to retry (an unbounded
number of times) to do the same thing as global:whereis_name/1, but
ensuring that the global lock is not set. This will make sure that name
registrations are not being created, deleted or changed while the
safe_whereis_name lookup occurs.

To my mind, this doesn't seem to add much safety. The whereis_name/1
operation is a single ets:lookup, and ets guarantees us that this read
is atomic, so the result of the lookup can't be affected by operations
that require the global lock anyway.

We could easily get stale results from global - a Pid for a name that is
immediately changed after the call, a Pid that was just unregistered on
another node, and probably a host of other things, but safe_whereis_name
doesn't appear to protect us from these situations either.

Is there an important edge case I'm missing here? I'd love to know
because I'm trying a patch similar in effect to this one at work as we
can't afford to serialize all process name lookups through global
(gen:call({global, Name}, ...) is the dominant part of the call time
here and it's getting up to 100+ms).

Cheers,
-- 
Geoff Cant



More information about the erlang-patches mailing list