Detect pid reuse

Thu Jul 9 12:35:41 CEST 2020

Hi Dinislam,

How many processes are you spawning per second?
We (in our biggest production cluster) are spawning more than 10000
processes per second with very variable lifetime (from milliseconds to
hours) on the system running for more than a year and never faced a
problem with PID reuse (our logic also depends on PID uniqueness).
Erlang is trying quite hard to make PID unique (there is a very small
probability to hit PID reuse) - isn't it possible, that there is a
problem somewhere else?

If you need more uniqueness you can use e.g. "erlang:make_ref/1" (same
thing is used by "gen:call" to connect requests with responses) as
your suggested token - I am not aware about any other workaround.
You can also try experimenting with the size of the Erlang process
table- it can affect the probability of PID reuse.

Jan Chochol

On Thu, Jul 9, 2020 at 11:03 AM Dinislam Salikhov
<Dinislam.Salikhov@REDACTED> wrote:
>
> Unfortunately, registering a process with a name doesn't help much. It reduces a time window where the race may occur though.
> For instance, when gen_server:call/3 is invoked, the library code calls whereis(Name) to get the pid and then sends it a message {'$gen_call,...}. So between erlang:whereis/1 and erlang:send/2, the pid may be reused (actually, it is between erlang:whereis/1 and erlang:monitor/2 followed by erlang:send/2, so we will monitor the wrong process).
> See lib/stdlib/src/gen.erl which is used by lib/stdlib/src/gen_server.erl
>
> > If you have multiple connections to any given db (a pool of pools, if
> you will), using a process group module like pg makes this easy.
>
> Never used it before. I'll have a look. Thanks for the reference.
>
> Dinislam Salikhov
> ________________________________________
> From: Aaron Seigo <aseigo@REDACTED>
> Sent: Thursday, July 9, 2020 10:26 AM
> To: Dinislam Salikhov
> Cc: erlang-questions@REDACTED
> Subject: Re: Detect pid reuse
>
> On 2020-07-06 14:09, Dinislam Salikhov wrote:
> > If I want to send a command to the database, I search for the pid of
> > the corresponding connection (in supervisor's children list). And
>
> Perhaps register the processes with a name so that instead of searching
> for a literal pid, which may indeed change and requires more bookkeeping
> in your application code, you lookup the relevant connect by a name in a
> process registry. Should the old connection go away, the new one takes
> over the same name.
>
> If you have multiple connections to any given db (a pool of pools, if
> you will), using a process group module like pg makes this easy.
>
> Even then, you'll obviously need to handle the failure case of the
> process exiting between the message being sent and the response being
> received, but at least the lookup will be consistent.
>
> --
> Aaron Seigo