[erlang-questions] pid representation in external term format

Serge Aleynikov serge@REDACTED
Fri Sep 13 20:00:09 CEST 2013


A
h, I was looking at stale source code where Serial was 3 bits rather than
13.

Still 28 bits is a pretty small number when it comes to other languages
interfacing with Erlang over distributed transport where a new Pid can be
used for each RPC call. Finding an available Pid number that could be
reused after the counter wrap-around wouldn't be as cheap as merely
incrementing a counter.

When the pid counter wraps around, how does the VM ensure that the
incremented Pid value doesn't collide with an existing Pid?

It doesn't seem like upgrading erl_interface to 64 bit representation of
pids would be a huge matter, since pid structure already has two 32-bit
integers (id + serial), so it would just be necessary to adjust the masks
when encoding/decoding values.

So, if there's a way to resurrect this issue and increase the Pid
representation to 60bit in a 64bit environment, this would really be
helpful.


On Fri, Sep 13, 2013 at 12:27 PM, Rickard Green <rickard@REDACTED> wrote:

> Up to somewhere around the R9 release pids were composed by a 15 bit index
> into the process table and a 3 bit serial. After that we use 28 bits, i.e.,
> all bits available in an immediate term on a 32-bit machine. Number of bits
> used as index is determined at boot-time instead of at compile-time.
>
> This can, quite easily, be extended to 60 bits on 64-bit machines when it
> comes to the VM. I've already implemented this once in the VM, but dropped
> it since it required backward incompatible changes and a lot of work in
> erl_interface/ei. This made us settle for 28 bits on all machines
> (unfortunately).
>
> The _PID_NUM_SIZE of 15 bits is more or less only still there in order to
> make the textual representation of pids look (somewhat) the same as they
> used to before R9.
>
> Since 2^28 is a quite small number, pids may be reused quite fast in a
> system that spawns a lot of processes. All of the 28 bits of the pid is
> passed over the distribution.
>
> Regards,
> Rickard Green
>
>  Yes, what I meant was:
>> Why wouldn't the Erlang VM be changed to use 4 bytes (32 bits) for the
>> Serial and 4 bytes (32 bits) for the Id (giving us 46 more bits), so that
>> the sizes match what the External Term Format allows?  Is this change on
>> the Erlang VM roadmap?
>>
>> On 09/12/2013 04:05 PM, Serge Aleynikov wrote:
>>
>>> It seems to me that you were mistaken as the 4 (id) and 4 (serial) in
>>> the External Term Format specification (*) indicate the sizes in bytes
>>> rather than in bits.
>>>
>>> (*) http://www.erlang.org/doc/**apps/erts/erl_ext_dist.html#**id87011<http://www.erlang.org/doc/apps/erts/erl_ext_dist.html#id87011>
>>>
>>>
>>> On Thu, Sep 12, 2013 at 6:22 PM, Michael Truog < <mailto:>> wrote:
>>>
>>>     On 09/11/2013 09:58 AM, Serge Aleynikov wrote:
>>>
>>>>     Presently the PID representation in external term format is limited
>>>> to the following:
>>>>
>>>>     http://www.erlang.org/doc/**apps/erts/erl_ext_dist.html#**id87011<http://www.erlang.org/doc/apps/erts/erl_ext_dist.html#id87011>
>>>>
>>>>     ./erts/emulator/beam/erl_node_**container_utils.h:#define
>>>> ERTS_MAX_PID_NUMBER           ((1 << _PID_NUM_SIZE) - 1)
>>>>     ./erts/emulator/beam/erl_term.**h:#define _PID_NUM_SIZE
>>>>       15
>>>>
>>>>     ID is limited to 15 bits
>>>>     Serial is limited to 3 bits
>>>>
>>>>     So in total a PID consists of 18 bits, and therefore it seems that
>>>> the number of pids on any remote node cannot exceed 2^18 (262144).  While
>>>> it may seem like a large number, when creating a node in other languages
>>>> that implement Erlang distributed transport (e.g. C/C#/Java) and
>>>> create/destroy mailboxes, the local pid counter used to create unique Pids
>>>> can easily go over that limit.  The work-around is to cache freed local
>>>> pids and resurrect them when pid counter wraps around 2^18 boundary.
>>>>
>>>>     This brings the question of whether that limitation is still
>>>> necessary in the current version of distribution.  Internally Pids use a
>>>> wider representation (is it 28 bits?), so aside for supporting older
>>>> versions of beam (which can be worked around through flags in distributed
>>>> transport) is there any valid reason not to increase the pid maximum
>>>> numbering limit?
>>>>
>>>>     BTW, as a side note, how is the same problem addressed in the beam
>>>> when the pid ID counter reaches that limit? Does it make it possible that a
>>>> newly assigned Pid becomes non-unique? (I.e. if some entity still maintains
>>>> a reference to an old Pid that died, and later after the pid ID counter
>>>> wrapped around, a new Pid was assigned the same Pid ID number of a
>>>> previously dead Pid, then the entity that had the reference to the old Pid
>>>> with the same ID, could send a message to it that would not be valid for
>>>> the new Pid.)
>>>>
>>>>     Serge
>>>>
>>>>
>>>>
>>>>     ______________________________**_________________
>>>>     erlang-questions mailing list
>>>>      <mailto:>
>>>>     http://erlang.org/mailman/**listinfo/erlang-questions<http://erlang.org/mailman/listinfo/erlang-questions>
>>>>
>>>     Why wouldn't the Erlang VM be changed to use 4 bits for the Serial
>>> and 4 bits for the Id, so that the sizes match what the External Term
>>> Format allows?  Is this change on the Erlang VM roadmap?
>>>
>>>
>>>
>>>
> --
> Rickard Green, Erlang/OTP, Ericsson AB.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20130913/2d40b5be/attachment.htm>


More information about the erlang-questions mailing list