term_to_binary/2 with atom cache and/or pid_info/1

Rickard Green rickard@REDACTED
Tue Mar 23 15:46:19 CET 2021

On Mon, Mar 22, 2021 at 12:08 PM Loïc Hoguin <lhoguin@REDACTED> wrote:

> Hello,
> Currently the Erlang Term Format has two variants:
>  * the full featured format that includes different forms of atom caches
>  * the simpler term_to_binary/1 format that does not
> This is not a satisfying state of affairs: sometimes we want to use
> term_to_binary/1 for protocols or when exchanging data, but the lack
> of atom cache can result in us sending a lot of 'undefined' atoms in
> string form.
>   => Should term_to_binary/1 allow setting up an atom cache?
>      Perhaps the cache could be maintained as a map to be encoded
>      separately by the user. This could also allow predefining
>      the most common atoms that could then never be sent (for
>      example #{undefined => 1, true => 2, false => 3}). Whatever
>      the interface we should reuse as much of the distribution
>      header atom cache code as possible.
> An alternative would be to build our own format loosely based on the
> Erlang Term Format. But in that scenario we end up lacking at least
> the pid_info/1 and ref_info/1 functions that would allow us to encode
> a pid/reference without having to use either term_to_binary/1 or
> {pid,ref}_to_list/1. On the other side the pid/reference can be
> recomposed via a pid_from_info/1 or ref_from_info/1 type of function.
> These functions can be useful to have regardless of the answer to the
> first question above. For example pid_info/1 is used in Mnesia here:
> https://github.com/erlang/otp/blob/master/lib/mnesia/src/mnesia_locker.erl#L1270
> And also in RabbitMQ here, as well as pid_from_info/1:
> https://github.com/rabbitmq/rabbitmq-server/blob/master/deps/rabbit/src/pid_recomposition.erl
> I've also been writing similar code when experimenting with custom
> distribution drivers.
>   => Should erlang:pid_info/1 and erlang:pid_from_info/1 be added?
>      This is the strongest case as there's code in the wild
>      already doing this.
>   => Should erlang:ref_info/1 and erlang:ref_from_info/1 be added?
> It's possible that ports and funs may benefit as well, but I have
> a hard time figuring out when we would want to use a port that
> way, and funs I believe that we already have everything we need
> as long as they're not anonymous funs.
> Cheers,
> --
> Loïc Hoguin
It is a bit unfortunate that the "creation" value of the node part is so
well hidden since the full identifier of a node is its nodename together
with its creation. It would have been nice if the node/1 BIF had returned
'{Nodename, Creation}' instead of just 'Nodename', but that is too late to
change now. Perhaps a nid/1 BIF?

Currently pids, ports and references are the datatypes that contain node
identifiers which also are the types the node/0 BIF can handle.

I think it is reasonable with functionality for creation of such data types
from full information, so that alternative protocols wont have to go via
the external term format.

Rickard Green, Erlang/OTP, Ericsson AB
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20210323/ea968699/attachment.htm>

More information about the erlang-questions mailing list