[erlang-questions] Changing the representation of sets

Tue Apr 25 15:17:00 CEST 2017

On Tue, Apr 25, 2017 at 1:17 PM Attila Rajmund Nohl <attila.r.nohl@REDACTED>
wrote:

> It would affect everybody who saves the opaque data using an earlier
> OTP version, then reads it in newer OTP version (e.g. after upgrade).
> Or those who run two nodes on different OTP versions (e.g. during an
> upgrade).
>

I like to think of dict and sets to be "opaque" data structures. If you are
relying on their internal representation, you will run into trouble. So
changing the representation going forward should definitely be possible.

This leaves backwards compatibility. If the maps-optimized module can read
and dynamically change old sets implementations into the new format on the
fly, it may be possible to gradually replace the old representations with
the new.

Some things to look out for in that process is data-at-rest, stored years
ago in a database. At some point you would have to reprocess such data, or
supply a conversion module which can handle these old formats.

I also think that any system must provide some measure of pushing things
forward. That is, each major release of Erlang could contain a limited set
of things you now have to do differently, with a clear upgrade path from
earlier versions. As long as the set is limited, we can probably handle the
rewrites needed. If you value backwards compatibility for forever, you run
the risk of getting stale, never upgrading anything.

As for the increased memory copy pressure: I think this should be fixed in
the context of "maps" and not be part of the argument as to why one would
keep the old sets representation.

As an aside: I've long wanted a way to "tag" an erlang term as
"do-not-touch-this" That, is to provide functions:

seal(Tag, Term) -> Sealed
unseal(Tag, Sealed) -> Term

which "hides" the representation of Term if printed, replacing it with some
kind of opaque representation (Think a function). For debugging purposes,
an auto-unseal could be necessary.

One reason for this is that I can make a representation abstract so users
of a library are unlikely to rely on the representation being a certain
structure and by accident tightly coupling their code to my code. While we
are all mutually consenting adults, experience has shown people tend to
rely on internals quite often.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20170425/b5b4a109/attachment.htm>