[erlang-questions] Extending term external format to support shared substructures

Richard O'Keefe ok@REDACTED
Tue Mar 31 04:57:27 CEST 2009


On 31 Mar 2009, at 2:19 pm, Matthew Dempsky wrote:

> Does anyone have opinions on extending the external term format to
> support shared substructures?

It already does this for atoms, so it would not be alien to the
spirit of the external term format.  What's more, so does UBF
have provision for this.  One naturally wonders: what would it be
like to unify the external term format and UBF, and would there be
any point in it?  -- The external term format could be a lot more
compact than it currently is, but UBF would move somewhat in the
other direction.

The issue is how much does it cost and how much does it gain.
Encoding with a reuse dictionary requires a minimum of two passes
over the term.  *IF* this reduces the size of the output, it's a
win; if it doesn't, it's a loss.

I've wondered whether it would be a good idea to have
  - one decoder, which is always ready to deal with shared
    substructure
  - two encoders, one that looks for shared substructure and one
    that doesn't.

Oddly enough, I have no firm opinions as yet, other than thinking
it would be a really good idea to have some figures on what NOT
looking for shared structure is costing us.
>
>




More information about the erlang-questions mailing list