[erlang-questions] Extending term external format to support shared substructures

Bjorn Gustavsson <>
Tue Mar 31 11:46:12 CEST 2009

On Tue, Mar 31, 2009 at 6:37 AM, Matthew Dempsky <> wrote:
> On Mon, Mar 30, 2009 at 6:19 PM, Matthew Dempsky <> wrote:
>> Unless anyone is strongly opposed to the idea, I'll work on a
>> proof-of-concept patch.
> Applying the patch below to R13A extends binary_to_term to support the
> 'D' and 'w' type tags as I described above.  For example:
> 1> binary_to_term(<<131,$D,3:32, $k,5:16,"hello", $k,5:16,"world",
> $h,2,$w,0:32,$w,1:32, $l,4:32,$w,2:32,$w,2:32,$w,2:32,$w,2:32,$j>>).
> [{"hello","world"},
>  {"hello","world"},
>  {"hello","world"},
>  {"hello","world"}]
> (This example uses a three element dictionary: the strings "hello" and
> "world" are the first two words, the tuple {"hello", "world"} is the
> third, using references to the dictionary instances; finally, the term
> value is a length-4 list using 4 references to the dictionary tuple.)

Thanks for the patch. Even though it looks fine, we will not include until
the code for encoding a term has been written.

> The patch isn't entirely minimal; it also fixes a decoding problem for
> zero length LIST_EXT structures, avoids allocating extra heap cells
> for list structures, and refactors some of the integer unpacking to
> use get_int16 and get_int32.

Thanks for pointing out those issues.

I will include the use of the macros and the correction of the number of
heap words needed for lists in R13B.

Your fix for a zero length LIST_EXT doesn't seem to be correct, though.
Have you tried:


My correction for this problem will appear in the next R13B
snapshot (hopefully tomorrow) at:


(It is also good idea to use a snapshot as a base for further patches,
as I have eliminated the deep recursion in term_to_binary/1.)

> On a related note, I don't really understand why decoded_size keeps a
> stack of values for 'terms'.  It seems like it should just be possible
> to keep a running grand-total rather than pushing and popping from a
> stack.

To make sure that all terms are properly nested. We try to do as much error
checking as possible while calculating the size.

Björn Gustavsson, Erlang/OTP, Ericsson AB

More information about the erlang-questions mailing list