[erlang-questions] json_to_term EEP

Richard A. O'Keefe ok@REDACTED
Wed Jul 30 03:34:03 CEST 2008

It would be nice if people would read the EEP.

On 30 Jul 2008, at 2:55 am, Hynek Vychodil wrote:
> I would prefer to always have strings in *one* format and not  
> special case keys with atoms sometimes. Otherwise to be certain you  
> would have to match both atom and binary to find key. Unless you  
> *always* use atoms for keys, which could easily explode.

In the EEP, json_to_term(IO_Data, Options) has an option
or	{label,atom}
or	{label,existing_atom}
There is no corresponding option for strings, which are
always binaries.  (The idea is that strings are
unpredictable data, whereas labels are predictable structure.)
{label,binary} says to leave all labels as binaries.
     This would have been intolerable before <<"...">> syntax
     was introduced; now the main thing is that it wastes space.
{label,atom} says to convert to an atom any label that CAN
     be converted to an atom, the main limitation being that
     Erlang atoms are not yet Unicode-ready.  (Someone else has
     an EEP about that, I believe.)  This is perfect for
     communicating with a TRUSTED source, just like receiving
     Erlang term_to_binary() values and decoding them.
{label,existing_atom} means that a module that mentions
     certain atoms in pattern matches against formerly-JSON
     labels can be confident of finding those atoms, while
     other labels may remain binaries.

Options are a way of coping with different people's different
situations and needs; the trick is to have just enough of them.

> I argue unification,

Unification of what with what?

> so transforming all to atom is insecure and result is don't use this  
> way at all.

WITHIN a trust boundary, all is well.  Not all communication
crosses trust boundaries, otherwise term_to_binary() would be
of little or no use.

> Aside non-uniformity of  list_to_existing_atom way, there is  
> performance drawback too. For each key you must call  
> list_to_existing_atom(binary_to_list(X)) and binary_to_list causes  
> GC pressure in this usage. I would not have use this variant, too.

What performance drawback?  What call to binary_to_list()?  Whoever said
the binary EXISTED in the first place?  The EEP is a proposal for  
these conversion functions in the Erlang core, eventually to be
implemented in C.  So implemented, the alleged performance drawback  
does not exist.

> P.S.: Why non-uniform is problem.

It is a problem for people who EXPECT a uniform translation,
and not for people who don't.

> One can argue, it looks nicer. OK. One can argue, binary->atom  
> transformation is done only for exists atoms and all atoms which  
> used in comparisons are exists. BAD, imagine for example store  
> Erlang term for long time or send to other nodes

Again, you are overlooking the fact that different people have
different needs, and that the translation of labels can be (and
IS, in the EEP) an OPTION.  You are also overlooking the fact
that *considered as JSON*, the forms are entirely equivalent,
and that since JSON explicitly says that the order of key:value
pairs does not matter, there is uncertainty about precisely
what Erlang term you get anyway.

In fact, for binary storage, conversion to existing atoms is
*better* than conversion to binaries, because the Erlang
term-to-binary format uses a compression scheme for atoms
that it does not use for binaries.

Admittedlty, the answer to that is to extend the compression
scheme to binaries as well.

More information about the erlang-questions mailing list