[erlang-questions] json_to_term EEP
Richard A. O'Keefe
ok@REDACTED
Wed Jul 30 03:34:03 CEST 2008
It would be nice if people would read the EEP.
On 30 Jul 2008, at 2:55 am, Hynek Vychodil wrote:
> I would prefer to always have strings in *one* format and not
> special case keys with atoms sometimes. Otherwise to be certain you
> would have to match both atom and binary to find key. Unless you
> *always* use atoms for keys, which could easily explode.
In the EEP, json_to_term(IO_Data, Options) has an option
{label,binary}
or {label,atom}
or {label,existing_atom}
There is no corresponding option for strings, which are
always binaries. (The idea is that strings are
unpredictable data, whereas labels are predictable structure.)
{label,binary} says to leave all labels as binaries.
This would have been intolerable before <<"...">> syntax
was introduced; now the main thing is that it wastes space.
{label,atom} says to convert to an atom any label that CAN
be converted to an atom, the main limitation being that
Erlang atoms are not yet Unicode-ready. (Someone else has
an EEP about that, I believe.) This is perfect for
communicating with a TRUSTED source, just like receiving
Erlang term_to_binary() values and decoding them.
{label,existing_atom} means that a module that mentions
certain atoms in pattern matches against formerly-JSON
labels can be confident of finding those atoms, while
other labels may remain binaries.
Options are a way of coping with different people's different
situations and needs; the trick is to have just enough of them.
> I argue unification,
Unification of what with what?
> so transforming all to atom is insecure and result is don't use this
> way at all.
WITHIN a trust boundary, all is well. Not all communication
crosses trust boundaries, otherwise term_to_binary() would be
of little or no use.
>
> Aside non-uniformity of list_to_existing_atom way, there is
> performance drawback too. For each key you must call
> list_to_existing_atom(binary_to_list(X)) and binary_to_list causes
> GC pressure in this usage. I would not have use this variant, too.
What performance drawback? What call to binary_to_list()? Whoever said
the binary EXISTED in the first place? The EEP is a proposal for
putting
these conversion functions in the Erlang core, eventually to be
implemented in C. So implemented, the alleged performance drawback
simply
does not exist.
>
> P.S.: Why non-uniform is problem.
It is a problem for people who EXPECT a uniform translation,
and not for people who don't.
> One can argue, it looks nicer. OK. One can argue, binary->atom
> transformation is done only for exists atoms and all atoms which
> used in comparisons are exists. BAD, imagine for example store
> Erlang term for long time or send to other nodes
Again, you are overlooking the fact that different people have
different needs, and that the translation of labels can be (and
IS, in the EEP) an OPTION. You are also overlooking the fact
that *considered as JSON*, the forms are entirely equivalent,
and that since JSON explicitly says that the order of key:value
pairs does not matter, there is uncertainty about precisely
what Erlang term you get anyway.
In fact, for binary storage, conversion to existing atoms is
*better* than conversion to binaries, because the Erlang
term-to-binary format uses a compression scheme for atoms
that it does not use for binaries.
Admittedlty, the answer to that is to extend the compression
scheme to binaries as well.
More information about the erlang-questions
mailing list