[erlang-questions] Why EEP-0018 "JSON bifs" (and conforming libraries) are "wrong" about object encoding (i.e. `[{}]`)

Wed Aug 27 11:53:34 CEST 2014

Hello,

EEP18 will probably never been implemented in OTP, and if it does, it will probably end up using maps to represent objects instead of tuples and lists. Remember that this EEP18 has been written long before maps were implemented.

Regards,

-- 
Anthony Ramine

Le 26 août 2014 à 23:30, Ciprian Dorin Craciun <ciprian.craciun@REDACTED> a écrit :

>    Today I've made a small attempt at finding a suitable JSON
> encoding / decoding library that fit my needs.  (Until now I was using
> `mochijson2` [1], which seems to be rather old...)  However the
> current email is not about my requirements, or the found libraries,
> but on the current direction the recent JSON libraries are heading
> towards, namely how objects are encoded as `[{}] | list({string(),
> json()})`.
> 
>    First of all a small survey about existing implementations:
>    * `mochijson2` [1] uses (what I call) "mochi", i.e. `{struct,
> list({string(), json()})}`;
>    * `jsx` [2] uses EEP-0018, i.e. `[{}] | list({string(), json()})`;
>    * `jiffy` [3] uses EEP-0018;
>    * `ejson` [4] uses something in between "mochi" and EEP-0018,
> namely `{list({string(), json()})}`;
>    * `jsonx` [5] uses "mochi";
>    * `kvc` [6] (although a querier) uses either "mochi" or EEP-0018;
>    * `valijate` [7] (although only a validator) uses something
> similar to "mochi";
>    * others are either "mochi"-compliant or EEP-0018 compliant;
> (many more seem to be EEP-0018 than "mochi";)
> 
>    Granted that EEP-0018 clearly states that a library could offer
> the user the option to choose how the object is to be encoded as an
> Erlang term (options A through F), and it could know how to correctly
> interpret one as such.  (Unfortunately none of the encoding / decoding
> libraries do support this choice.)
> 
> 
>    At a first scratch there seems nothing wrong with either approach
> (either "mochi" or EEP-0018).  Except maybe the fact that when writing
> a multi-headed pattern-matching function, in case of EEP-0018 one must
> match first the list, then the object, or else the object might be
> misinterpreted as a list.
> 
>    Although I do have to say that "mochi" does have a clearly
> unambiguous way to detect the value type, that is almost impossible to
> get wrong.  But granted that deconstructing EEP-0018 compliant objects
> with plain Erlang libraries (like `proplists`) is slightly more
> straight-forward, because one can use the object value as a list,
> whereas in "mochi" one would need to first extract the list, which
> boils down to `Object` vs. `element(2, Object)`.
> 
>    Therefore why do I say EEP-0018 (and conforming libraries) are
> wrong?  (In fact so wrong that I felt the need to write such a lengthy
> email?)  Because when one wants to extend the proposed JSON term
> syntax, or perhaps use it for something else (but still related to
> JSON), things start to crumble.
> 
> 
>    Let's take for example `valijate` [7] which allows one to easily
> validate (among others) JSON values with a simple schema.  For example
> one can say `{array, string}` to denote a schema which matches any
> list made only of strings.
> 
>    One could write the `validate` function for EEP-0018 as:
> 
> ~~~~ (not tested)
> validate (List, {array, ElementSchema}) ->
>    case List of
>        [{}] -> false;
>        [Head | _] when is_tuple (Head) -> false;
>        _ when is_list (List) ->
>            lists:foldl(
>                fun (true, A) -> A; (false, _) -> false end, true,
>                lists:map (fun (Element) -> validate(Element,
> ElementSchema) end, List));
>        _ -> false;
>    end;
> ~~~~
> 
>    Compared with the following for "mochi":
> 
> ~~~~ (not tested)
> validate (List, {array, ElementSchema}) ->
>    if
>        is_list (List) ->
>            lists:foldl(
>                fun (true, A) -> A; (false, _) -> false end, true,
>                lists:map (fun (Element) -> validate(Element,
> ElementSchema) end, List));
>        true -> false
>    end;
> ~~~~
> 
>    No big issue so far, except a few extra matches (which could
> become tiresome to write if we have more than one type of schema that
> applies to lists).
> 
> 
>    Let's move a little bit further.  Say we now want to be able to
> write in `valijate` something like the following Erlang type:
> `[list(string()), list(integer())]` (i.e. a list made of exactly two
> elements, the first a list of strings, the second a list of integers).
> Although I don't know how (or if) `valijate` is able to express this,
> I would have expected something like this to work:
> 
>      [{array, string}, {array, integer}]
> 
>    Let's complicate it further and assume that we want to be able to
> validate if a list is a set (i.e. non-repeating elements), and we
> introduce a new schema type called `set`.  Let's see how a schema
> would look like for a list made of exactly two elements, the first an
> array of strings, the second a set of integers:
> 
>      [{array, string}, {set, integer}]
> 
>    However, before writing the implementation, let's imagine how a
> schema for an objects would look like.  Assuming we want to keep as
> close as possible to the original EEP-0018 proposed term syntax, one
> could imagine something like this:
> 
>    * for an object with any number of attributes, whose keys and
> values must independently match the given schemas (i.e. the expected
> object behaves like a dictionary):
> 
>      [{<schema-for-key>, <schema-for-value}]
> 
>    * for an object with as many attributes as given by the tuples,
> whose keys exactly match the given key literals (maybe in a different
> order), and whose values match the given schemas (i.e. the expected
> object behaves like a record):
> 
>      [{<literal-key-1>, <schema-for-value-1}, {<literal-key-2>,
> <schema-for-value-2>}, ...]
> 
>    Let's try to give two examples:
>    * an object with any key, and a string as value (pick any):
>      [{string, string}]
>      [{any, string}]
>    * an object with exactly one key named either "string", or "any",
> whose value is a string:
>      [{string, string}]
>      [{any, string}]
> 
>    Let's also try to provide the schema for an object with exactly
> two attributes, one named `array` and with a string value, the second
> named `set` with an integer value:
> 
>      [{array, string}, {set, integer}]
> 
>    Darn...  I can't discern between the schema for a dictionary of
> strings or the schema for a record with a single attribute named
> "string" and a string value.  Similarly I can't discern between the
> schema for a two element array (one element a list, the other a set)
> or a two attribute object one named "array", the other "set".
> (Granted I can start complicating the schema syntax, but that would
> get further from a "simple" approach where the schema resembles
> closely the actual value.)
> 
> 
>    Let's see if "mochi" could do it:
>    * for a dictionary:
>      {object, {<schema-for-key>, <schema-for-value}}
>    * for a record:
>      {object, [{<literal-key-1>, <schema-for-value-1}, ...]}
>    * (an actual matching JSON would be encoded as:)
>      {object, [{<literal-key-1>, <actual-value-1}, ...]}
> 
>    For example:
>      {object, {string, string}} -- the string dictionary
>      {object, [{string, string}]} -- the record with a string attribute
>      {object, [{array, string}, {set, integer}]} -- the object with
> "array" and "set" attributes
>      [{array, string}, {set, integer}] -- the list with an array of
> strings and a set of integers
> 
> 
>    OK...  Let's assume that the validation use-case is not of
> interest, and we could live with a certain syntax for the JSON values
> and another one for the schema.  Fine!
> 
>    Say however that we want now to make a small extension to our
> favourite JSON encoding library to better suit the following scenario:
> we need to implement a small web-service which gets a JSON from
> somewhere as a binary (maybe from a database or file), i.e.
> <<"{...}">>, of which we are certain it is correctly encoded, thus we
> don't need to parse it, and we need to "wrap" it into another JSON
> which holds some meta-data, like for example `{"ok" : true, "outcome"
> : <the-JSON-we-got-from-somewhere>}`.  However as stated we would like
> the to reuse our favourite JSON library to encode such a wrapper JSON,
> but without first parsing the JSON binary.
> 
>    We could therefore extend the JSON library to allow us to put
> inside JSON terms, values which are to be "pasted" directly in the
> result (this is equivalent to the `RawMessage` from Go's JSON
> library).
> 
>    However the question is how to flag these raw values?  We
> obviously can't use a binary as that is the representation of a
> string.  We could choose something like `{raw, <<"...">>}`.
> 
>    OK, let's now try to see how a list composed of a single element,
> namely the raw JSON, would look like:
> 
>      [{raw, <<"...">>}]
> 
>    Doesn't that resemble an object with an attribute named "raw" and
> a string as value?  In EEP-0018 it surely does, but in "mochi" it
> doesn't (that would be `{object, [{raw, <<"...">>}]}`).
> 
>    (Or maybe we want to be able to use records directly as JSON
> values, tagged like `{record, <actual-record>}` which would call an
> external formatter, or even simpler directly using `<actual-record>`
> provided that it's tag isn't `object`.  Or perhaps `{dict,
> <actual-dict>}`, `{gb_tree, <actual-gb-tree>}`, etc.)
> 
> 
>    I hope that the two given examples argument my case against
> `[{}]`.  (I.e. it hampers extensibility of the proposed JSON term
> syntax.)
> 
> 
>    Moreover the other choice `{[{key, value}, ...]}` is marginally
> better than EEP-0018, because now it suggests that people can match an
> object by simply stating `is_tuple (JSON)`, which would make
> implementing extensions like the raw message one harder.
> 
> 
>    The reason that I wrote this email is because I have invested
> quite some time in writing a "few" JSON utility functions (including
> complex schema validation, destructuring, etc.) which heavily use and
> extend the "mochi" variant.  Based on this experience and a small
> analysis I've done today, I concluded that EEP-0018 would be quite
> cumbersome for expressing any kind of extension without a lot of
> pattern-matching to catch the extensions.  However by no mean do I
> expect developers to change their libraries to suite such a usage, I
> only wanted to provide a counter-argument to EEP-0018.  Moreover, now
> that Erlang has hash objects, hopefully these can be used to express
> objects, and this problem would go away.
> 
>    Hopefully I haven't offended anyone, (I apologize in advance,)
>    Ciprian.
> 
> 
>    [1] https://github.com/mochi/mochiweb/blob/master/src/mochijson2.erl
>    [2] https://github.com/talentdeficit/jsx
>    [3] https://github.com/davisp/jiffy
>    [4] https://github.com/benoitc/ejson
>    [5] https://github.com/iskra/jsonx
>    [6] https://github.com/etrepum/kvc
>    [7] https://github.com/eriksoe/valijate
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions