[erlang-questions] No JSON/MAPS interoperability in 17.0?

Fri Mar 14 16:20:13 CET 2014

Hi Joe,

I think the library you have described (does 99% of the work) is the
equivalent of bait-and-switch at the language/library level:

1) it's not just a question of do floating point numbers survive the
   roadtrip, but also integers.
2) json tags are more or less strings and expect utf-8. Currently, we
   'support' utf8 atoms but we don't. See
   http://www.erlang.org/erldoc?q=list_to_atom.
   This doesn't mention what you do in case of trying to encode a map
   which currently uses keys such as '1.0', 1.0, <<"1.0">>, and "1.0" at
   the same time. We currently have 4 data types that will possibly need
   an identical representation while being converted.

   Woops, that doesn't work super well and may in fact cover far less
   than 99% of the cases. We have to consider all the other cases such
   as just 1, 1.0, "1.00", "1.000", ..., and so on.

3) That can be made to work
4) No opinion on this one
5) This can also be read as "best effort not corrupting values read"
   which scares me a lot if the end result is not "raise an error when you can't
   figure it out reliably"
6) Amen to that.

This doesn't even take into account the issue that by using atoms by
default, you're actively using a source of memory leaks into the
library. This guarantees that every tutorial out there will recommend
not using the standard library for your tasks.

What I'm getting at here is that while your scheme might work for 99% of
possible JSON -> Erlang decodings, it will do so in a risky way that
cannot be advocated.

If you consider all the possible Erlang -> JSON mappings (and this is
where the biggest problem always is), then it covers far, far less than
99% of the cases, because there is not one good way to do it (how do you
even represent binaries that are not UTF-8 and distinguish them from
actual strings? You validate the entire thing unless you want to create
unparseable JSON).

I used the words bait-and-switch and I mean it there. This is one of the
points where Jose Valim and I disagree the most.

I hate, absolutely hate solutions that bring you 70% of the way (to use
a number we discussed between ourselves). Why? Because you start by
learning the 70% solution and then it doesn't work. Suddenly you have to
go out and look for the 99% and the 100% solutions out there.

Except you now have a crusty code base full of legacy, and you start
supporting, one, two libraries, which you never envisioned. And you find
out you're married to encoders and decoders that decided to do things
differently, but you just don't have the time to fix everything right
now.

You start seeing node crashes because someone decided atoms is a good
way to receive unvalidated data without first having implemented a
scheme to GC them (say EEP-20:
http://www.erlang.org/eeps/eep-0020.html).

You start being pretty angry that a language so focused on getting
productions systems live has a standard library that lets you hang dry.

Then you get even angrier when you figure out a crapload of frameworks
in the wild all use that faulty function because it was in the standard
library.

Then you get certainly extremely angry and leave for Go or some other
language when you figure out the community already had solutions to
these problems in the year 2014 (and before!) but they were just
overlooked because we wanted an easy JSON implementation in the stdlib.

I can't for the life of me see the benefit of canonizing a bad library
when tradeoffs are already known and worked around in the wild.

What we should focus on is explaining these tradeoffs and making it easy
to show the different options. Currently, picking a JSON lib is hard
because there is such a very poor match between what you can possibly
encode in Erlang and how you can translate this back and forth with
JSON. Not just because it's not in the standard library.

Not speaking about the problem doesn't make it go away, it makes it more
surprising, which is not a desirable property.

Regards,
Fred.

On 03/14, Joe Armstrong wrote:
> This is what most libraries do - they work 99% of the time.
> 
> (( the 99% is horrible - just when I think a library is great I find - *but
> I can't do X* then
>     I'm in trouble - but I guess this is in the nature of the beast - only
> pure mathematical
>     functions have the kind of "platonic interfaces" that will never change
> - real
>     world problems with time and space involved are just plain messy ))
> 
> If we freeze the JSON design and say:
> 
>     1) floating point number will not survive a round-trip
>     2) JSON tags will be erlang atoms
>     3) Terms are not-infinite and rather small (few MB at max)
>     4) we want a parse tree
>     5) best effort at sorting out character set encodings
>     6) Pure erlang
> 
> Then the problem becomes "easy" and can be knocked out in a few lies of code
> using maps:from_list and maps:to_list.
> 
> And *because* it's easy nobody writes a library to do this.
> 
> The trouble is a beginner probably does not find this easy and would
> appreciate
> a library that worked 99% of the time on simple problems.
> 
> /Joe
>