[erlang-questions] No JSON/MAPS interoperability in 17.0?

Carsten Bormann cabo@REDACTED
Thu Mar 13 23:40:13 CET 2014


>> The original specification of JSON said that it was a
>> *binary* format where characters must be encoded as UTF-8.
>> RFC 7159 says that it SHALL be UTF-8, UTF-16, or UTF-32
>> but then forbids the use of a BOM!
> 
> I guess this encoding problem must be solved the same way as content type - this information has to be provided outside of content.

It’s much simpler than that — the encoding is UTF-8.  Always.

(RFC 7159 removed the language that RFC 4627 had about auto detecting the character encoding scheme from the initial bytes, in part because the old language assumed only arrays and maps (“objects”) could be top-level, but also because UTF-16 and UTF-32 are not in real-world use.  There wasn’t enough energy to remove the fiction of UTF-16 or UTF-32 support from the document, because there was a feeling that would “break” something — break what?  One of the few failings of the process that led to RFC 7159.)

The whole text:

   JSON text SHALL be encoded in UTF-8, UTF-16, or UTF-32.  The default
   encoding is UTF-8, and JSON texts that are encoded in UTF-8 are
   interoperable in the sense that they will be read successfully by the
   maximum number of implementations; there are many implementations
   that cannot successfully read texts in other encodings (such as
   UTF-16 and UTF-32).

In other words: Don’t do that (where that = UTF-16 or UTF-32).

More specifically, the Internet Media Type application/json does *not* have a charset parameter, because it doesn’t need one.
(Autodetection would still work, but you have to figure out the details by yourself.)

Grüße, Carsten




More information about the erlang-questions mailing list