[erlang-questions] json_to_term EEP

Wed Jul 30 02:50:39 CEST 2008

On 29 Jul 2008, at 6:10 pm, Willem de Jong wrote:
> How about a SAX-like API?

(1) Anyone who wants such a design can produce their own design,
     AND their own code.  The EEP I am concerned with is a DVM-
     like design (Document *Value* Model).

(2) In the XML world, there are several reasons for being
     interested in SAX-like designs (why the H*LL they could
     not bring themselves to say ESIS-like, when ESIS was the
     traditional SGML model for the event stream, I cannot
     imagine, unless it was sheer NIH).

     (A) You can start processing a document without waiting for
         the end.  If people have JSON applications where they
         need to start, say, processing the properties of an
         "object" before knowing what other properties it may
         have, then such a design may be useful for them.  See
         JSON-RPC note below.

     (B) You can process a HUGE document without having to hold
         all of it in memory.  This was a major issue back in the
         days of 16-bit machines; one of the merits of Troff was
         that it produced pages "on-line", and pipelines
         involving SGML and Troff (or similar) made sense.  These
         days, there are some amazingly large RDF files around,
         so again, not having to hold the hold thing makes sense.
         If people have JSON applications where they want to send
         100s of MB of data as JSON, such a design may be useful
         for them.

	The 'man' documentation kit on Solaris works in very much
	this way:  SGML documentation => events => hacky program
	that converts element edges to Troff macros => Troff.

     (C) You may be able to filter an event stream so as to yield
         the effect of selecting (or removing) elements.  I've done
	more of this than I care to remember piping the output of
	nsgmls (or of the SWI Prolog SGML parser) through AWK
	scripts.  Think "subset of XPath" and you'll get the idea.
	This is really a special case of (A) and (B).  People who
	have a need for filtering lengthy JSON streams and want
	to reduce latency could use such a design.

(3) In the functional programming world, SAX is less attractive,
     because the usual techniques for using an ESIS/SAX-like interface
     are heavily stateful.

     Once I had my Document Value Model kit, I found doing things the
     "functional" way over documents as trees was so much easier than
     doing things the ESIS/SAX-like way that now work with entire
     forms whenever I can, and this is *C* programming I'm talking
     about, where stateful is supposed to be easy.

(4) The JSON RFC makes it clear that JSON "messages", if I may call
     them that, may only be "arrays" or "objects"; a number or a
     string must be inside something else.  In cases where an ESIS/
     SAX-like interface might have made sense, it would be more usual
     using JSON to send a stream of self-contained forms that can be
     easily processed one at a time as entire things.

(5) The JSON-RPC 1.1 draft (I haven't looked at 1.0) hints at some
     kind of ESIS/SAX-like interface when it says that arguments
     should be sent in such an order that the receiver can process
     them when it gets them.  How are people actually using JSON-RPC?
     Is there that much to gain, in actual practice?

(6) Not on topic, but I can't help feeling that Linux D-Bus would be
     nicer if it used JSON...

> See for example http://www.p6r.com/articles/2008/05/22/a-sax-like-parser-for-json/ 
> . I can imagine that it would be easy to create any of the forms  
> proposed in this thread based on such an API.

The thing is, it wouldn't be NEARLY as easy as NOT using such an API.
Several Erlang JSON implementations have been mentioned or displayed
in this thread already.  They are not particularly hard to write.
I'd say they are MUCH harder to design than to write!  And the ones
I have read would definitely have been *harder* to code using an ESIS/ 
SAX-
like interface.

> On the other hand it would allow you to do things that you wouldn't  
> be able to do with a parser that produces a complete representation  
> at once (in particular: parsing very big documents), and it would be  
> better suitedt to support a 'data mapper' approach like the Erlang  
> ASN.1 implementation, Googles Protocol Buffers or erlsom.

The question is whether the things that an ESIS/SAX-like interface
let you do are things that people particularly *want* to do with JSON.
I have no idea.

The world has room for both "value" interfaces and "event stream"
interfaces.

Obviously an ESIS-like interface is possible
because we can trivially map JSON to XML:

	number =>	<number value="numeric string"/>
	string =>	<string value="string"/>
	array  =>	<array>e1...en</array>
	object =>	<object><slot name="n1">e1</slot>...</object>

So a JSON parser could simply emit the same event stream
(using *precisely* a SAX interface) as an XML parser
*would* have emitted given the equivalent XML.
That is, you would not have a new *interface*, just a new
*parser* that reused your existing "SAX" interface.