[erlang-questions] json_to_term EEP
Richard A. O'Keefe
ok@REDACTED
Wed Jul 30 02:50:39 CEST 2008
On 29 Jul 2008, at 6:10 pm, Willem de Jong wrote:
> How about a SAX-like API?
(1) Anyone who wants such a design can produce their own design,
AND their own code. The EEP I am concerned with is a DVM-
like design (Document *Value* Model).
(2) In the XML world, there are several reasons for being
interested in SAX-like designs (why the H*LL they could
not bring themselves to say ESIS-like, when ESIS was the
traditional SGML model for the event stream, I cannot
imagine, unless it was sheer NIH).
(A) You can start processing a document without waiting for
the end. If people have JSON applications where they
need to start, say, processing the properties of an
"object" before knowing what other properties it may
have, then such a design may be useful for them. See
JSON-RPC note below.
(B) You can process a HUGE document without having to hold
all of it in memory. This was a major issue back in the
days of 16-bit machines; one of the merits of Troff was
that it produced pages "on-line", and pipelines
involving SGML and Troff (or similar) made sense. These
days, there are some amazingly large RDF files around,
so again, not having to hold the hold thing makes sense.
If people have JSON applications where they want to send
100s of MB of data as JSON, such a design may be useful
for them.
The 'man' documentation kit on Solaris works in very much
this way: SGML documentation => events => hacky program
that converts element edges to Troff macros => Troff.
(C) You may be able to filter an event stream so as to yield
the effect of selecting (or removing) elements. I've done
more of this than I care to remember piping the output of
nsgmls (or of the SWI Prolog SGML parser) through AWK
scripts. Think "subset of XPath" and you'll get the idea.
This is really a special case of (A) and (B). People who
have a need for filtering lengthy JSON streams and want
to reduce latency could use such a design.
(3) In the functional programming world, SAX is less attractive,
because the usual techniques for using an ESIS/SAX-like interface
are heavily stateful.
Once I had my Document Value Model kit, I found doing things the
"functional" way over documents as trees was so much easier than
doing things the ESIS/SAX-like way that now work with entire
forms whenever I can, and this is *C* programming I'm talking
about, where stateful is supposed to be easy.
(4) The JSON RFC makes it clear that JSON "messages", if I may call
them that, may only be "arrays" or "objects"; a number or a
string must be inside something else. In cases where an ESIS/
SAX-like interface might have made sense, it would be more usual
using JSON to send a stream of self-contained forms that can be
easily processed one at a time as entire things.
(5) The JSON-RPC 1.1 draft (I haven't looked at 1.0) hints at some
kind of ESIS/SAX-like interface when it says that arguments
should be sent in such an order that the receiver can process
them when it gets them. How are people actually using JSON-RPC?
Is there that much to gain, in actual practice?
(6) Not on topic, but I can't help feeling that Linux D-Bus would be
nicer if it used JSON...
> See for example http://www.p6r.com/articles/2008/05/22/a-sax-like-parser-for-json/
> . I can imagine that it would be easy to create any of the forms
> proposed in this thread based on such an API.
The thing is, it wouldn't be NEARLY as easy as NOT using such an API.
Several Erlang JSON implementations have been mentioned or displayed
in this thread already. They are not particularly hard to write.
I'd say they are MUCH harder to design than to write! And the ones
I have read would definitely have been *harder* to code using an ESIS/
SAX-
like interface.
> On the other hand it would allow you to do things that you wouldn't
> be able to do with a parser that produces a complete representation
> at once (in particular: parsing very big documents), and it would be
> better suitedt to support a 'data mapper' approach like the Erlang
> ASN.1 implementation, Googles Protocol Buffers or erlsom.
The question is whether the things that an ESIS/SAX-like interface
let you do are things that people particularly *want* to do with JSON.
I have no idea.
The world has room for both "value" interfaces and "event stream"
interfaces.
Obviously an ESIS-like interface is possible
because we can trivially map JSON to XML:
number => <number value="numeric string"/>
string => <string value="string"/>
array => <array>e1...en</array>
object => <object><slot name="n1">e1</slot>...</object>
So a JSON parser could simply emit the same event stream
(using *precisely* a SAX interface) as an XML parser
*would* have emitted given the equivalent XML.
That is, you would not have a new *interface*, just a new
*parser* that reused your existing "SAX" interface.
More information about the erlang-questions
mailing list