[erlang-questions] fast JSON parser in C
Wed Jul 30 03:26:37 CEST 2008
2008/7/25 Chris Anderson <jchris@REDACTED>
> On Thu, Jul 24, 2008 at 6:56 PM, Bob Ippolito <bob@REDACTED> wrote:
> > I'd be curious to know if leex/yecc can do any better than mochijson2
> > (which is written by hand), especially considering that it uses
> > binaries instead of strings.
> The version of mochijson2 that is in CouchDB's trunk (not in use) is
> about twice as fast as cjson - which is used by CouchDB currently.
> After spending a day writing my own leex/yecc parser, it turns out to
> be about 3 times slower than cjson, and about 6 times slower than
> mochijson2. I could probably optimize the grammar definition to try to
> make it faster. I was hoping that the magic of parser-generators would
> give me a big jump on the hand-crafted code. Since it didn't, I don't
> plan to pursue it any further.
> If anyone wants to see the leex/yaac code, I can put it online.
I have looked at Chris's code and done some optimising of the leex .xrl
file. It now goes about 3.5 times faster than before and is only about 20%
slower than mochijson2. At least for a limited range of input files. Though
Chris uses lists for input as well as for strings while the mochiweb uses
binaries. How much this affects the speed I don't know.
The difference is *much* less than I expected and *much* *much* easier to
read and understand. If I was doing ít I might consider handwriting the
scanner and usng the yecc generated parser. One benefit of the leex
generated scanner is that it is re-entrant which means it can easily handle
input coming in chunks, as well as being directly usable in the i/o system.
The yecc grammar is trivial.
As yet leex cannot generate a scanner which will work directly on binaries,
but this is no real problem but will unfortunately result in code almost
Parse tools rule,
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the erlang-questions