[erlang-questions] fast JSON parser in C

Robert Virding rvirding@REDACTED
Wed Jul 30 03:26:37 CEST 2008


2008/7/25 Chris Anderson <jchris@REDACTED>

> On Thu, Jul 24, 2008 at 6:56 PM, Bob Ippolito <bob@REDACTED> wrote:
> > I'd be curious to know if leex/yecc can do any better than mochijson2
> > (which is written by hand), especially considering that it uses
> > binaries instead of strings.
>
> The version of mochijson2 that is in CouchDB's trunk (not in use) is
> about twice as fast as cjson - which is used by CouchDB currently.
>
> After spending a day writing my own leex/yecc parser, it turns out to
> be about 3 times slower than cjson, and about 6 times slower than
> mochijson2. I could probably optimize the grammar definition to try to
> make it faster. I was hoping that the magic of parser-generators would
> give me a big jump on the hand-crafted code. Since it didn't, I don't
> plan to pursue it any further.
>
> If anyone wants to see the leex/yaac code, I can put it online.


I have looked at Chris's code and done some optimising of the leex .xrl
file. It now goes about 3.5 times faster than before and is only about 20%
slower than mochijson2. At least for a limited range of input files. Though
Chris uses lists for input as well as for strings while the mochiweb uses
binaries. How much this affects the speed I don't know.

The difference is *much* less than I expected and *much* *much* easier to
read and understand. If I was doing ít I might consider handwriting the
scanner and usng the yecc generated parser. One benefit of the leex
generated scanner is that it is re-entrant which means it can easily handle
input coming in chunks, as well as being directly usable in the i/o system.

The yecc grammar is trivial.

As yet leex cannot generate a scanner which will work directly on binaries,
but this is no real problem but will unfortunately result in code almost
duplication.

Parse tools rule,

Robert
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20080730/d03dbced/attachment.htm>


More information about the erlang-questions mailing list