[erlang-questions] erljson - a fast JSON parser in Erlang

Paulo Sérgio Almeida psa@REDACTED
Tue Jul 29 01:02:58 CEST 2008


Hi all,

now that I have caught your attention with the subject ... ;)


Given the sudden interest in JSON, I am making available yet another 
JSON parser I wrote some time ago. I never ended up polishing it, but 
considering the risk of forgetting about it for some months ... again, 
here it goes as is.

I wanted something fast, not fancy, but which decoded json into 
something I could store directly in my data structures, avoiding further 
post-processing of decoded terms. My design decisions were:

- use [] for arrays;

- use lists of pairs for objects
   - the empty object is distinguished from the empty array using a 
special case [{}] for the empty object; empty objects are rare (at least 
for my purposes I never used them) and even with this representation one 
can still use things like lists:keysearch or list comprehensions such as 
[K || {K,_} <- Object];
   - objects are distinguishable from arrays; e.g.

is_object([T|_]) when is_tuple(T) -> true;
is_object(_)-> false.

   - objects decoded are ordered by keys; this imposes a negligible 
overhead and is useful if needed; I use it frequently; this option is 
arguable but anyway it is a single line of code and easy to change;

- strings are decoded to existing atoms, or binaries otherwise;

- I was not interested in string processing, only store strings to send 
them later on; therefore I did not worry about char sets, or decoding 
into code points or such; the parser is char set agnostic; it can 
operate on ascii, iso-latin-* or utf8;
   - but if \uxxxx is used in the input, it assumes and decodes to utf8, 
using its own functions using binaries, avoiding using general utf 
processing libraries.

- fast encoding to io_lists; special care is taken to see if no escaping 
is needed, so as to reuse binaries in the structure to encode. (A sample 
testing on a small example mostly with objects gave about 4 times faster 
encoding than mochijson2; decoding was about the same speed).

- encoding is liberal on allowed keys of objects; it can take integers, 
floats, atoms, lists, binaries; but produces valid json with keys being 
strings;
   - this is an example that means the current version may not produce a 
roundtrip erlang->json->erlang, as object keys are only decoded to atoms 
or binaries; this could possibly change; some thought and polishing needed.

That's it. Comments, suggestions are welcome. Also, it has not been much 
tested and bugs may remain. I attach it here. I can also make it 
available elsewhere. Trapexit?

Regards,
Paulo


-------------- next part --------------
A non-text attachment was scrubbed...
Name: erljson.erl
Type: text/x-erlang
Size: 11622 bytes
Desc: not available
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20080729/c1d27e8a/attachment.bin>


More information about the erlang-questions mailing list