[erlang-questions] Re: [neotoma] [ANN] Neotoma 1.5

Sean Cribbs seancribbs@REDACTED
Tue Mar 15 13:58:14 CET 2011


One thing I neglected to emphasize in the announcement was that where
Neotoma's parsing functions would return lists or even single
characters before, they now return binaries. Some grammars will
require drastic changes to accommodate this difference.  My apologies
to those who tried the new version and had difficulties (e.g. Cliff).

Regarding Tony's question about future performance improvements, I
think the next one is obvious: Although switching to binaries helped
reduce copying, it can be reduced further by refusing to memoize the
input in its various states of consumption, but simply to scan and
split the single input binary when fetching a memoization (h/t Ben
Black).  This will require adding another counter to the "Index" tuple
that keeps the raw byte-offset.  I also want to investigate using the
process dictionary a bit more instead of ETS, or at least get a
comparison metric of the two approaches.

I've also been thinking about what should happen for Neotoma 2.0,
although as you can see by the release schedule over the last year, it
has not been high enough on my priorities.  Nevertheless, here's my
wishlist:

1) Generate an AST from the metagrammar/grammar parser instead of raw
code, decoupling parsing from code generation.
2) Create some EQC generators that can use the AST of your grammar to
provide automated parser tests.
3) Allow code generation into multiple languages from the AST. One
could generate Erlang, LFE, Reia, Efene, or any language for that
matter. I'm thinking this is what I REALLY wanted to do when I
experimented with parse_transform early on.
4) Let the grammar designer select whether to use binaries or lists as
output of parsing functions.
5) Create analysis tools for the grammar AST that can discover
problems and suggest optimizations.
6) Create a "regular expression" terminal in the metagrammar that can
more succinctly capture strings which currently require combinations
of string literals, character classes and repetition operators.

Thanks for using Neotoma, and as always I'll gladly accept patches to
implement any of the above.

Cheers,

Sean


More information about the erlang-questions mailing list