[erlang-questions] BNF/EBNF Grammar for Erlang

Joe Armstrong erlang@REDACTED
Sat Nov 12 18:44:50 CET 2011


If you dig deeper you have two alternatives:

1) learn yecc (it's like bison/yacc etc.) Yacc grammers are
    easy (ish) to read (one you understand the syntax) but difficult
   to debug if you get  them wrong.

2) Search for an EBNF/BNF/Peg grammar from Erlang

Of the two 1) is the quickest alternative. I have never ever seen
a complete type 2) grammar for Erlang - subsets yes - complete grammars no.
It would take a considerable amount of work to
make a type 2) grammar for Erlang and even if you found one you would never
know if the grammar described the same language
as a type 1) grammar - equivalence of grammars is undecidable
in general.

If you want to play with the parser the following function is
useful:

string2exprs(Str) ->
    case erl_scan:tokens([], Str ++ ". ", 1) of
{done, {ok, Toks, _}, []} ->
    case erl_parse:parse_exprs(Toks) of
{ok, Exprs} ->
    {ok, Exprs};
{error,{Line,Mod,Arg}} ->
    EStr = io_lib:format("~s",[apply(Mod,format_error,[Arg])]),
    Msg = lists:flatten(EStr),
    io:format("~n***PARSE ERROR in line:~w ~s~n", [Line,Msg]),
    io:format("Str=~s~n",[Str]),
    error
    end;
Other ->
    io:format("~n***SCAN ERROR:~p~n", [Other]),
    error
    end.

This is a very simple interface to the generated parser. So for example, if
you add this to the module mymod you can run it like this:

> mymod:string2exprs("case foo(X) of 1 -> sqrt(Y) end").
{ok,[{'case',1,
             {call,1,{atom,1,foo},[{var,1,'X'}]},
             [{clause,1,
                      [{integer,1,1}],
                      [],
                      [{call,1,{atom,1,sqrt},[{var,1,'Y'}]}]}]}]}

If you read the above code you'll see how to get from the
world of strings to tokens using erl_scan:tokens, and from tokens
to parse trees using erl_parse:parse_exprs.

the module erl_parse is automatically generated from the grammar.

The above term is generated by the mysterious right hand sides following
the colons in the productions:

The appropriate line in the grammar that did this was lines
378-380 ie.

case_expr -> 'case' expr 'of' cr_clauses 'end' :
	{'case',?line('$1'),'$2','$4'}.


Have fun

/Joe


On Fri, Nov 11, 2011 at 9:56 PM, Ryan Molden <ryanmolden@REDACTED> wrote:

> Great, thanks I will give the YRL a shot (ignoring 'everything to the
> right of the ':' may really be all that is needed, it wasn't clear to me if
> that was somehow essentially in expressing what the YRL was trying to
> represent).
>
> Ryan
>
> On Fri, Nov 11, 2011 at 12:38 PM, Joe Armstrong <erlang@REDACTED> wrote:
>
>> I don't think there are any BNF/EBNF grammars for erlang
>> there might be grammars for subsets of the language but not the
>> entire language.  The problem with NBF/EBNF/PEG grammars
>> is that decent error reporting is very difficult. In practice
>> virtual all languages use hand written recursive descent parsers
>> or LALR(1) parsers for which acceptable error recovery strategies exist.
>>
>> The official grammar is: (official means this is the actual grammar used
>> by the erlang compiler and all tools)
>>
>> https://github.com/erlang/otp/blob/master/lib/stdlib/src/erl_parse.yrl
>>
>> It you look at the productions they are very similar to yacc
>> productions:
>>
>> For example a tuple is defined like this (line 330,331)
>>
>> tuple -> '{' '}'       : {tuple,?line('$1'),[]}.
>> tuple -> '{' exprs '}' : {tuple,?line('$1'),'$2'}.
>>
>> If you forget about the stuff after the ':' this
>> reads
>>
>>    tuple -> '{' '}'
>>    tuple -> {' exprs '}'
>>
>>
>> ie a tuple is either {} or { exprs }
>>
>> The part after the ':' define the parse tree that
>> is returned if the expression is recognised.
>>
>> A production like:
>>
>>
>> a -> b c d : {something, '$2'}
>>
>> means is we match an 'a' then the parse
>> tree we want returned is {something, '$2'}
>>
>>
>>
>> '$1' is the parse tree of b, '$2' is the parse tree
>> of c etc.
>>
>> and exprs (line 445/446) is defined
>>
>> exprs -> expr           : ['$1'].
>> exprs -> expr ',' exprs : ['$1' | '$3'].
>>
>>
>> ie exprs is an expr or a comma separated
>> sequence of expr's
>>
>> In the original yacc this would be written
>> something like
>>
>> exprs: expr {$$ = $1}
>> | expr ',' exprs {$$ = [$1|$3]}
>>
>> The yecc manual is at http://www.erlang.org/doc/man/yecc.html
>>
>> the command
>>
>> > erlc erlang.yecc
>>
>> compiles the grammar into a beam file
>>
>> Cheers
>>
>> /Joe
>>
>>
>>
>>
>> On Fri, Nov 11, 2011 at 5:08 PM, Ryan Molden <ryanmolden@REDACTED>wrote:
>>
>>> Howdy fellow Erlangeers.  I was wondering if anyone knew of a BNF/EBNF
>>> grammar for Erlang?  Joe Armstrong pointed me at the YRL grammar at github;
>>> unfortunately, I am YRL illiterate so it looked mostly like gibberish to me.
>>>
>>> Ryan
>>>
>>> _______________________________________________
>>> erlang-questions mailing list
>>> erlang-questions@REDACTED
>>> http://erlang.org/mailman/listinfo/erlang-questions
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20111112/6d39e239/attachment.htm>


More information about the erlang-questions mailing list