[erlang-questions] problem making my first parser

Sat Sep 26 09:51:00 CEST 2015

On Saturday 26 September 2015 09:31:31 Roelof Wobben wrote:
> Op 26-9-2015 om 09:08 schreef zxq9:
> > On Saturday 26 September 2015 09:01:30 Roelof Wobben wrote:
> >> Why do I then see this error message ?
> > What happens if you try `[1] >= $0, [1] =< $9.`?
> >
> > How about `[1] =:= "+".` ?
> >
> > What are you actually comparing here?
> 
> A list with a string/char.   But I check the tail of a list which 
> contains a char with + or 0 - 9.
> 
> > Do you care about plus signs as strings, or do you care about plus as a symbol that represents an operation?
> 
> At this point I only care about the plus sign as a string. Later on the 
> parse function I will make a operation based on this.

You care about the plus sign as a token, not a string. There is a difference.

What is the difference among the following?

scan([H|R], L) when H =:= "+" -> scan(R, [plus|L].

scan([H|R], L) when H =:= $+ -> scan(R, [H|L]).

scan([$+|R], L) -> scan(R, [fun plus/2|L]).

Three different ways to decide what you received, and three different ways to continue on. Think about which combination might be most useful to you, and which might be difficult to deal with (or even broken).

In other words:
  Since you are going to have to deal with the resulting list later, which list will be the easiest to deal with?

But that is forward thinking about the output. What might be the most useful way to define the *input* that you are going to accept? Maybe "1+1" all crunched together isn't the friendliest way to receive this input? Simple delimitation like "1 + 1" or even "1,+,1" certainly allows more opportunities for "tokenize-or-reject/crash" on bad input. What about the ordering of the symbols? After all "1" is not actually 1, it is a character that represents 1 (hence the conversion from a character to an integer), the same way that neither "+" nor $+ are actually `plus(A, B) -> A + B`.

Hm... that's interesting, when writing functions we write function(Thingy1, Thingy2) -- which is not infix notation. So infix "1 + 1" would look like infix "1 plus 1" but converted to a function would look like "plus(1, 1)" which as an S-expression would look like "(plus 1 1)" or "(+ 1 1)", which in Reverse Polish notation would look like "1 1 +" and as a list might look like [1, 1, plus] or [plus, 1, 1]. Think about this a bit.

Why are these notations useful? Why have they been invented? Why do programming languages have functions written like "plus(A, B)" when there are operators like "A + B"?

These are not merely philosophic questions. They are central to understanding how notations, grammars, parsing, etc. work.

-Craig