[erlang-questions] some language changes

Thu May 24 02:59:03 CEST 2007

On 23 May 2007, at 8:01 pm, Vlad Dumitrescu wrote:

> Hi,
>
> On 5/23/07, ok <ok@REDACTED> wrote:
>> This preprocessor is just 16 SLOC of AWK.  For *THIS* we are to
>> make Erlang lexical structure more complicated and to break editor
>> support for the language?
>
> Cold you please elaborate on why a preprocessor doesn't break the
> editor support for the language?

It's a fair question, and the answer of course is that it does, BUT.
This applies to any preprocessor and any language.  Once you add your
own stuff to bend the rules, the editor cannot rely on the rules
being bent.

BUT!  This only applies to source files that use the preprocessor.
You don't HAVE to use the preprocessor.  You can continue producing
source code without it.  There is *A* language which the editor and
other tools can understand.  And people can still work directly in
that language without THEIR editor support being compromised in any
way.

> The way I see it is that if the said editor is to provide advanced
> support, it needs to be able to work with the textual raw
> representation, not the preprocessed one, because that is the one the
> developer sees and uses.

This argument works for an editor and a pretty printer.  It does not
work for anything else much, including compilers, static checkers,
cross reference programs, you name it.  My home-brew tools for  
grovelling
around in C code include ones that work on the raw source (because it is
their job to report what includes what or what defines what), on the
object code (because it is their job to construct who depends on what
graphs), and on the output of the preprocessor (because it is their job
to check a particular configuration).  There is room for all of these.

The crucial point is that if a language has syntax that is mutable by
dynamically plugging in user code, then only an editor that implements
that language can be SURE of supporting ANYTHING in that language.

Prolog's syntax isn't _that_ mutable, and my own editor has a Prolog
parser with a mildly extended set of operators built in.  It's enough
for my own stuff, but tends to fall down on code written by people
who have added a whole lot of extra operators.  ("Move backward/forward
over clause" works, but "check clause" doesn't.)  At least Prolog's
lexical structure isn't pluggable (so "move over compound term in
f(...) form" is completely reliable).  Quintus got over the problem by
having Emacs ask Prolog to do things.

Now, suppose you want good support for Erlang in Vim.  Currently, you
can do quite a bit.  If the lexical structure of the language becomes
mutable, then (a) you can't EVER rely Vim to get it right and (b) you
lose support just when you need it most.  So, give up on Vim.  The big
fancy things people are using these days are Eclipse and NetBeans.  (I
was unable to install NetBeans on my SPARC/Solaris 2.9/Java 1.5 box,
and I am abjectly daunted by the size of the Eclipse book.)  Both of
those are done in Java, not Erlang.  Good luck handling mutable lexical
structure for an extended Erlang in *those*!

Take the machine where I am typing this message.  I have Vim.
I have Emacs.  I have Xcode.  I have SubEthaEdit.  I have all sorts
of stuff.  What I *don't* have at the moment is an Erlang installation.
(That's on another, and another kind of, machine.)  Is there any reason
I shouldn't be able to edit Erlang?

OK, let's be honest here.  One of my home-brew tools is a program that
reads source code in a variety of languages, tokenises it, and can
either selectively include or exclude tokens or can write it out with
styles and/or colours in LaTeX, (X)HTML, Troff, ANSI terminal, or plain
text.  For example, I use
      find . -name "*.?rl" | xargs m2h -ex | sloc | wc
to count SLOC for Erlang.  If Erlang lexical structure becomes  
pluggable,
how will my poor little C program adapt?

How does my "poor little C program" cope with Lisp read-macros?
That's easy.  IT DOESN'T.  It's for me, and it relies on me not using
them (or not using them in ways that will confuse it).  Why don't I
fix it?  Because I can't.

> For example, I might be interested in renaming a normal -define()
> macro and get all uses replaced with the new name. This means that the
> lexer and parser used by the editor need to include the concept of a
> macro.

I have written extensively saying why the C-envious preprocessor in
Erlang was and is a bad idea.  (Thanks to E. E. Smith for the phrase
"was and is".)

> At the same time, the editor needs to be able to map back the final
> AST to the actual concrete syntax, for example in order to map compile
> errors to the right place in the code even if that code contains
> macros and parse transforms. [*]

Which side are you arguing?  It seems to me that in this paragraph you
are making a very strong case for a simple fixed syntax without
read-macros or extensive parse transforms.

At the moment, there are some things that even the dumbest editor can
rely on when processing Erlang:
(1) the lexical structure is fixed and fairly simple.
(2) top level items are ended by full stops, which have no other job.
There is one thing that one normally expects to rely on, which I use
all the time, and that is parenthesis matching.  (And other bracket
matching as well.)  Thanks to a rather dubious decision in the Erlang
C-envious preprocessor, no, you cannot rely on that.  One form of
syntax bending has ALREADY resulted in an useful error-detecting/
preventing editor facility being rendered untrustworthy.  If we are
talking about language changes, I think by far the biggest payoff
would come from removing the C-envious preprocessor, and I have written
extensively on how that can be done.

I myself proposed a method of extending Erlang syntax, so that you
could do things like

     -syntax(sql, 'lib/syntaxes/sql').
     ...
         sql (......)

which, however, had two key restrictions:
(a) lexical structure was still Erlang lexical structure, and
(b) the stuff after the syntax keyword had to be bracket-balanced.

I am not against SYNTAX extension mechanisms (like Lisp macros,
especially Scheme hygienic macros).  What I am arguing against
is LEXICAL extension mechanisms.

> So IMHO if the language is extended in any way, via preprocessor or
> not, the editor still needs to be updated if it is going to be able to
> understand the changes.

If the preprocessor is *OPTIONAL* (so that Joe might use it for his
regular expressions, but I wouldn't) then he can pay the price of the
editor not understanding his lexical extensions without ME having to
pay the price for something I am not using.