[erlang-questions] Indentation of multiline strings

Tue Feb 11 01:34:05 CET 2014

It is now more years than I care to recall since I wrote
the essay "Delenda est preprocessor."

It is extremely hard for indent(1) to do a good job of C.
Which is why it doesn't, really.

For example, here is some legal C code.

    if (border) printf("\\hline");
    rflag = border;
    for_each_element_child(e0, i, j, e1)
        printf(rflag ? "\\\\\n" : "\n");
        rflag = true;
        cflag = false;
        for_each_element_child(e1, k, l, e2)
            if (cflag) printf(" & ");
            cflag = true;
            walk_paragraph("", e2, "");
        end_each_element_child
    end_each_element_child
    if (border) printf("\\\\\\hline");

This is part of a program that turns slides marked up in SGML
into LaTeX.  It outputs the body of a table.

	for_each_element_child(Parent, I, J, Child)
	    <code for element children>
	if_no_element_child
	    <code for no elements>
	end_each_element_child

iterates over the children of Parent.  The variable I
counts all children.  The variable J counts those children
that are elements (as opposed to processing instructions,
comments, PCDATA, unresolved entity references, &c).  Child
is bound to the corresponding child.  Whenever such a Child
is for, <code for element children> is executed.  If Parent
has no child that is an element, <code for no elements> is
executed; that part is optional.

- If an indenter formats this with no knowledge of preprocessing,
  it will generate something unspeakably horrible.
- If an indenter is smart enough to look inside the macros, it
  will produce something that makes sense, but since
	- for_each_element_child has four left curly braces
	- if_no_element_child has }} if (...) {{
	- end_each_element child has four right curly braces
  the result will still not be something you want to read.
- Only if an indenter can be told about *these* macros specifically
  will anything reasonable happen.
- You really really REALLY do not want to do this stuff without
  such macros.  You don't want to see how

	for_each_descendant(Ancestor, Descendant)
	    <recursion without recursion>
	end_each_descendant

  is implemented.  You really don't.

In practice, this means that the tolerably large amount of code
I've written marching over SGML/XML documents -- and it may be
C, but it's *still* simpler than XSLT! -- *cannot* be indented
with indent(1) and still less can it be indented with emacs.

What is the relevance of this to Erlang?

Well, Erlang copied the C preprocessor, and must suffer the
consequences.

On 10/02/2014, at 8:01 PM, Bengt Kleberg wrote:

> Greetings,
> 
> One reason against a standalone formatter is that it could introduce
> bugs in the code. That this is only a problem with a standalone
> formatter is probably based upon that nobody would use a editor based
> formatter and then not compile/test after wards.

I do not believe that it IS "only a problem with a standalone
formatter" and I for one always check the output of indent(1).
> 
> To make a stand alone formatter with better level of confidence I
> suggest using Erlang/OTP tools to create one of the intermediate level
> formats (Core Erlang, "parse trees", the simplest/quickest/...) that
> removes formatting. If this is done before and after the standalone
> formatting, and the result is the same, then no bugs have been
> introduced.

I've always liked indenters that could cope with imperfect inputs.