[erlang-questions] atoms with newlines

Valentin Micic <>
Thu Feb 27 01:33:06 CET 2014


On 26 Feb 2014, at 10:30 PM, Richard A. O'Keefe wrote:

> 
> On 26/02/2014, at 10:56 PM, Valentin Micic wrote:
> 
>> If this is *really* necessary, wouldn't it be less confusing (or even more consistent) to consider:
>> 
>> 1>  'foo'  \
>> 1>  'bar'.
>> foobar
>> 
>> whereas: 
>> 
>> 3> 'foo
>> 3>bar'
>> 'foo\nbar'
>> 
>> includes a new line -- as already does.
> 
> No, and on two grounds.
> 
> First, backslash escapes are not a special notation
> confined to Erlang.  They are supported in a wide
> variety of languages.  Not just the "usual suspects"
> in the C family.  There's Python, for example.  And
> logic programming languages and functional languages too:
>>>> print "a\
> ... b";
> ab					-- Python
> 
>> "a\
> - b";;
> val it : string = "ab"			-- F# and O'CAML

I do see your point (and feel your frustration), however, you are using a logical fallacy to reinforce your argument as you making an appeal to authority -- others are using it, thus it has to be right.
Also, even if we assume that your argument may be valid for a particular context, say, strings in Python, it would not make much sense to consider it in this situation, for you know very well that strings and atoms are not the same thing.


> ?- X = 'a\
> |    b'.
> X = ab.					-- Prolog and Mercury
> 
> So having backslash+newline *silently accepted* and
> do *exactly the opposite of what people expect* cannot
> be a good thing.

Using similar reasoning people may also expect that:

A = 1,
A = A + 1

should not crash a program (as it doesn't in other programming languages), yet in Erlang it does. It may not be what people want, expect or like if they are new to Erlang, but it is a part of the language and it behaves consistently -- it always causes crash.

By the same token, having a rule which allows atoms to contain "special" characters if one enclose them within a pair of apostrophes, cannot be considered consistent if  a single backslash followed by a new-line character has yet another special meaning, e.g. elimination of the new-line character (*)

I think that the current implementation is consistent with its intent -- allowing atoms that contain special characters and having a new-line in this context treated as a special character. One may question if a backslash ( \ ) followed by an action of pressing the enter key, should produce the same result as pressing the enter key without preceding backslash -- namely, both cases producing a single '\n'; but this was not what you've been complaining about, so I would not like to enter that conversation. 

> We already have two ways to get a newline into an Erlang
> string or atom:  a bare newline or a \n escape.  We really
> do NOT need a third, especially one that is basically an
> implementation accident.

I think you are making an assumption this was "an implementation accident". 
I am making an assumption that this was a well considered and deliberate effort.
In both cases we are talking about assumptions, why present them as facts then?

> 
> It would be excusable to make backslash+newline an
> always-reported syntax error.  People would get a nasty
> surprise, but at least they wouldn't silently get the
> wrong value in their program.  In fact I think that
> *every* \<char> combination that is not explicitly
> documented should be reported as a syntax error.

Why is it that you're ignoring the fact that the intent behind usage of apostrophes in atom construction is documented. 
Also, if you place a new-line during the construction of an atom without apostrophes, the syntax error will be indeed reported.

> 
> Second, it is currently the case that backslash has a
> special meaning in Erlang *ONLY* inside quoted literals
> (counting $\t as a quoted literal).  Your "less confusing
> (or even more consistent)" proposal introduces a new
> thing that *looks* like an operator but *isn't* one.

As opposed to not introducing a new operator, but definitely behaving like there was one (e.g. one that eliminates new-line character altogether)?

> It is difficult to see how that is less confusing than
> following common practice or more consistent with anything.


Sorry to hear that.
Let's agree to disagree.

Kind reagards

V/


(*) In my view, it is nothing more than a historical incident (or a whim of someone in a position of authority a long time ago) that yielded an elimination of  new-line character when preceded by a backslash, whilst in any other case it is a backslash that is eliminated.
For example:

\"  results in a double quote -- backslash is eliminated;
\\  results in a backslash (having first backslash eliminated);
\002 results in a integer value of two -- backslash is eliminated;
\n results in integer value of 0x0a -- backslash is eliminated;

Why should then a sequence such:

"This is a test  \  0x0a
string"

(Where 0x0a represents an explicit new-line, as typed using a keyboard -- basically an Enter key) 

Result in a string  "This is test string" with both backslash and new-line eliminated. 
And yet:

"This is a test\n string"

Would result in insertion of a new-line character:

This is a test
 string

How is this not confusing, let alone inconsistent? Common practice notwithstanding. 




More information about the erlang-questions mailing list