[erlang-questions] Erlang Syntax and "Patterns" (Again)

Emil Holmstrom <>
Sat Mar 19 17:34:47 CET 2016

I fail to see the significance of the type system in this case, it doesn't
stop Erlang to have a char() type? It has float(), integer(), atom(),
etc... Too force lists to have the same element type is still possible even
if Erlang is dynamically typed. Unfortunately iolist() would have to go.
Maybe I am missing something obvious?


On Fri, 18 Mar 2016 at 18:15, Hynek Vychodil <>

> A superlative suggestion sir, with only two minor drawbacks: one, Erlang
> is dynamically typed language and two, Erlang is dynamically typed
> language. I know that technically that’s only one drawback, but I thought
> it was such a big one it was worth mentioning twice.
> Hynek
> On Fri, Mar 18, 2016 at 5:30 PM, Emil Holmstrom <> wrote:
>> I am probably repeating what someone else already have said in some other
>> similar thread.
>> The confusion between strings and [integer()] would have been greatly
>> reduced if char() existed, $a wouldn't have to be syntactic sugar for 97
>> but would actually be "character a". You would have to explicitly convert
>> char() -> integer() and wise versa. This is how strings are implemented in
>> ML and Haskell.
>> Regarding character encoding: inside Erlang Unicode could always be
>> assumed, converson between different character encodings could be done on
>> I/O.
>> /emil
>> On Fri, 18 Mar 2016 at 00:51, Richard A. O'Keefe <>
>> wrote:
>>> On 17/03/16 11:53 pm, Steve Davis wrote:
>>> > > ROK said:
>>> > > Yawn.
>>> > (What am I doing trying to argue with ROK??? Am I MAD?)
>>> >
>>> > 1) Why is it people rant about "string handling" in Erlang?
>>> Because it is not the same as Java.
>>> >
>>> > 2) Principle of least surprise:
>>> > 1> [H|T] = [22,87,65,84,33].
>>> > [22,87,65,84,33]
>>> > 2> H.
>>> > 22
>>> > 3> T.
>>> > "WAT!”
>>> This is a legitimate complaint, but it confuses two things.
>>> There is *STRING HANDLING*, which is fine, and
>>> there is *LIST PRINTING*, which causes the confusion.
>>> For comparison, DEC-10 Prolog, PDP-11 Prolog, C-Prolog, and Quintus
>>> Prolog
>>> all did STRING HANDLING as lists of character codes, but
>>> all did LIST PRINTING without ever converting lists of numbers to
>>> strings.
>>> The answer was that there was a library procedure to print a list of
>>> integers as a string and you could call that whenever you wanted to,
>>> such as in a user-defined pretty-printing procedure.  Here's a transcript
>>> from SICStus Prolog:
>>> | ?- write([65,66,67]).
>>> [65,66,67]
>>> yes
>>> | ?- write("ABC").
>>> [65,66,67]
>>> yes
>>> The heuristic used by the debugger in some Prologs was that a list of
>>> integers between 32 and 126 inclusive was printed as a string; that
>>> broke down with Latin 1, and broke harder with Unicode.  The simple
>>> behaviour mandated by the standard that lists of integers print as
>>> lists of integers confuses people once, then they learn that string
>>> quotes are an input notation, not an output notation, and if they want
>>> string notation in output, they have to call a special procedure to get
>>> it.
>>> The ISO Prolog committee introduced a horrible alternative which the
>>> DEC-10 Prolog designers had experienced in some Lisp systems and
>>> learned to hate: flip a switch and "ABC" is read as ['A','B','C']. The
>>> principal reason given for that was that the output was semi-readable.
>>> One of my arguments against it was that this required every Prolog
>>> system to be able to hold 17*2**16 atoms, and I new for a fact that
>>> many would struggle to do so.  The retort was "they must be changed
>>> to make a special case for one-character atoms".  Oh well, no silver
>>> bullet.
>>> That does serve as a reminder, though, that using [a,b,c] instead of
>>> [$a,$b,$c] is *possible* in Erlang.
>>> Just to repeat the basic point: the printing of (some) integer lists as
>>> strings is SEPARABLE from the issue of how strings are represented and
>>> processed; that could be changed without anything else in the language
>>> changing.
>>> >
>>> > 3) A codec should be perfectly reversible i.e. X = encode(decode(X)).
>>> > Without tagging, merely parsing out a string as a list is not
>>> > perfectly reversible.
>>> Here you are making a demand that very few other programming languages
>>> can support.  For example, take JavaScript.  "\u0041" is read as "A",
>>> and you are not going to get "\u0041" back from "A".  You're not even
>>> going to get "\x41" back from it, even though "\x41" == "A".
>>> Or take Erlang, where
>>> 1> 'foo bar'.
>>> 'foo bar'
>>> 2> 'foobar'.
>>> foobar
>>> with the same kind of thing happening in Prolog.
>>> And of COURSE reading [1 /* one */, 2 /* deux */, 4 /* kvar */]
>>> in JavaScript preserves the comments so that re-encoding the
>>> data structure restores the input perfectly.  </sarc>
>>> Or for that matter consider floating point numbers, where
>>> even the languages that produce the best possible conversions
>>> cannot promise that encode(decode(x)) == x.
>>> No, I'm sorry, this "perfectly reversible codec" requirement sets up
>>> a standard that NO programming language I'm aware of satisfies.
>>> It is, in fact, a straw man.  What you *can* ask, and what some
>>> language designers and implementers strive to give you, is
>>>      decode(encode(decode(x))) == decode(x).
>>> But to repeat the point made earlier, the way that lists of plausible
>>> character codes is printed is SEPARABLE from the way strings are
>>> represented and handled and in an ancestral language is SEPARATE.
>>> >
>>> > 4) What is the right way to implement the function is_string(List)
>>> > correctly?
>>> >
>>> > *ducks*
>>> That really is a "have you stopped beating your wife, answer yes or no"
>>> sort of question.
>>> It depends on the semantics you *want* it to have.  The Quintus
>>> library didn't provide any such predicate, but it did provide
>>> plausible_chars(Term)
>>>   when Term is a sequence of integers satisfying
>>>   is_graphic(C) or is_space(C),
>>>   possibly ending with a tail that is a variable or
>>>   a variable bound by numbervars/3.
>>> Notice the careful choice of name:  not IS (certainly) a string,
>>> but is a PLAUSIBLE list of characters.
>>> It was good enough for paying customers to be happy with the
>>> module it was part of (which was the one offering the
>>> non-usual portray_chars(Term) command).
>>> One of the representations Quintus used for strings (again, a
>>> library feature, not a core language feature) was in Erlang
>>> notation {external_string,FileName,Offset,Length}, and idea
>>> that struck the customer I developed it for as a great
>>> innovation, when I'd simply stolen it from Smalltalk!
>>> The thing is that STRINGS ARE WRONG for most things,
>>> however represented.  For example, when Java changed
>>> the representation of String so that slicing became a
>>> costly operation, I laughed, because I had my own representation
>>> of strings that provided O(1) concatenation as well as cheap
>>> slicing.  (Think Erlang iolists and you won't be far wrong.)
>>> The Pop2 language developed and used at Edinburgh
>>> represented file names as lists, e.g., [/dev/null] was in
>>> Erlang notation ['/',dev.'/',null].  This made file name
>>> manipulation easier than representing them as strings.
>>> Any time there is internal structure, any time there is scope
>>> for sharing substructure, any time you need to process
>>> the parts of a string, strings are wrong.
>>> The PERL lesson is that regular expressions are a fantastic
>>> tool for doing the wrong thing quite simply.
>>> _______________________________________________
>>> erlang-questions mailing list
>>> http://erlang.org/mailman/listinfo/erlang-questions
>> _______________________________________________
>> erlang-questions mailing list
>> http://erlang.org/mailman/listinfo/erlang-questions
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20160319/6a4b0061/attachment.html>

More information about the erlang-questions mailing list