[erlang-questions] Strings - deprecated functions

Joe Armstrong erlang@REDACTED
Thu Nov 23 17:39:33 CET 2017


On Wed, Nov 22, 2017 at 9:28 PM, Loïc Hoguin <essen@REDACTED> wrote:
> Calm down. Considering how ubiquitous the string module is, these functions
> are not going to be removed for at least a few years. That gives you plenty
> of time to understand the new string module.

If you change it in a 1000 years time you're really going to confuse everybody.

Programs in 3020 will want to know which millenia the code was written.

There is no shortage of names.

Call the new module string_vsn1 and NOT string then string and string_vsn1
can co-exist *forever*

/Joe




>
> Perhaps during this journey you can help make the documentation for the
> module more user friendly by sending patches or opening tickets at
> bugs.erlang.org. I'll admit that the current documentation does confuse me
> personally, though I've not needed to use it yet.
>
> Unfortunately languages are complex and Unicode is therefore also complex.
> There's no real way around that. Even if you target English speakers it's
> likely that you will need Unicode, because many things require it like names
> or addresses for example. So even if it feels like you won't need it (and
> maybe you won't) it's a good idea to be ready for it.
>
> I wouldn't say latin1 is widely used anymore. Most of everything uses
> Unicode nowadays. Nearly everything switched to Unicode, Erlang is one of
> the last. Even your email was sent encoded in utf-8.
>
> On 11/22/2017 08:43 PM, lloyd@REDACTED wrote:
>>
>> Dear Gods of Erlang,
>>
>> "This module has been reworked in Erlang/OTP 20 to handle
>> unicode:chardata() <http://erlang.org/doc/man/unicode.html#type-chardata>
>> and operate on grapheme clusters. The old functions
>> <http://erlang.org/doc/man/string.html#oldapi> that only work on Latin-1
>> lists as input are still available but should not be used. They will be
>> deprecated in Erlang/OTP 21."
>>
>> I'm sorry. I've brought up this issue before and got lots of push back.
>>
>> But every time I look up tried and true and long-used string functions to
>> find that they are deprecated and will be dropped in future Erlang releases
>> my blood pressure soars. Both my wife and my doctor tell me that at my age
>> this is a dangerous thing.
>>
>> I do understand the importance and necessity of Unicode. And applaud the
>> addition of Unicode functions.
>>
>> But the deprecated string functions have a long history. The English
>> language and Latin-1 characters are widely used around the world.
>>
>> Yes, it should be easy for programmers to translate code from one user
>> language to another. But I'm not convinced that the Gods of Erlang have
>> found the optimal solution by dropping all Latin-1 string functions.
>>
>> My particular application is directed toward English speakers. So, until
>> further notice, I have no use for Unicode.
>>
>> I don't want to sound like nationalist pig, but I think dropping the
>> Latin-1 string functions from future Erlang releases is a BIG mistake.
>>
>> I look up tokens/2, a function that I use fairly frequently, and I see
>> that it's deprecated. I look up the suggested replacement and I see
>> lexemes/2.
>>
>> So I ask, what the ... is a lexeme? I look it up in Merriam-Webster and I
>> see that a lexeme is  "a meaningful linguistic unit."
>>
>> Meaning what? I just want to turn "this and that" into "This And That."
>>
>> I read further in the Erlang docs and I see "grapheme cluster."  WHAT THE
>> ... IS GRAPHEME CLUSTER?
>>
>> I look up "grapheme" in Merriam-Webster. Oh it is now all so clear: "a
>> unit of a writing system."
>>
>> Ah yes, grapheme is defined in the docs. But I have to read and re-read
>> the definition to understand what the God's of Erlang mean by a "graphene
>> cluster." And I'm still not sure I get it.
>>
>> It sounds like someone took a linguistics class and is trying to show off.
>>
>> But now I've spent 30 minutes--- time that I don't have to waste trying to
>> figure out how do a simple manipulation of "this and that." Recurse the next
>> time I want to look up a string function in the Erlang docs.
>>
>> SOLUTION
>>
>> Keep the Latin-1 string functions. Put them in a separate library if
>> necessary. Or put the new Unicode functions in a separate library. But don't
>> arbitrarily drop them.
>>
>> Some folks have suggested that I maintain my own library of the deprecated
>> Latin1 functions. But why should I have to do that? How does that help other
>> folks with the same issue?
>>
>> Bottom line: please please please do not drop the existing Latin-1 string
>> functions.
>>
>> Please don't.
>>
>> Best wishes,
>>
>> LRP
>>
>>
>>
>> _______________________________________________
>> erlang-questions mailing list
>> erlang-questions@REDACTED
>> http://erlang.org/mailman/listinfo/erlang-questions
>>
>
> --
> Loïc Hoguin
> https://ninenines.eu
>
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions



More information about the erlang-questions mailing list