<div dir="ltr"><div>someone has to write a good binding (non blocking) of icu :) I think it would be easier than reinventing the wheel. i18n [1] was a good start but looks abandoned these days. Also I disliked the load of resources at startup in ETS:<a href="https://github.com/erlang-unicode/i18n"> https://github.com/erlang-unicode/i18n</a><br><br></div><div>- benoît<br></div></div><br><div class="gmail_quote"><div dir="ltr">On Sat, Jan 14, 2017 at 3:56 PM Loïc Hoguin <<a href="mailto:essen@ninenines.eu">essen@ninenines.eu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">We need support for locales before we can do proper operations on text.<br class="gmail_msg">
Just Unicode isn't enough.<br class="gmail_msg">
<br class="gmail_msg">
On 01/14/2017 03:18 PM, John Doe wrote:<br class="gmail_msg">
> Indeed, unicode upercase/lowercsase is one of the most essential<br class="gmail_msg">
> features of string which don't exist yet in erlang stdlib. I'm aware<br class="gmail_msg">
> about problems with some letters and scripts, such as german SS or<br class="gmail_msg">
> turkish I, but still having upper/lower in stdlib is the must, IMO. The<br class="gmail_msg">
> problem is that uppercase/lowercase would require support of unicode<br class="gmail_msg">
> normalization.<br class="gmail_msg">
><br class="gmail_msg">
> 2017-01-14 1:34 GMT+03:00 Michał Muskała <<a href="mailto:michal@muskala.eu" class="gmail_msg" target="_blank">michal@muskala.eu</a><br class="gmail_msg">
> <mailto:<a href="mailto:michal@muskala.eu" class="gmail_msg" target="_blank">michal@muskala.eu</a>>>:<br class="gmail_msg">
><br class="gmail_msg">
> I fully agree there are no languages that deal with strings<br class="gmail_msg">
> perfectly. That said there are those that are better at it and those<br class="gmail_msg">
> that aren't so good. A language, where I need to look for a library<br class="gmail_msg">
> to upcase or downcase my own name, fits into the second group in my<br class="gmail_msg">
> book.<br class="gmail_msg">
><br class="gmail_msg">
><br class="gmail_msg">
> Michał.<br class="gmail_msg">
><br class="gmail_msg">
> On 13 Jan 2017, 13:20 +0100, Jesper Louis Andersen<br class="gmail_msg">
> <<a href="mailto:jesper.louis.andersen@gmail.com" class="gmail_msg" target="_blank">jesper.louis.andersen@gmail.com</a><br class="gmail_msg">
> <mailto:<a href="mailto:jesper.louis.andersen@gmail.com" class="gmail_msg" target="_blank">jesper.louis.andersen@gmail.com</a>>>, wrote:<br class="gmail_msg">
>> Richard is indeed right, depending on what your definition of<br class="gmail_msg">
>> "String" is.<br class="gmail_msg">
>><br class="gmail_msg">
>> If a "String" is "An array of characters from some alphabet", then<br class="gmail_msg">
>> you need to take into account Strings are Unicode codepoints in<br class="gmail_msg">
>> practice. This is also the most precise definition from a<br class="gmail_msg">
>> technical point of view.<br class="gmail_msg">
>><br class="gmail_msg">
>> When I wrote my post, I was--probably incorrectly--assuming the<br class="gmail_msg">
>> older notion of a "String" where the representation is either<br class="gmail_msg">
>> ASCII or something like ISO-8859-15. In this case, a string<br class="gmail_msg">
>> coincides with a stream of bytes.<br class="gmail_msg">
>><br class="gmail_msg">
>> Data needs parsing. A lot of data comes in as some kind of stringy<br class="gmail_msg">
>> representation: UTF-8, byte array (binary), and so on.<br class="gmail_msg">
>><br class="gmail_msg">
>> And of course, that isn't the whole story, since there are<br class="gmail_msg">
>> examples of input which are not string-like in their forms.<br class="gmail_msg">
>><br class="gmail_msg">
>><br class="gmail_msg">
>> On Fri, Jan 13, 2017 at 2:34 AM Richard A. O'Keefe<br class="gmail_msg">
>> <<a href="mailto:ok@cs.otago.ac.nz" class="gmail_msg" target="_blank">ok@cs.otago.ac.nz</a> <mailto:<a href="mailto:ok@cs.otago.ac.nz" class="gmail_msg" target="_blank">ok@cs.otago.ac.nz</a>>> wrote:<br class="gmail_msg">
>><br class="gmail_msg">
>><br class="gmail_msg">
>><br class="gmail_msg">
>> On 13/01/17 8:56 AM, Jesper Louis Andersen wrote:<br class="gmail_msg">
>> > Strings are really just streams of bytes.<br class="gmail_msg">
>><br class="gmail_msg">
>> That was true a long time ago. Maybe.<br class="gmail_msg">
>> But it isn't anywhere near accurate as a description<br class="gmail_msg">
>> of Unicode:<br class="gmail_msg">
>> - Unicode is made of 21-bit code points, not bytes.<br class="gmail_msg">
>> - Most possible code points are not defined.<br class="gmail_msg">
>> - Some of those that are defined are defined as<br class="gmail_msg">
>> "it is illegal to use this".<br class="gmail_msg">
>> - Unicode sequences have *structure*; it is simply<br class="gmail_msg">
>> not the case that every sequence of allowable<br class="gmail_msg">
>> Unicode code points is a legal Unicode string.<br class="gmail_msg">
>> - As a special case of that, if s is a non-empty<br class="gmail_msg">
>> valid Unicode string, it is not true that every<br class="gmail_msg">
>> substring of s is a valid Unicode string.<br class="gmail_msg">
>><br class="gmail_msg">
>> In case you were thinking of UTF-8, not all byte<br class="gmail_msg">
>> sequences are valid UTF-8.<br class="gmail_msg">
>><br class="gmail_msg">
>> Byte streams are as important as you say, but it's<br class="gmail_msg">
>> really hard to see the software for a radar or a<br class="gmail_msg">
>> radio telescope as processing strings...<br class="gmail_msg">
>><br class="gmail_msg">
>> _______________________________________________<br class="gmail_msg">
>> erlang-questions mailing list<br class="gmail_msg">
>> <a href="mailto:erlang-questions@erlang.org" class="gmail_msg" target="_blank">erlang-questions@erlang.org</a> <mailto:<a href="mailto:erlang-questions@erlang.org" class="gmail_msg" target="_blank">erlang-questions@erlang.org</a>><br class="gmail_msg">
>> <a href="http://erlang.org/mailman/listinfo/erlang-questions" rel="noreferrer" class="gmail_msg" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a><br class="gmail_msg">
>> <<a href="http://erlang.org/mailman/listinfo/erlang-questions" rel="noreferrer" class="gmail_msg" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a>><br class="gmail_msg">
><br class="gmail_msg">
> _______________________________________________<br class="gmail_msg">
> erlang-questions mailing list<br class="gmail_msg">
> <a href="mailto:erlang-questions@erlang.org" class="gmail_msg" target="_blank">erlang-questions@erlang.org</a> <mailto:<a href="mailto:erlang-questions@erlang.org" class="gmail_msg" target="_blank">erlang-questions@erlang.org</a>><br class="gmail_msg">
> <a href="http://erlang.org/mailman/listinfo/erlang-questions" rel="noreferrer" class="gmail_msg" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a><br class="gmail_msg">
> <<a href="http://erlang.org/mailman/listinfo/erlang-questions" rel="noreferrer" class="gmail_msg" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a>><br class="gmail_msg">
><br class="gmail_msg">
><br class="gmail_msg">
><br class="gmail_msg">
><br class="gmail_msg">
> _______________________________________________<br class="gmail_msg">
> erlang-questions mailing list<br class="gmail_msg">
> <a href="mailto:erlang-questions@erlang.org" class="gmail_msg" target="_blank">erlang-questions@erlang.org</a><br class="gmail_msg">
> <a href="http://erlang.org/mailman/listinfo/erlang-questions" rel="noreferrer" class="gmail_msg" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a><br class="gmail_msg">
><br class="gmail_msg">
<br class="gmail_msg">
--<br class="gmail_msg">
Loïc Hoguin<br class="gmail_msg">
<a href="https://ninenines.eu" rel="noreferrer" class="gmail_msg" target="_blank">https://ninenines.eu</a><br class="gmail_msg">
_______________________________________________<br class="gmail_msg">
erlang-questions mailing list<br class="gmail_msg">
<a href="mailto:erlang-questions@erlang.org" class="gmail_msg" target="_blank">erlang-questions@erlang.org</a><br class="gmail_msg">
<a href="http://erlang.org/mailman/listinfo/erlang-questions" rel="noreferrer" class="gmail_msg" target="_blank">http://erlang.org/mailman/listinfo/erlang-questions</a><br class="gmail_msg">
</blockquote></div>