[erlang-questions] Erlang basic doubts about String, message passing and context switching overhead

Benoit Chesneau bchesneau@REDACTED
Sat Jan 14 16:29:59 CET 2017


someone has to write a good binding (non blocking) of icu :) I think it
would be easier than reinventing the wheel. i18n [1] was a good start but
looks abandoned these days. Also I disliked the load of resources at
startup in ETS: https://github.com/erlang-unicode/i18n

- benoît

On Sat, Jan 14, 2017 at 3:56 PM Loïc Hoguin <essen@REDACTED> wrote:

> We need support for locales before we can do proper operations on text.
> Just Unicode isn't enough.
>
> On 01/14/2017 03:18 PM, John Doe wrote:
> > Indeed, unicode upercase/lowercsase is one of the most essential
> > features of string which don't exist yet in erlang stdlib. I'm aware
> > about problems with some letters and scripts, such as german SS or
> > turkish I, but still having upper/lower in stdlib is the must, IMO. The
> > problem is that uppercase/lowercase would require support of unicode
> > normalization.
> >
> > 2017-01-14 1:34 GMT+03:00 Michał Muskała <michal@REDACTED
> > <mailto:michal@REDACTED>>:
> >
> >     I fully agree there are no languages that deal with strings
> >     perfectly. That said there are those that are better at it and those
> >     that aren't so good. A language, where I need to look for a library
> >     to upcase or downcase my own name, fits into the second group in my
> >     book.
> >
> >
> >     Michał.
> >
> >     On 13 Jan 2017, 13:20 +0100, Jesper Louis Andersen
> >     <jesper.louis.andersen@REDACTED
> >     <mailto:jesper.louis.andersen@REDACTED>>, wrote:
> >>     Richard is indeed right, depending on what your definition of
> >>     "String" is.
> >>
> >>     If a "String" is "An array of characters from some alphabet", then
> >>     you need to take into account Strings are Unicode codepoints in
> >>     practice. This is also the most precise definition from a
> >>     technical point of view.
> >>
> >>     When I wrote my post, I was--probably incorrectly--assuming the
> >>     older notion of a "String" where the representation is either
> >>     ASCII or something like ISO-8859-15. In this case, a string
> >>     coincides with a stream of bytes.
> >>
> >>     Data needs parsing. A lot of data comes in as some kind of stringy
> >>     representation: UTF-8, byte array (binary), and so on.
> >>
> >>     And of course, that isn't the whole story, since there are
> >>     examples of input which are not string-like in their forms.
> >>
> >>
> >>     On Fri, Jan 13, 2017 at 2:34 AM Richard A. O'Keefe
> >>     <ok@REDACTED <mailto:ok@REDACTED>> wrote:
> >>
> >>
> >>
> >>         On 13/01/17 8:56 AM, Jesper Louis Andersen wrote:
> >>         > Strings are really just streams of bytes.
> >>
> >>         That was true a long time ago.  Maybe.
> >>         But it isn't anywhere near accurate as a description
> >>         of Unicode:
> >>           - Unicode is made of 21-bit code points, not bytes.
> >>           - Most possible code points are not defined.
> >>           - Some of those that are defined are defined as
> >>             "it is illegal to use this".
> >>           - Unicode sequences have *structure*; it is simply
> >>             not the case that every sequence of allowable
> >>             Unicode code points is a legal Unicode string.
> >>           - As a special case of that, if s is a non-empty
> >>             valid Unicode string, it is not true that every
> >>             substring of s is a valid Unicode string.
> >>
> >>         In case you were thinking of UTF-8, not all byte
> >>         sequences are valid UTF-8.
> >>
> >>         Byte streams are as important as you say, but it's
> >>         really hard to see the software for a radar or a
> >>         radio telescope as processing strings...
> >>
> >>     _______________________________________________
> >>     erlang-questions mailing list
> >>     erlang-questions@REDACTED <mailto:erlang-questions@REDACTED>
> >>     http://erlang.org/mailman/listinfo/erlang-questions
> >>     <http://erlang.org/mailman/listinfo/erlang-questions>
> >
> >     _______________________________________________
> >     erlang-questions mailing list
> >     erlang-questions@REDACTED <mailto:erlang-questions@REDACTED>
> >     http://erlang.org/mailman/listinfo/erlang-questions
> >     <http://erlang.org/mailman/listinfo/erlang-questions>
> >
> >
> >
> >
> > _______________________________________________
> > erlang-questions mailing list
> > erlang-questions@REDACTED
> > http://erlang.org/mailman/listinfo/erlang-questions
> >
>
> --
> Loïc Hoguin
> https://ninenines.eu
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20170114/6b11942e/attachment.htm>


More information about the erlang-questions mailing list