[erlang-questions] Erlang basic doubts about String, message passing and context switching overhead
Benoit Chesneau
bchesneau@REDACTED
Sat Jan 14 16:29:59 CET 2017
someone has to write a good binding (non blocking) of icu :) I think it
would be easier than reinventing the wheel. i18n [1] was a good start but
looks abandoned these days. Also I disliked the load of resources at
startup in ETS: https://github.com/erlang-unicode/i18n
- benoît
On Sat, Jan 14, 2017 at 3:56 PM Loïc Hoguin <essen@REDACTED> wrote:
> We need support for locales before we can do proper operations on text.
> Just Unicode isn't enough.
>
> On 01/14/2017 03:18 PM, John Doe wrote:
> > Indeed, unicode upercase/lowercsase is one of the most essential
> > features of string which don't exist yet in erlang stdlib. I'm aware
> > about problems with some letters and scripts, such as german SS or
> > turkish I, but still having upper/lower in stdlib is the must, IMO. The
> > problem is that uppercase/lowercase would require support of unicode
> > normalization.
> >
> > 2017-01-14 1:34 GMT+03:00 Michał Muskała <michal@REDACTED
> > <mailto:michal@REDACTED>>:
> >
> > I fully agree there are no languages that deal with strings
> > perfectly. That said there are those that are better at it and those
> > that aren't so good. A language, where I need to look for a library
> > to upcase or downcase my own name, fits into the second group in my
> > book.
> >
> >
> > Michał.
> >
> > On 13 Jan 2017, 13:20 +0100, Jesper Louis Andersen
> > <jesper.louis.andersen@REDACTED
> > <mailto:jesper.louis.andersen@REDACTED>>, wrote:
> >> Richard is indeed right, depending on what your definition of
> >> "String" is.
> >>
> >> If a "String" is "An array of characters from some alphabet", then
> >> you need to take into account Strings are Unicode codepoints in
> >> practice. This is also the most precise definition from a
> >> technical point of view.
> >>
> >> When I wrote my post, I was--probably incorrectly--assuming the
> >> older notion of a "String" where the representation is either
> >> ASCII or something like ISO-8859-15. In this case, a string
> >> coincides with a stream of bytes.
> >>
> >> Data needs parsing. A lot of data comes in as some kind of stringy
> >> representation: UTF-8, byte array (binary), and so on.
> >>
> >> And of course, that isn't the whole story, since there are
> >> examples of input which are not string-like in their forms.
> >>
> >>
> >> On Fri, Jan 13, 2017 at 2:34 AM Richard A. O'Keefe
> >> <ok@REDACTED <mailto:ok@REDACTED>> wrote:
> >>
> >>
> >>
> >> On 13/01/17 8:56 AM, Jesper Louis Andersen wrote:
> >> > Strings are really just streams of bytes.
> >>
> >> That was true a long time ago. Maybe.
> >> But it isn't anywhere near accurate as a description
> >> of Unicode:
> >> - Unicode is made of 21-bit code points, not bytes.
> >> - Most possible code points are not defined.
> >> - Some of those that are defined are defined as
> >> "it is illegal to use this".
> >> - Unicode sequences have *structure*; it is simply
> >> not the case that every sequence of allowable
> >> Unicode code points is a legal Unicode string.
> >> - As a special case of that, if s is a non-empty
> >> valid Unicode string, it is not true that every
> >> substring of s is a valid Unicode string.
> >>
> >> In case you were thinking of UTF-8, not all byte
> >> sequences are valid UTF-8.
> >>
> >> Byte streams are as important as you say, but it's
> >> really hard to see the software for a radar or a
> >> radio telescope as processing strings...
> >>
> >> _______________________________________________
> >> erlang-questions mailing list
> >> erlang-questions@REDACTED <mailto:erlang-questions@REDACTED>
> >> http://erlang.org/mailman/listinfo/erlang-questions
> >> <http://erlang.org/mailman/listinfo/erlang-questions>
> >
> > _______________________________________________
> > erlang-questions mailing list
> > erlang-questions@REDACTED <mailto:erlang-questions@REDACTED>
> > http://erlang.org/mailman/listinfo/erlang-questions
> > <http://erlang.org/mailman/listinfo/erlang-questions>
> >
> >
> >
> >
> > _______________________________________________
> > erlang-questions mailing list
> > erlang-questions@REDACTED
> > http://erlang.org/mailman/listinfo/erlang-questions
> >
>
> --
> Loïc Hoguin
> https://ninenines.eu
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20170114/6b11942e/attachment.htm>
More information about the erlang-questions
mailing list