[erlang-questions] Erlang basic doubts about String, message passing and context switching overhead
Richard A. O'Keefe
Fri Jan 13 02:33:56 CET 2017
On 13/01/17 8:56 AM, Jesper Louis Andersen wrote:
> Strings are really just streams of bytes.
That was true a long time ago. Maybe.
But it isn't anywhere near accurate as a description
- Unicode is made of 21-bit code points, not bytes.
- Most possible code points are not defined.
- Some of those that are defined are defined as
"it is illegal to use this".
- Unicode sequences have *structure*; it is simply
not the case that every sequence of allowable
Unicode code points is a legal Unicode string.
- As a special case of that, if s is a non-empty
valid Unicode string, it is not true that every
substring of s is a valid Unicode string.
In case you were thinking of UTF-8, not all byte
sequences are valid UTF-8.
Byte streams are as important as you say, but it's
really hard to see the software for a radar or a
radio telescope as processing strings...
More information about the erlang-questions