[erlang-questions] byte() vs. char() use in documentation
Mon May 2 11:58:24 CEST 2011
On Mon, May 02, 2011 at 09:35:18AM +0000, Robert Virding wrote:
> ----- "Raimo Niskanen" <> wrote:
> > On Mon, May 02, 2011 at 12:01:49AM +0100, James Churchman wrote:
> > > more of a question than an actual answer, but in erlang can erlang
> > strings ( therefore io-lists) be utf-16?
> > A string is a list of unicode code points.
> > An IO-list is a list of binaries or bytes.
> > >
> > > I assume that binaries are obviously only ever utf8 representation,
> > but a list of ints can obviously exceed number above 255..
> > You can choose your binary representation. See erlang man page
> > unicode(3).
> > >
> > > so maybe (??) the answer is
> > >
> > > a) iolist CAN be a char() (.. this is surely especially true if the
> > data is only being messages threw erlang from other systems)
> > No. byte().
> As a string is a list of unicode code points and an iolist can contain a string then its type must also be char().
No. As it stands now a string is a list of unicode code points and
can not be contained in an iolist.
This became messy when char() was re-defined from latin-1 character
to unicode character. That affected string() that affected iolist()
and the latter was incorrect.
We must clean up the mess. Either by completing the notion of char()
being unicode and hence rewriting iolist() to contain byte() and binary(),
or by reverting to char() being latin-1 char and using unicode:char()
and unicode:string() where that is correct...
/ Raimo Niskanen, Erlang/OTP, Ericsson AB
More information about the erlang-questions