[erlang-questions] unicode:characters_to_list

Thu Mar 22 17:17:31 CET 2012

On 2012-03-22, at 16:48 , Michael Uvarov wrote:
> Also, in utf-8 each code point can be encoded using from 1 to 6 bytes.

1 to 4: Unicode is defined from 0 to 10FFFF, code-points beyond this
range are to be considered ill-formed. UTF-8 can encode U+10FFFF in 4
bytes (with room to spare), and its definition was restricted to the
same range as Unicode in RFC 3629 (the original definition did indeed
allow for encoding 31 bit over up to 6 bytes).