[erlang-questions] Erlang string datatype [cookbook entry #1 - unicode/UTF-8 strings]

Michael Uvarov <>
Sat Oct 22 09:23:10 CEST 2011


I think we need polymorphic strings. String can be in any Unicode encoding.
A string is a Unicode binary.
This binary also contains information about encoding.
When we need string in some form or encoding, VM encodes string into
this encoding.

For example,
get_string(S as utf8)  ->
    ok.

Also, if we store information about encoding, we can easy move strings
between machines with different endianesses.
We also need API for NIFs something like: enif_get_unicode_string(env,
term, out, output_encoding :: utf8 | utf16be | utf16le | ...),
enif_make_unicode_string(env, term, in, input_encoding). Then we can
write NIFs for ICU4C using this API.

--
Best regards,
Uvarov Michael



More information about the erlang-questions mailing list