[erlang-questions] Fwd: String encoding and character set

Dmitrii 'Mamut' Dimandt dmitriid@REDACTED
Wed Jan 17 11:23:04 CET 2007


Do list_to_binary/binary_to_list preserve codepoints? That is, does L1 = binary_to_list(list_to_binary(L2)) imply that L1 = L2? If not, then we loose an effective way of sending strings as binary


Romain Lenglet wrote:
> As Robert explained, the current convention for representing strings in 
> Erlang is a flat list of Unicode code-points as integers. Every element 
> in such a list is a character, represented by its Unicode code-point 
> integer value. The 11th character of a string is the 11th element in the 
> list. If you want to encode such a string, you are free to do so, and 
> that is relatively easy. But the current convention is to represent 
> strings *unencoded*, as such lists of Unicode code points.
>   




More information about the erlang-questions mailing list