string literal in binary construction

Robert Virding robert.virding@REDACTED
Wed Jul 26 00:39:51 CEST 2006


Tony Rogvall wrote:

>>
>> Hmmm
>>
>> <<A:32, String/string, B/binary>>
>>
>
> Hmmm, what is a string ?  ;-)
>
> String of what....
>
> lets expand the context:
>
> String/utf-8
>
> String/utf-16-big String/utf-16-little
>
> String/utf-32-big String/utf-32-little
>
> String/ascii
>
> String/iso-8859-1
>
> And so on.
>
> ( I could survive with utf extensions )
>
>
>
>> would certainly be useful (easy to implement too)
>
>
> Should not totally kill any one to implement either ;-)

Again I have two comments:

1. What does it mean? Is String a list of ... what? 32-bit Unicode 
codes? 8-bit characters in utf? Or what? And what is then put in the 
binary? 8-bit characters, 16-bit characters, or what?

2. The follow-up is of course what does it mean to use it in a match? 
The fun comes when you a binary of say utf-8 and match on it, do I get 
bytes or unicode codes? And why?

I would prefer a set of conversion functions between strings (lists) of 
32-bit unicode codes and binary representations of that string, for example:

string_to_utf8, string_to_utf16, utf8_to_string etc.

Or better yet put them in module string so we get:

string:to_uf8, string:to_utf16, string:from_utf8, etc.

 From my point of thinking the meaning is much clearer and it feels less 
hardwired. It should be possible to put these conversion functions in a 
separate file and make it easy to add new conversions.

Robert



More information about the erlang-questions mailing list