[erlang-questions] binary typed schema-less protocol

Tony Rogvall tony@REDACTED
Tue Jul 30 00:24:06 CEST 2013


On 29 jul 2013, at 23:52, Richard A. O'Keefe <ok@REDACTED> wrote:

> 
> On 29/07/2013, at 11:55 PM, Tony Rogvall wrote:
> 
>> I am also pretty pleased with cson :-)
>> Looks very nice.
>> 
>> A small question.
>> How are strings encoded?
>> 
>> a)  <utf8-octet-string-length>"<utf8-chars> 
>> b)  <number-of-unicode-chars>"<integers>*
>> c)  <number-of-unicode-chars>"<utf8-char>*
>> d) Other?
> 
> Well, I was doing this in my Smalltalk system, where Strings
> are mutable arrays of Unicode, and where byte streams support
> a #nextUtf8 method.  So having read a <number of unicode
> chars> and noticed the " the next step is
> 
> 	str := String new: n.
> 	1 to: n do: [:i | str at: i put: stream nextUtf8].
> 	^str
> 
> So it's (c). 
> 

Thanks. That was what I was afraid of.
Luckily I remembered that there is now support for utf8 in Erlangs binary syntax:

decode_string(0, Cs, Acc) ->
    {lists:reverse(Acc), Cs};
decode_string(I, <<Char/utf8,Cs/binary>>, Acc) ->
    decode_string(I-1,Cs,[Char|Acc]).

So all is well.

/Tony




More information about the erlang-questions mailing list