[erlang-questions] EEP 10
Per Melin
per.melin@REDACTED
Thu May 15 23:45:00 CEST 2008
2008/5/15 David Mercer <dmercer@REDACTED>:
> I don't know of an "easy" way to write using the current bit syntax:
>
> <<Ch/utf8,_/binary>> = BinString
>
> I'd have thought we would need the utf8 (and utf16) binary extensions to do
> this easily. What is the "easy" way to do this without the proposed
> extensions?
Off the top of my head. I haven't really tested this, but I think it
should work. Don't know if it falls within what you'd call easy
though. I think it's easy enough.
Calling the function u(BinStr) should return Ch if u/1 is:
u(<<2#11110:5, Ch:28, _/binary>>) -> 2#11110 bsl 28 bor Ch;
u(<<2#1110:4, Ch:20, _/binary>>) -> 2#1110 bsl 20 bor Ch;
u(<<2#110:3, Ch:13, _/binary>>) -> 2#110 bsl 13 bor Ch;
u(<<Ch:8, _/binary>>) -> Ch.
Though, after googling for other solution (maybe I should have done
that first?) I realise that this would also happily process broken
UTF-8 without complaints.
(And it would be more efficient to reverse the order of the above
function definitions.)
More information about the erlang-questions
mailing list