[erlang-bugs] Bug in unicode characters_to_list trap
Thu May 2 15:56:36 CEST 2013
On 05/02/2013 10:48 AM, Patrik Nyblom wrote:
> Hi David!
> On 05/01/2013 06:35 PM, David Buckley wrote:
>> Simple test session:
>> [ 17:28 ] :~% erl
>> Erlang R15B01 (erts-5.9.1) [source] [64-bit] [smp:4:4]
>> [async-threads:0] [hipe] [kernel-poll:false]
>> Eshell V5.9.1 (abort with ^G)
>> 1> <<_, RR/binary>> = <<$a,164,161,$b>>.
>> 2> RR.
>> 3> unicode:characters_to_list(RR).
>> 4> unicode:characters_to_list(list_to_binary(binary_to_list(RR))).
> Yep - that's a bug, no doubt...
> Can you try a source code patch when I've found a cure?
A small patch is attached, the full patch will of course also cointain a
test case, but this is tha minimal fix.
It would be great if you would also test it, i will meanwhile prepare a
fix in maint...
>> I'm using Debian's default erlang build, but I've verified the bug on
>> various others, and can't see it in the release notes.
>> Description: The latter two calls should return the dame value, as
>> list_to_binary(binary_to_list(RR)) =:= RR.
>> I would guess that the code in erlang's guts is taking the falure offset
>> into the binary part as an offset into the full binary. At least, the
>> return values are consistent with this.
> Good guess, I agree.
And, you were absolutely right!
>> Workaround is just to call list_to_binary(binary_to_list()) on your data
>> before calling unicode:characters_to_list on it. Or manually offsetting
>> into the binary yourself in the case of a failed parse.
> erlang-bugs mailing list
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 765 bytes
Desc: not available
More information about the erlang-bugs