[erlang-bugs] Binary memory reuse issue in unicode:characters_to_list

Patrik Nyblom pan@REDACTED
Wed Aug 14 10:29:38 CEST 2013


Hi!

This bug was fixed in the latest release. See 
https://github.com/erlang/otp/commit/0ebffb2b55bd1870bfbe0ea47aa94017d7917084 
for details.

Cheers,
Patrik

On 08/13/2013 02:03 PM, James Wheare wrote:
> Just found this extremely unexpected behaviour when using binary
> pattern matching and unicode:characters_to_list
>
> http://pastebin.com/7EYEhu0Z
>
> Given a 2 byte binary, e.g. <<65,128>> (65 = letter "A", 128 = invalid
> standalone utf8 byte)
>
> <<Char:8,Rest/binary>> = <<65,128>>,
> Char = 65,
> Rest = <<128>>.
>
> unicode:characters_to_list(Rest) should error, with {error, [],
> <<128>>} but instead is giving {error, [], "A"}
>
> unicode:characters_to_list(<<128>>) produces the desired result even
> though it should be identical.
>
> Making a copy will also give the desired result:
> Rest2 = <<Rest/binary>>,
> unicode:characters_to_list(Rest).
>
> Is this related to binary optimisations detailed here?
> http://www.erlang.org/doc/efficiency_guide/binaryhandling.html
>
> Seems like a bug in the unicode nif.
>
> Note that it's not reproducing on all environments, even given the
> same erlang version. Even 2 identical linux vms running under
> virtualbox but on 2 separate host machines produced different results
> (one showed the bug, one didn't)
> _______________________________________________
> erlang-bugs mailing list
> erlang-bugs@REDACTED
> http://erlang.org/mailman/listinfo/erlang-bugs




More information about the erlang-bugs mailing list