[erlang-bugs] Strings handled differently in the shell and compiled modules
Fri Feb 19 14:00:30 CET 2010
On Thu, Feb 18, 2010 at 02:27:32PM -0800, Geoff Cant wrote:
> I've just been working on some code and came across a surprising result
> and wonder if it's a bug.
> If I create a module with a unicode string:
> test() ->
> Then the following is true in the shell:
> unitest:test() =/= "©|®|???|[\\-\\.!,]".
> That is, the string literal in the module is a list of utf-8 bytes and
> the shell string literal is a list of unicode codepoints; string
> literals have a different value depending on their context.
> Have I simply missed something in the documentation that says this is
> the expected behaviour? If not, then it'd be nice if shell code and
> module code behaved as similarly as possible.
It might be a terminal and locale problem.
What does this produce?
1> io:format("~w~n", ["©|®|???|[\\-\\.!,]"]).
2> io:format("~w~n", [unitest:test()]).
And, at the shell prompt:
$ env | grep '^LC_'
$ echo $LANG
$ cat >test.txt
$ hexdump -C test.txt
> Geoff Cant
> erlang-bugs (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:
/ Raimo Niskanen, Erlang/OTP, Ericsson AB
More information about the erlang-bugs