[erlang-questions] escript argument encoding
Reid Draper
reiddraper@REDACTED
Thu Mar 20 19:38:50 CET 2014
I’m running into an issue where OSX and Linux seem to treat the command-line arguments to an escript differently. Using the following escript:
#!/usr/bin/env escript
%% -*- erlang -*-
%%! +pc unicode
main([Args]) ->
io:setopts([{encoding, utf8}]),
io:format("~w~n", [Args]),
io:format("~ts~n", [Args]).
Both my OSX and Linux (Ubuntu 13.10) boxes have their LANG set to en_US.UTF-8. I’m running the escript like so:
./sample.escript سلام
On OSX, the escript seems to treat `Args` as a list of unicode code-points:
./sample.escript سلام
[1587,1604,1575,1605]
سلام
On Linux, it seems to treat the input as a list of UTF-8 bytes, where each byte is turned into an integer. The Erlang unicode guide calls this a 'Lists of UTF-8 Bytes' [1].
./sample.escript سلام
[216,179,217,132,216,167,217,133]
سÙاÙ
How does I get both OSX and Linux to treat the input of the escript as a list of code-points?
Thanks,
Reid
[1] http://www.erlang.org/doc/apps/stdlib/unicode_usage.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20140320/dbfc2651/attachment.htm>
More information about the erlang-questions
mailing list