[erlang-questions] puzzled with this charset/encoding -related behaviour

Attila Rajmund Nohl attila.r.nohl@REDACTED
Sat Oct 14 10:12:19 CEST 2017


2017-10-14 4:21 GMT+02:00 Alexandre Karpov <alexakarpov@REDACTED>:
> TL;DR: how do I run erl which understands Unicode?
>
> Or, in more detail:
>
> (Disclaimer: this official documentation got me really humbled:
> http://www1.erlang.org/doc/apps/stdlib/unicode_usage.html
> , and just a little bit scared =) )
>
> Judging by my S/O question, which got 3 upvotes and no answers, I'm not the
> only one wondering:
> https://stackoverflow.com/questions/46735539/erlang-regexp-matching-on-chinese-characters
>
> Here's the gist of the problem:
>
> 57> "абв".
>
> [1072,1073,1074]
>
> The codes are correct Unicode for the [Cyrillic] characters - which means my
> Terminal didn't fail to understand my keyboard's input =) but Erlang shell
> didn't recognize Terminal's input as printable characters. And it is my
> understanding that this is exactly why this call fails:
>
> 25> re:run("йцу.asd", xmerl_regexp:sh_to_awk("*.*"), [{capture, none}]). **
> exception error: bad argument in function re:run/3 called as
> re:run([1081,1094,1091,46,97,115,100], "^(.*\\..*)$", [{capture,none}])

Try

re:run(<<"йцу.asd"/utf8>>, xmerl_regexp:sh_to_awk("*.*"), [{capture, none}]).



More information about the erlang-questions mailing list