[erlang-questions] Strings as Lists

Sean Hinde sean.hinde@REDACTED
Thu Feb 14 17:24:05 CET 2008


Has anyone here noticed that Erlang ships with a library module for  
charset conversion (xmerl_ucs). It may take you some part of the way

Sean

On 14 Feb 2008, at 09:37, Hasan Veldstra wrote:

>
> Erlang currently sucks for working with Unicode, and as a
> consequence, sucks for working with strings.
>
> This isn't a fault of the language, just the lack of libraries.
>
> Pretending that lists with a bit of DIY are good enough doesn't help.
>
> Yeah, you can load text in any Unicode encoding into an Erlang list
> with no problems... but there's much more to supporting Unicode than
> that.
>
> For example, say you've got the string "привет" (which is
> Russian for "hi") encoded in UTF-8 in list L:
>
> L = [208, 191, 209, 128, 208, 184, 208, 178, 208, 181, 209, 130]
>
> Now say you want to convert it to uppercase. Well, you can't.
> string:to_upper() won't work, as the only encoding it's aware of is
> ISO Latin-1.
>
> As soon as you've got text in anything other than ISO Latin-1, the
> arguments about niceties of being able to do maps/folds/
> comprehensions on lists pretending to be strings become void. You
> can't reliably iterate over each character in a UTF-8 or UTF-16
> string in a plain list, because they are variable-width encodings.
> Neither could you do it even if your strings were in UTF-32, because
> they may have composed characters, and you'd have to normalize the
> string first... and then you're well on your way to re-implementing
> Unicode in Erlang yourself. Good luck.
>
> Anyway, I've been working on an Erlang Unicode string library based
> on ICU (http://www.icu-project.org/) for the past week. It's coming
> along nicely, and I'll release an alpha version in another week or so.
>
> Erlang is a great language and platform, and non-existent Unicode
> support is probably the biggest drawback it has. I hope we'll get it
> fixed soon.
>
>
> --
> http://12monkeys.co.uk
> http://hypernumbers.com
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions




More information about the erlang-questions mailing list