[erlang-questions] Strings and Text Processing
Thomas Lindgren
thomasl_erlang@REDACTED
Sat Dec 29 17:59:08 CET 2012
----- Original Message -----
> From: Masklinn <masklinn@REDACTED>
...
>
> I don't think the former and the latter match. Erlang/OTP can be nice at
> string processing where "string" is understood as "sequence of
> bytes",
> but it remains rather ungood at *text* processing: *as far as I know*,
> aside from encoding and decoding UTFs it has very limited support for
> it[0]: no support (note: by "support" I mean "support built into
> the
> core distribution", it's always possible to call into ICU) for
> UnicodeData queries (codepoint meta-information), unicode case folding,
> grapheme cluster handling, the important text-processing annexes (UAX 14
> "line breaking algorithm", UAX 15 "normalization forms", UAX
> 29 "text
> segmentation") or standards (UTS 10 "collation algorithm" and UTS
> 18
> "regular expressions" as well — for other parts of the system but also
> part of unicode itself — UTS 35 "LDML" and the its data-formatting and
> data-parsing components), …
Good point. A strong erlang unicode library implementing the above would be very nice.
(I'm not a great fan of drivers myself.)
Best regards to Arcturus,
Thomas
More information about the erlang-questions
mailing list