[erlang-questions] Strings and Text Processing

Thomas Lindgren thomasl_erlang@REDACTED
Sat Dec 29 17:59:08 CET 2012





----- Original Message -----
> From: Masklinn <masklinn@REDACTED>
...
> 
> I don't think the former and the latter match. Erlang/OTP can be nice at
> string processing where "string" is understood as "sequence of 
> bytes",
> but it remains rather ungood at *text* processing: *as far as I know*,
> aside from encoding and decoding UTFs it has very limited support for
> it[0]: no support (note: by "support" I mean "support built into 
> the
> core distribution", it's always possible to call into ICU) for
> UnicodeData queries (codepoint meta-information), unicode case folding,
> grapheme cluster handling, the important text-processing annexes (UAX 14
> "line breaking algorithm", UAX 15 "normalization forms", UAX 
> 29 "text
> segmentation") or standards (UTS 10 "collation algorithm" and UTS 
> 18
> "regular expressions" as well — for other parts of the system but also
> part of unicode itself — UTS 35 "LDML" and the its data-formatting and
> data-parsing components), …


Good point. A strong erlang unicode library implementing the above would be very nice.

(I'm not a great fan of drivers myself.)

Best regards to Arcturus,
Thomas 



More information about the erlang-questions mailing list