In essence, it sounds like representing strings as
lists of unicode characters (either as separate
character objects, or simply as integers?(*)) with
ways to encode/decode to other formats (UTF-8, SCSU)
would be a quite reasonable way forward, then? (And, I
suppose, libraries to work with unicode.)

Other useful comments have appeared too. Maybe we
should collect them and see. Are there any programming
language issues to be considered? Backwards


(*) I seem to recall that the old Erlang-5
specification proposed a separate character datatype.
It seems a bit useful to me, but I haven't really
thought it through.

