Strings (was: Re: are Mnesia tables immutable?)
Andrew Lentvorski
bsder@REDACTED
Thu Jun 29 08:27:24 CEST 2006
Richard A. O'Keefe wrote:
> So we have two possible approaches here:
We have more than that, but how about choice 0:
0) We leave strings alone and simply declare them by fiat to be lists of
integers and encoded as UTF-8.
This has the advantage that strings survive very nicely inside BEAM
files without making any code changes to the Erlang system. It also
means that the current term-to-binary stuff works just fine if a bit
verbose. UTF-8 is documented everywhere on the planet and survives old
systems because it makes sure not to use ASCII NUL (0) except as NUL.
It is also very identifiable as it looks like ASCII or it looks like
nothing else. Therefore, dropped bytes and characters are usually
fairly identifiable, but the decoding can continue so that all the
information isn't lost.
This requires *0* lines of code and no understanding by those who stay
within the ASCII character set.
There should, however, be a module which encodes and decodes from the
internal format to the various multiplicity of encodings. Probably the
end result should be a binary object. That way, if you want to put a
string on the wire in a particular encoding, you can. If you don't want
to, you don't have to. And there will always be someone who doesn't
want to.
-a
More information about the erlang-questions
mailing list