[erlang-questions] Erlang Syntax and "Patterns" (Again)
Fred Hebert
mononcqc@REDACTED
Thu Mar 17 13:24:30 CET 2016
On 03/17, Steve Davis wrote:
>3) A codec should be perfectly reversible i.e. X = encode(decode(X)).
>Without tagging, merely parsing out a string as a list is not perfectly
>reversible.
>
>4) What is the right way to implement the function is_string(List) correctly?
>
Those two are kind of funny tricky because 'String' as a standalone type
does not convery enough information yet.
What you need to be aware of, for example, is encoding: is it ISO-8859-1
(latin1), a flavor of unicode (UTF-8, UTF-16, UTF-32, UCS-* etc.), Other
variants such as CP-1252, plain ASCII, and so on. Erlang lists also let
you specify strings as raw unicode codepoint sequences rather than under
any specific encoding.
Many of these will share the same basic format, such that "Hello,
World!" shows up the same in most of these (if you omit byte-order
marks) such that you cannot *detect* that information from the data, it
has to be carried from the input or specification in most cases. Then,
once it's in place, you can tag it appropriately or make sure you know
the meaning.
is_string(List) may tell you true or false, but the result there does
not tell you whether you can do anything with it in your libraries,
merge them together, or make sure they have been normalized to fixed
point.
Strings are trickier than that, sadly.
More information about the erlang-questions
mailing list