[erlang-questions] Erlang Syntax and "Patterns" (Again)

Fred Hebert mononcqc@REDACTED
Thu Mar 17 13:24:30 CET 2016


On 03/17, Steve Davis wrote:
>3) A codec should be perfectly reversible i.e. X = encode(decode(X)).
>Without tagging, merely parsing out a string as a list is not perfectly
>reversible.
>
>4) What is the right way to implement the function is_string(List) correctly?
>

Those two are kind of funny tricky because 'String' as a standalone type 
does not convery enough information yet.

What you need to be aware of, for example, is encoding: is it ISO-8859-1 
(latin1), a flavor of unicode (UTF-8, UTF-16, UTF-32, UCS-* etc.), Other 
variants such as CP-1252, plain ASCII, and so on. Erlang lists also let 
you specify strings as raw unicode codepoint sequences rather than under 
any specific encoding.

Many of these will share the same basic format, such that "Hello, 
World!" shows up the same in most of these (if you omit byte-order 
marks) such that you cannot *detect* that information from the data, it 
has to be carried from the input or specification in most cases.  Then, 
once it's in place, you can tag it appropriately or make sure you know 
the meaning.

is_string(List) may tell you true or false, but the result there does 
not tell you whether you can do anything with it in your libraries, 
merge them together, or make sure they have been normalized to fixed 
point.

Strings are trickier than that, sadly.



More information about the erlang-questions mailing list