[erlang-questions] Erlang Syntax and "Patterns" (Again)

Steve Davis steven.charles.davis@REDACTED
Thu Mar 17 16:07:05 CET 2016


Indeed! Which is, I think, why they are best left as the “opaque" binaries that they are when parsing… and leaving the final decision about their content to the presentation layer that they are implicitly targeting.

/s

> On Mar 17, 2016, at 7:24 AM, Fred Hebert <mononcqc@REDACTED> wrote:
> 
> On 03/17, Steve Davis wrote:
>> 3) A codec should be perfectly reversible i.e. X = encode(decode(X)).
>> Without tagging, merely parsing out a string as a list is not perfectly
>> reversible.
>> 
>> 4) What is the right way to implement the function is_string(List) correctly?
>> 
> 
> Those two are kind of funny tricky because 'String' as a standalone type does not convery enough information yet.
> 
> What you need to be aware of, for example, is encoding: is it ISO-8859-1 (latin1), a flavor of unicode (UTF-8, UTF-16, UTF-32, UCS-* etc.), Other variants such as CP-1252, plain ASCII, and so on. Erlang lists also let you specify strings as raw unicode codepoint sequences rather than under any specific encoding.
> 
> Many of these will share the same basic format, such that "Hello, World!" shows up the same in most of these (if you omit byte-order marks) such that you cannot *detect* that information from the data, it has to be carried from the input or specification in most cases.  Then, once it's in place, you can tag it appropriately or make sure you know the meaning.
> 
> is_string(List) may tell you true or false, but the result there does not tell you whether you can do anything with it in your libraries, merge them together, or make sure they have been normalized to fixed point.
> 
> Strings are trickier than that, sadly.




More information about the erlang-questions mailing list