Binary, List and Tuple Inequalities (Paradox?)

zxq9 zxq9@REDACTED
Fri Nov 1 08:31:45 CET 2019


On 2019/11/01 6:03, Valentin Micic wrote:
>> On 31 Oct 2019, at 21:43, Richard O'Keefe <raoknz@REDACTED> wrote:
>>
>> You can represent all sorts of information as binaries,
>> but the binary doesn't know.
>> Your intentions are not part of the binary.

...

>> The best we can hope for in a "global" term ordering is
>> that it satisfy the axioms of a total order.  That's a useful
>> thing to have.
>>
>> Want something else?  Program it.

> Yes, indeed, and I’ve implied that much — I do want to program it, but I also wanted to know what people think  — if one aspires to program for others, it seems reasonable to consider dominant logic on the issue.
> 
> I started by thinking that less cannot be more, e.g. two octets cannot be less than three.
> 
> Now I am increasingly convinced that one should never compare Binaries that are not of the same size.
> However, I am not so sure what to do if, or rather when, one has to (compare binaries of unequal sizes).

Not so fast. You MUST be able to compare things in *some* way or else 
you cannot establish a global order, and without that an entire universe 
of useful things is closed to you.

Binaries are a primitive data type.
Strings can NEVER be a primitive data type.

With this in mind, you could useful things with custom types such as 
"binary string", "list string", "mixed string"...

   -type bstring :: {bstring, binary()}.
   -type lstring :: {lstring, string()}.
   -type mstring :: {mstring, iolist()}.

Your types could also (whoa!) include meta that indicates the subtype of 
the string data -- not merely whether it encoded as UTF8, but what the 
collation rule is intended to be, whether the encoding is canonical and 
if so which form, etc. (In my region this is all quite a big deal...)

   -type bstring :: {bstring, encoding(), collation(), lang(), binary()}.

Now you could REALLY do some useful sorting!

But not on binaries. They are primitives. The fact we have a universal 
order across ALL primitives is great and something that goes 
unappreciated until you run into a problem where it is critical to have 
such an order and you lack it.

This discussion stems from a failure to distinguish between a complex 
type (strings) and a language primitive.

In most modern OOP languages strings are objects (whether immutable or 
not) and carry a fair amount of meta with them. When you rip the arms 
and legs off an object you are essentially left with a tuple that 
carries the core data itself along with useful descriptive meta. That's 
what is needed to resolve the string sorting problem: the string itself 
plus meta to describe its nature.

There is nothing even remotely weird about the way Erlang sorting order 
works. That's a platform issue in Erlang. If we want arbitrary collation 
sorts over a true (complex) string type, we're going to need to write a 
library to get it.

-Craig



More information about the erlang-questions mailing list