Binary, List and Tuple Inequalities (Paradox?)

Wed Oct 30 14:34:53 CET 2019

> On 28 Oct 2019, at 10:02, Raimo Niskanen <raimo.niskanen@REDACTED> wrote:
> 
> I'd like to defend the current term order of lists, tuples and
> binaries.
> 
> Lists are compared a'la string comparison, which avoids having to
> traverse the whole list when comparing.  Binaries are compared as lists
> since both are used for handling strings, and also to be compared like
> memcmp does it.
> 
> Tuples are used for general complex terms.  They can be compared size
> first whithout having to be fully traversed, as opposed to lists, which
> is useful in other contexts.  You can thereby choose your data
> structure to select comparision.
> -- 
> Raimo Niskanen
> 

Thanks Raimo,

Your argument seems to have one…well, misconception. I don’t think it would be correct to imply that Binaries are used exclusively for handling strings, but rather, one of their uses may be that.

For example, here's a syntax that is not anything “string-like”:

(tsdb_1_1@REDACTED)3593> <<1:16>> < <<2:16>>. 
true
(tsdb_1_1@REDACTED)3594> <<50:16>> < <<36:16>>.
false

As you could see, this pretty much behaves the way we would expect when we compare two integer values, e.g.

1 is less than 2, and so is <<1:16>> < <<2:16>> Technically, we are not talking about strings here, but, rather two 16-bit integers.

I can also say that:

(tsdb_1_1@REDACTED)3597> 1 < 000002.          
true

When we write: 

(tsdb_1_1@REDACTED)3593> <<1:16>>. 

It would be wrong to pretend that we’re actually talking about a strings (e.g. in alphanumeric sense). This clearly means that the integer value of 1 stored using Big-Endian encoding (e.g. network byte order).

Thus, when we write:  <<2:16>> we get <<0,2>>. When we write: <<2:24>> we get <<0,0,2>>... these values are *not* intended to be strings, but integers.
So, when we add leading zeroes, we do not change the integer value.

So why is then:

(tsdb_1_1@REDACTED)3600> <<1:16>> < <<2:24>>.
false

First, we’re clearly use integer syntax to encode integer values, then we have the first integer value encoded using 16-bits, and the second integer value encoded using 24-bits.
It just happens so, that 16-bits is used to encode the value of 1, and 24-bits to encode the value of two.

Thus, since 16-bits are less then 24-bits (in length), but also 1 is less than 2, one may expect this to yield TRUE. Yet somehow,  two octets are NOT LESS than there, nor 1 is NOT LESS than 2!

I think this cannot pass the "red-face test”, and thus does not deserve defending.

Contrast this with the way tuples are handled:

(tsdb_1_1@REDACTED)3666> {1,1} < {1,2}.
true
(tsdb_1_1@REDACTED)3667> {1,3} < {1,2}.
false
(tsdb_1_1@REDACTED)3668> {1,3} < {1,2,3}.
true

Considering that Binaries may be used to encode ANYTHING, shouldn’t they be handled the same way as tuples instead of:

(tsdb_1_1@REDACTED)3624> <<1,1>> < <<1,2>>.
true
(tsdb_1_1@REDACTED)3625> <<1,3>> < <<1,2>>.
false
(tsdb_1_1@REDACTED)3626> <<1,3>> < <<1,2,3>>.
false

As I said in my previous email, I do not expect Erlang to change, and for my "own intents and purposes” I am still considering if:

(tsdb_1_1@REDACTED)3626> <<1,3>> < <<1,2,3>>.
false

should be given more credence than, say TRUE... if nothing else, because two octets are less than three octets.

In other words, if a three-element-tuple, regardless of its content,  could never be less than any given two-elements-tuple, why shouldn't the same hold for Binaries?

Kind regards

V/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20191030/b59c986f/attachment.htm>