Off-topic from Binary, List and Tuple Inequalities (Paradox?)

Brujo Benavides elbrujohalcon@REDACTED
Thu Oct 31 03:23:38 CET 2019


*P.S.: By the way, this is why I don't like passing fixed length lists as
arguments for init/1 in gen_* callbacks (such as init([Some, Args, Here])
-> ...). These should be tuples instead, they are not variable length!*

YES!!! Yes yes!! N+1 times yes! I've said this many times: the only
argument that init/1 has should be called Arg, not Args... And if you don't
have anything to pass in to that function as an argument on start_link,
just use 'undefined' or any other sensibly named atom... Even {} is better
than [].
Using lists is not only semantically incorrect, it's also often confused
with mfa (i.e. module, function & arguments, like in spawn/3) and people
then define init with multiple arguments ��‍♂️

On Wed, 30 Oct 2019 at 20:13 Dániel Szoboszlay <dszoboszlay@REDACTED>
wrote:

> Hi Valentin,
>
> I think you should acknowledge that the ordering of terms as defined by
> the Erlang language does not aim to fit any semantic interpretation of the
> data you may have. The sole purpose of the ordering is to be well defined,
> so you can create e.g. a binary search tree that holds whatever data. For
> that purpose any crazy ordering like 1 < false < 0 < [1, 2, foobar] <
> <0.12.34> < 42 < [] < 30 < baz ... would work fine, as long as the place of
> every term in this sequence is well defined. There's only one constraint
> (apart from not violating the mathematical properties of a strict ordering
> relation), and that is to make comparison of terms reasonably efficient (my
> above example would fail this check).
>
> So you feel <<1:16>> should be less than <<2:24>>? That's because you
> interpret <<0,1>> and <<0,0,2>> as variable length encoding of integers. If
> you have this use case, than write a custom comparison function and sort
> your binaries with that. I may have a use case where the same two binaries
> mean operation #1 (with no arguments) and operation #0 (with a single
> argument 2), so I would have to write a different comparison function (but
> as lucky as I am, I may just use erlang:'<'/2 this time, phew!).
>
> But if you would like to know why I find the built-in ordering of terms
> logical, it is because both lists and binaries are usually used to
> represent variable sized data, while tuples are for fixed size data. I mean
> that {1,2} and {1,2,3} typically describe semantically different things,
> e.g. a 2D coordinate and a semver version number. Their relative ordering
> is as irrelevant for most applications as the ordering of {1,2} and
> <0.0.1>. Ordering by tuple size first makes some sense because it will keep
> types together: just like all integers are less than atoms, all 2D
> coordinates are less than version numbers.
>
> Binaries may represent fixed size data (e.g. a packet header), but than it
> doesn't matter whether you order lexicographically or size first, because
> all binaries in your domain will be the same size. Most often however
> binaries represent variable sized data, such as payload in a packet. And
> for variable sized data the human convention is lexicographical ordering. A
> printed dictionary may sort words by length first, and it would be a
> (mostly) usable dictionary, but that's just not how things are by
> convention.
>
> Cheers,
> Dániel
>
> P.S.: By the way, this is why I don't like passing fixed length lists as
> arguments for init/1 in gen_* callbacks (such as init([Some, Args, Here])
> -> ...). These should be tuples instead, they are not variable length!
>
> On Wed, 30 Oct 2019 at 14:35, Valentin Micic <v@REDACTED> wrote:
>
>>
>> On 28 Oct 2019, at 10:02, Raimo Niskanen <raimo.niskanen@REDACTED>
>> wrote:
>>
>> I'd like to defend the current term order of lists, tuples and
>> binaries.
>>
>> Lists are compared a'la string comparison, which avoids having to
>> traverse the whole list when comparing.  Binaries are compared as lists
>> since both are used for handling strings, and also to be compared like
>> memcmp does it.
>>
>> Tuples are used for general complex terms.  They can be compared size
>> first whithout having to be fully traversed, as opposed to lists, which
>> is useful in other contexts.  You can thereby choose your data
>> structure to select comparision.
>> --
>> Raimo Niskanen
>>
>>
>> Thanks Raimo,
>>
>> Your argument seems to have one…well, misconception. I don’t think it
>> would be correct to imply that Binaries are used exclusively for handling
>> strings, but rather, one of their uses may be that.
>>
>> For example, here's a syntax that is not anything “string-like”:
>>
>>
>> (tsdb_1_1@REDACTED)3593> <<1:16>> < <<2:16>>.
>> true
>> (tsdb_1_1@REDACTED)3594> <<50:16>> < <<36:16>>.
>> false
>>
>>
>> As you could see, this pretty much behaves the way we would expect when
>> we compare two integer values, e.g.
>>
>> 1 is less than 2, and so is <<1:16>> < <<2:16>> Technically, we are not
>> talking about strings here, but, rather two 16-bit integers.
>>
>> I can also say that:
>>
>> (tsdb_1_1@REDACTED)3597> 1 < 000002.
>> true
>>
>> When we write:
>>
>> (tsdb_1_1@REDACTED)3593> <<1:16>>.
>>
>> It would be wrong to pretend that we’re actually talking about a strings
>> (e.g. in alphanumeric sense). This clearly means that the integer value of
>> 1 stored using Big-Endian encoding (e.g. network byte order).
>>
>> Thus, when we write:  <<2:16>> we get <<0,2>>. When we write: <<2:24>> we
>> get <<0,0,2>>... these values are *not* intended to be strings, but
>> integers.
>> So, when we add leading zeroes, we do not change the integer value.
>>
>> So why is then:
>>
>> (tsdb_1_1@REDACTED)3600> <<1:16>> < <<2:24>>.
>> false
>>
>> First, we’re clearly use integer syntax to encode integer values, then we
>> have the first integer value encoded using 16-bits, and the second integer
>> value encoded using 24-bits.
>> It just happens so, that 16-bits is used to encode the value of 1, and
>> 24-bits to encode the value of two.
>>
>> Thus, since 16-bits are less then 24-bits (in length), but also 1 is less
>> than 2, one may expect this to yield TRUE. Yet somehow,  two octets are NOT
>> LESS than there, nor 1 is NOT LESS than 2!
>>
>> I think this cannot pass the "red-face test”, and thus does not deserve
>> defending.
>>
>> Contrast this with the way tuples are handled:
>>
>> (tsdb_1_1@REDACTED)3666> {1,1} < {1,2}.
>> true
>> (tsdb_1_1@REDACTED)3667> {1,3} < {1,2}.
>> false
>> (tsdb_1_1@REDACTED)3668> {1,3} < {1,2,3}.
>> true
>>
>> Considering that Binaries may be used to encode ANYTHING, shouldn’t they
>> be handled the same way as tuples instead of:
>>
>>
>> (tsdb_1_1@REDACTED)3624> <<1,1>> < <<1,2>>.
>> true
>> (tsdb_1_1@REDACTED)3625> <<1,3>> < <<1,2>>.
>> false
>> (tsdb_1_1@REDACTED)3626> <<1,3>> < <<1,2,3>>.
>> false
>>
>>
>> As I said in my previous email, I do not expect Erlang to change, and for
>> my "own intents and purposes” I am still considering if:
>>
>> (tsdb_1_1@REDACTED)3626> <<1,3>> < <<1,2,3>>.
>> false
>>
>> should be given more credence than, say TRUE... if nothing else, because
>> two octets are less than three octets.
>>
>> In other words, if a three-element-tuple, regardless of its content,
>>  could never be less than any given two-elements-tuple, why shouldn't the
>> same hold for Binaries?
>>
>> Kind regards
>>
>> V/
>>
>>
>> --
<https://about.me/elbrujohalcon?promo=email_sig&utm_source=product&utm_medium=email_sig&utm_campaign=gmail_api&utm_content=thumb>
Brujo Benavides
about.me/elbrujohalcon
<https://about.me/elbrujohalcon?promo=email_sig&utm_source=product&utm_medium=email_sig&utm_campaign=gmail_api&utm_content=thumb>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20191030/5576c195/attachment.htm>


More information about the erlang-questions mailing list