Binary, List and Tuple Inequalities (Paradox?)

Richard O'Keefe raoknz@REDACTED
Thu Oct 31 20:43:25 CET 2019


You can represent all sorts of information as binaries,
but the binary doesn't know.  It may be that you mean
<<65,66,67>> to represent "ABC"; it may be that you
mean to represent 4276803; it may be that you mean
to represent 'play A below middle c for 66 milliseconds
on instrument 67'.  Your intentions are not part of the
binary.  So there is no way to have a sort order for
binaries (or anything else) that perfectly suits all use
cases.  Even for strings, English dictionaries and
telephone books use *different* orders and neither of
them matches strcmp().

For Prolog, I managed in 1984 to come up with a
justification for the ordering of terms of different kinds
(coherence between @< and subtypes) which the ISO
committee chose to ignore.

The best we can hope for in a "global" term ordering is
that it satisfy the axioms of a total order.  That's a useful
thing to have.

Want something else?  Program it.

On Thu, 31 Oct 2019 at 23:53, Raimo Niskanen
<raimo.niskanen@REDACTED> wrote:
>
> On Wed, 2019-10-30 at 15:34 +0200, Valentin Micic wrote:
> > > On 28 Oct 2019, at 10:02, Raimo Niskanen <
> > > raimo.niskanen@REDACTED> wrote:
> > >
> > > I'd like to defend the current term order of lists, tuples and
> > > binaries.
> > >
> > > Lists are compared a'la string comparison, which avoids having to
> > > traverse the whole list when comparing.  Binaries are compared as
> > > lists
> > > since both are used for handling strings, and also to be compared
> > > like
> > > memcmp does it.
> > >
> > > Tuples are used for general complex terms.  They can be compared
> > > size
> > > first whithout having to be fully traversed, as opposed to lists,
> > > which
> > > is useful in other contexts.  You can thereby choose your data
> > > structure to select comparision.
> > > --
> > > Raimo Niskanen
> > >
> >
> > Thanks Raimo,
> >
> > Your argument seems to have one…well, misconception. I don’t think it
> > would be correct to imply that Binaries are used exclusively for
> > handling strings, but rather, one of their uses may be that.
>
> I did not mean to imply that binaries are used exclusevily for handling
> strings.  I merely said that both lists and binaries are used for
> handling strings, i.e that strings is one use case.
>
> Although I confess that I think it is an important use case, if not
> even a very important use case.
>
> When binaries were introduced it was already recognized that having
> strings as lists is a clumsy representation, especially with 64-bit
> then on the way with almost double the space waste.  So one could
> foresee that binaries would be used for storing strings.
>
> And to not have strings as binaries sort like strings as lists would be
> a really bad idea for that use case.
>
> As for now, Elixir uses binaries as their default representation for
> strings, so having them sort as strings is vital for Elixir.
> (Unicode has shot a big hole in that argument, but still true for
> latin-1)
>
> >
> > For example, here's a syntax that is not anything “string-like”:
> >
> >
> > (tsdb_1_1@REDACTED)3593> <<1:16>> < <<2:16>>.
> > true
> > (tsdb_1_1@REDACTED)3594> <<50:16>> < <<36:16>>.
> > false
> >
> :
> :
> > As I said in my previous email, I do not expect Erlang to change, and
> > for my "own intents and purposes” I am still considering if:
> >
> > (tsdb_1_1@REDACTED)3626> <<1,3>> < <<1,2,3>>.
> > false
> >
> > should be given more credence than, say TRUE... if nothing else,
> > because two octets are less than three octets.
> >
> > In other words, if a three-element-tuple, regardless of its content,
> >  could never be less than any given two-elements-tuple, why shouldn't
> > the same hold for Binaries?
>
> If you want to compare terms as integers, then represent them as
> integers!  Erlang bignums can be big enough for most purposes.
>
> Erlang does not have operator overloading or subclassing, and has a
> global term order.  So exactly one sort order has to be chosen for a
> type, and then a guess has to be made of which one is "best" or "most
> useful".
>
> The one chosen for binaries was string sort order, because that is how
> C compares binaries, and strings seemed to be a very important use case
> for binaries, which Elixir kind of has proven after the fact.
>
> The tuple sort order could have been chosen for binaries, but as I said
> above that would be bad for the string use case.
>
> Also, regarding "useful"; tuple sort order for binaries can easily be
> implemented by first comparing sizes then comparing content, whilst
> should you have tulpe sort order and want to implement list sort order
> you would have to loop byte by byte.  Therefore the list sort order can
> be regarded as more useful since it can be a building block for
> efficient tuple sort order...
>
> Best regards
> / Raimo Niskanen
>
> >
> > Kind regards
> >
> > V/
> >
> >



More information about the erlang-questions mailing list