[erlang-questions] Bug in sets module? Equal sets aren't equal

Richard A. O'Keefe ok@REDACTED
Mon Feb 20 23:36:15 CET 2017



On 21/02/17 2:58 AM, Hugo Mills wrote:

About a problem that occurs in practically every programming language
that has a built-in structural equality, going back to Lisp.

The problem is that we have an ABSTRACT data type A
with a CONCRETE representation C such that

for all x in C there is a unique y in A such that x represents y

for all y in A there is some x in C such that x represents y

-- if we didn't have these laws C wouldn't be a concrete
representation for A --

BUT

exists y in A such that #{x in C | x represents y} > 1

some abstract value has multiple representations.

Perhaps the simplest example of this is the queue-as-back-to-back
lists example, where A = sequence of T and
C = { list(T), list(T) } such that {X,Y} represents X ++ reverse(Y).

Then trivially {[],[1]} and {[1],[]} represent the same sequence.

If you have a typed language like Haskell or an OO language like
Smalltalk so that you can redefine equality for C to match equality
for A, you're away laughing.

If there is a built-in structural equality like EQUAL in Lisp
or = (==) in Prolog, or the analogue in Erlang, then your
program is ALWAYS using equality of representations.

That means that when you have such a data type, you HAVE to
write your own equality function for it, and you HAVE to remember
to call that function instead of the built-in equality.

Sometimes C may be a canonical representation for A.
For example, if A is set of T and C is strictly increasing
list of T, then representations are unique and we don't need
to distinguish between equality wrt A and equality wrt C.



To put it another way, it is a DEFECT in Erlang's set module
that there is no sets:equal(S1, S2).

The same defect is present in ordsets, there is no
ordsets:equal(S1, S2).  Wait, didn't I just say that the
ordered representation is canonical?  Yes, if done right,
but ordsets is defined to use ==, so [1] and [1.0] are,
as far as ordsets is concerned, the same set.

1> X = ordsets:from_list([1]).
[1]
2> Y = ordsets:from_list([1.0]).
[1.0]
3> ordsets:is_subset(X, Y).
true
4> ordsets:is_subset(Y, X).
true
5> X == Y.
true
6> X =:= Y.
false

If you were thinking of having a set of ordsets, think again.

The same defect applies to gb_sets where there is no
gb_sets:equal(S1, S2).

>    Is this expected behaviour, or a bug in the sets module?

In a sense, both.  It is expected behaviour that equality of
concrete representations is a refinement of equality of
abstract values.  It is a bug that there is no sets:equal/2
to both solve the problem and by its presence in the interface
remind you that the problem exists.




More information about the erlang-questions mailing list