[erlang-questions] Bug in sets module? Equal sets aren't equal
Richard A. O'Keefe
ok@REDACTED
Mon Feb 20 23:36:15 CET 2017
On 21/02/17 2:58 AM, Hugo Mills wrote:
About a problem that occurs in practically every programming language
that has a built-in structural equality, going back to Lisp.
The problem is that we have an ABSTRACT data type A
with a CONCRETE representation C such that
for all x in C there is a unique y in A such that x represents y
for all y in A there is some x in C such that x represents y
-- if we didn't have these laws C wouldn't be a concrete
representation for A --
BUT
exists y in A such that #{x in C | x represents y} > 1
some abstract value has multiple representations.
Perhaps the simplest example of this is the queue-as-back-to-back
lists example, where A = sequence of T and
C = { list(T), list(T) } such that {X,Y} represents X ++ reverse(Y).
Then trivially {[],[1]} and {[1],[]} represent the same sequence.
If you have a typed language like Haskell or an OO language like
Smalltalk so that you can redefine equality for C to match equality
for A, you're away laughing.
If there is a built-in structural equality like EQUAL in Lisp
or = (==) in Prolog, or the analogue in Erlang, then your
program is ALWAYS using equality of representations.
That means that when you have such a data type, you HAVE to
write your own equality function for it, and you HAVE to remember
to call that function instead of the built-in equality.
Sometimes C may be a canonical representation for A.
For example, if A is set of T and C is strictly increasing
list of T, then representations are unique and we don't need
to distinguish between equality wrt A and equality wrt C.
To put it another way, it is a DEFECT in Erlang's set module
that there is no sets:equal(S1, S2).
The same defect is present in ordsets, there is no
ordsets:equal(S1, S2). Wait, didn't I just say that the
ordered representation is canonical? Yes, if done right,
but ordsets is defined to use ==, so [1] and [1.0] are,
as far as ordsets is concerned, the same set.
1> X = ordsets:from_list([1]).
[1]
2> Y = ordsets:from_list([1.0]).
[1.0]
3> ordsets:is_subset(X, Y).
true
4> ordsets:is_subset(Y, X).
true
5> X == Y.
true
6> X =:= Y.
false
If you were thinking of having a set of ordsets, think again.
The same defect applies to gb_sets where there is no
gb_sets:equal(S1, S2).
> Is this expected behaviour, or a bug in the sets module?
In a sense, both. It is expected behaviour that equality of
concrete representations is a refinement of equality of
abstract values. It is a bug that there is no sets:equal/2
to both solve the problem and by its presence in the interface
remind you that the problem exists.
More information about the erlang-questions
mailing list