Finding unique elements of a list
Richard A. O'Keefe
ok@REDACTED
Fri Dec 3 01:52:38 CET 2004
If you want to remove duplicate elements from a list but otherwise
preserve the order of the elements, so that, e.g.,
remove_dups([3,1,4,1,5,9,2,6,5,3,5,8,9]) -> [3,1,4,5,9,2,6,8]
then there is an O(N.lgN) method which goes like this:
(1) pair each element with its position number in the list
O(N)
(2) sort the list of pairs (using whole pairs as keys)
O(N.lgN)
(3) remove adjacent duplicates
O(N)
(4) switch the {Elem,Posn} pairs to {Posn,Elem} pairs
O(U) where 1 <= U <= N, U is the number of unique elements
(5) sort the list of {Posn,Elem} pairs
O(U.lgU)
(6) strip off the positions, {Posn,Elem} -> Elem
In Haskell notation, to keep this message short,
remove_dups :: Ord a => [a] -> [a]
remove_dups xs = [x | (_,x) <- sort (remdups (sort (zip xs [1..])))]
where remdups [] = []
remdups ((x,n) : xns) = (n,x) : remdup2 xns x
remdup2 ((y,_) : xns) x | y == x = remdup2 xns x
remdup2 xns _ = remdups xns
This is a little over twice as slow as simply sorting the elements and
removing adjacent duplicates, so it's only something you would use if
the original order of the elements mattered.
More information about the erlang-questions
mailing list