Finding unique elements of a list

Luke Gorrie luke@REDACTED
Tue Dec 7 08:48:54 CET 2004


James Hague <jamesh@REDACTED> writes:

> Is there a standard Erlang library function or idiom for removing duplicate
> items from an unsorted list?  Ideally the elements in the final list should
> be in the order they were passed in, except without duplicates:

I've always liked this one that Magnus Fröberg showed me years ago:

  remove_dups([])    -> [];
  remove_dups([H|T]) -> [H | [X || X <- T, X /= H]].

That's O(n^2) on basic list operations, but my rule-of-thumb is that's
fine if it's not used in a loop and the inputs are a thousand elements
or less.

To be faster I think Vance-style with a dictionary or set to get O(N)
or O(N log N) is the way, and I think this is reasonably nice:

  remove_dups_linear(L) -> remove_dups_linear(L, gb_sets:new()).

  remove_dups_linear([], _) -> [];
  remove_dups_linear([H|T], S) ->
      case gb_sets:is_member(H, S) of
          true  -> remove_dups_linear(T, S);
          false -> [H|remove_dups_linear(T, gb_sets:insert(H, S))]
      end.

Cheers,
Luke





More information about the erlang-questions mailing list