[erlang-questions] Intersection of two ordered sets

Tue May 12 08:50:10 CEST 2009

Hi.

Sorry for hi-jacking this thread, but a question comes to my mind.
Is there a performance diffrence between:
     intersection([H1|T1], [H2|T2]) when H1 > H2 ->
         intersection([H1|T1], T2);
     intersection([H1|T1], [H2|T2]) when H1 < H2 ->
         intersection(T1, [H2|T2]);
     intersection([H|T1], [H|T2]) ->
         [H|intersection(T1, T2)].

And:
     intersection([H1|_] = L1, [H2|T2]) when H1 > H2 ->
         intersection(L1, T2);
     intersection([H1|T1], [H2|_] = L2) when H1 < H2 ->
         intersection(T1, L2);
     intersection([H|T1], [H|T2]) ->
         [H|intersection(T1, T2)].

Which one is the optimal, I'm leaning towords the second later
("[H1|_] = L1" ... "L1")?

If this is the case would it be a "simple task" for the compiler to
transform these kind of statment?
Would this be a valid EEP?

Kind regards
Andreas Hillqvist

2009/5/10 Richard Carlsson <richardc@REDACTED>:
> Nicholas Dronen wrote:
>> Does this look like a sane implementation for finding the intersection
>> of two ordered sets?  There are implementations in the standard erlang
>> libraries (e.g. ordsets.erl), but they're a bit different and I wanted
>> to make sure I'm not doing anything wildly wrong, either in terms of
>> correctness or performance.  (I'm not reinventing the wheel here --
>> just trying to learn erlang by implementing whatever comes to mind.)
>>
>>     intersection(L1, L2) -> intersection(L1, L2, []).
>>
>>     intersection([H1|T1], [H2|T2], Result) when H1 == H2 ->
>>         intersection(T1, T2, [H1|Result]);
>>     intersection([H1|T1], [H2|T2], Result) when H1 > H2 ->
>>         intersection([H1|T1], T2, Result);
>>     intersection([H1|T1], [H2|T2], Result) when H1 < H2 ->
>>         intersection(T1, [H2|T2], Result);
>>     intersection(_, _, Result) -> lists:reverse(Result).
>
> Sane, yes. However:
>
> First, there's probably no point in using an accumulator and reversing
> at the end; a plain recursive version has about the same efficiency,
> generates less garbage, and is easier to read.
>
> Second, by always testing for equality (which is the least likely case)
> in the first clause, you're generally wasting a test per element.
>
> Third, if you put the equality test after the > and < tests instead,
> you don't really need to actually do the test at all. If neither of
> those tests are true, the elements have to be equal.
>
> And fourth, in clause 2 you are deconstructing [H1|T1] and then
> reconstructing it again (the compiler won't catch this reuse),
> and similarly for [H2|T2] in clause 3.
>
>     /Richard
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions
>