[erlang-questions] Question about reverse list of recursion functions
ok@REDACTED
ok@REDACTED
Thu Dec 31 01:09:34 CET 2015
> Hi, guys! I recently started to learn Erlang (migrating from Ruby) by
> reading âErlang Programmingâ Cesarini book. After the 3rd chapter
> there is a task to implement quick sort and I implemented it like this:
> https://gist.github.com/ka8725/f3fcc264e12bcefa6035
This was an exercise, so whatever you learn from is right.
In practice, you should be using the list reversal and sorting
functions from the lists: module.
Instead of returning a compound data structure, it's often
useful to think "continuations":
f(...) -> ...; f(...) -> {A,B,C}.
g(...) -> {A,B,C} = f(...), h(A, B. C).
=>
g(...) -> f(.....).
f(.....) -> ...; f(.....) -> h(A, B, C, ...).
In the case of things like sorting, you can think of even and
odd layers, working in opposite directions. I'll leave you
to work out the details.
Once you've eliminated reversing, the next thing would be
to reduce concatenation. (The wow-neat version of qsort
in Prolog uses list differences for this purpose; Erlang
would pass an accumulator, so
qsort(List, Acc) === qsort(List) ++ Acc)
However, you've missed the REALLY major point about the
code, the reason you should never use it.
(A) It's quicksort, which has O(N**2) worst case.
Yes, I know the books say this will almost never happen,
but in practice it happens surprisingly often.
As Bentley and McIlroy pointed out in their "Engineering
a Quicksort" paper, repeated values can kill a simple-
minded quicksort. You need a "fat pivot", that is, you
need to split the list THREE ways:
part([X|Xs], P, L, E, G, A) ->
if X < P -> part(Xs, P, [X|L], E, G, A)
; P < X -> part(Xs, P, L, E, [X|G], A)
; true -> part(Xs, P, L, [X|E], G, A)
end;
part([], _, L, E, G, A) ->
sort(L, E ++ sort(G, A)).
sort(List = [_,Pivot|_], Acc) ->
part(List, Pivot, [], [], [], Acc);
sort(List, Acc) ->
List ++ Acc.
sort(List) ->
sort(List, []).
Oh, by the way, you will notice that this does no
reversal. Why would it? Yes, the intermediate
lists L and G are delivered in reverse order, but
we're going to *sort* them, so we don't *care* what
order they are in. We would care if we were trying
to make a stable sorting routine, but quicksort is
not a stable sorting routine anyway. Reversing a
list whose order you don't care about is pointless.
(B) Even with a "fat pivot" quicksort can still go
quadratic. Taking the first element of the list
as the pivot means it WILL go quadratic if the
input is already sorted or nearly sorted, which turns
out to be quite common in practice. This is why
array-based quicksorts use median-of-three pivot
selection and the Bentley & McIlroy version uses
the median of three medians of three for large enough
inputs.
Many people are impressed by the name "quick"sort and
don't look into the background. When Hoare invented quick
sort, he (a) already worked out most of the improvements
people talk about and (b) KNEW that on a good day with the
wind behind it quicksort would do more comparisons than
merge sort. BUT he needed to sort a bunch of numbers on
a machine with an amount of memory that would disgrace a
modern singing greeting card and just didn't have enough
memory for workspace. If comparisons are costly, it's
easy for a well-written merge sort to beat quicksort.
For what it's worth, I have tested the code above but not
benchmarked it.
More information about the erlang-questions
mailing list