Divide and Conquer
Given problem P and input I. Let us denote this problem instance
by (P,I).
If size(I) is small, e.g. size(I) = 1, the instance probably is
easy to solve.
If size(I) = n is large, a divide-and-conquer approach
takes the following steps:
- [Divide] Divide (P,I) into smaller subproblem instances
(P1,I1), (P2,I2),
¼, (Pa,Ia), |
|
where size(Ij) < n, "j = 1, 2,
¼, a.
- [Conquer] Solve each (Pj,Ij), obtaining solution
Oj, "j = 1, 2, ¼, a.
- [Combine] Combine solutions O1, O2,
¼, Oa to obtain solution O for (P,I).
In step (2) above, if size(Ij) is small, then (Pj,Ij) is solved
directly. However, if size(Ij) is not small enough, we apply the
same divide-and-conquer process on (Pj,Ij) recursively.
Since the divide-and-conquer method starts with the largest problem (the top
level) and proceeds to solve smaller and smaller subproblems (going down the
recursion tree), it is a top-down approach.
In order that divide-and-conquer approach is possible, we must have:
- The problem instance (P,I) can be divided naturally into subproblem
instances (Pj,Ij), j = 1,2,¼,a for some
integer a ³ 2.
- Each subproblem Pj is the same as P, and size(Ij) < size(I).
Assume in the dividing step, a problem instance of size n is
always partitioned into a subinstances of size n/b for some
constant integers a ³ 1 and b > 1.
Let D(n) and C(n) be the times for dividing and combining respectively.
If T(n) is the running time of this divide-and-conquer algortihm, then
we have
For some special f(n) \defeq D(n) + C(n), the master theorem can be used to
solve for T(n).
The recurrence
where a ³ 1 and b > 1 are constant integers, has asymptotic
solution
- T(n) = Q(nlogba)
if f(n) = O(nlogba-e
) for some constant e > 0.
- T(n) = Q(nlogbalgn) if f(n) =
Q(nlogba).
- T(n) = Q(f(n)) if f(n) = W
(nlogba+e) for
some constant e > 0 and af(n/b)
£ cf(n) for some constant c < 1 and all sufficiently large n.
The recurrence
where c is a positive constant, has solutions
- T(n) = Q(nc) if a < bc.
- T(n) = Q(nc lgn) if a = bc.
- T(n) = Q(nlogba) if a > bc.
The recurrence
has solutions
- T(n) = Q(lgn) if a = 1.
- T(n) = Q(nlogba) if a > 1.
We notice that the straightforward sorting algorithms: bubble
sort, insertion sort, or selection sort all have O(n2)
running time. They belong to the class of comparison sorting algorithms
that base on an abstract linear ordering relation ` < ' between keys.
We want to know the theoretically best O running time
of these class of algorithms.
We are trying to figure out, in the worst case, at least how many key
comparisons must be performed by any such algorithm.
To sort an input of n different keys, a comparison sorting algorithm does
a series of comparisons to determine the correct order of the keys.
Such an algorithm can be described by its decision tree.
The decision tree of a sorting algorithm for n different keys is a
regular (or full) binary tree in which each internal node
represents a comparison of two keys Ki and Kj.
The left subtree corresponds to the case Ki < Kj, and the right subtree
corresponds to Ki > Kj. An external node contains a permutation
p of {1,2,¼,n} that represents the order
Figure 3.1 shows the decision tree of a sorting algorithm on an input of three
different keys.
Picture Omitted
Since all permutations of {1,2,¼,n} are possible, and there are n!
of them. It follows that there are at least n! external nodes in this decision tree.
We know that a regular binary tree with m external nodes must have at least
élgm ù+ 1 levels. On a path from the root
to an external node, each internal node represents a comparison of keys performed for this
particular input. Hence in the worst case, the algorithm has to perform as
many comparisons as the number of levels in the tree minus one, which is
élg(n!) ù »
nlgn.
Any comparison sorting algorithm on n keys must perform approximately
n lgn comparisons in the worst case.
Can we find a sorting algorithm that has this best worst-case performance?
It turns out that a straightforward application
of the divide-and-conquer method on a sorting problem can almost achieve
this. The resulting algorithm is mergesort. The idea is illustrated
in Figure 3.2.
[Mergesort]
Sort the array a[low..high].
- [Stopping criterion] If low
³ high, stop.
- [Divide] Set mid ¬
(low + high)/2.
- [Conquer] Sort the subarrays a[low..mid] and
a[mid + 1..high]
by recursive calls to mergesort.
- [Combine] Merge the two subarrays into a sorted array.
Picture Omitted
Suppose Step (3) above is done by a procedure merge, then the above
can be implemented as in Figure 3.3.
mergesort (int a[], int low, int high)
/* Sort array a[low..high] */
{
int mid = (low + high) / 2;
if (low < high) {
mergesort(a, low, mid);
mergesort(a, mid + 1, high);
merge(a, low, mid, high);
}
}
merge (int a[], int low, int mid, int high)
/* Merge sorted subarrays a[low..mid] and a[mid+1..high]
* into sorted array a[low..high] */
{
int b[MAX_SIZE], i = low, j = mid + 1, k = 0;
/* Merge into b */
while (i <= mid && j <= high)
b[k++] = (a[i] < a[j]) ? a[i++] : a[j++];
/* Copy rest of first half */
while (i <= mid)
b[k++] = a[i++];
/* Copy rest of second half */
while (j <= high)
b[k++] = a[j++];
/* Copy back into a */
for (k = 0; k <= high - low; k++)
a[low + k] = b[k];
}
It is obvious that merge runs in Q(n) time. Let T(n) be
the running time of mergesort. We have
T(n) = 2T(n/2) + Q(n), for n > 1.
|
|
Theorem 3 implies that T(n) = Q(n lgn) which attains the
theoretically best worst-case running time.
Let us take key comparison as the basic operation, and let
W(n) be the number of basic operations performed by mergesort
in the worst case. It is obvious that merge does n-1 comparisons
with an input of size n in the worst case. Therefore we have
Let n = 2k for simplicity. Solving for W(n), we have
|
| | |
| |
22W(2k-2) + 2×2k
- (1 + 2) |
|
| |
2kW(1) + k ×2k -(1 + 2 +
¼+ 2k-1) |
|
| | |
|
|
|
W(n) > lg(n!) which is the least number of comparisons performed by
any comparison sorting algorithms. However,
|
lim
n ® ¥
|
W(n) / lg(n!) = 1. |
|
This proves the following
Mergesort is asymptotically optimal.
Since merge requires an additional array as big as the input,
mergesort does not sort in place.
To implement function calls, the computer has to do a lot of housekeeping
chores, e.g. pushing local variables, parameters, and return addresses
onto the recursion stack, and save the contents of all registers.
To mimic the returning from a recursive call, values of the local
variables and parameters of the previous call must be restored.
Contents of the registers of the previous call must also be restored.
Then control can be branched back to the return address found on
the top of the recursion stack.
When the subarrays in mergesort are short, this overhead for recursion
dominates the running time.
In this case, nonrecursive algorithms run faster.
Hence, the overall performance of mergesort can be improved if it
switches to a nonrecursive sorting algorithm for short subarrays,
say of size less than 20. Figure 3.4 shows such an improved version.
/* M is a constant determined by experiments */
mergesort (int a[], int low, int high)
{
int mid = (low + high) / 2;
if (high - low >= M) {
mergesort(a, low, mid);
mergesort(a, mid + 1, high);
merge(a, low, mid, high);
}
else /* switch to a nonrecursive sort */
other_sort(a + low, high - low + 1);
}
Quicksort is another fast sorting algorithm based on the
divide-and-conquer paradigm.
It has many different versions. The following version is chosen because its
analysis is simple.
[Quicksort]
Sort the array a[low..high].
- [Stopping criterion] If low ³ high,
stop.
- [Divide] The array a[low..high] is partitioned
(rearranged ) into three parts a[low..c-1], a[c], and
a[c+1..high]
such that each element in a[low..c-1] is
£ a[c] and
each element in a[c+1..high] is ³ a[c].
- [Conquer] The two subarrays a[low..c-1] and
a[c+1..high] are sorted by recursive calls to quicksort.
- [Combine] Since the subarrays are sorted in place, no work is
needed to combine them: the entire array a[low..high] is now
sorted.
The idea of quicksort is illustrated in Figure 3.5.
Picture Omitted
In pseudocode, quicksort can be described as:
quicksort (a, low, high )
// Sort a[low..high ].
{
if (low < high ) then {
c ¬ partition (a, low, high );
quicksort (a, low, c-1);
quicksort (a, c+1, high );
}
}
[Partition]
Rearrange array a[low .. high] and returns index c such that
low £ s < c < t
£ high
Þa[s] £ a[c]
£ a[t]. |
|
- Set x ¬ a[low ]. (x is called the
partitioning element or the pivot).
- Set i ¬ low, and j ¬
high +1.
- Move i up, within bound, until a key ³ x, and move j down,
within bound, until a key £ x.
- If i < j, then swap a[i] and a[j] and repeat Step 3;
else swap the pivot and a[j]; return j.
The following facts imply that partition is correct:
At any time during the process of partition,
- all keys to the left of i are £ x, and
- all keys to the right of j are ³ x.
Figure 3.6 shows how partition works.
An implementation of partition and quicksort is shown in
figure 3.7.
Picture Omitted
int partition (int a, int low, int high)
/* return c such that elements in a[low..c-1] <=
* a[c] <= elements in a[c+1..high] */
{
int t, x = a[low];
i = low, j = high + 1;
while (1) {
/* move i */
do { i++; } while (i <= high && a[i] < x);
/* move j */
do { j--; } while (i <= j && a[j] > x);
if (i < j) { /* swap a[i] and a[j] */
t = a[i]; a[i] = a[j]; a[j] = t;
}
else { /* Swap a[low] and a[j] */
a[low] = a[j]; a[j] = x;
return (j);
}
}
}
quicksort (int a, int low, int high)
/* Sort array a[low..high] */
{
int c;
if (low < high) {
c = partition(a, low, high);
quicksort(a, low, c - 1);
quicksort(a, c + 1, high);
}
}
For simplicity, we assume that there are no equal keys in the array.
The result we will obtain is also valid with equal keys, but the
derivation is more intricate.
We choose key comparison as the basic operation.
Notice procedure quicksort is only a driver to make calls to
partition where all basic operations are performed.
We observe that in partition, one comparison is done after each index is
advanced. The two indices will cross eventually. Hence n+1 comparisons
are required where n is the number of entries in the subarray.
In general, let C(n) be the number of basic operations performed with
input size n. We have
The value of s is not fixed. It depends on particular input.
Best case occurs when s » n/2 for all recursive calls.
Let B(n) denote C(n) in the best case. Ideally, we have
Worst case occurs when s = 1 or s = n for all recursive calls.
Let W(n) denote C(n) in the worst case. We have
|
| | |
| | |
| | |
| |
W(1) + 3 + 4 + ¼+ n + (n+1) |
|
| | |
|
|
|
This is as bad as any method anyone would use.
Assume the array is randomly ordered. Since s can be any of 1, 2,
¼,n, the randomness assumption implies that
Pr(s = i) = 1 / n, "i = 1,2,¼,n.
|
|
Let A(n) denote C(n) in the average case. Then
A(n)=1 / n |
n å
s=1
|
[A(s-1)+A(n-s)+n+1] |
|
which implies
A(n) = n+1+ (2 / n) |
n-1
å
t=0
|
A(t). |
|
and hence
nA(n) = n(n+1) + 2 |
n-1
å
t=0
|
A(t) |
| (3.1) |
With n substituted by n-1, we have
(n-1)A(n-1) =
(n-1)n + 2 |
n-2
å
t=0
|
A(t) |
| (3.2) |
(3.1) - (3.2) yields
nA(n) - (n-1)A(n-
1) = 2n + 2A(n-1) |
|
Þ A(n) / (n+1) = 2 / (n+1) + A(n-
1) / n, "n. |
|
Hence
|
| | |
|
| |
2 / (n+1) + 2 / n + A(n-2) / (n-1 )
|
|
|
| |
2 / (n+1) + 2 / n + 2 / (n-1) + A(n-3)
/ (n-2) |
|
|
| |
2 / (n+1) + 2 / n+ 2 / (n-1) +¼+
2 / 3 + A(1) / 2 |
|
|
| |
2(1 + 1 / 2 + 1 / 3 + ¼+ 1 / (n+1) ) -
3 |
|
|
| | |
|
| | |
|
|
|
Thus A(n) » 2n lnn = (2 ln2)n lgn »
1.39 n lgn. This shows that quicksort does only about 39% more comparisons
in the average case than the best case.
There is still much room for improvement.
We choose a[low] as the pivot only for convenience. This is not neccessary.
However, the choice of the pivot has tremendous impact on the performance.
Indeed, from the above analysis, choosing a[low ] as
the pivot leads to the worst-case behavior if the input is already sorted,
in increasing or decreasing order. Even though these cases are unlikely
under the randomness assumption. However, in practice, the randomness
assumption may not be true. For many applications, it is not unusual that
the inputs are almost sorted. Under these circumstances, quicksort
is extremely slow.
To avoid this, we can choose another entry a[p] as the pivot, swap
it into the first position, and partition will proceed as before.
There are many ways of choosing the pivot:
- p = ë(low+high) / 2 û. This leads to the best-case
behavior for inputs almost sorted.
- p = a random number between low and high. The resulting
version is called randomized quicksort. The randomness assumption
used in the average-case analysis is true in this situation.
- Median-of-three: Choose any three entries of the input,
e.g. the first, the last, and the middle entries, and
take the median of them as the pivot. This can effectively
eliminate the worst-case behavior.
Like mergesort, quicksort is very inefficient for small
subarrays. We should avoid making recursive calls if the subarrays
are not long enough, as shown in Figure 3.8.
/* M is a constant determined by experiments */
quicksort (int a[], int low, int high)
{
int c, p;
if (high - low >= M) {
p = choose(a, low, high);
swap a[low] and a[p];
c = partition(a, low, high);
quicksort(a, low, c - 1);
quicksort(a, c + 1, high);
}
else /* switch to a nonrecursive sort */
other_sort(a + low, high - low + 1);
}
Even though partition splits the input in place.
Quicksort is not an in-place sorting algorithm, since the depth
of the recursion stack depends on the input size n.
In the worst case, the depth could be as bad as Q(n).
However, if we carefully juggle the order
of the two recursive calls, the depth can be kept
under lgn. Figure 3.9 shows how this can be done.
/* M is a consant determined by experiments */
quicksort (int a[], int low, int high)
{
int c, p;
if (high - low >= M) {
p = choose(a, low, high);
swap a[low] and a[p];
c = partition(a, low, high);
if (c - low < high - c) {
quicksort(a, low, c - 1);
quicksort(a, c + 1, high);
}
else {
quicksort(a, c + 1, high);
quicksort(a, low, c - 1);
}
}
else
other_sort(a + low, high - low + 1);
}
There are various reasons for removing recursions:
- Some programming languages do not support recursion, e.g.
Fortran, Cobol, and assembly languages.
- The removal of recursions can help us to understand the implications
and hidden pitfalls of recursion.
- A nonrecursive version could run faster.
A nonrecursive quicksort is given in Figure 3.10.
nonrecursive_quicksort (int a[], int n)
/* Sort array a of size n */
{
int p, low = 0, high = n - 1;
STACK S = create_stack();
while (1) {
while (high - low >= M) {
p = choose(a, low, high);
swap a[p] and a[low];
c = partition(a, low, high);
push(S, c + 1); push(S, high);
high = c - 1;
}
other_sort(a + low, high - low + 1);
if (!stack_empty(S)) {
high = pop(S); low = pop(S);
}
else
return;
}
}
3.4 Exercises
- Devise a 3-way merge sorting algorithm, and compare its performance with
the 2-way mergesort given in class.
-
Suppose in a divide-and-conquer algorithm, we always divide a problem of
size n into n subproblems of size n/2, the dividing and combining steps
taking constant time. Write the recurrence for the running time T(n)
for solving a problem of size n, and solve the recurrence for T(n)
asymptotically.
-
What is the running time of the following algorithm?
try (int n)
int n;
{
if (n < 2)
dosomething(); /* runs in constant time */
else
{
try(n - 1);
try(n - 2);
}
}
-
Let a1, a2, ¼, an be a sequence of real numbers (may be negative).
The maximum subsum problem is to find
max{ ai+¼+aj : 1
£ i £ j £ n}.
Design a divide-and-conquer algorithm for this problem that runs in
Q(n lgn)
time.
[Hint: At the first stage, the maximizing segment may
- come from the left half of the original sequence, or
- come from the right half, or
- straddle both halves.
The last case can be handled directly in linear time.]
-
Suppose in a divide-and-conquer algorithm, we always divide a problem of
size n into n subproblems of size n/3, the dividing and combining steps
taking linear time. Write the recurrence for the running time T(n)
for solving a problem of size n, and solve the recurrence for T(n)
asymptotically.
-
Modify algorithm quicksort to an algorithm that finds the kth smallest
item in the list a[1], a[2], ¼, a[n] without sorting the list.
3.5 Additional Problems
- Suppose we have a straightforward algorithm for a problem that runs
in Q(n2) time. Suppose we can devise a
divide-and-conquer algorithm that divides an input into two inputs half as big, and takes
n lgn time to divide the problem and n lgn time to combine the
solutions to get a solution for the original input. Is the divide-and-conquer algorithm
less efficient than the straightforward scheme?
Justify your answer.
-
A sorting algorithm is said to be stable if at the end of the method,
identical elements occur in the same relative order as in the original
unsorted list. Are mergesort and quicksort stable?
Justify your answer.
- (Brassard and Bratley)
Let T[1..n] be a sorted array of integers, some of which may be negative
but all of which are different. Give an algorithm that is able to find an
index i such that 1 £ i £ n and T[i] = i,
provided such an index exists. Your algorithm should have running time O(logn).
- (Shamos)
Let X[1..n] and Y[1..n] contain two sets of integers, each sorted in
nondecreasing order. Write an algorithm which finds the median of the 2n
combined elements. [Hint: use binary search]
- (Horowitz and Sahni)
Let u and v be two n-bit numbers where for simplicity n is a power of 2.
The traditional multiplication algorithm requires O(n2) operations.
A divide-and-conquer based algorithm splits the numbers into two equal parts,
computing the product as
uv = (a 2n/2 + b)(c 2n/2 + d) = ac 2n + (ad + bc)2n/2 + bd. |
|
The multiplications ac, ad, bc, and bd are done recursively.
- Determine the computing time of the above algorithm.
- If ad + bc is computed as (a + b)(c + d) -ac
- bd, what is the computing time?
- (Winograd)
Let n = 2p, V = (v1, ¼, vn), W = (w1,
¼, wn). Then we can compute the vector product VW by the
formula
|
å
1 £ i
£ p
|
(v2i-1 + w2i)(v2i + w2i
-1) - |
å
1 £ i
£ p
|
v2i-1v2i - |
å
1 £ i
£ p
|
w2i-1w2i |
|
which requires 3n/2 multiplications. Show how to use this formula for the
multiplication of two n ×n matrices giving a method which requires
n3/2 + n2 multiplications rather than the usual n3 multiplications.
File translated from
TEX
by
TTH,
version 3.03.
On 9 Jan 2002, 22:50.