Divide and Conquer

Given problem P and input I. Let us denote this problem instance by (P,I). If size(I) is small, e.g. size(I) = 1, the instance probably is easy to solve. If size(I) = n is large, a divide-and-conquer approach takes the following steps:

[Divide] Divide (P,I) into smaller subproblem instances

(P₁,I₁), (P₂,I₂), Ľ, (P_a,I_a),

where size(I_j) < n, "j = 1, 2, Ľ, a.

[Conquer] Solve each (P_j,I_j), obtaining solution O_j, "j = 1, 2, Ľ, a.

[Combine] Combine solutions O₁, O₂, Ľ, O_a to obtain solution O for (P,I).

In step (2) above, if size(I_j) is small, then (P_j,I_j) is solved directly. However, if size(I_j) is not small enough, we apply the same divide-and-conquer process on (P_j,I_j) recursively.

Since the divide-and-conquer method starts with the largest problem (the top level) and proceeds to solve smaller and smaller subproblems (going down the recursion tree), it is a top-down approach.

In order that divide-and-conquer approach is possible, we must have:

The problem instance (P,I) can be divided naturally into subproblem instances (P_j,I_j), j = 1,2,Ľ,a for some integer a ł 2.

Each subproblem P_j is the same as P, and size(I_j) < size(I).

Assume in the dividing step, a problem instance of size n is always partitioned into a subinstances of size n/b for some constant integers a ł 1 and b > 1. Let D(n) and C(n) be the times for dividing and combining respectively. If T(n) is the running time of this divide-and-conquer algortihm, then we have

T(n) =

ě
í
î

aT(n/b) + D(n) + C(n),

n > 1

Q(1),

n = 1.

For some special f(n) \defeq D(n) + C(n), the master theorem can be used to solve for T(n). The recurrence

T(n) =

ě
í
î

aT(n/b) + f(n),

n > 1

Q(1),

n = 1.

where a ł 1 and b > 1 are constant integers, has asymptotic solution

T(n) = Q(n^log_ba) if f(n) = O(n^log_ba-e) for some constant e > 0.

T(n) = Q(n^log_balgn) if f(n) = Q(n^log_ba).

T(n) = Q(f(n)) if f(n) = W (n^log_ba+e) for some constant e > 0 and af(n/b) Ł cf(n) for some constant c < 1 and all sufficiently large n.

The recurrence

T(n) = ě
í
î

aT(n/b) + Q(n^c),

n > 1

Q(1),
n = 1

where c is a positive constant, has solutions

T(n) = Q(n^c) if a < b^c.

T(n) = Q(n^c lgn) if a = b^c.

T(n) = Q(n^log_ba) if a > b^c.

The recurrence

T(n) =

ě
í
î

aT(n/b) + Q(1),

n > 1

Q(1),

n = 1

has solutions

T(n) = Q(lgn) if a = 1.

T(n) = Q(n^log_ba) if a > 1.

We notice that the straightforward sorting algorithms: bubble sort, insertion sort, or selection sort all have O(n²) running time. They belong to the class of comparison sorting algorithms that base on an abstract linear ordering relation ` < ' between keys. We want to know the theoretically best O running time of these class of algorithms. We are trying to figure out, in the worst case, at least how many key comparisons must be performed by any such algorithm.

To sort an input of n different keys, a comparison sorting algorithm does a series of comparisons to determine the correct order of the keys. Such an algorithm can be described by its decision tree.

The decision tree of a sorting algorithm for n different keys is a regular (or full) binary tree in which each internal node represents a comparison of two keys K_i and K_j. The left subtree corresponds to the case K_i < K_j, and the right subtree corresponds to K_i > K_j. An external node contains a permutation p of {1,2,Ľ,n} that represents the order

K_p₁ < K_p
₂ < Ľ < K_{p
_n}.

Figure 3.1 shows the decision tree of a sorting algorithm on an input of three different keys.

Picture Omitted

Since all permutations of {1,2,Ľ,n} are possible, and there are n! of them. It follows that there are at least n! external nodes in this decision tree. We know that a regular binary tree with m external nodes must have at least élgm ů+ 1 levels. On a path from the root to an external node, each internal node represents a comparison of keys performed for this particular input. Hence in the worst case, the algorithm has to perform as many comparisons as the number of levels in the tree minus one, which is élg(n!) ů » nlgn.

Any comparison sorting algorithm on n keys must perform approximately n lgn comparisons in the worst case.

Can we find a sorting algorithm that has this best worst-case performance? It turns out that a straightforward application of the divide-and-conquer method on a sorting problem can almost achieve this. The resulting algorithm is mergesort. The idea is illustrated in Figure 3.2.

[Mergesort] Sort the array a[low..high].

[Stopping criterion] If low ł high, stop.

[Divide] Set mid ¬ (low + high)/2.

[Conquer] Sort the subarrays a[low..mid] and a[mid + 1..high] by recursive calls to mergesort.

[Combine] Merge the two subarrays into a sorted array.

Picture Omitted

Suppose Step (3) above is done by a procedure merge, then the above can be implemented as in Figure 3.3.

   mergesort (int a[], int low, int high)
   /* Sort array a[low..high] */
   {
      int  mid = (low + high) / 2;

      if (low < high) {
         mergesort(a, low, mid);
         mergesort(a, mid + 1, high);
         merge(a, low, mid, high);
      }
   }

   merge (int a[], int low, int mid, int high) 
   /* Merge sorted subarrays a[low..mid] and a[mid+1..high]
    * into sorted array a[low..high] */
   {
      int  b[MAX_SIZE], i = low, j = mid + 1, k = 0;

      /* Merge into b */
      while (i <= mid && j <= high)
         b[k++] = (a[i] < a[j]) ? a[i++] : a[j++];

      /* Copy rest of first half */
      while (i <= mid)
         b[k++] = a[i++];

      /* Copy rest of second half */
      while (j <= high)
         b[k++] = a[j++];

      /* Copy back into a */
      for (k = 0; k <= high - low; k++)
         a[low + k] = b[k];
   }

It is obvious that merge runs in Q(n) time. Let T(n) be the running time of mergesort. We have

T(n) = 2T(n/2) + Q(n), for n > 1.

Theorem 3 implies that T(n) = Q(n lgn) which attains the theoretically best worst-case running time.

Let us take key comparison as the basic operation, and let W(n) be the number of basic operations performed by mergesort in the worst case. It is obvious that merge does n-1 comparisons with an input of size n in the worst case. Therefore we have

W(n) = ě
í
î

2W(n/2) + n - 1,
n > 1

0,
n = 1

Let n = 2^k for simplicity. Solving for W(n), we have

W(n)
=
2W(2^k-1) + 2^k - 1

=
2²W(2^k-2) + 2×2^k - (1 + 2)

=
2^kW(1) + k ×2^k -(1 + 2 + Ľ+ 2^k-1)

=
n lgn - n + 1

W(n) > lg(n!) which is the least number of comparisons performed by any comparison sorting algorithms. However,

lim
n ® Ą
W(n) / lg(n!) = 1.

This proves the following

Mergesort is asymptotically optimal. Since merge requires an additional array as big as the input, mergesort does not sort in place. To implement function calls, the computer has to do a lot of housekeeping chores, e.g. pushing local variables, parameters, and return addresses onto the recursion stack, and save the contents of all registers. To mimic the returning from a recursive call, values of the local variables and parameters of the previous call must be restored. Contents of the registers of the previous call must also be restored. Then control can be branched back to the return address found on the top of the recursion stack. When the subarrays in mergesort are short, this overhead for recursion dominates the running time. In this case, nonrecursive algorithms run faster. Hence, the overall performance of mergesort can be improved if it switches to a nonrecursive sorting algorithm for short subarrays, say of size less than 20. Figure 3.4 shows such an improved version.

      /* M is a constant determined by experiments */

      mergesort (int a[], int low, int high)
      {
         int  mid = (low + high) / 2;

         if (high - low >= M) {
            mergesort(a, low, mid);
            mergesort(a, mid + 1, high);
            merge(a, low, mid, high);
         }
         else  /* switch to a nonrecursive sort */
            other_sort(a + low, high - low + 1);
      }

Quicksort is another fast sorting algorithm based on the divide-and-conquer paradigm. It has many different versions. The following version is chosen because its analysis is simple.

[Quicksort] Sort the array a[low..high].

[Stopping criterion] If low ł high, stop.

[Divide] The array a[low..high] is partitioned (rearranged ) into three parts a[low..c-1], a[c], and a[c+1..high] such that each element in a[low..c-1] is Ł a[c] and each element in a[c+1..high] is ł a[c].

[Conquer] The two subarrays a[low..c-1] and a[c+1..high] are sorted by recursive calls to quicksort.

[Combine] Since the subarrays are sorted in place, no work is needed to combine them: the entire array a[low..high] is now sorted.

The idea of quicksort is illustrated in Figure 3.5.

Picture Omitted

In pseudocode, quicksort can be described as: quicksort (a, low, high )
// Sort a[low..high ].
{
          if (low < high ) then {
                    c ¬ partition (a, low, high );
                    quicksort (a, low, c-1);
                    quicksort (a, c+1, high );
          }
}

[Partition] Rearrange array a[low .. high] and returns index c such that

low Ł s < c < t Ł high Ţa[s] Ł a[c] Ł a[t].

Set x ¬ a[low ]. (x is called the partitioning element or the pivot).

Set i ¬ low, and j ¬ high +1.

Move i up, within bound, until a key ł x, and move j down, within bound, until a key Ł x.

If i < j, then swap a[i] and a[j] and repeat Step 3; else swap the pivot and a[j]; return j.

The following facts imply that partition is correct: At any time during the process of partition,

all keys to the left of i are Ł x, and

all keys to the right of j are ł x.

Figure 3.6 shows how partition works. An implementation of partition and quicksort is shown in figure 3.7.

Picture Omitted

int partition (int a, int low, int high)
/* return c such that elements in a[low..c-1] <=
 * a[c] <= elements in a[c+1..high] */
{
    int  t, x = a[low];
         i = low, j = high + 1;

    while (1) {
       /* move i */
       do { i++; } while (i <= high && a[i] < x);

       /* move j */
       do { j--; } while (i <= j && a[j] > x);

       if (i < j) { /* swap a[i] and a[j] */
          t = a[i]; a[i] = a[j]; a[j] = t;
       }
       else { /* Swap a[low] and a[j] */
          a[low] = a[j];  a[j] = x;
          return (j);
       }
   }
}

quicksort (int a, int low, int high)
/* Sort array a[low..high] */
{
   int  c;

   if (low < high) {
      c = partition(a, low, high);
      quicksort(a, low, c - 1);
      quicksort(a, c + 1, high);
   }
}

For simplicity, we assume that there are no equal keys in the array. The result we will obtain is also valid with equal keys, but the derivation is more intricate.

We choose key comparison as the basic operation. Notice procedure quicksort is only a driver to make calls to partition where all basic operations are performed. We observe that in partition, one comparison is done after each index is advanced. The two indices will cross eventually. Hence n+1 comparisons are required where n is the number of entries in the subarray. In general, let C(n) be the number of basic operations performed with input size n. We have

C(n) = ě
í
î

C(s-1) + C(n-s) + n + 1,
n > 1 (where 1 Ł s Ł n),

0,
n = 0, 1.

The value of s is not fixed. It depends on particular input. Best case occurs when s » n/2 for all recursive calls. Let B(n) denote C(n) in the best case. Ideally, we have

B(n) » 2B(n/2) + n + 1

Ţ
B(n) » n lgn.

Worst case occurs when s = 1 or s = n for all recursive calls. Let W(n) denote C(n) in the worst case. We have

W(n)
=
W(0) + W(n-1) + n + 1

=
W(n-1) + (n + 1)

=
W(n-2) + n + (n+1)

=
W(1) + 3 + 4 + Ľ+ n + (n+1)

»
n² / 2.

This is as bad as any method anyone would use. Assume the array is randomly ordered. Since s can be any of 1, 2, Ľ,n, the randomness assumption implies that

Pr(s = i) = 1 / n, "i = 1,2,Ľ,n.

Let A(n) denote C(n) in the average case. Then

A(n)=1 / n n
ĺ
s=1
[A(s-1)+A(n-s)+n+1]

which implies

A(n) = n+1+ (2 / n) n-1
ĺ
t=0
A(t).

and hence

nA(n) = n(n+1) + 2 n-1
ĺ
t=0
A(t)
(3.1)
With n substituted by n-1, we have

(n-1)A(n-1) = (n-1)n + 2 n-2
ĺ
t=0
A(t)
(3.2)
(3.1) - (3.2) yields

nA(n) - (n-1)A(n- 1) = 2n + 2A(n-1)

Ţ A(n) / (n+1) = 2 / (n+1) + A(n- 1) / n, "n.

Hence

A(n) / (n+1)
=
2 / (n+1) + A(n-1) / n

=
2 / (n+1) + 2 / n + A(n-2) / (n-1 )

=
2 / (n+1) + 2 / n + 2 / (n-1) + A(n-3) / (n-2)

=
2 / (n+1) + 2 / n+ 2 / (n-1) +Ľ+ 2 / 3 + A(1) / 2

=
2(1 + 1 / 2 + 1 / 3 + Ľ+ 1 / (n+1) ) - 3

»
2(ln(n+1) + g) - 3

»
2 lnn.

Thus A(n) » 2n lnn = (2 ln2)n lgn » 1.39 n lgn. This shows that quicksort does only about 39% more comparisons in the average case than the best case. There is still much room for improvement. We choose a[low] as the pivot only for convenience. This is not neccessary. However, the choice of the pivot has tremendous impact on the performance. Indeed, from the above analysis, choosing a[low ] as the pivot leads to the worst-case behavior if the input is already sorted, in increasing or decreasing order. Even though these cases are unlikely under the randomness assumption. However, in practice, the randomness assumption may not be true. For many applications, it is not unusual that the inputs are almost sorted. Under these circumstances, quicksort is extremely slow. To avoid this, we can choose another entry a[p] as the pivot, swap it into the first position, and partition will proceed as before.

There are many ways of choosing the pivot:

p = ë(low+high) / 2 ű. This leads to the best-case behavior for inputs almost sorted.

p = a random number between low and high. The resulting version is called randomized quicksort. The randomness assumption used in the average-case analysis is true in this situation.

Median-of-three: Choose any three entries of the input, e.g. the first, the last, and the middle entries, and take the median of them as the pivot. This can effectively eliminate the worst-case behavior.

Like mergesort, quicksort is very inefficient for small subarrays. We should avoid making recursive calls if the subarrays are not long enough, as shown in Figure 3.8.

   /* M is a constant determined by experiments */

   quicksort (int a[], int low, int high)
   {
      int  c, p;

      if (high - low >= M) {
         p = choose(a, low, high);
         swap a[low] and a[p];
         c = partition(a, low, high);
         quicksort(a, low, c - 1);
         quicksort(a, c + 1, high);
      }
      else  /* switch to a nonrecursive sort */
         other_sort(a + low, high - low + 1);
   }

Even though partition splits the input in place. Quicksort is not an in-place sorting algorithm, since the depth of the recursion stack depends on the input size n. In the worst case, the depth could be as bad as Q(n). However, if we carefully juggle the order of the two recursive calls, the depth can be kept under lgn. Figure 3.9 shows how this can be done.

   /* M is a consant determined by experiments */

   quicksort (int a[], int low, int high)
   {
      int  c, p;

      if (high - low >= M) {
         p = choose(a, low, high);
         swap a[low] and a[p];
         c = partition(a, low, high);
         if (c - low < high - c) {
            quicksort(a, low, c - 1);
            quicksort(a, c + 1, high);
         }
         else {
            quicksort(a, c + 1, high);
            quicksort(a, low, c - 1);
         }
      }
      else
         other_sort(a + low, high - low + 1);
   }

There are various reasons for removing recursions:

Some programming languages do not support recursion, e.g. Fortran, Cobol, and assembly languages.

The removal of recursions can help us to understand the implications and hidden pitfalls of recursion.

A nonrecursive version could run faster.

A nonrecursive quicksort is given in Figure 3.10.

   nonrecursive_quicksort (int a[], int n)
   /* Sort array a of size n */
   {
      int p, low = 0,  high = n - 1;
      STACK  S = create_stack();

      while (1) {

         while (high - low >= M) {
            p = choose(a, low, high);
            swap a[p] and a[low];
            c = partition(a, low, high);
            push(S, c + 1);  push(S, high);
            high = c - 1;
         }

         other_sort(a + low, high - low + 1);

         if (!stack_empty(S)) {
            high = pop(S);  low = pop(S);
         }
         else
            return;
      }
   }

3.4 Exercises

Devise a 3-way merge sorting algorithm, and compare its performance with the 2-way mergesort given in class.

Suppose in a divide-and-conquer algorithm, we always divide a problem of size n into n subproblems of size n/2, the dividing and combining steps taking constant time. Write the recurrence for the running time T(n) for solving a problem of size n, and solve the recurrence for T(n) asymptotically.

What is the running time of the following algorithm?

   try (int n)
   int  n;
   {
      if (n < 2)
         dosomething();   /* runs in constant time */
      else
      {
         try(n - 1);
         try(n - 2);
      }
   }

Let a₁, a₂, Ľ, a_n be a sequence of real numbers (may be negative). The maximum subsum problem is to find max{ a_i+Ľ+a_j : 1 Ł i Ł j Ł n}.
Design a divide-and-conquer algorithm for this problem that runs in Q(n lgn) time.
[Hint: At the first stage, the maximizing segment may
1. come from the left half of the original sequence, or
2. come from the right half, or
3. straddle both halves.
The last case can be handled directly in linear time.]

Suppose in a divide-and-conquer algorithm, we always divide a problem of size n into n subproblems of size n/3, the dividing and combining steps taking linear time. Write the recurrence for the running time T(n) for solving a problem of size n, and solve the recurrence for T(n) asymptotically.

Modify algorithm quicksort to an algorithm that finds the kth smallest item in the list a[1], a[2], Ľ, a[n] without sorting the list.

3.5 Additional Problems

Suppose we have a straightforward algorithm for a problem that runs in Q(n²) time. Suppose we can devise a divide-and-conquer algorithm that divides an input into two inputs half as big, and takes n lgn time to divide the problem and n lgn time to combine the solutions to get a solution for the original input. Is the divide-and-conquer algorithm less efficient than the straightforward scheme? Justify your answer.

A sorting algorithm is said to be stable if at the end of the method, identical elements occur in the same relative order as in the original unsorted list. Are mergesort and quicksort stable? Justify your answer.

(Brassard and Bratley) Let T[1..n] be a sorted array of integers, some of which may be negative but all of which are different. Give an algorithm that is able to find an index i such that 1 Ł i Ł n and T[i] = i, provided such an index exists. Your algorithm should have running time O(logn).

(Shamos) Let X[1..n] and Y[1..n] contain two sets of integers, each sorted in nondecreasing order. Write an algorithm which finds the median of the 2n combined elements. [Hint: use binary search]

(Horowitz and Sahni) Let u and v be two n-bit numbers where for simplicity n is a power of 2. The traditional multiplication algorithm requires O(n²) operations. A divide-and-conquer based algorithm splits the numbers into two equal parts, computing the product as

uv = (a 2^n/2 + b)(c 2^n/2 + d) = ac 2ⁿ + (ad + bc)2^n/2 + bd.

The multiplications ac, ad, bc, and bd are done recursively.
1. Determine the computing time of the above algorithm.
2. If ad + bc is computed as (a + b)(c + d) -ac - bd, what is the computing time?

(Winograd) Let n = 2p, V = (v₁, Ľ, v_n), W = (w₁, Ľ, w_n). Then we can compute the vector product VW by the formula

ĺ
1 Ł i Ł p
(v_2i-1 + w_2i)(v_2i + w_2i
-1) -
ĺ
1 Ł i Ł p
v_2i-1v_2i -
ĺ
1 Ł i Ł p
w_2i-1w_2i

which requires 3n/2 multiplications. Show how to use this formula for the multiplication of two n ×n matrices giving a method which requires n³/2 + n² multiplications rather than the usual n³ multiplications.

File translated from T_EX by T_TH, version 3.03.
On 9 Jan 2002, 22:50.