by William Shoaff with lots of help
You can download a postscript version of this file (which is prettier) at
Recursion is a core idea in computer science. Our goal is to learn introductory techniques useful in analyzing the time and space complexity of recursive algorithms. Such programs fall into the divide and conquer paradigm for problem solving. As every general knows, the key to this strategy is in the division. To illustrate the ideas to be presented, we'll start with a division process that is successful because it leads to a slow process.
This story, attributed to Lucas, gives us hope that the world will be around for a number of year yet. To paraphrase it, once God created the world he established an order of monks in Asia and set before them the tasks of moving 64 golden disks arranged on a diamond needle to a second needle. The disks were graduated in size and God established the rule that a large disk could never be placed on a smaller one. To complete the task without violating the rule, a third diamond needle was provided. Once all the disks have been moved, God decreed that the world would end.
We will see that 264 - 1 moves are required, thus one can easily calculate that
about three hundred billion years will elapse before the world is finished,
if the monks can move one disk per second.
10To see this we make the approximation
210
103 leading to the
estimate of about 1019 moves. At one move a second, that's 1019seconds, and since there are about
billion seconds in a century
it will take about
The Hanoi monks could use this scheme to move all the disks:
Find a method to move the top 63 disk from the first needle to the third. Then move the 64th disk from the first to the second. And finish by moving the 63 disks from the third needle to the second.We can easily translate this into a recursive program, where Strings first, second, and third represent the needles and integers 1 through 64 the disks.
$<$Tower of Hanoi$>$=
public void towerOfHanoi(String first, String second, String third, int disks) {
if (disks $>$ 0) {
towerOfHanoi(first, third, second, disks-1);
System.out.println(Moving disk +disks+ from +first+ to +second);
towerOfHanoi(third, second, first, disks-1);
}
}
The first resource we wish to measure is the number of moves made by the monks in completing their task. To do so let M(n) = Mn name the numbers of moves to transfer ndisks from one needle to another (we'll often go back and forth between functional (M(n)) and subscript (Mn) notation, depending on what seems appropriate). The towerOfHanoi() program, called with n disks, makes Mn - 1 moves in transferring the top n - 1 disks from the first to third needle. It makes 1 move in transferring the nth disk from the first to second needle. And finally, it makes Mn - 1 moves while taking n - 1 disks from the third to second needle. Summing these move counts yields the recurrence relation
Equation 1 is called a relation since there are multiple solutions of it. We can fix on one solution by providing an initial condition. For our problem M0 = 0 is reasonable initial condition as no moves are necessary when there are no disks to move.Recurrence relations are called difference equations when the initial condition is specified as part of the recursion formula. The study of difference equations using the sum and difference calculus is a fascinating topic predating differential and integral calculus. Difference equation theory parallels the theory of differential equations; the analogies between the two are beautiful. We will not explore these ideas here. Our goal is more immediate: how does one ``solve'' the Tower of Hanoi recurrence?
A simple, yet useful tool is to explore the sequence defined by the recurrence to see if it is familiar. We know M0 = 0, so using 1 M1 = 2*0 + 1 = 1, M2 = 2*1 + 1 = 3 and M3 = 2*3 + 1 = 7. If you're a computer scientist, you recognize this as the start of the sequence defined by Mn = 2n - 1, which gives the largest unsigned integer representable with n bits.
But exploring a few cases should never be substituted for a proof, unless one does not have the tools for a proof. Here, several proof techniques exist and provide an arsenal for solving difference equations.
Suppose we can deduce a solution from sampling the sequence. Then the solution can often be proved correct by using induction. For the Tower of Hanoi problem 1, letŐs pretend that Mn = 2n - 1 is correct. Induction requires us to demonstrate the proposed solution is valid for a base case. Here, we know M0 = 20 - 1 = 0 is correct. Then we must establish the inductive step: pretending the formula is correct for n - 1, we show it is correct for the next value n. That is, if Mn - 1 = 2n - 1 - 1 then from equation 1
One simple method for solving difference equations is back substitution. To see how this works, we'll start with the recurrence
and its previous incarnation:
Substituting the expression for Mn - 1 from 3 where Mn - 1 occurs in 2 yields
Now we also know that
and substituting this into 4 gives
Making an inductive leap we find
and since M0 = 0 we know
Summing factors are the forward elimination counterpart to back substitution. A key idea is telescoping sums:
For the Tower of Hanoi equation we can make it telescope writing it in a functional form
| Mn - 2Mn - 1 | = | 1 | |
| 2Mn - 1 - 22Mn - 2 | = | 2 | |
| 22Mn - 2 - 23Mn - 3 | = | 22 | |
| 23Mn - 3 - 24Mn - 4 | = | 23 | |
| 2n - 1M1 - 2nM0 | = | 2n - 1 |
The solution to the general difference equation
Let's apply the Master theorem to binary search:
Let's apply the Master theorem to mergeSort.
Finally, let's apply the Master theorem to the Tower of Hanoi. Notice that
Here's how it works.
Let's pretend that
m = r(n) = bn for some fixed base b and
exponential parameter m. The function r(m) is the
reparameterization.
Notice that when
n - 1
n the new parameter is divided by b:
m/b
m. Now let's also rename
M(n) = M(logbm) =
(m)so that we have:
| = | M(logbm) | ||
| = | M(n) | ||
| = | 2M(n - 1) + 1 | ||
| = | 2 |
A recurrence of the form
By renaming tn = xn we can turn linear recurrence with constant coefficients into a polynomial:
Now, theoretically, every polynomial of degree k can factored as a product of linear terms, specifically the fundamental theorem of algebra states:
Let r be any root of the characteristics polynomial.
Then tn = rn is a solution of the recurrence
| a0tn + a1tn - 1 + ... + aktn - k | = | a0rn + a1rn - 1 + ... + akrn - k | |
| = | rn - k(a0rk + a1rk - 1 + ... + ak | ||
| = | 0 |
Consider the Fibonacci characteristic polynomial x2 - x - 1. Use the quadratic formula, we find roots
and r2 =
But to the point, the Fibonacci numbers are given by
| F0 | = | c1
|
|
| = | c1 + c2 | ||
| = | 0, | ||
| F1 | = | c1
|
|
| = | ![]() |
||
| = | ![]() |
||
| = | 1. |
Now let's consider non-homogeneous the Tower of Hanoi recurrence
| Mn - 2Mn - 1 | = | 1 | |
| Mn - 1 - 2Mn - 2 | = | 1 | |
| Mn - 3Mn - 1 + 2Mn - 2 | = | 0. |
More generally, if the non-homogeneous recurrence has the form
The above discussion of non-homogeneous recurrences led to characteristic polynomials with repeated roots. A root r has multiplicity m if the characteristic polynomial can be factored as
When a root r has multiplicity m, we can show that
Generating functions offer a powerful tool for solving recurrences. The idea is simple, but can involve some convoluted symbol manipulation. We start with an infinite sequence
A two-step process solves difference equations using generating
functions.
The difference equation defines a sequence
a0, a1, a2,...
of unknown terms.
Let's apply this method to the Tower of Hanoi recurrence.
The sequence
M0, M1, M2,...
has terms Mn representing the number of moves when n disks
are stacked on the first needle.
The recurrence
Mn = 2Mn - 1 + 1, n > 1 is multiplied by zn and summed
from 1 to
.
|
M(z) = M0 + |
(10) |
|
2z |
(11) |
| M(z) = z/(1 - z)(1 - 2z), | (12) |
The second step is to expand this expression into a series.
We have:
| M(z) | = | z/(1 - z)(1 - 2z) | |
| = | 1/(1 - 2z) - 1/(1 - z) | ||
| = | |||
| = | |||
| = |
Consider a recurrence in the form that appears in the Master theorem
| = | T(n) | ||
| = | aT(n/b) + f (n) | ||
| = | a |
To test our mettle, we will use our difference equation solution techniques on a collection of recursive algorithms. We start with some classic algorithms that sort elements selected from a sample space with a total (also called linear) order.
With some loss of generality, we will assume the sorting problem we are to solve has an instance described by
The code for mergeSort() is in another file which you can visit should you forget the details of how it is implemented.
Under the usual uniform-cost assumption, each move of a word from one array to another has unit cost and each comparison between words has unit cost. Other incidental operations are not counted.
Since, on the initial call, there are n words to move from words[] into tempWords[] and then back into words[], there are 2n String moves. Also there are n String comparisons, on the initial call, as k goes from first=0 to last=n-1. Thus, the recurrence relation for the time complexity T(n) of mergeSort() is
A reasonable initial condition is T(1) = 1, which takes into account the cost of a subroutine call which does no real work.From the Master theorem, we can conclude that mergeSort() has time complexity
that is, mergeSort() has a best, worst, and average case time complexity that grows as nlg n.
What is the space complexity S(n) of mergeSort()? In the merge section of the code an new array of size last-first+1 is declared. This is largest when last=n-1 and first = 0. We must assume that a smart compiler reuses and reclaims space and so the space complexity is O(n).
Another issue is stack space required to support recursion. A pointer to the words to be sorted and the range being sorted and merged must be stored, that is, three items. Since the recursion depth is O(lg n), we find that the additional space complexity due to recursion does not alter the big-O space complexity mentioned above.
The code for quickSort() is in another file which you can visit should you forget the details of how it is implemented.
That partition requires O(n) operations is clear from analyzing the single for loop. In fact, using n + 1 as the cost of partition will make the analysis smooth, so we assume the difference equation for quickSort() is
In the worst case partitioning will create an empty array on one side of pivot and an n - 1 array on the other side, that is, p = 0 (or p = n - 1). If this occurs on every call then
What are the instances where this worst case behavior occurs?
Intuitively, in the best case, pivot will partition the array into (nearly) equal sized sub-arrays, that is, p = n/2. When this happens at each recursive stage, the recurrence for quickSort() is
In the average case, any pivot from p = 0 to p = n - 1 can occur. We will assume each of these n cases occur with equal probability 1/n. Thus, the recurrence is:
Terms of the form nT(n) in equation 18 suggest differentiation
of the generating function. Notice that
| G'(z) - zG'(z) | = | ||
| = | |||
| = | |||
| = | |||
| = |
| G(z) | = | ||
| = | 2 |
||
| = | 2 ]zn |
||
| = |
Now let's consider a different type of problem.
When the product
c = a . b of two positive integers exceeds a computer's
word size we can accepts an approximate answer
or devise
a multiple precision approach to record the exact product.
The use of large integers in encryption algorithms makes
multi-precision arithmetic problems interesting.
Let's pretend we're programming a computer with positive integer register word size of n bits. We'll also pretend that positive integers a and b lie in the range from 0 to 2n - 1. Of course, their product c can be as large as 22n - 2n + 1 + 1 and require as many as 2n bits to represent exactly.
To get started, we'll partition a positive integer a into most and least significant half-words. That is, write a in its radix 2 form:
| a | = | an - 12n - 1 + an - 22n - 2 + ... + a121 + a020 | |
| = | 2n/2(an - 12n/2 - 1 + ... + an/2) + (an/2 - 12n/2 - 1 + ... + a0) | ||
| = | 2n/2 |
Multiplication of two n bit integers a and b can then be written as
| a*b | = | (2n/2 |
|
| = | 2n( |
||
| 2n/2( |
|||
| = | 2n( |
||
| 2n/2(( |
|||
| |
|||
| |
If we let T(n) denote the cost of multiplying 2 n bit numbers then the first expansion of the product produces the recurrence
Given an array of words, we'd like to find the minimum and maximum words, that is, if the words were ordered, the minimum would be first and the maximum would be last. A brute force algorithm simply scans the array and maintains information about the minimum and maximum words. It is easy to count that the worst case time complexity of naiveMinMax() is O(n); in the worst case 2n - 2 compares are made.
The code for the brute force scan of an array to find the minimum and maximum is in another file which you can visit should you forget the details of how it is implemented. Let's determine the number of comparisons in the average case for this algorithm. For each iteration of the loop at least one compare is always made and a second compare is made if and only if word[i] is greater than all of the previous words. We'll assume all the words are distinct to rule out the ``or equal'' case. Over all n! permutations of words, it is easy to convince oneself that word[i] will be greater than the previous i - 1 words with probability 1/i. To see this, notice word[2] will be greater than word[1] 1/2 of the time, word[3] will be greater than word[1] and word[2] 1/3 of the time, and so on. In general, i words can be arranged in i! ways and out of these (i - 1)! arrangement will have the maximum word last.
We can compute the average case complexity by adding, from i = 1to i = n, the probability of making one compare (1) times the cost of one compare (1) plus probability of making a second compares (1-1/i) times the cost of the second compare (1).
Next, let's develop a recursive solution to the min-max problem.
public String[] recursiveMinMax(String[] words, int low, int high) {
String[] minMax = new String[2];
For a one element array the minimum and maximum are identical.
For a two element array, one compare in made and based on the outcome the minimum and maximum words can be set.= if (low == high) { minMax[0] = words[low]; minMax[1] = words[low]; return minMax; }
Now when there are more than two elements, we'll divide the array in half (roughly), determine the minimum and maximum of both halves, compare the two minimums to determine the global minimum, and compare the two maximum to determine the global maximum.= if (low == high - 1) { if (words[low].compareTo(words[high]) $<$ 0) { minMax[0] = words[low]; minMax[1] = words[high]; } else { minMax[0] = words[high]; minMax[1] = words[low]; } return minMax; }