Growth of Functions
A central topic in algorithm analysis is to compare the time (or space) efficiency of algorithms (for the same problem). As the running time of an algorithm is represented as a function of instance size, the problem is transformed into categorization of these functions.
Relations among the five:
Asymptotic notions can be used in equations and inequalities, as well as in certain calculations. For example,
2n2 + 3n + 1 = Θ(n2) + Θ(n) + Θ(1) = Θ(n2).
In algorithm analysis, the most common conclusions are worst case expenses expressed as O(f(n)), which can be obtained by focusing on the most expensive item in the function, as in the analysis of Insertion Sort algorithm.
In general, the "divide-and-conquer" approach uses D(n) time to divide a problem into a smaller subproblems of the same type, each with 1/b of the original size. After the subproblems are solved, their solutions are combined in C(n) time to get the solution for the original problem. This process is repeated on the subproblems, until the input size becomes so small that the problem can be solved in constant time. This approach gives us the following recurrence:
For example, for the Merge Sort algorithm, we have
This is the case, because each time an array of size n is divided into two halves, and to merge the results together takes linear time.
Often we need to solve the recurrence, so as to get a running time function without recursive call.
One way to solve a recurrence is to use the substitution method, which uses mathematical induction to prove a function that was guessed previously.
For example, if the function is
one reasonable guess is T(n) = O(n lg n). To show that this is indeed the case, we can prove that T(n) ≤ c n lg n for an appropriate choice of the constant c > 0. We start by assuming that this bound holds for halves of the array, then, substituting into the recurrence yields,
where the last step holds as long as c ≥ 1.
Furthermore, mathematical induction requires us to show that our solutions holds for the boundary conditions, that is, we can choose the constant c large enough so that T(n) ≤ c n lg n holds there. For this example, if the boundary condition is T(1) = 1, then c = 1 does not work there because c n lg(n) = 1 1 lg(1) = 0. To resolve this issue we only need to show that for n = 2, we do have T(n) ≤ c 2 lg 2 as long as c ≥ 2.
In summary, we can use c ≥ 2, and we have shown that T(n) ≤ c n lg n for all n ≥ 2.
To get a good guess for recurrence function, one way is to draw a recursion tree. For example, if the function is
we can create the following recursion tree:
To determine the height of the tree i, we have n / 4i = 1, that is, i = log4n. So the tree has log4n + 1 levels.
At level i, there are 3i nodes, each with a cost c(n / 4i)2, so the total is (3/16)icn2.
At the leave level, the number of nodes is 3log4n, which is equal to nlog43 (see page 54). At that level, each node costs T(1), so the total is nlog43T(1), which is Θ(nlog43).
After some calculation (see Section 4.4 of the textbook for details), the guess T(n) = O(n2) can be obtained.
Finally, there is a "master method" that provides a general solution for all divide-and-conquer recurrence. Intuitively, the theorem says that the solution to the recurrence is determined by the larger one of the costs at the top and bottom levels of the recursion tree, respectively:
The above example falls into Case 3.
The proof of the master theorem is given in the textbook. It is important to realize that the three cases of the master theorem do not cover all the possibilities.
As a special case, when a = b we have logba = 1. The above results are simplified to (1) Θ(n), (2) Θ(n lg n), and (3) Θ(f(n)), respectively, depending on whether f(n) is linear, or faster/slower. We can see that Merge Sort belongs to Case 2.