For example, "sorting" is a problem where the input is a sequence of items of a certain type, with an order (a transitive relation) defined between any pair of them, and the output should be a sequence of the same items, ordered according to the given relation. A sorting algorithm should specify, step by step, how to turn any valid input into the corresponding output in finite time, then stop there.
An algorithm is correct if for every valid input instance, it halts with the correct output (with respect to the specification of the problem). An algorithm is incorrect if there is a valid input instance for which the algorithm produces an incorrect answer or no answer at all (i.e. does not halt).
For a given problem, if there are multiple candidate algorithms, which one should be used? There are several factors to be considered:
When the (input, output, or intermediate) data of a problem contain multiple items, they are usually organized in a data structure, which represents both the data items and the relation among them.A data structure can be specified either abstractly, in terms of the operations that can be carried out on it, or concretely, in terms of the storage organization and the algorithms implementing the operations. In the design and selection of data structures, the analysis of the algorithms involved is a central topic.
Programming means to implement an algorithm in a computer language. A program is language-specific, but an algorithm is language-independent.
For a given program, the actual time it takes to solve a given problem instance depends on
The common practice is to define a size for each instance of the problem (which intuitively measures the relative difficulty of the instance), then to represent the number of execution of a certain operation as a function of instance size. Finally, the increasing rate of the function, with respect to the increasing of the instance size, is used as the indicator of the efficiency of the algorithm.
With such a procedure, algorithm analysis becomes a pure mathematical problem, which is well-defined, and the result has universal validity.
Though it is a powerful technique, we need to be aware that many relevant factors have been ignored in this process, and therefore, if for certain reason some of the factors have to be taken into consideration, the traditional algorithm analyzing approach may become improper to use.
Also, some of the decisions made during the formalization process, such as the definition of problem size and the selection of the operation to be counted, are not always obvious or unique, and different decisions may lead to different conclusions.
More complicated blocks, such as loops, can be built from the above instructions.
The correctness of the algorithm is proven by checking the loop invariant, a proposition about a relation, such as
At the start of each iteration of the for loop of line 1-8, the subarray A[1 .. j − 1] consists of the elements originally in A[1 .. j − 1] but in sorted order.We must show three things about a loop invariant:
For sorting, it is natural to use the number of keys to be sorted as input size, and it is assumed that a constant amount of time is required to execute each line of our pseudocode (except comments).
Now let us mark the cost and the number of execution times of each line:
in the algorithm, n is length[A]. In line 5-7, tj is the number of times for the while loop test is executed for that number of j.
The running time of the algorithm is
For a given n, T(n) depends on the values of tj, which change from instance to instance.The best case of the algorithm happens when the array is already sorted, so that tj = 1 for j = 2, 3, ..., n, and the function becomes
The worst case of the algorithm happens when the array is reverse sorted, so that tj = j for j = 2, 3, ..., n, and the function becomes
which is a quadratic function of n.
Usually, analysis of algorithm is concentrated on the worst case. Though average case is also important, it is harder to analyze.
The merge procedures Merge(A, p, q, r) is given in the following:
It first moves the two (sorted) subarrays A[p...q] and A[q+1...r] into two separate arrays L and R, put two special sentinel values at the end of each of them, then merge the two back into the original array A[p...r]. Its time expense is a linear function of n.
The following is the merge sort algorithm, which recursively calls itself on the two halves of the array, then merge the results together.
The correctness and efficiency of this algorithm can be analyzed similarly.