**CIS 9615. Analysis of Algorithms
****Greedy Algorithms**

### 1. The idea

*Greedy algorithm*: optimization by making the locally optimal choice, without looking further.
Compared with dynamic programming: faster, though may miss the optimal solution.

- Matrix-Chain Multiplication: in each step, eliminate the largest row/column number. Complexity: O(n lg n).
- Optimal Binary Search Tree: in each step, choose a root by considering both probability and balance. Complexity: O(n
^{2}).

Under certain condition, locally optimal decisions do lead to globally optimal decisions.
A problem exhibits optimal substructure if an optimal solution to the problem contains optimal solutions to the sub-problems. Roughly speaking, a greedy algorithm is optimal if its greedy choices are included in optimal choices (details).

### 2. Example: Activity selection

Problem: for a given set of activities *a*_{1} ... a_{n}, each *a*_{i} using a resource exclusively in interval [*s*_{i}, *f*_{i}), find a schedule that maximize the number of compatible activities.
Dynamic programming solution: consider *S*_{ij}, the set of all activities compatible with *a*_{i} and *a*_{j} where *f*_{i} < *s*_{j}, then finally decide the size of the maximum compatible subset of *S*_{0,n+1}. For each *S*_{ij}, the size of the maximum compatible subset can be obtained by going through all *a*_{k} in the set, so to recursively reduce the problem to *S*_{ik} and *S*_{kj}.

Greedy solution: recursively find the activity that is compatible with the current schedule while having the earliest finish time, and add it into the schedule.

The solution can be proved to be optimal: the earliest-finished activity must be in the optimal solution.

A recursive algorithm initially called as *Recursively-Activity-Selector(s, f, 0, n+1)*, with the activities sorted by finish time.

A more efficient iterative algorithm:

What are the growth orders of these algorithms?

### 3. Example: Huffman coding

To use variable-length code for symbols, and give frequent ones shorter codes.
Assuming binary prefix coding, a solution can be represented by a binary tree, with each leaf corresponds to a unit, and its path from root represents the code for the unit. An optimal code is a full binary tree with the minimum expected path length.

In the following algorithm, *C* contains a set of symbols, and *f* contains the frequency values of the units. *Q* is a priority queue, implemented by a heap.

Example:

Demo applet

The running time of the algorithm is O(n lg n): n elements in a min-heap.

The average code length equals to the sum of all the probability values in the internal nodes of the tree.

Correctness of the algorithm

Compared to optimal binary search tree: similarity and difference.

### 4. Related topics

**Hill Climbing**: when an optimization problem can be represented as searching for the maximum value of a function in a space, "hill climbing" is a greedy algorithm which moves to the local maximum value in each step.
**Best-First Search**: if the next step is selected at the neighborhood of all explored points, "hill climbing" become "best-first search".