CIS 9615. Analysis of Algorithms

Divide-and-Conquer on Linear Data Structures

### 1. Linear Data Structures

Data structure: data items with a relation among them, as well as operations on the structure, with respect to the items and the relation.

Relation: usually assumed to be unique, binary, and directional. Predecessor and successor. Order among data items.

Major common operations on data structures: search, insert, delete, traversal, sorting.

A data structure is defined either abstractly (by relation and operations specification) or concretely (by storage and procedure description).

Linear data structures: unique predecessor and successor, except at both ends.

Size of instance: length of description is usually proportional to the number of data items.

Major operations to be analyzed: search and sorting.

### 2. Search

Most typical case is search by key, though it can also be by order (predecessor and successor, minimum or maximum key) or by condition. Also, search-for-any and search-for-all.

Comparison-based search, efficiency and number of key comparisons.

Linear (sequential) search: key comparison one-by-one. Return value. Iterative and recursive form. Best case and worst case.

Does search direction matter? Can the (asymptotic) efficiency in either case be improved?

To rule out more than one item after a comparison: ordered items.

Binary search: check the middle item, then go left or right. Iterative and recursive form. Best case and worst case.

Can the worst case (asymptotic) efficiency be improved? Why?

What if the checking point is moved from the n/2 of binary search to n/4 or n-4?

How about binary search on unordered linear data structure? Or on ordered linked list?

What if the address is a one-to-one function of the key? Searching without element comparison.

Hash table, ideal situation and actual situation.

Hash function and collision handling. Best case and worst case.

Is a hash table a linear data structure? Why?

### 3. Sorting

Assuming an order among the key values.

Linear sorting: insertion sort and selection sort. Recursive forms:

```INSERTION-SORT(A, n)
1  IF n > 1
2    THEN INSERTION-SORT(A, n-1)
3         INSERTION(A, n)

SELECTION-SORT(A, n)
1  IF n > 1
2    THEN SELECTION(A, n)
3         SELECTION-SORT(A, n-1)
```
Operations to be counted: element comparison and element assignment.

Both INSERTION(A, n) and SELECTION(A, n) take linear time. Iterative and recursive form.

Best case and worst case for each of the two sorting algorithms. Which one is more efficient?

Binary sorting: merge-sort and quick-sort. Recursive forms:

```MERGE-SORT(A, p, r)
1  IF p < r
2    THEN q <- (p + r) / 2
3         MERGE-SORT(A, p, q)
4         MERGE-SORT(A, q+1, r)
5         MERGE(A, p, q, r)

QUICK-SORT(A, p, r)
1  IF p < r
2    THEN q <- PARTITION(A, p, r)
3         QUICK-SORT(A, p, q-1)
4         QUICK-SORT(A, q+1, r)
```
Iterative and recursive forms of sorting algorithms. Best case and worst case.

What if PARTITION(A, p, r) always returns (p+r)/2? How about p+(r-p)/100? Or p+100 (when r-p > 100)?

In general, size reduction in recursion often takes the form of n/c or n-c, and they usually lead to different growth orders.

Can sorting be more efficient than O(n lg n)? Why? Proof

Linear-cost sorting: limited range and granularity. Sorting without (or with restricted) element comparison.