**Divide-and-Conquer on Linear Data Structures**

*Relation*: usually assumed to be unique, binary, and directional. Predecessor and successor. Order among data items.

Major common operations on data structures: *search*, *insert*, *delete*, *traversal*, *sorting*.

A data structure is defined either *abstractly* (by relation and operations specification) or *concretely* (by storage and procedure description).

Linear data structures: unique predecessor and successor, except at both ends.

Size of instance: length of description is usually proportional to the number of data items.

Major operations to be analyzed: *search* and *sorting*.

Comparison-based search, efficiency and number of key comparisons.

Linear (sequential) search: key comparison one-by-one. Return value. Iterative and recursive form. Best case and worst case.

Does search direction matter? Can the (asymptotic) efficiency in either case be improved?

To rule out more than one item after a comparison: ordered items.

Binary search: check the middle item, then go left or right. Iterative and recursive form. Best case and worst case.

Can the worst case (asymptotic) efficiency be improved? Why?

What if the checking point is moved from the n/2 of binary search to n/4 or n-4?

How about binary search on unordered linear data structure? Or on ordered linked list?

What if the address is a one-to-one function of the key? Searching without element comparison.

Hash table, ideal situation and actual situation.

Hash function and collision handling. Best case and worst case.

Is a hash table a linear data structure? Why?

Linear sorting: insertion sort and selection sort. Recursive forms:

INSERTION-SORT(A, n) 1 IF n > 1 2 THEN INSERTION-SORT(A, n-1) 3 INSERTION(A, n) SELECTION-SORT(A, n) 1 IF n > 1 2 THEN SELECTION(A, n) 3 SELECTION-SORT(A, n-1)Operations to be counted: element comparison and element assignment.

Both `INSERTION(A, n)` and `SELECTION(A, n)` take linear time. Iterative and recursive form.

Best case and worst case for each of the two sorting algorithms. Which one is more efficient?

Binary sorting: merge-sort and quick-sort. Recursive forms:

MERGE-SORT(A, p, r) 1 IF p < r 2 THEN q <- (p + r) / 2 3 MERGE-SORT(A, p, q) 4 MERGE-SORT(A, q+1, r) 5 MERGE(A, p, q, r) QUICK-SORT(A, p, r) 1 IF p < r 2 THEN q <- PARTITION(A, p, r) 3 QUICK-SORT(A, p, q-1) 4 QUICK-SORT(A, q+1, r)Iterative and recursive forms of sorting algorithms. Best case and worst case.

What if `PARTITION(A, p, r)` always returns (*p+r*)/2? How about *p+(r-p)*/100? Or *p*+100 (when *r-p* > 100)?

In general, size reduction in recursion often takes the form of *n/c* or *n-c*, and they usually lead to different growth orders.

Can sorting be more efficient than O(*n* lg *n*)? Why?
Proof

Linear-cost sorting: limited range and granularity. Sorting without (or with restricted) element comparison.