Binary Search Trees
A binary tree is a tree where each node can only have a left successor and a right successor.
To represent a binary tree, usually each node have two pointers to its successors, though it may also contains pointer to its predecessor.
To represent a general tree, we can add an array or linked list into each node to point to its successors, or convert it into a corresponding binary tree, following the "left-child, right-sibling" mapping.
In the discussion on heap, we have seen a (complete) binary tree can be mapped into an array.
A systematic visit of the nodes of a tree is called "walk" or "traversal". For a binary tree, there are three common walk orders, all defined recursively:
These orders can be followed manually by tracing the outline drawn around the tree:
Given this definition, an inorder walk of a binary search tree lists all of its node in order. In this sense, BST is "sorted" horizontally. In comparison, a heap is sorted vertically, so only maintains a partial order among all the nodes in the tree.
For such a binary tree, search is similar to binary search in an array.
The path the algorithm following is from the root to where a node is or should be in the tree, therefore the running time is proportional to the length of the path, and the worst case running time is proportional to the height of the tree.
The non-recursive version of the algorithm:
We can take the search for minimum and maximum keys as special cases of the search operation. In these cases, the comparison in the path become unnecessary, and the algorithm simply goes to the end of one direction: left for the minimum and right for the maximum.
Given a node x in a binary search tree where all keys are distinct, the successor of the node is the node with the smallest key grater than x.key. In an inorder tree walk, this node will immediately follow x. The following algorithm assume the pointer to parent in each node. If x has a right subtree, then its successor is the minimum node in it, otherwise its successor is its closest ancestor that x is in its left subtree.
The Tree-Predecessor algorithm is symmetric to this one.
Repeatedly calling Tree-Successor will give us a non-recursive inorder tree walk algorithm.
All the search algorithms on BST run in O(h) time, where h is the height of the tree.
The following algorithm insert node z into BST T (assume z is not already in T):
In the algorithm, x traces a path to the insertion point, and y indicates the parent of x.
The deletion algorithm is more complicated, because after a non-leaf node is deleted, the "hole" in the structure needs to be filled by a leaf node. There are three possibilities:
In the following algorithm, z is an input argument referring to the node to be deleted from the BST T, and the local variable y refers to its successor.
Both above algorithms have run time O(h), where h is the height of the tree.
Since in BST all major operations have run time O(h), the height of a binary search tree determines the worst case run time. For a binary tree with n nodes, the best case (complete binary tree) is h = Θ(lg(n)), and the worst case (linear list) is h = Θ(n). A randomly formed BST has an expected height h = Θ(lg(n)).