CIS 3223. Data Structures and Algorithms

Self-balancing search trees (1)

 

0. Conceptual review

Data structure: data items with relations among them, plus operations

Linear data structure, search and sorting

Tree and binary tree (shape), binary search tree (shape and order)

Major operations: search, insert, delete, and traversal

Time costs of the operations: worst case and shape of the tree

 

1. AVL tree

As we have seen, in binary search tree the average time spent in operations (search, insert, delete) depends on the height of the tree (which is the number of levels in the tree). Given the number of nodes in it, a balanced tree is the shortest. Therefore, to dynamically keep the balance will improve the efficiency.

An AVL tree is a height balanced binary search tree. For any node in an AVL tree, the height of the left subtree and right subtree differ by 1 or 0.

The balance factor of a node is hL - hR (or hR - hL — it does not matter as far as used consistently) where hL is the height of its left subtree and hR is the height of its right subtree.

After an insertion or deletion, the balance factor of some node may become +2 or -2, then the tree is "rebalanced" to restore its balance by moving some nodes around.

 

2. Simple rebalance

Let's start with binary search trees with 3 nodes. There are 5 possibilities, but only one of the them is balanced.

Example: each node shows its key over its balance factor

                C/2
                /
              B/1
              /
            A/0 
This is considered a LL situation. The first unbalanced node (C/2) is left heavy (+2) and so is its left child (+1).

To rebalance the tree without changing the relative order of nodes (as defined in binary search tree), we can use a rotation.

The result of a right rotation would be

           B/0
          /  \
        A/0  C/0
Another example:
             C/2
            /
          A/-1
            \
             B/0 

This is a LR situation. Unbalanced node is left-heavy (+2), and its left child is right heavy (-1). Result of first a left and then a right rotation would be: After left rotation of bottom 2 nodes:

             C/2
            /
          B/1
          /
        A/0 

After right rotation:

           B/0
          /  \
        A/0  C/0

There are two mirror-image situations: RR and RL.

 

3. General rebalance

We showed a very simple case with just 3 nodes. In actual fact there may be many nodes involved in the rotation.

After an insertion or deletion, "out-of-balance" nodes (with balance factor 2 or -2) only appear in the path from root to the insertion/deletion location. If we check the "out-of-balance" node that is farest from the root, there are only 4 canonical forms.

LL case:

              C/2
             /   \
          B/1     CR
         /   \
      A/0     BR
     /   \
   AL     AR

Here AL, AR, BR, and CR have the same height.

From properties of binary search trees, we know the following: AL < A < AR < B < BR < C < CR

This requires a right rotation. B will become the root. Its right subtree will be C. C will no longer have B as its right subtree but will have BL instead. B and C have balance factors of 0. The other nodes balance factors remain the same as before.

          B/0
         /   \
      A/0     C/0
     /  \    /   \
   AL    AR  BR   CR
where we still have the order AL < A < AR < B < BR < C < CR.

LR Case:

               C/2
              /   \
          A/-1     CR
         /    \
       AL     B/0
             /   \
            BL    BR
Assume that all the subtrees, AL, BL, BR, and CR have the same height. For t he order among the nodes, here we have: AL < A < BL < B < BR < C < CR.

A left rotation (around A) gives us:

              C/2
             /   \
          B/1     CR
         /   \
       A/0    BR
      /   \    
     AL    BL

then a right rotation (around C) gives us a balanced tree:

          B/0
         /   \
      A/0     C/0
     /   \   /   \
    AL    BL BR   CR
where we still have AL < A < BL < B < BR < C < CR.

The RL and RR cases are symmetric.

These four cases cover all possibilities. For an out-of-balance node (i.e., with balance factor 2 or -2), if its child on the heavy side has balance factor 1 or -1, it directly maps into one of the above four cases: 2/1 to LL, 2/-1 to LR, -2/-2 to RR, and -2/1 to RL. Please note that if the third node (A in LL, B in LR and RL, and C in RR) has a balance factor 1 or -1, we can still use the above procedure to rebalance the tree. If the heavy-side child has balance factor 0 (which can only happen after a deletion, but not after an insertion), it is treated as LL or RR, and the tree will be balanced.

An AVL tree is maintained by making some changes to the algorithm for binary search tree insertion and deletion. We can visualize the change as retracing the path we took to do the insertion/deletion and checking to see whether the change caused a node to become unbalanced. If necessary, a rebalance action is taken.

In a balanced binary search tree, the complexity of search, insertion, and deletion is O(log n), so it is more efficient then linear data structures, such as array and linked list.

AVL tree demo on the web: an animation.

A demo applet:

After each insertion or deletion, the current tree is displayed. The tree is shown sideway, with a parent on left, children on right, and left child above right child. For example, in the following display, B is the parent, A is the left child, and C is the right child.

        A
     B
        C
In the sample applet, the height and balance factor of a node is displayed after the String in it.