**B-trees and Fibonacci Heaps**

A 2-3 tree is a tree where each node contains 1 or 2 key values (so has 2 or 3 children). Within the tree, a 2-node has 1 key and 2 children, and a 3-node has 2 keys and 3 children. All leaves are on the same level.

[|7|] / \ [|3|] [|11|15|] / \ / | \ [1] [5] [9] [13] [17 19]Searching a 2-3 tree is similar to searching a binary search tree, except that within each 3-node, in the worst case two comparisons are needed.

**Insertion**: In a 2-3 tree, a new key value is initially inserted into a leaf, according to the order of a search tree. Then there are the following possibilities:

- If the leaf was a 2-node before the insertion, it becomes a 3-node.
- If the leaf was a 3-node, it is split into two 2-node, and the middle value is inserted into the parent. This process may repeat at the parent. If the root splits, a new root is generated.

**Deletion**: To remove a key value from a 2-3 tree, the first step is to search for it. If it is in a leaf, simply remove it. If it is in an internal node, replace it by
its predecessor (or successor), which must be in a leaf.

After the removal of the key value, if a leaf becomes empty, there are two possibilities:

- If it has a 3-node sibling, a key is moved from the sibling to the parent, and the key in the parent is moved into the empty leaf.
- If it has no 3-node sibling, it is merged with a sibling with key from the parent. This process may repeat at the parent. If the root become empty, it is removed.

A 2-3-4 tree extends a 2-3 tree by allowing 4-nodes, so each non-leaf node can have 2, 3, or 4 children.

The operations of 2-3-4 trees are similar to those of 2-3 trees.

There is a mapping between a 2-3-4 tree and a Red-Black tree. In the following, a lower case letter represents a key, and an upper case letter represents a subtree.

[|b|] b / \ / \ A C A C [|b|d|] b d / | \ / \ / \ A C E A d or b E / \ / \ C E A C [|b|d|f|] d / | | \ / \ A C E G b f / \ / \ A C E GSince each node in a 2-3-4 true corresponds to one black node (plus at most two red nodes) in a Red-Black tree, the height of a 2-3-4 tree corresponds to the number of black nodes in the path from the root to a leaf in a Red-Black tree, which is half of the number of comparisons in the worst case.

If you have a very large data base that cannot fit in main memory, you need to store it on disk and come up with some way to search it efficiently. Disk accesses are very time consuming compared to memory accesses. So you want to come up with a method that involves few disk accesses. If a binary tree is stored in secondary memory (with pointers to disk addresses) it will take too long to search it.

The height of a tree is what determines the number of comparisons and disk accesses we need to make. A binary tree has a minimum height of log(n) because it only stores 1 key in a node and has 2 branches. If we can come up with a technique for representing a tree node that stores more key and has more branches, we could store all the keys in a data structure that is not as deep and thus search it with fewer disk accesses.

A B-tree is a way of doing this. A *B-tree with a minimum degree t* (*t* is 2 or more) has the following properties:

- If a non-leaf node contains
*n*keys, it has*n + 1*children. - For every internal node (that is neither the root or a leaf), its number of children is between
*t*and*2t*. - All leaf nodes are at the same level of the tree.
- All keys stored in a node are in sequential order.
- A node with
*n*keys has the following structure:{address of B-tree node for keys < key 1}

{key 1, address of data for key 1, address of B-tree node with keys > key 1 and < key 2}

{key 2, address of data for key 2, address of B-tree node with keys > key 2 and < key 3}

{key 3, address of data for key 3, address of B-tree node with keys > key 3 and < key 4}

. . .

{key n, address of data for key n, address of B-tree node with keys > key n}

Example: an order 5 B-tree (where each node contains 2-4 keys and 3-5 children)

[|20|30|42|] / | | \ /--------/ / \ \--------\ / / \ \ [ 10 15 ] [ 25 28 ] [ 32 34 ] [ 44 55 ]

If pointer to node is null target key is not in tree - return null Else Search current node If target key is found return disk address of item with that key Else If target key is < first key on node Search first child of current node Else if key k < target key < key k+1 Search child k Else if target key is > last key on node Search last child of current node

New let's add the following entries: 8, 18, 26, 36, 39, 43

[|20|30|42|] / | | \ /-------------/ / \ \---------------\ / / \ \ [ 8 10 15 18 ] [ 25 26 28 ] [ 32 34 36 39 ] [ 43 44 55 ]At this point 2 of the leaf nodes are full and 2 are not. Let's insert in one of the full leaf nodes and see what happens.

Insert 37: 37 in not in the root so insert in the child node that contains keys between 31 and 41, inclusive. That node would contain: 32 34 36 37 39 which is too big, so split it into two nodes and pass the middle value (36) up.

[|20|30|36|42|] / | | | \ /-----------------/ /-/ | \-\ \------------\ / / | \ \ [ 8 10 15 18 ] [ 25 26 28 ] [ 32 34 ] [ 37 39 ] [ 43 44 55 ]Now let's insert in the other node which is full. Remember that all insertions begin at a leaf node. Insert 12 in leftmost leaf: 8 10 12 15 18 - too big, split and pass 12 up.

Parent node becomes 12 20 30 36 42 - too big, split in 2 and pass 30 up. New tree has 3 levels (it is fine if the root has less than *m/2* entries). From this example, we see that B-Trees actually grow from the bottom up, rather than the top-down.

[|30|] / \ / \ [|12|20|] [|36|42|] /---------------/ / | | \ \--------------\ / /-------/ | | \------\ \ / / / \ \ \ [ 8 10 ] [ 15 18 ] [ 25 26 28 ] [ 32 34 ] [ 37 39 ] [ 43 44 55 ]

If we delete from a leaf node, there is no problem unless the leaf node becomes smaller than *(m-1)/2* entries. Then we have to merge it with an adjacent leaf. If the resulting node has too many entries we have to split it in two and send the middle key up as for insertion.

Else if the key we wish to delete is not on a leaf node, then we replace it with the next larger entry in the B-tree. Analogous to finding the leftmost child in the right subtree of a binary tree. Follow pointer to next child node and then follow all leftmost pointers until you reach a leaf. Replace key to be deleted with smallest key in the leaf node and then delete that key from the leaf. Merge with an adjacent leaf if necessary (see process for deleting from a leaf node).

Let us Delete 44 (in a leaf) from the above tree:

[|30|] / \ / \ [|12|20|] [|36|42|] /---------------/ / | | \ \--------------\ / /-------/ | | \------\ \ / / / \ \ \ [ 8 10 ] [ 15 18 ] [ 25 26 28 ] [ 32 34 ] [ 37 39 ] [ 43 55 ]Now delete 18, in a leaf. Since the new leaf is too small, merge 15 20 25 26 28, move 25 up. The result is like a rotation in AVL Tree.

[|30|] / \ / \ [|12|25|] [|36|42|] /---------------/ / | | \ \--------------\ / /-------/ | | \------\ \ / / / \ \ \ [ 8 10 ] [ 15 20 ] [ 26 28 ] [ 32 34 ] [ 37 39 ] [ 43 55 ]Delete 36, replace with 37, delete old 37, merge 32, 34, new 37, 39, keep in one full node.

[|30|] / \ / \ [|12|25|] [|42|] /---------------/ / | | \ / /-------/ | | \----------\ / / / \ \ [ 8 10 ] [ 15 20 ] [ 26 28 ] [ 32 34 37 39 ] [ 43 55 ]Now node with 42 has only one key, so it must be merged with 12 25 30 42 to form a new root node. Tree height is reduced by 1.

[|12|25|30|42|] /---------------/ / | | \ / /-------/ | | \----------\ / / / \ \ [ 8 10 ] [ 15 20 ] [ 26 28 ] [ 32 34 37 39 ] [ 43 55 ]Every n-node B-tree has height O(lg n), and every major operation works on a path from the root to a leaf.

In a Fibonacci heap, each node x contains a pointer to its parent and a pointer to an arbitrary one of its children. The children of a node are linked together in a circular, doubly linked "child list" in an arbitrary order. The "degree" of a node indicates its number of children.

Defined in this way, certain operations can be carried out in O(1) time. INSERT simply adds the new node as the root of a new tree and update the heap root when necessary; UNION concatenate the tree-root lists of two heaps and decide the heap root. Complicated structure maintenance only happen after the minimum value (heap root) is removed. After that the children of the removed node are treated as roots of separate trees, then the trees of the same degree are merged repeatedly, as in the following example:

A Fibonacci heap can also support other operations, like deleting a non-root node, decreasing a key value, etc.

For this type of structures, for each operation it usually makes more sense to analyze its average cost when repeated in the worst case, which is called the "amortized cost" operation. For example, if a stack is implemented as an array with fixed length, *push* usually takes O(1) time, but the worst case is O(n) when space reallocation happens. Since the latter happens after the former happens O(n) times, the amortized cost is still O(1). Chapter 17 provides a detailed description of this analysis technique.

Here is a comparison between binary heap and Fibonacci heap: