CSC Lecture notes on Trees


I. Trees, Binary Trees

1. Basic Concepts

2. Binary Trees

3. Mathematical Properties of binary trees

  1. Number of nodes on each level i is at most 2i
    • level 0: 20 = 1 ... root
    • level 1: 21 = 2
    • level 2: 22 = 4

    Those are upper bound numbers which hold when a tree is fully expanded (i.e., a tree of depth d is complete and all nodes at depth 0 through d-1 have two children).

  2. Total number of nodes in a (general) binary tree of depth d is
    • upper bound: 20 + 21 + 22 + .. + 2d = 2d+1 -1 (by geometric sequence) -- when a tree is fully expanded
    • lower bound: d + 1 -- when all internal nodes have 1 child (called a degenerate tree)

  3. Depth of a binary tree with N nodes >= floor(lgN)
    • lower bound: floor(lgN) .. when a tree is fully expanded
    • upper bounf: N - 1 .. , when a tree is a degenerate tree

      e.g. floor(lg31) = 4

Questions:

  1. How many nodes are in a fully expanded binary tree of depth 5? Of them, how many are internal, and how many are leaf nodes?



  2. What is the depth of a complete binary tree with 200 nodes?



4. Class BinaryNode

class BinaryNode
{
  // constructors
  BinaryNode( Comparable o ) { element = o; }
  BinaryNode( Comparable o, BinaryNode lt, BinaryNode rt )
  {
    element  = o;
    left     = lt;
    right    = rt;
  }
  // fields
  Comparable element;
  BinaryNode left;
  BinaryNode right;
}

5. Inorder/Preorder/Postorder Tree Traversals

             A
           /   \
          B     F
           \   / 
           C   G
          / \
         D   E

There are three different orders to visit nodes:

  1. Preorder -- 1) visit current node, 2) go left, then 3) go right
  2. Inorder -- 1) go left, 2) visit current node, then 3) go right
  3. Postorder -- 1) go left, 2) go right, then 3) visit current node
Preorder:  A B C D E F G
Inorder:   B D C E A G F
Postorder: D E C B G F A

6. Recursion and Binary Trees

7. Expression Tree

Preorder:  - * 3 5 2
Inorder:   3 * 5 - 2
Postorder: 3 5 * 2 -

II.  Binary Search Trees (BST's)

1. Concepts

2. Basic Schemes

  1. Find (binary search)
    • To find an element x, start from the root and go down the tree RECURSIVELY.
    • At any node n,
      • if x equals n.element --> return n (reference to the existing node)
      • if x < n.element --> go left
      • if x > n.element --> go right

  2. Insertion
    • New nodes are always inserted as leaf nodes -- NO nodes are inserted as internal nodes.
    • To insert a new item, we must go down the tree as if the new item is searched using binary search. When we come to a node which has NULL to the position (left or right child) where the new item should be found we place the new node there.
    • Example:
             20                                  20 
            / \                                 / \      
          12   31                             12   31
         / \   / \        => after           / \   / \   
        9  18 25  41        inserting 30    9  18 25  41 
          /       / \                         /    \  / \ 
         15      35  60                      15    30 35 60 
                   \                                    \ 
                   40                                   40
      
    • The shape of the tree depends on the order of the inserted values. For example, if the values 30, 2, 42, 15 are inserted in that order,
      1) insert 30    2) insert 2    3) insert 42    4) insert 15
      
          30              30             30               30
                         /              / \              / \
                        2              2   42           2   42
                                                         \
                                                         15
      

      However, if the same values are inserted in a different order: 2, 15, 30, 42

      1) insert 2    2) insert 15    3) insert 30    4) insert 42
      
          2               2               2               2
                           \               \               \
                            15              15              15
                                              \               \
                                               30              30
                                                                 \
                                                                  42

      Thus, depending on the order of insertion, a sequence of insertions may yield a degenerate tree.

  3. Deletion
    • Deletion is more complicated: since nodes in a tree are connected, we may need to reorganize the tree if internal nodes are deleted.
    • Basically, there are three cases:

      Case 1: a leaf node is deleted

      • This case is simple -- just delete the node and NULL out the pointer from its parent (either parent's right or left child)
      • Example:
                  10                              10
                 /  \                            /  \
                5    15     => after            5    15
               / \   /          deleting 6     /     /
              3   6 12                        3     12

      Case 2: an internal node with one child is deleted

      • This case is also relatively simple -- just replace the node with its non-NULL child (either left or right)
      • Example 1) -- delete node with one left child
                  10                               10
                 /  \                             /  \
                5    15     => after             5    12
               / \   /          deleting 15     / \   / \
              3   6 12                         3   6 11  13
                    / \
                  11  13
        
      • Example 2) -- delete node with one right child
                  10 
                 /  \
                5    15     => after
               / \     \        deleting 15
              3   6    20
                       / \
                     17  23
        
      • Both Example 1) and 2) work because all nodes in the right subtree of node with 10 have greater values (a defining property of BST).

      Case 3: an internal node with two children is deleted

      1. Replace data with the smallest data in the right subtree.
      2. Remove this data from the right subtree.

      Example: delete 31

             20 
            / \ 
          12   31
         / \   / \        => after
        9  18 25  41        deleting 31
          /       / \
         15      35  60
                   \
                   40

NOTE: The other way is also possible, that is, to replace data with the largest data in the left subtree. 

3. Class BinarySearchTree

 


III. Balanced Search Trees


Introduction

Balanced search trees are trees whose height in the worst case is O(log n)and for which the operations INSERT, DELETE, and SEARCH can be implemented in O(log n)time. Complete binary trees, if used as search trees, cannot be updated after an insertion or deletion in time less than Ω(n)They are too rigid. The key to the design of balanced search trees is the introduction of glue, or degrees of freedom. This glue takes the form of red nodes in a red-black tree, nodes with 2 children (instead of three) in 2-3 trees, and nodes with a height-imbalance of one in AVL trees.

Binary search trees (see Figure 1) work well for many applications but they are limiting because of their bad worst-case performance (height = O(n)). A binary search tree with this worst-case structure is no more efficient than a regular linked list.

Figure 1: Binary search tree


Figure 2: Worst case binary search tree


Types of Balanced Search Trees

There exist over one hundred types of balanced search trees. Some of the more common types are: AVL trees, B-trees, and red-black trees. B-trees are a category of tree designed to work well on magnetic disks and other direct-access secondary storage devices (Cormen et al., 1996, p:381).

Heights of balanced search trees

red-black tree: height 2 log2(n+1)
AVL tree: height (1.44042...)*log2 n
2-3 tree: height 1 * log2n


AVL tree

AVL trees were invented by two computer chess specialists in Russia, Adelson-Velskii and Landis in 1962 (hence the acronym AVL). >

B-tree

A B-tree of order m is a tree exhibiting the following properties:

  1. the root is either a leaf or has between 2 and m children
  2. all nonleaf nodes (except the root) have between [m/2] and m children
  3. all leaves are at the same depth

B-trees of order 4 are known as 2-3-4 trees, and a B-tree of order 3 is known as a 2-3 tree. The accompanying Java applet demonstrates insert operations on a 2-3 tree. In fact, 2-3 trees were "invented" by J.E. Hopcroft in 1970 (Cormen et al., 1996, p:399), prior to B-trees in 1972 by Bayer and McCreight.



Red-black trees

Red-black trees are standard binary trees that satisfy the following conditions:

  1. every node is either red or black
  2. every external node (nil) is black
  3. no red node can have a red parent
  4. every simple path from a node to an external contains the same number of black nodes

[W: 4.4] AVL Trees

1. Basics

2. Insertion

 

 



Sparse AVL trees

Sparse AVL trees are defined as AVL trees of height h with the fewest possible nodes. Figure 3 shows sparse AVL trees of heights 0, 1, 2, and 3.



Figure 3: Structure of an AVL tree



Analysis of AVL tree organizations

We first analyze the number of nodes nk in a sparse AVL tree of height k.

From Figure 3, we can deduce the following recurrence relation:

which looks like the Fibonacci sequence given by:

Adding 1 on both sides of the equalities we obtain:

Taking to be , we rewrite the recurrence relation as:

which is the usual Fibonacci recurrence relation.

By inspecting the Fibonacci sequence, one suspects that it grows somehow "exponentially" i.e. that it is of the form: , for some suitable constants A and c that we have to find.

Replacing by in the Fibonacci recurrence relation, we get:

After cancelling A, dividing by on both sides, the following quadratic equation:

is solved with the usual method, and:

The theory for linear homogeneous recurrence relations (cf: Rosen, K. "Discrete Mathematics and its Applications", third edition.) says that the general solution of such a recurrence relation is given by:

Note that: is an integer. Also:

since , we have fast.

We still have to find the two unknowns and . Using the initial conditions in the recurrence relation, we know that and , so the constants are extracted from:

Solving we get: and,

Hence, a sparse AVL tree of height k has:

If an AVL tree in general has height k, and n nodes, then:

and therefore, after trivial operations,

Note: is known under the name golden ratio and powers of it are very close to be integers.