CSC393 Feb19

slide version

single file version

Contents

  1. Binary Search Trees - Performance
  2. Perfectly Balanced Binary Search Trees
  3. Balanced Binary Search Trees
  4. AVL Trees
  5. Height of an AVL tree
  6. Estimating f(h)
  7. Upper Bound for the Height of an AVL tree
  8. Insertion into an AVL Tree
  9. The Rotate Methods
  10. The 2 Rotate Methods
  11. Fix Unbalanced Node with rotateLeft
  12. RotateLeft (Before Insertion)
  13. RotateLeft (After Insertion)
  14. The rotateLeft method
  15. Insertion in a Binary Search Tree (not balanced)
  16. Insertion in an AVL Tree (balanced Binary Search Tree)
  17. Insertion in an AVL (with balancing code)
  18. Symbol Table Keys
  19. Problem
  20. Two kinds of symbol tables
  21. Symbol Tables with comparable keys
  22. The ceiling function
  23. Implementation of the ceiling function

Binary Search Trees - Performance[1] [top]

Binary Search Trees have worst case performance for the symbol table operations - put and get - the same as for a linked list.

To guarantee these operations have order of growth log(N), we need to keep the trees balanced.

Keeping the tree balanced guarantees that the paths have length proportional to log(N).

Both put and get must traverse one path and so traversing a path will cost at most log(N) operations.

But we have to add to this the cost of rebalancing.

So the cost of keeping the tree balanced must be small so that the order of growth of put and get remains log(N).

Perfectly Balanced Binary Search Trees[2] [top]

One definition of balanced is too restrictive:

For every node, the height of the left and right subtrees of the node should be equal.

How many nodes are in such a tree if the height if h?

Answer:

      2h + 1 - 1
    

Proof:

Correct for h = 0.

If T(h) is such a tree of height h > 0, then its left and right subtrees are of height h - 1.

So the number of nodes in each subtree must be:

      2h - 1
    

The total number of nodes in T(h) is then

      2 * (2h - 1) + 1 = 2h + 1 - 2 + 1 =
      2h + 1 - 1
    

So only "balanced trees" with this definition could only exist provided the number of nodes is of this this special form!

So we need a slightly less restrictive definition.

Balanced Binary Search Trees[3] [top]

So what should it mean for a BST to be balanced?

There are several possibilities that all lead to all paths in the tree having length at most proportional to log(N).

Two different kinds of balanced BSTs using slightly different definitions of balanced are:

AVL Trees[4] [top]

AVL trees are Binary Search Trees with the additional property that the methods maintain a "balanced" property:

      For every node in an AVL tree, the heights of its children differ by
      at most 1.
    

For the purpose of this definition (and also the implementation), an empty child will be considered to be at height -1.

The AVL tree below shows the height of each node.

      The children of 600 have heights 

       0 (left child 550) and
      -1 (right child empty).

      and these differ by only 1.
    

Height of an AVL tree[5] [top]

What is the an upper bound for the height of an AVL tree with N nodes?

This can be determined if we can answer the reverse question:

What is a lower bound for the number of nodes in an AVL tree of height h?

Let f(h) be the minimum number of nodes that can be in an AVL tree of height h.

Then

      f(-1) = 0
      f(0) = 1
      f(1) = 2
    

For h >= 1,

Estimating f(h)[6] [top]

      f(h) = 1 + f(h-1) + f(h-2)
    

Since f(h-1) >= f(h-2), we get

      f(h) > 2f(h-2)   (for all h >= 1)
    
f(7) > 2f(5) f(8) > 2f(6)
f(7) > 22f(3) f(8) > 22f(4)
f(7) > 23f(1) f(8) > 23f(2)
f(7) > 232 f(8) > 24f(0)
f(7) > 24 f(8) > 24

In general,

      f(h) > 2ceiling(h/2) > 2h/2
    

where ceiling(x) means the smallest integer that is >= x.

Upper Bound for the Height of an AVL tree[7] [top]

Let h be the maximum height for an AVL tree with N nodes. Then f(h) must be <= N.

  2h/2 < f(h) <= N

Taking log2 we get

      h/2 < log2(N)

      or

      h < 2log2(N)


    

So the height of an AVL tree with N nodes is O(log(N)).

Insertion into an AVL Tree[8] [top]

Maintaining the AVL Tree property, requires that the insert includes some code to rebalance portions of the tree if the property would otherwise be violated.

For example if we try to insert the value 525 into this tree in the usual way as a child of 550, the height of 550 would become 1, while the right child (empty) of 600 is still at height -1, a difference of 2.

So the node with 600 would become unbalanced!

The Rotate Methods[9] [top]

This problem is solved by:

      1. First insert in the usual way for a binary search tree either in
      the left subtree or right subtree. (Duplicates are not allowed in
      this version.)
      2. Check if the heights of the two children subtrees differs by 2.
      If so, then rotate nodes to restore the AVL properties.
    

Fix Unbalanced Node with rotateLeft[11] [top]

Returning to the example AVL tree, we try to insert 190:

The rotateLeft(Node t) method is called where t is the unbalanced Node (100 in this example) to rotate the t and its right child to the left.

The result is

What happened? How many links were changed?

What is the code for void rotateLeft(Node t)?

RotateLeft (Before Insertion)[12] [top]

      
                 100 [h=3]
                /   \
               /     \
      [h=1]   A      150 [h=2]
                    /   \
                   /     \
            [h=1] B       C [h=1]
    

Inserting 190 will make subtree C at 175 have height 2 with children of height 0 and 1
(175 will still be balanced).

This makes the subtree at 150 height 3 with children of height 1 and 2
(150 will still be balanced)

But this makes the subtree at 100 unbalanced. Its children will have heights 1 and 3
(100 will be unbalanced)!

RotateLeft (After Insertion)[13] [top]

      
           100 [h=3]                                           150 [h=3] 
          /   \                                               /   \                              
         /     \                                             /     \                             
  [h=1] A      150 [h=2 -> 3]            =>           [h=2] 100     C [h=2]  
              /   \                                        /   \                         
             /     \                                      /     \                        
      [h=1] B       C [h=1 -> 2]                   [h=1] A       B [h=1]
    

The rotateLeft method[14] [top]

Here is the method:

  private Node rotateLeft( Node t )
  {
    // Rotate left with p's right child
    Node r = ?;
    t.right = ?;
    r.left = ?;

    // Adjust the heights since the nodes have been rotated.
    t.height = max( height(p.left), height(p.right) ) + 1;
    r.height = max( height(r.left), height(r.right) ) + 1;

    return r;
  }

Note: The height method just returns the height of
the node passed to it, but also handles the case if null is
passed. In the later case, height returns -1.
    

Insertion in a Binary Search Tree (not balanced)[15] [top]


  Node * insert(const Key& k, const Value& v, Node *rt) {
    if (rt == NULL) {
      rt = new Node(k,v);
    } else if (k < rt->key) {
      rt->left = insert(k, v, rt->left);
    } else if (k > rt->key) {
      rt->right = insert(k, v, rt->right);
    }
    return rt;
  }

Insertion in an AVL Tree (balanced Binary Search Tree)[16] [top]


 Node * insert(const Key& k, const Value& v, Node *rt) {
    if (rt == NULL) {
      rt = new Node(k,v);
    } else if (k < rt->key) {
      rt->left = insert(k, v, rt->left);
      if ((height(rt->left) - height(rt->right)) > 1) {
        // balance ?? 
      
        if (k < rt->left->key) {
          rt = rotateRight(rt);
        } else {
          rt ->left = rotateLeft(rt->left);
          rt = rotateRight(rt);
        }
      

      }
    } else if (k > rt->key) {
      rt->right = insert(k, v, rt->right);
      if ((height(rt->right) - height(rt->left)) > 1) {
        // balance ??
      
        if( k > rt->->key) {
          rt = rotateLeft(rt);
        } else {
          rt->right = rotateRight(rt->right);
          rt = rotateLeft(rt);
        }
      
      }
    }
    rt->ht = max(height(rt->left), height(rt->right)) + 1;
    return rt;
  }

Insertion in an AVL (with balancing code)[17] [top]


 Node * insert(const Key& k, const Value& v, Node *rt) {
    if (rt == NULL) {
      rt = new Node(k,v);
    } else if (k < rt->key) {
      rt->left = insert(k, v, rt->left);
      if ((height(rt->left) - height(rt->right)) > 1) {
        // balance 
      
        if (k < rt->left->key) {
          rt = rotateRight(rt);
        } else {
          rt ->left = rotateLeft(rt->left);
          rt = rotateRight(rt);
        }
      

      }
    } else if (k > rt->key) {
      rt->right = insert(k, v, rt->right);
      if ((height(rt->right) - height(rt->left)) > 1) {
        // balance
      
        if( k > rt->right->key) {
          rt = rotateLeft(rt);
        } else {
          rt->right = rotateRight(rt->right);
          rt = rotateLeft(rt);
        }
      
      }
    }
    rt->ht = max(height(rt->left), height(rt->right)) + 1;
    return rt;
  }

Symbol Table Keys[18] [top]

Do the keys in a symbol table need to be comparable for...

Problem[19] [top]

Read dow jones stock closing data from a file (1895 to 2005) then be able to provide the closing average for any input date.

  1. What if the date is before 1895 or after 2005?
  2. What if the date is in the range 1895 to 2005, but was a weekend date or a holiday?

Two kinds of symbol tables[20] [top]

  1. Require keys to be comparable
  2. Do not require keys to be comparable

Symbol Tables with comparable keys[21] [top]

We want additional methods for these symbol tables:

      Key min();
      Key max();
      void deleteMin();
      void deleteMax();
      Key floor(const Key& key);
      Key ceiling(const Key& key);
      
    

The ceiling function[22] [top]

What is the specification of the ceiling function?

How would it help to handle the dow jones stock problem?

Implementation of the ceiling function[23] [top]

How would we implement ceiling for

  1. SequentialSearchST
  2. BinarySearchST
  3. BST
  4. avlBST