- In a binary search tree, each node has key & value, w/ ordering restriction to support efficient search
- Binary search tree (BST): Binary tree where each node has
Comparable
key (& associated value) & satisfies restriction that key in any node is larger than keys in all nodes in node's left subtree & smaller than keys in all nodes in node's right subtree
Basic Implementation
- BST represents set of keys (& associated values)
- many different BSTs that represent same set
Analysis
- Assume keys (uniformly) random → inserted in random order
- BSTs dual to quicksort
- Node at root of tree = first partitioning item in quicksort (no keys to left larger, no keys to right smaller)
- Subtrees built recursively, corresponds to quicksort's recursive subarray sorts
- Adding depths of all nodes = internal path length of tree
- Search hits in BST built from random keys require compares, on average
- = total internal path length of BST built from inserting randomly ordered distinct keys
- Insertions & search misses in BST built from random keys require compares, on average
- Insertions and search misses take 1 more compare, on average, than search hits
- = external path length
- = internal path length
- = # of internal nodes
- Need at least 1 extra up from for each node w/ at least 1 external child → if is leaf node w/ 2 external children, have to count second child by tracing nodes down from root → need extra up from
- Can consider recursively/inductively as well
- # of empty links or external nodes → → need to add 1 for root comparison →
- Like quicksort, standard deviation of # of compares known to be low → formulas increasingly accurate as ↑
Order-based Methods & Deletion
- Important reason BSTs widely used b/c allow keeping keys in order → basis for implementing ordered symbol-table API
Floor & Ceiling
- Floor of key = largest key in BST
key
- If given key
key
key at root of BST → floor of key
must be in left subtree
- If
key
key at root → floor of key
could be in right subtree
- In right subtree only if key
key
in right subtree
- If not (or if
key
= key at root) → key at root = floor of key
- Interchanging right & left (and & , & ) =
ceiling()
Delete
- Choice of using only successor as replacement = arbitrary & not symmetric
- Worthwhile to choose at random between predecessor & successor
Analysis
- Given tree → height determines worst-case of all BST operations (except range search, which incurs additional cost proportional to # of keys returned)
- Average height of BST built from random keys logarithmic → approaches for large
- BST & quicksort both rely on randomization for optimal performance
- Worst case for both when input is in (reverse) sorted order