Midterm 2 Notes

Interfaces

  • Every method in interface must be public (public by default)
  • Variables will be public static final, no instance variables for interfaces

Abstract Classes

  • abstract keyword for abstract methods
  • Can provide any kind of variables & methods
  • Subclasses can only extend one (abstract) class

Autoboxing

  • Arrays never autoboxed/unboxed (e.g. Integer[] cannot be used in place of int[] & vice versa)
  • Can cast int to Integer

Promotion/Primitive Widening

  • Similar thing happens when moving from primitive type w/ narrower range to wider range
    • Value is promoted
    • double wider than int → can pass int as arg to method that declares double param
  • To move from wider format to narrower format, must use casting

Immutability

  • final helps compiler ensure immutability, not guarantee
    • Neither necessary nor sufficient for immutability
    • Can assign value only once (in constructor of class or initializer)
  • Declaring reference final does not make object referred to by reference immutable
    • public final ArrayDeque<String> d = new ArrayDeque<>();
      • Memory box d not allowed to point at any other ArrayDeque, can't be changed to point at different ArrayDeque
      • Referenced ArrayDeque itself can change

Generic Methods

  • Types inferred from type of object passed in

Type Upper Bounds

  • Can use extends keyword as type upper bound
    • Used as statement of fact, doesn't change definition/behavior of generic method parameter

Checked vs Unchecked Exceptions

  • Any subclass of RuntimeException or Error is unchecked
  • All other Throwables are checked

Iteration

  • Instantiate non-static nested class (inner class) → use instance.new
  • Implement Iterable interface to support enhanced for loop
    • iterator() method must return object that implements Iterator interface

Packages

  • Cannot import/use/access code from default package from within different package

Access Control

  • Access based only on static types

Access Control w/ Inheritance & Packages

  • protected modifier allows package members & subclasses to use class member
  • Package private: no modifier → allows classes from same package, but not subclasses to access member

Access Levels

Modifier Class Package Subclass World
public Y Y Y Y
protected Y Y Y N
no modifier Y Y N N
private Y N N N

Access Control at the Top Level

  • Two levels: public, no modifier (package-private)
    • Can't declare top level class as private/protected
  • No such thing as a sub-package, ug.joshh.Animal & ug.joshh.Plant = 2 completely different packages

.equals()

  • Default implementation of .equals() uses ==
  • JUnit assertEquals uses .equals()
  • .equals() parameter must take Object, cast to actual type w/in .equals() method
  • Generally will need:
    • Reference check
    • null check
    • Class check w/ .getClass()
    • Cast to same type
    • Check fields

Asymptotics

  • If function has runtime with order of growth :
    • Running function on input size of 1000 & input size of 10000 may not be large enough to exhibit quadratic behavior (i.e. takes 100 times longer to run)
      • notation may abstract away potentially very large constant factors that may initially entirely determine the overall runtime for small inputs
  • Use arithmetic/geometric sum formulas
    • Arithmetic:
    • Geometric:
      • Infinite ():
  • May need to state what if not explicitly in problem
  • If → false b/c actual could be scaled less than

Using Potentials for Amortization

  • Amortized Analysis
  • Associate potential w/ operation that tracks "saved up" time from cheap operations for spending on expensive ones, starting from
  • Define cost of operation as
  • On cheap operations, pick
  • On expensive operations, pick
  • Goals for ArrayList:
    • Requires choosing
    • Cost for operations is 1 for non-powers of 2, & for powers of 2
    • For high cost ops, need in bank, have previous operations to reach balance
    • ArrayList amortized insertion/removal if utilizing geometric resizing
  • On average, each op takes constant time, arrays = good lists
    • Rigorously show by overestimating constant time of each operation & proving resulting potential never

Disjoint Sets

Implementation Constructor connect isConnected
QuickFindDS
QuickUnionDS
WeightedQuickUnionDS
  • Tree height = # of edges from root of deepest node
    • Root has depth 0

Weighted Quick Union

  • Modify quick-union to avoid tall trees
    • Track tree size (number of elements), also works similarly for height
    • Always link root of smaller tree to larger tree
  • Max depth of any item is
    • Depth of element x only increases when tree T1 that contains x linked below other tree T2
      • Occurs only when weight(T2) >= weight(T1)
      • Size of resulting tree is at least doubled
      • Depth of element x incremented only when weight(T2) >= weight(T1) & resulting tree at least doubled in size
    • Tree containing x can double in size at most times, when starting from just 1 element
    • Max depth starts at 0 for only 1 element (root) → increments at most times, max depth =

Path Compression

  • When doing isConnected(15, 10), tie all nodes seen to the root
    • Additional cost insignificant (same order of growth)
  • Kinda memoization/dynamic programming?
  • Path compression results in union/connected operations very close to amortized constant time
  • operations on nodes =
  • Tighter bound: , where is the inverse Ackermann function



Trees, BSTs

Tree

  • Constraint: Exactly one path between any 2 nodes

BST

  • Consequence of rules = no duplicate keys allowed in BST
  • Random inserts take on average only each
  • Insertion of random data yields bushy BST
    • On random data, order of growth for get/put operations = logarithmic
  • Randomly deleting and inserting from tree changes height from to
    • Hibbard deletion results in order of growth
  • Deletion not commutative

TreeMap/TreeSet

  • BST internally
  • Sorted/ordered Map
  • Self-balancing, height
  • Insertion/removal/searching →

Balanced BSTs

B-Tree

  • B-tree of order also called 2-3-4 tree (or 2-4 tree)
    • # of children node can have, (e.g. 2-3-4 tree node may have 2, 3, or 4 children)
  • B-tree of order also called 2-3 tree

Perfect Balance & Logarithmic Height

  • Max # of splitting operations per insert:
    • Time per insert/contains:

Tree Rotation

  • Preserves search tree property
  • Given arbitrarily unbalanced tree, sequence of rotations that will yield balanced tree
  • Balanced search tree = tree w/in constant factor of 2

Left-Leaning Red Back Tree (LLRB)

  • Every path from root to leaf has same # of black links
    • Imposes balance on LLRB
    • Black edges in LLRB connect 2-3 nodes in 2-3 tree
    • 2-3 tree balanced on black edges → LLRB also balanced on black edges
      • Guaranteed logarithmic performance for insert
  • Walking along red edges analogous to walking through elements of stuffed node in B-tree
  • # of red edges used on any given path from root to bottom of tree constrained
  • At most red edges for every black edge along path
    • Height along any given path in red-black tree at most
    • 2-3 tree (which is balanced), corresponding red-black tree that has depth
  • Searching LLRB tree for key just like BST
    • Red edges only matter in insertions
    • Red edges just like black edges for searching

Maintaining Isometry Through Rotations

  • isometry between 2-3 tree & LLRB
  • Implementation of LLRB based on maintaining isometry
  • When performing LLRB operations, pretend as if 2-3 tree
  • Preservation of isometry involves tree rotations

Preserving Isometry After Addition/Insertion Operations

  • Violations for 2-3 trees:
    • Existence of 4-nodes
  • Operations for fixing 2-3 tree violations:
    • Splitting 4-node
  • Violations for LLRBs:
    • 2 red children
    • 2 consecutive red links
    • Right red child (wrong representation)
  • Operations for fixing LLRB tree violations:
    • Tree rotations & color flips

Summary

  • 2-3 & 2-3-4 trees have perfect balance
    • Height guaranteed logarithmic
    • After insert/delete → at most 1 split operation per level of tree
      • Height logarithmic → splits
      • insert/delete
    • Hard to implement
  • LLRBs mimic 2-3 tree behavior using color flipping & tree rotation
    • Height guaranteed logarithmic
    • After insert/delete → at most 1 color flip or rotation per level of tree
      • Height logarithmic → flips/rotations
      • insert/delete
    • Easier to implement, constant factor faster than 2-3 or 2-3-4 tree

Hashing

Hash Tables

  • Never store mutable objects in HashSet or HashMap
  • Never override equals w/out also overriding hashCode

Hash Functions

  • Computing hash function consists of 2 steps:
    1. Compute hashCode (integer between &
    2. Computing index = hashCode

Default hashCodes()

  • All Objects have hashCode() function
  • Default: returns this (address of object)

Negative .hashCodes in Java

  • In Java, -1 % 4 == -1 → use Math.floorMod instead
  • When finding bucket to insert into, use bucketNum = (o.hashCode() & 0x7FFFFFFF) % M
    • M = # of buckets
    • Can flip negative bit or use Math.floorMod

Good hashCodes()

  • Wide variety/range of objects ("random")
  • Deterministic
  • If 2 objects equal by .equals() → must have same hashCode()
  • Efficiently computed
  • Doesn't use mutable fields

.equals()

@Override
public boolean equals(Object other) {
    if (this == o) {
        return true;
    }

    if (o == null || getClass() != o.getClass()) {
        return false;
    }

    SimpleOomage that = (SimpleOomage) o;

    return (red == that.red) && (green == that.green) && (blue == that.blue); // check equality of fields
}

Summary

  • W/ good hashCode() & resizing, operations are amortized
  • Store & retrieval does not require items to be Comparable (unlike (balanced) BST)
Data Structure contains(x) insert(x)
Linked List
Bushy BSTs (used by TreeSet)
Unordered Array
Hash Table (used by HashSet)

Priority Queues & Heaps

Priority Queue Interface

/** (Min) Priority Queue: Allowing tracking & removal of smallest item in priority queue */
public interface MinPQ<Item> {
    public void add(Item x);
    public Item getSmallest();
    public Item removeSmallest();
    public int size();
}
  • Only allows interaction w/ smallest item at any given time → better performance than List/Set
  • Useful for keeping track of "smallest", "largest", "best", etc. seen so far

Heaps

  • Binary min-heap: Binary tree that is complete & obeys min-heap property
    • Min-heap property: Every node both its children
    • Complete: Missing items only at bottom level (if any), all nodes as far left as possible

Insertion

  • Add to end of heap temporarily
  • Swim up hierarchy

Delete Min

  • Swap last item in heap into root
  • Sink down hierarchy (promote "better" successor)

Tree Representation

  • Store keys in array, offset everything by 1 spot
  • Leave spot 0 empty
  • Makes computation of children/parents "nicer"
    • leftChild(k) = k * 2
    • rightChild(k) = k * 2 + 1
    • parent(k) = k / 2

BST → Heap

  • Swim (heapify up) all nodes in level order (analogous to adding element to heap)
  • Sink all nodes in reverse level order (analogous to removing element from heap)

Heap Implementation of a Priority Queue

Operation Ordered Array Bushy BST (items w/ same priority hard to handle) Hash Table Heap
add
getSmallest
removeSmallest
  • Position in tree/heap = priority
  • Heap is time amortized (resize backing array)
    • Worst case inserting items:
      • Averaged over items →
  • BST can have constant getSmallest by keeping pointer to smallest
  • Heaps handle duplicate priorities much more naturally than BSTs
  • Array based heaps take less memory

Data Structures Summary

Search Data Structures

Data Structure Storage Operation(s) Primary Retrieval Operation Retrieve By:
List add(key)
insert(key, index)
get(index) index
Map put(key, value) get(key) key identity
Set add(key) containsKey(key) key identity
PQ add(key) getSmallest() key order/size
Disjoint Sets conenct(int1, int2) isConnected(int1, int2) 2 int values

Advanced Trees, including Geometric

Traversals

Tree Traversal

Range Finding

  • Instead of traversing entire tree, only want to look at items in certain range

Pruning & findInRange Runtime

  • Pruning: Restrict search to only nodes containing path(s) to answer(s)
  • Runtime

Spatial Trees

Handling Multidimensional Data: Quadtrees

  • Generalization of BST
  • Think of items as oriented in particular 2D direction (e.g NW/NE/SW/SE instead of binary </>)
  • Can have mutliple different quadtrees for same set of data
    • Just like BST, insertion order determines topology of QuadTree
  • If on boundary line, usually assume = → >
  • Pruning
    • Analogous to binary search on BST (eliminate unnecessary search spaces)
    • Ignore branches if no possible way branch could contain wanted item

Graphs

Graph Types

  • Directed: Edges have notion of direction (one-way)
    • Acyclic: No cycles (lead back to start)
    • Cyclic:
  • Undirected: No notion of directionality, can traverse either way (bidirectional?)
    • Acyclic: No cycles (lead back to start) w/out reusing any edges
    • Cyclic:
  • Any graph w/ cycle = cyclic
    • If not → acyclic
  • Terminology
  • Graph Processing Problems

Graph Representations

Depth First Traversal

  • Implementation
  • Properties
    • Guaranteed to reach every reachable node
    • Runs in time
      • Every edge used at most once, total # of vertex considerations = # of edges
        • # of times need to consider vertex # of edges incident on it
        • May be faster for some problems which quit early on some stopping condition (e.g. connectivity)
  • Postorder = return order

Graph Traversals

Graph Problems

  • Runtime
    • Each vertex visited exactly once, each visit = constant time
    • Each edge considered once
  • Space
    • Recursive call stack depth
    • Space of recursive algorithm = depth of call stack

Graph Traversals

  • Level-order/BFS for vertices at same distance from source can be in any permutation
  • If multiple paths to a vertex, BFS always visits based on closest path
  • Traversals & Graph Problems
    • Detect cycle by running DFS → cycle if node already on active call stack encountered, runtime

Topological Sorting

  • Indegree 0 vertices → , have to look at lists w/ total length
  • Only works if graph is acyclic → if cyclic b/c can't have circular dependencies
    • No indegree 0 vertices → cyclic →
  • Topological ordering only possible iff graph is directed acyclic graph (no directed cycles)
  • Reverse postorder
  • Linearizes graph
  • Graph Problems

Implementation

public class DepthFirstOrder {

    private boolean[] marked;
    private Stack<Integer> reversePostorder; // using stack analogous to reversing list b/c LIFO

    public DepthFirstOrder(Digraph G) {
        reversePostorder = new Stack<>();
        marked = new boolean[G.V()];
        for (int v = 0; v < G.V(); v++) {
            if (!marked[v]) {
                dfs(G, v);
            }
        }
    }

    private void dfs(Digraph G, int v) {
        marked[v] = true;
        for (int w : G.adj(v)) {
            if (!marked[w]) {
                dfs(G, w);
            }
        }
        reversePostorder.push(v); // after each DFS is done, "visit" vertex by pushing on stack
    }
}
  • Works even when starting from vertices not indegree 0

Regular Expressions

Regular Expressions in Java

  • Default String method matches matches entire String, but not substrings
  • group(0) = first match of entire pattern, entire match
    • group(i), i > 0 = parenthesized groupings

Java

Maps

  • Map returns null when calling get on key not in Map
  • Map can have keys w/ null values

results matching ""

    No results matching ""