Midterm 2 Notes
Interfaces
- Every method in interface must be
public
(public
by default) - Variables will be
public static final
, no instance variables for interfaces
Abstract Classes
abstract
keyword forabstract
methods- Can provide any kind of variables & methods
- Subclasses can only
extend
one (abstract
) class
Autoboxing
- Arrays never autoboxed/unboxed (e.g.
Integer[]
cannot be used in place ofint[]
& vice versa) - Can cast
int
toInteger
Promotion/Primitive Widening
- Similar thing happens when moving from primitive type w/ narrower range to wider range
- Value is promoted
double
wider thanint
→ can passint
as arg to method that declaresdouble
param
- To move from wider format to narrower format, must use casting
Immutability
final
helps compiler ensure immutability, not guarantee- Neither necessary nor sufficient for immutability
- Can assign value only once (in constructor of class or initializer)
- Declaring reference
final
does not make object referred to by reference immutablepublic final ArrayDeque<String> d = new ArrayDeque<>();
- Memory box
d
not allowed to point at any otherArrayDeque
, can't be changed to point at differentArrayDeque
- Referenced
ArrayDeque
itself can change
- Memory box
Generic Methods
- Types inferred from type of object passed in
Type Upper Bounds
- Can use
extends
keyword as type upper bound- Used as statement of fact, doesn't change definition/behavior of generic method parameter
Checked vs Unchecked Exceptions
- Any subclass of
RuntimeException
orError
is unchecked - All other
Throwable
s are checked
Iteration
- Instantiate non-static nested class (inner class) → use
instance.new
- Implement
Iterable
interface to support enhanced for loopiterator()
method must return object that implementsIterator
interface
Packages
- Cannot import/use/access code from default package from within different package
Access Control
- Access based only on static types
Access Control w/ Inheritance & Packages
protected
modifier allows package members & subclasses to use class member- Package private: no modifier → allows classes from same package, but not subclasses to access member
Access Levels
Modifier | Class | Package | Subclass | World |
---|---|---|---|---|
public |
Y | Y | Y | Y |
protected |
Y | Y | Y | N |
no modifier | Y | Y | N | N |
private |
Y | N | N | N |
Access Control at the Top Level
- Two levels:
public
, no modifier (package-private)- Can't declare top level class as
private
/protected
- Can't declare top level class as
- No such thing as a sub-package,
ug.joshh.Animal
&ug.joshh.Plant
= 2 completely different packages
.equals()
- Default implementation of
.equals()
uses==
- JUnit
assertEquals
uses.equals()
.equals()
parameter must takeObject
, cast to actual type w/in.equals()
method- Generally will need:
- Reference check
null
check- Class check w/
.getClass()
- Cast to same type
- Check fields
Asymptotics
- If function has runtime with order of growth :
- Running function on input size of 1000 & input size of 10000 may not be large enough to exhibit quadratic behavior (i.e. takes 100 times longer to run)
- notation may abstract away potentially very large constant factors that may initially entirely determine the overall runtime for small inputs
- Running function on input size of 1000 & input size of 10000 may not be large enough to exhibit quadratic behavior (i.e. takes 100 times longer to run)
- Use arithmetic/geometric sum formulas
- Arithmetic:
- Geometric:
- Infinite ():
- Arithmetic:
- May need to state what if not explicitly in problem
- If → false b/c actual could be scaled less than
Using Potentials for Amortization
- Amortized Analysis
- Associate potential w/ operation that tracks "saved up" time from cheap operations for spending on expensive ones, starting from
- Define cost of operation as
- On cheap operations, pick → ↑
- On expensive operations, pick
- Goals for
ArrayList
:- Requires choosing
- Cost for operations is 1 for non-powers of 2, & for powers of 2
- For high cost ops, need in bank, have previous operations to reach balance
ArrayList
amortized insertion/removal if utilizing geometric resizing
- On average, each op takes constant time, arrays = good lists
- Rigorously show by overestimating constant time of each operation & proving resulting potential never
Disjoint Sets
Implementation | Constructor | connect |
isConnected |
---|---|---|---|
QuickFindDS |
|||
QuickUnionDS |
|||
WeightedQuickUnionDS |
- Tree height = # of edges from root of deepest node
- Root has depth 0
Weighted Quick Union
- Modify quick-union to avoid tall trees
- Track tree size (number of elements), also works similarly for height
- Always link root of smaller tree to larger tree
- Max depth of any item is
- Depth of element
x
only increases when treeT1
that containsx
linked below other treeT2
- Occurs only when
weight(T2) >= weight(T1)
- Size of resulting tree is at least doubled
- Depth of element
x
incremented only whenweight(T2) >= weight(T1)
& resulting tree at least doubled in size
- Occurs only when
- Tree containing
x
can double in size at most times, when starting from just 1 element - Max depth starts at 0 for only 1 element (root) → increments at most times, max depth =
- Depth of element
Path Compression
- When doing
isConnected(15, 10)
, tie all nodes seen to the root- Additional cost insignificant (same order of growth)
- Kinda memoization/dynamic programming?
- Path compression results in union/connected operations very close to amortized constant time
- operations on nodes =
- Inverse of super exponentiation
- Tighter bound: , where is the inverse Ackermann function
Trees, BSTs
Tree
- Constraint: Exactly one path between any 2 nodes
BST
- Consequence of rules = no duplicate keys allowed in BST
- Random inserts take on average only each
- Insertion of random data yields bushy BST
- On random data, order of growth for get/put operations = logarithmic
- Randomly deleting and inserting from tree changes height from to
- Hibbard deletion results in order of growth
- Deletion not commutative
TreeMap
/TreeSet
BST
internally- Sorted/ordered
Map
- Self-balancing, height
- Insertion/removal/searching →
Balanced BSTs
B-Tree
- B-tree of order also called 2-3-4 tree (or 2-4 tree)
- # of children node can have, (e.g. 2-3-4 tree node may have 2, 3, or 4 children)
- B-tree of order also called 2-3 tree
Perfect Balance & Logarithmic Height
- Max # of splitting operations per insert:
- Time per insert/contains:
Tree Rotation
- Preserves search tree property
- Given arbitrarily unbalanced tree, sequence of rotations that will yield balanced tree
- Balanced search tree = tree w/in constant factor of 2
Left-Leaning Red Back Tree (LLRB)
- Every path from root to leaf has same # of black links
- Imposes balance on LLRB
- Black edges in LLRB connect 2-3 nodes in 2-3 tree
- 2-3 tree balanced on black edges → LLRB also balanced on black edges
- Guaranteed logarithmic performance for
insert
- Guaranteed logarithmic performance for
- Walking along red edges analogous to walking through elements of stuffed node in B-tree
- # of red edges used on any given path from root to bottom of tree constrained
- At most red edges for every black edge along path
- Height along any given path in red-black tree at most
- 2-3 tree (which is balanced), corresponding red-black tree that has depth
- Searching LLRB tree for key just like BST
- Red edges only matter in insertions
- Red edges just like black edges for searching
Maintaining Isometry Through Rotations
- isometry between 2-3 tree & LLRB
- Implementation of LLRB based on maintaining isometry
- When performing LLRB operations, pretend as if 2-3 tree
- Preservation of isometry involves tree rotations
Preserving Isometry After Addition/Insertion Operations
- Violations for 2-3 trees:
- Existence of 4-nodes
- Operations for fixing 2-3 tree violations:
- Splitting 4-node
- Violations for LLRBs:
- 2 red children
- 2 consecutive red links
- Right red child (wrong representation)
- Operations for fixing LLRB tree violations:
- Tree rotations & color flips
Summary
- 2-3 & 2-3-4 trees have perfect balance
- Height guaranteed logarithmic
- After
insert
/delete
→ at most 1 split operation per level of tree- Height logarithmic → splits
insert
/delete
- Hard to implement
- LLRBs mimic 2-3 tree behavior using color flipping & tree rotation
- Height guaranteed logarithmic
- After
insert
/delete
→ at most 1 color flip or rotation per level of tree- Height logarithmic → flips/rotations
insert
/delete
- Easier to implement, constant factor faster than 2-3 or 2-3-4 tree
Hashing
Hash Tables
- Never store mutable objects in
HashSet
orHashMap
- Never override
equals
w/out also overridinghashCode
Hash Functions
- Computing hash function consists of 2 steps:
- Compute
hashCode
(integer between & - Computing index =
hashCode
- Compute
Default hashCodes()
- All
Objects
havehashCode()
function - Default: returns
this
(address of object)
Negative .hashCode
s in Java
- In Java,
-1 % 4 == -1
→ useMath.floorMod
instead - When finding bucket to insert into, use
bucketNum = (o.hashCode() & 0x7FFFFFFF) % M
M
= # of buckets- Can flip negative bit or use
Math.floorMod
Good hashCodes()
- Wide variety/range of objects ("random")
- Deterministic
- If 2 objects equal by
.equals()
→ must have samehashCode()
- Efficiently computed
- Doesn't use mutable fields
.equals()
@Override
public boolean equals(Object other) {
if (this == o) {
return true;
}
if (o == null || getClass() != o.getClass()) {
return false;
}
SimpleOomage that = (SimpleOomage) o;
return (red == that.red) && (green == that.green) && (blue == that.blue); // check equality of fields
}
Summary
- W/ good
hashCode()
& resizing, operations are amortized - Store & retrieval does not require items to be
Comparable
(unlike (balanced) BST)
Data Structure | contains(x) |
insert(x) |
---|---|---|
Linked List | ||
Bushy BSTs (used by TreeSet ) |
||
Unordered Array | ||
Hash Table (used by HashSet ) |
Priority Queues & Heaps
Priority Queue Interface
/** (Min) Priority Queue: Allowing tracking & removal of smallest item in priority queue */
public interface MinPQ<Item> {
public void add(Item x);
public Item getSmallest();
public Item removeSmallest();
public int size();
}
- Only allows interaction w/ smallest item at any given time → better performance than
List
/Set
- Useful for keeping track of "smallest", "largest", "best", etc. seen so far
Heaps
- Binary min-heap: Binary tree that is complete & obeys min-heap property
- Min-heap property: Every node both its children
- Complete: Missing items only at bottom level (if any), all nodes as far left as possible
Insertion
- Add to end of heap temporarily
- Swim up hierarchy
Delete Min
- Swap last item in heap into root
- Sink down hierarchy (promote "better" successor)
Tree Representation
- Store keys in array, offset everything by 1 spot
- Leave spot 0 empty
- Makes computation of children/parents "nicer"
leftChild(k) = k * 2
rightChild(k) = k * 2 + 1
parent(k) = k / 2
BST → Heap
- Swim (heapify up) all nodes in level order (analogous to adding element to heap)
- Sink all nodes in reverse level order (analogous to removing element from heap)
Heap Implementation of a Priority Queue
Operation | Ordered Array | Bushy BST (items w/ same priority hard to handle) | Hash Table | Heap |
---|---|---|---|---|
add |
||||
getSmallest |
||||
removeSmallest |
- Position in tree/heap = priority
- Heap is time amortized (resize backing array)
- Worst case inserting items:
- Averaged over items → →
- Worst case inserting items:
- BST can have constant
getSmallest
by keeping pointer to smallest - Heaps handle duplicate priorities much more naturally than BSTs
- Array based heaps take less memory
Data Structures Summary
Search Data Structures
Data Structure | Storage Operation(s) | Primary Retrieval Operation | Retrieve By: |
---|---|---|---|
List | add(key) insert(key, index) |
get(index) |
index |
Map | put(key, value) |
get(key) |
key identity |
Set | add(key) |
containsKey(key) |
key identity |
PQ | add(key) |
getSmallest() |
key order/size |
Disjoint Sets | conenct(int1, int2) |
isConnected(int1, int2) |
2 int values |
Advanced Trees, including Geometric
Traversals
Tree Traversal
- Level Order
- Traverse top-to-bottom, left-to-right
- Nodes "visited" in given order
- Depth First Traversals
- Preorder (root, left, right), Inorder (left, root, right), Postorder (left, right, root)
- Level Order Traversal
- Iterative Deepening
- Runtime b/c each new level doubles amount of work done & # of nodes visited
- Exponential in height of tree, → height logarithmic in # of nodes, → overall runtime linear in # of nodes,
- Spindly Runtime
- Tree Height & Runtime
- Runtime b/c each new level doubles amount of work done & # of nodes visited
- Iterative Deepening
Range Finding
- Instead of traversing entire tree, only want to look at items in certain range
Pruning & findInRange
Runtime
- Pruning: Restrict search to only nodes containing path(s) to answer(s)
- Runtime
Spatial Trees
Handling Multidimensional Data: Quadtrees
- Generalization of BST
- Think of items as oriented in particular 2D direction (e.g NW/NE/SW/SE instead of binary </>)
- Can have mutliple different quadtrees for same set of data
- Just like BST, insertion order determines topology of QuadTree
- If on boundary line, usually assume = → >
- Pruning
- Analogous to binary search on BST (eliminate unnecessary search spaces)
- Ignore branches if no possible way branch could contain wanted item
Graphs
Graph Types
- Directed: Edges have notion of direction (one-way)
- Acyclic: No cycles (lead back to start)
- Cyclic:
- Undirected: No notion of directionality, can traverse either way (bidirectional?)
- Acyclic: No cycles (lead back to start) w/out reusing any edges
- Cyclic:
- Any graph w/ cycle = cyclic
- If not → acyclic
- Terminology
- Graph Processing Problems
Graph Representations
- Common Simplification
- Degree = # of edges incident on graph node
- Graph API
- Adjacency Matrix
- Runtime
- Rows = start nodes, columns = end nodes
- Edge Sets
- Adjacency Lists
- Runtime
- If sparse graph (few edges) →
- If dense graph (many edges) →
- Bare-Bones Undirected Graph Implementation
- Runtime
- Summary
Depth First Traversal
- Implementation
- Properties
- Guaranteed to reach every reachable node
- Runs in time
- Every edge used at most once, total # of vertex considerations = # of edges
- # of times need to consider vertex # of edges incident on it
- May be faster for some problems which quit early on some stopping condition (e.g. connectivity)
- Every edge used at most once, total # of vertex considerations = # of edges
- Postorder = return order
Graph Traversals
Graph Problems
- Runtime
- Each vertex visited exactly once, each visit = constant time
- Each edge considered once
- Space
- Recursive call stack depth
- Space of recursive algorithm = depth of call stack
Graph Traversals
- Level-order/BFS for vertices at same distance from source can be in any permutation
- If multiple paths to a vertex, BFS always visits based on closest path
- Traversals & Graph Problems
- Detect cycle by running DFS → cycle if node already on active call stack encountered, runtime
Topological Sorting
- Indegree 0 vertices → , have to look at lists w/ total length
- Only works if graph is acyclic → if cyclic b/c can't have circular dependencies
- No indegree 0 vertices → cyclic →
- Topological ordering only possible iff graph is directed acyclic graph (no directed cycles)
- Reverse postorder
- Linearizes graph
- Graph Problems
Implementation
public class DepthFirstOrder {
private boolean[] marked;
private Stack<Integer> reversePostorder; // using stack analogous to reversing list b/c LIFO
public DepthFirstOrder(Digraph G) {
reversePostorder = new Stack<>();
marked = new boolean[G.V()];
for (int v = 0; v < G.V(); v++) {
if (!marked[v]) {
dfs(G, v);
}
}
}
private void dfs(Digraph G, int v) {
marked[v] = true;
for (int w : G.adj(v)) {
if (!marked[w]) {
dfs(G, w);
}
}
reversePostorder.push(v); // after each DFS is done, "visit" vertex by pushing on stack
}
}
- Works even when starting from vertices not indegree 0
Breadth First Search
Regular Expressions
- Introducing the Regular Expression
[]
= set
Regular Expressions in Java
- Default
String
methodmatches
matches entireString
, but not substrings group(0)
= first match of entire pattern, entire matchgroup(i)
,i > 0
= parenthesized groupings
Java
Maps
Map
returnsnull
when callingget
on key not inMap
Map
can have keys w/null
values