Non-Linear Data Structures
Hierarchical and networked data representations.
Trees: binary trees and BSTs
A tree is a hierarchical non-linear data structure consisting of nodes connected by edges. Unlike linear structures (arrays, linked lists), trees represent hierarchical relationships.
Tree Terminology:
- Root: The topmost node (no parent)
- Parent: A node with children
- Child: A node connected below another node
- Leaf: A node with no children
- Edge: Connection between two nodes
- Path: Sequence of nodes connected by edges
- Height: Length of longest path from root to leaf
- Depth/Level: Distance from root to a node
- Subtree: Tree formed by a node and its descendants
Example Tree:
1 (root, height=2, depth=0)
/ \
2 3 (depth=1)
/ \ \
4 5 6 (leaves, depth=2)
Binary Tree:
A binary tree is a tree where each node has at most two children (left and right).
Types of Binary Trees:
- Full Binary Tree: Every node has 0 or 2 children
1 / \ 2 3 / \ 4 5 - Complete Binary Tree: All levels filled except possibly the last, which fills left to right
1 / \ 2 3 / \ / 4 5 6 - Perfect Binary Tree: All internal nodes have 2 children, all leaves at same level
1 / \ 2 3 / \ / \ 4 5 6 7 - Degenerate/Skewed Tree: Each node has only one child (essentially a linked list)
1 \ 2 \ 3
Binary Tree Implementation:
class TreeNode:
def __init__(self, value):
self.value = value
self.left = None
self.right = None
class BinaryTree:
def __init__(self):
self.root = None
def insert_left(self, parent, value):
"""Insert node as left child"""
if parent is None:
return None
new_node = TreeNode(value)
parent.left = new_node
return new_node
def insert_right(self, parent, value):
"""Insert node as right child"""
if parent is None:
return None
new_node = TreeNode(value)
parent.right = new_node
return new_node
# Usage:
tree = BinaryTree()
tree.root = TreeNode(1)
left_child = tree.insert_left(tree.root, 2)
right_child = tree.insert_right(tree.root, 3)
tree.insert_left(left_child, 4)
tree.insert_right(left_child, 5)
Binary Search Tree (BST):
A BST is a binary tree with an ordering property:
- Left subtree contains only nodes with values less than the parent
- Right subtree contains only nodes with values greater than the parent
- Both subtrees are also BSTs
Example BST:
8
/ \
3 10
/ \ \
1 6 14
/ \ /
4 7 13
Property: 1 < 3 < 4 < 6 < 7 < 8 < 10 < 13 < 14
BST Operations:
1. Search - O(h) where h is height:
def search(root, target):
"""Search for a value in BST"""
if root is None or root.value == target:
return root
if target < root.value:
return search(root.left, target) # Search left
else:
return search(root.right, target) # Search right
# Iterative version:
def search_iterative(root, target):
current = root
while current is not None:
if target == current.value:
return current
elif target < current.value:
current = current.left
else:
current = current.right
return None
2. Insert - O(h):
def insert(root, value):
"""Insert value into BST"""
if root is None:
return TreeNode(value)
if value < root.value:
root.left = insert(root.left, value)
elif value > root.value:
root.right = insert(root.right, value)
# If value == root.value, do nothing (no duplicates)
return root
# Usage:
root = None
root = insert(root, 8)
root = insert(root, 3)
root = insert(root, 10)
root = insert(root, 1)
3. Delete - O(h):
def delete(root, value):
"""Delete value from BST"""
if root is None:
return None
# Find the node to delete
if value < root.value:
root.left = delete(root.left, value)
elif value > root.value:
root.right = delete(root.right, value)
else:
# Node found! Three cases:
# Case 1: Leaf node (no children)
if root.left is None and root.right is None:
return None
# Case 2: One child
if root.left is None:
return root.right
if root.right is None:
return root.left
# Case 3: Two children
# Find inorder successor (min value in right subtree)
successor = find_min(root.right)
root.value = successor.value
root.right = delete(root.right, successor.value)
return root
def find_min(root):
"""Find minimum value node"""
while root.left is not None:
root = root.left
return root
def find_max(root):
"""Find maximum value node"""
while root.right is not None:
root = root.right
return root
4. Find Min/Max - O(h):
# Min: leftmost node
# Max: rightmost node
BST Properties:
- Inorder traversal gives sorted sequence
- Average case: O(log n) for search, insert, delete
- Worst case: O(n) when tree becomes skewed
- Space: O(n) for storing n nodes
BST vs Array vs Linked List:
| Operation | Sorted Array | Linked List | BST (balanced) |
|---|---|---|---|
| Search | O(log n) | O(n) | O(log n) |
| Insert | O(n) | O(1)* | O(log n) |
| Delete | O(n) | O(1)* | O(log n) |
| Min/Max | O(1) | O(n) | O(log n) |
| Sorted order | Built-in | No | Inorder traversal |
*assuming position is known
Applications of BST:
- Databases: Indexing for fast queries
- File systems: Directory structures
- Expression trees: Compiler design
- Symbol tables: Compiler implementation
- Priority queues: When dynamic updates needed
Example: Building and Using a BST:
class BST:
def __init__(self):
self.root = None
def insert(self, value):
self.root = self._insert_recursive(self.root, value)
def _insert_recursive(self, node, value):
if node is None:
return TreeNode(value)
if value < node.value:
node.left = self._insert_recursive(node.left, value)
elif value > node.value:
node.right = self._insert_recursive(node.right, value)
return node
def search(self, value):
return self._search_recursive(self.root, value)
def _search_recursive(self, node, value):
if node is None or node.value == value:
return node
if value < node.value:
return self._search_recursive(node.left, value)
return self._search_recursive(node.right, value)
def inorder(self):
"""Return sorted values"""
result = []
self._inorder_recursive(self.root, result)
return result
def _inorder_recursive(self, node, result):
if node:
self._inorder_recursive(node.left, result)
result.append(node.value)
self._inorder_recursive(node.right, result)
# Usage:
bst = BST()
values = [8, 3, 10, 1, 6, 14, 4, 7, 13]
for val in values:
bst.insert(val)
print(bst.inorder()) # [1, 3, 4, 6, 7, 8, 10, 13, 14] - sorted!
print(bst.search(6) is not None) # True
print(bst.search(5) is not None) # False
Common BST Problems:
- Validate if a tree is a BST
- Find kth smallest/largest element
- Lowest common ancestor
- Convert sorted array to balanced BST
- Find range of values in BST
Tree traversals: inorder/preorder/postorder
Tree traversal is the process of visiting all nodes in a tree in a specific order. There are two main categories: Depth-First Search (DFS) and Breadth-First Search (BFS).
Depth-First Traversals (DFS):
Visit children before siblings. Three main types:
1. Inorder Traversal (Left → Root → Right):
Visit left subtree, then root, then right subtree.
Tree: 4
/ \
2 6
/ \ / \
1 3 5 7
Inorder: 1, 2, 3, 4, 5, 6, 7 (sorted for BST!)
def inorder(root):
"""Inorder traversal: Left → Root → Right"""
if root is None:
return []
result = []
result.extend(inorder(root.left)) # Left
result.append(root.value) # Root
result.extend(inorder(root.right)) # Right
return result
# Iterative version using stack:
def inorder_iterative(root):
result = []
stack = []
current = root
while current or stack:
# Go to leftmost node
while current:
stack.append(current)
current = current.left
# Process node
current = stack.pop()
result.append(current.value)
# Visit right subtree
current = current.right
return result
Use cases:
- Get sorted values from BST
- Expression tree evaluation (infix notation)
2. Preorder Traversal (Root → Left → Right):
Visit root first, then left subtree, then right subtree.
Tree: 4
/ \
2 6
/ \ / \
1 3 5 7
Preorder: 4, 2, 1, 3, 6, 5, 7
def preorder(root):
"""Preorder traversal: Root → Left → Right"""
if root is None:
return []
result = []
result.append(root.value) # Root
result.extend(preorder(root.left)) # Left
result.extend(preorder(root.right)) # Right
return result
# Iterative version:
def preorder_iterative(root):
if root is None:
return []
result = []
stack = [root]
while stack:
node = stack.pop()
result.append(node.value)
# Push right first (so left is processed first)
if node.right:
stack.append(node.right)
if node.left:
stack.append(node.left)
return result
Use cases:
- Copy a tree
- Get prefix expression (prefix notation)
- Serialize/deserialize a tree
3. Postorder Traversal (Left → Right → Root):
Visit left subtree, then right subtree, then root.
Tree: 4
/ \
2 6
/ \ / \
1 3 5 7
Postorder: 1, 3, 2, 5, 7, 6, 4
def postorder(root):
"""Postorder traversal: Left → Right → Root"""
if root is None:
return []
result = []
result.extend(postorder(root.left)) # Left
result.extend(postorder(root.right)) # Right
result.append(root.value) # Root
return result
# Iterative version (more complex):
def postorder_iterative(root):
if root is None:
return []
result = []
stack = []
last_visited = None
current = root
while current or stack:
# Go to leftmost node
while current:
stack.append(current)
current = current.left
peek_node = stack[-1]
# If right child exists and not yet processed
if peek_node.right and last_visited != peek_node.right:
current = peek_node.right
else:
result.append(peek_node.value)
last_visited = stack.pop()
return result
Use cases:
- Delete a tree (delete children before parent)
- Get postfix expression (postfix notation)
- Calculate directory sizes
Breadth-First Traversal (BFS / Level-Order):
Visit nodes level by level, left to right.
Tree: 4
/ \
2 6
/ \ / \
1 3 5 7
Level-order: 4, 2, 6, 1, 3, 5, 7
from collections import deque
def level_order(root):
"""Level-order traversal (BFS)"""
if root is None:
return []
result = []
queue = deque([root])
while queue:
node = queue.popleft()
result.append(node.value)
if node.left:
queue.append(node.left)
if node.right:
queue.append(node.right)
return result
# Level-by-level (grouped):
def level_order_grouped(root):
"""Return nodes grouped by level"""
if root is None:
return []
result = []
queue = deque([root])
while queue:
level_size = len(queue)
current_level = []
for _ in range(level_size):
node = queue.popleft()
current_level.append(node.value)
if node.left:
queue.append(node.left)
if node.right:
queue.append(node.right)
result.append(current_level)
return result
# Returns: [[4], [2, 6], [1, 3, 5, 7]]
Use cases:
- Find shortest path in unweighted tree
- Level-order processing
- Check if tree is complete
- Print tree level by level
Traversal Comparison:
| Traversal | Order | Use Case | Data Structure |
|---|---|---|---|
| Inorder | Left-Root-Right | Get sorted values (BST) | Stack (recursion) |
| Preorder | Root-Left-Right | Copy tree, serialize | Stack |
| Postorder | Left-Right-Root | Delete tree, calculate size | Stack |
| Level-order | Level by level | Shortest path, level processing | Queue |
Visual Summary:
Tree: 1
/ \
2 3
/ \
4 5
Preorder: 1 → 2 → 4 → 5 → 3 (DLR: Data-Left-Right)
Inorder: 4 → 2 → 5 → 1 → 3 (LDR: Left-Data-Right)
Postorder: 4 → 5 → 2 → 3 → 1 (LRD: Left-Right-Data)
Level: 1 → 2 → 3 → 4 → 5 (Top-to-bottom, left-to-right)
Time and Space Complexity:
- Time: O(n) - visit each node once
- Space:
- Recursion: O(h) for call stack (h = height)
- Iterative: O(h) for explicit stack/queue
- Level-order: O(w) where w = max width
Practical Example: Expression Tree:
"""
Expression: (3 + 5) * 2
Tree: *
/ \
+ 2
/ \
3 5
Inorder (infix): 3 + 5 * 2 (need parentheses)
Preorder (prefix): * + 3 5 2
Postorder (postfix): 3 5 + 2 *
"""
def evaluate_expression_tree(root):
"""Evaluate arithmetic expression tree"""
if root is None:
return 0
# Leaf node (operand)
if root.left is None and root.right is None:
return int(root.value)
# Evaluate subtrees
left_val = evaluate_expression_tree(root.left)
right_val = evaluate_expression_tree(root.right)
# Apply operator (postorder: process children first!)
if root.value == '+':
return left_val + right_val
elif root.value == '-':
return left_val - right_val
elif root.value == '*':
return left_val * right_val
elif root.value == '/':
return left_val / right_val
# This uses postorder traversal internally!
Heaps: min-heap and max-heap
A heap is a specialized tree-based data structure that satisfies the heap property. It’s typically implemented as a complete binary tree stored in an array.
Heap Property:
- Min-Heap: Parent ≤ Children (smallest at root)
- Max-Heap: Parent ≥ Children (largest at root)
Why Heaps?
- Efficiently find min/max: O(1)
- Efficiently insert/delete: O(log n)
- Foundation for priority queues
- Used in heapsort algorithm
Min-Heap Example:
1
/ \
3 2
/ \ / \
7 5 4 6
Array representation: [1, 3, 2, 7, 5, 4, 6]
Property: Parent ≤ Children at every node
Max-Heap Example:
10
/ \
7 9
/ \ / \
3 5 6 8
Array representation: [10, 7, 9, 3, 5, 6, 8]
Property: Parent ≥ Children at every node
Array Implementation:
For a node at index i:
- Left child:
2*i + 1 - Right child:
2*i + 2 - Parent:
(i - 1) // 2
class MinHeap:
def __init__(self):
self.heap = []
def parent(self, i):
return (i - 1) // 2
def left_child(self, i):
return 2 * i + 1
def right_child(self, i):
return 2 * i + 2
def swap(self, i, j):
self.heap[i], self.heap[j] = self.heap[j], self.heap[i]
def insert(self, value):
"""Insert value - O(log n)"""
# Add to end
self.heap.append(value)
# Bubble up to maintain heap property
current = len(self.heap) - 1
while current > 0:
parent = self.parent(current)
if self.heap[current] < self.heap[parent]:
self.swap(current, parent)
current = parent
else:
break
def extract_min(self):
"""Remove and return minimum - O(log n)"""
if len(self.heap) == 0:
raise IndexError("Heap is empty")
if len(self.heap) == 1:
return self.heap.pop()
# Store minimum
min_value = self.heap[0]
# Move last element to root
self.heap[0] = self.heap.pop()
# Bubble down to maintain heap property
self._heapify_down(0)
return min_value
def _heapify_down(self, i):
"""Restore heap property by bubbling down"""
size = len(self.heap)
while True:
smallest = i
left = self.left_child(i)
right = self.right_child(i)
# Find smallest among node and its children
if left < size and self.heap[left] < self.heap[smallest]:
smallest = left
if right < size and self.heap[right] < self.heap[smallest]:
smallest = right
# If node is in correct position, done
if smallest == i:
break
# Swap and continue
self.swap(i, smallest)
i = smallest
def peek(self):
"""Return minimum without removing - O(1)"""
if len(self.heap) == 0:
raise IndexError("Heap is empty")
return self.heap[0]
def size(self):
return len(self.heap)
def is_empty(self):
return len(self.heap) == 0
# Usage:
min_heap = MinHeap()
min_heap.insert(5)
min_heap.insert(3)
min_heap.insert(7)
min_heap.insert(1)
print(min_heap.peek()) # 1 (minimum)
print(min_heap.extract_min()) # 1
print(min_heap.peek()) # 3 (new minimum)
Max-Heap Implementation:
class MaxHeap:
"""Similar to MinHeap but with opposite comparisons"""
def __init__(self):
self.heap = []
def parent(self, i):
return (i - 1) // 2
def left_child(self, i):
return 2 * i + 1
def right_child(self, i):
return 2 * i + 2
def swap(self, i, j):
self.heap[i], self.heap[j] = self.heap[j], self.heap[i]
def insert(self, value):
"""Insert value - O(log n)"""
self.heap.append(value)
current = len(self.heap) - 1
# Bubble up (compare with > instead of <)
while current > 0:
parent = self.parent(current)
if self.heap[current] > self.heap[parent]:
self.swap(current, parent)
current = parent
else:
break
def extract_max(self):
"""Remove and return maximum - O(log n)"""
if len(self.heap) == 0:
raise IndexError("Heap is empty")
if len(self.heap) == 1:
return self.heap.pop()
max_value = self.heap[0]
self.heap[0] = self.heap.pop()
self._heapify_down(0)
return max_value
def _heapify_down(self, i):
"""Restore heap property"""
size = len(self.heap)
while True:
largest = i
left = self.left_child(i)
right = self.right_child(i)
# Find largest among node and children
if left < size and self.heap[left] > self.heap[largest]:
largest = left
if right < size and self.heap[right] > self.heap[largest]:
largest = right
if largest == i:
break
self.swap(i, largest)
i = largest
def peek(self):
"""Return maximum without removing - O(1)"""
if len(self.heap) == 0:
raise IndexError("Heap is empty")
return self.heap[0]
Building a Heap from Array - O(n):
def heapify(arr):
"""Build min-heap from array in O(n) time"""
n = len(arr)
# Start from last non-leaf node and heapify down
for i in range(n // 2 - 1, -1, -1):
_heapify_down(arr, n, i)
def _heapify_down(arr, n, i):
"""Helper to heapify subtree rooted at i"""
smallest = i
left = 2 * i + 1
right = 2 * i + 2
if left < n and arr[left] < arr[smallest]:
smallest = left
if right < n and arr[right] < arr[smallest]:
smallest = right
if smallest != i:
arr[i], arr[smallest] = arr[smallest], arr[i]
_heapify_down(arr, n, smallest)
# Usage:
arr = [5, 3, 7, 1, 9, 4]
heapify(arr)
print(arr) # Valid min-heap structure
Python’s heapq Module:
import heapq
# Min-heap (default)
heap = []
heapq.heappush(heap, 5)
heapq.heappush(heap, 3)
heapq.heappush(heap, 7)
heapq.heappush(heap, 1)
print(heapq.heappop(heap)) # 1 (minimum)
print(heapq.heappop(heap)) # 3
print(heap[0]) # 5 (peek at minimum)
# Build heap from list
data = [5, 3, 7, 1, 9, 4]
heapq.heapify(data) # Converts to min-heap in-place
# For max-heap, negate values:
max_heap = []
heapq.heappush(max_heap, -5)
heapq.heappush(max_heap, -3)
heapq.heappush(max_heap, -7)
max_val = -heapq.heappop(max_heap) # 7
Applications:
1. Priority Queue:
class PriorityQueue:
"""Task queue with priorities"""
def __init__(self):
self.heap = []
self.counter = 0 # For tie-breaking
def push(self, item, priority):
heapq.heappush(self.heap, (priority, self.counter, item))
self.counter += 1
def pop(self):
return heapq.heappop(self.heap)[2] # Return item only
# Usage:
pq = PriorityQueue()
pq.push("task1", priority=3)
pq.push("task2", priority=1) # Highest priority
pq.push("task3", priority=2)
print(pq.pop()) # "task2" (priority 1)
2. Find K Largest/Smallest Elements:
def find_k_largest(arr, k):
"""Find k largest elements - O(n log k)"""
# Use min-heap of size k
heap = arr[:k]
heapq.heapify(heap)
for num in arr[k:]:
if num > heap[0]:
heapq.heapreplace(heap, num)
return heap
def find_k_smallest(arr, k):
"""Find k smallest elements - O(n log k)"""
return heapq.nsmallest(k, arr) # Built-in
# Usage:
arr = [3, 1, 5, 12, 2, 11, 9, 8, 4]
print(find_k_largest(arr, 3)) # [9, 11, 12]
print(find_k_smallest(arr, 3)) # [1, 2, 3]
3. Heapsort - O(n log n):
def heapsort(arr):
"""Sort array using heap"""
# Build max-heap
n = len(arr)
for i in range(n // 2 - 1, -1, -1):
_heapify_down_max(arr, n, i)
# Extract elements one by one
for i in range(n - 1, 0, -1):
arr[0], arr[i] = arr[i], arr[0] # Swap root with last
_heapify_down_max(arr, i, 0) # Heapify reduced heap
def _heapify_down_max(arr, n, i):
largest = i
left = 2 * i + 1
right = 2 * i + 2
if left < n and arr[left] > arr[largest]:
largest = left
if right < n and arr[right] > arr[largest]:
largest = right
if largest != i:
arr[i], arr[largest] = arr[largest], arr[i]
_heapify_down_max(arr, n, largest)
# Usage:
arr = [5, 3, 7, 1, 9, 4]
heapsort(arr)
print(arr) # [1, 3, 4, 5, 7, 9]
4. Merge K Sorted Lists:
def merge_k_sorted_lists(lists):
"""Merge k sorted lists using min-heap"""
heap = []
result = []
# Add first element from each list
for i, lst in enumerate(lists):
if lst:
heapq.heappush(heap, (lst[0], i, 0)) # (value, list_idx, elem_idx)
while heap:
val, list_idx, elem_idx = heapq.heappop(heap)
result.append(val)
# Add next element from same list
if elem_idx + 1 < len(lists[list_idx]):
next_val = lists[list_idx][elem_idx + 1]
heapq.heappush(heap, (next_val, list_idx, elem_idx + 1))
return result
# Usage:
lists = [[1, 4, 7], [2, 5, 8], [3, 6, 9]]
print(merge_k_sorted_lists(lists)) # [1, 2, 3, 4, 5, 6, 7, 8, 9]
Heap Complexity:
| Operation | Time Complexity |
|---|---|
| Insert | O(log n) |
| Extract Min/Max | O(log n) |
| Peek Min/Max | O(1) |
| Build Heap | O(n) |
| Heapsort | O(n log n) |
| Space | O(n) |
Heap vs BST:
| Feature | Heap | BST |
|---|---|---|
| Structure | Complete binary tree | Can be unbalanced |
| Order | Partial (heap property) | Full (left < root < right) |
| Find min/max | O(1) | O(h) |
| Search arbitrary | O(n) | O(h) |
| Insert | O(log n) | O(h) |
| Delete min/max | O(log n) | O(h) |
| Array representation | Yes (efficient) | No |
When to use Heaps:
- Need quick access to min/max only
- Implementing priority queues
- K largest/smallest problems
- Median maintenance
- Task scheduling
- Graph algorithms (Dijkstra’s, Prim’s)
Balanced trees: AVL and Red-Black
Balanced trees are BSTs that automatically maintain a balanced structure to ensure O(log n) operations in the worst case.
Why Balanced Trees?
A regular BST can degenerate into a linked list:
Unbalanced (worst case: O(n)):
1
\
2
\
3
\
4
Balanced (O(log n)):
2
/ \
1 3
\
4
AVL Tree (Adelson-Velsky and Landis):
An AVL tree is a self-balancing BST where the height difference between left and right subtrees (balance factor) is at most 1 for every node.
Balance Factor: height(left subtree) - height(right subtree)
- Must be in {-1, 0, 1}
-
If BF > 1, rebalancing needed
Example AVL Tree:
8 (BF=0)
/ \
4 10 (BF=0)
/ \ \
2 6 12
BF BF BF
0 0 0
AVL Rotations:
When balance factor violates AVL property, perform rotations:
1. Right Rotation (LL case):
y (BF=2) x
/ / \
x → z y
/
z
Code:
def right_rotate(y):
x = y.left
T = x.right
x.right = y
y.left = T
# Update heights
y.height = 1 + max(height(y.left), height(y.right))
x.height = 1 + max(height(x.left), height(x.right))
return x # New root
2. Left Rotation (RR case):
x (BF=-2) y
\ / \
y → x z
\
z
Code:
def left_rotate(x):
y = x.right
T = y.left
y.left = x
x.right = T
# Update heights
x.height = 1 + max(height(x.left), height(x.right))
y.height = 1 + max(height(y.left), height(y.right))
return y # New root
3. Left-Right Rotation (LR case):
z (BF=2) z y
/ / / \
x → y → x z
\ /
y x
Solution: Left rotate x, then right rotate z
4. Right-Left Rotation (RL case):
x (BF=-2) x y
\ \ / \
z → y → x z
/ \
y z
Solution: Right rotate z, then left rotate x
AVL Tree Implementation:
class AVLNode:
def __init__(self, value):
self.value = value
self.left = None
self.right = None
self.height = 1
class AVLTree:
def height(self, node):
return node.height if node else 0
def balance_factor(self, node):
if not node:
return 0
return self.height(node.left) - self.height(node.right)
def update_height(self, node):
if node:
node.height = 1 + max(self.height(node.left), self.height(node.right))
def right_rotate(self, y):
x = y.left
T = x.right
x.right = y
y.left = T
self.update_height(y)
self.update_height(x)
return x
def left_rotate(self, x):
y = x.right
T = y.left
y.left = x
x.right = T
self.update_height(x)
self.update_height(y)
return y
def insert(self, root, value):
# Standard BST insertion
if not root:
return AVLNode(value)
if value < root.value:
root.left = self.insert(root.left, value)
elif value > root.value:
root.right = self.insert(root.right, value)
else:
return root # Duplicates not allowed
# Update height
self.update_height(root)
# Get balance factor
balance = self.balance_factor(root)
# Rebalance if needed
# Left-Left case
if balance > 1 and value < root.left.value:
return self.right_rotate(root)
# Right-Right case
if balance < -1 and value > root.right.value:
return self.left_rotate(root)
# Left-Right case
if balance > 1 and value > root.left.value:
root.left = self.left_rotate(root.left)
return self.right_rotate(root)
# Right-Left case
if balance < -1 and value < root.right.value:
root.right = self.right_rotate(root.right)
return self.left_rotate(root)
return root
# Usage:
avl = AVLTree()
root = None
values = [10, 20, 30, 40, 50, 25]
for val in values:
root = avl.insert(root, val)
# Tree remains balanced after each insertion
Red-Black Tree:
A Red-Black tree is a self-balancing BST where each node has a color (red or black) with specific properties that ensure balance.
Properties:
- Every node is either red or black
- Root is always black
- All leaves (NIL) are black
- Red nodes cannot have red children (no two red nodes in a row)
- Every path from root to leaf has the same number of black nodes (black-height)
Example Red-Black Tree:
B:8
/ \
R:4 R:12
/ \ / \
B:2 B:6 B:10 B:14
B = Black, R = Red
Red-Black Tree Operations:
Insertion - O(log n):
- Insert as in regular BST (node starts as red)
- Fix violations:
- Recolor nodes
- Perform rotations if needed
Recoloring and Rotation Cases:
class RBNode:
def __init__(self, value):
self.value = value
self.left = None
self.right = None
self.parent = None
self.color = "RED" # New nodes are red
class RBTree:
def __init__(self):
self.NIL = RBNode(None)
self.NIL.color = "BLACK"
self.root = self.NIL
def left_rotate(self, x):
y = x.right
x.right = y.left
if y.left != self.NIL:
y.left.parent = x
y.parent = x.parent
if x.parent is None:
self.root = y
elif x == x.parent.left:
x.parent.left = y
else:
x.parent.right = y
y.left = x
x.parent = y
def right_rotate(self, y):
x = y.left
y.left = x.right
if x.right != self.NIL:
x.right.parent = y
x.parent = y.parent
if y.parent is None:
self.root = x
elif y == y.parent.right:
y.parent.right = x
else:
y.parent.left = x
x.right = y
y.parent = x
def insert(self, value):
# BST insertion
node = RBNode(value)
node.left = self.NIL
node.right = self.NIL
parent = None
current = self.root
while current != self.NIL:
parent = current
if node.value < current.value:
current = current.left
else:
current = current.right
node.parent = parent
if parent is None:
self.root = node
elif node.value < parent.value:
parent.left = node
else:
parent.right = node
# Fix Red-Black properties
self.fix_insert(node)
def fix_insert(self, node):
while node.parent and node.parent.color == "RED":
if node.parent == node.parent.parent.left:
uncle = node.parent.parent.right
if uncle.color == "RED":
# Case 1: Uncle is red - recolor
node.parent.color = "BLACK"
uncle.color = "BLACK"
node.parent.parent.color = "RED"
node = node.parent.parent
else:
if node == node.parent.right:
# Case 2: Node is right child - left rotate
node = node.parent
self.left_rotate(node)
# Case 3: Node is left child - right rotate and recolor
node.parent.color = "BLACK"
node.parent.parent.color = "RED"
self.right_rotate(node.parent.parent)
else:
# Mirror cases
uncle = node.parent.parent.left
if uncle.color == "RED":
node.parent.color = "BLACK"
uncle.color = "BLACK"
node.parent.parent.color = "RED"
node = node.parent.parent
else:
if node == node.parent.left:
node = node.parent
self.right_rotate(node)
node.parent.color = "BLACK"
node.parent.parent.color = "RED"
self.left_rotate(node.parent.parent)
self.root.color = "BLACK" # Root is always black
# Usage:
rb_tree = RBTree()
values = [10, 20, 30, 40, 50, 25]
for val in values:
rb_tree.insert(val)
AVL vs Red-Black Comparison:
| Feature | AVL Tree | Red-Black Tree |
|---|---|---|
| Balance | Strictly balanced | Loosely balanced |
| Height | ≤ 1.44 log n | ≤ 2 log n |
| Rotations on insert | Up to 2 | Up to 2 |
| Rotations on delete | O(log n) | Up to 3 |
| Search | Faster | Slightly slower |
| Insert/Delete | Slower (more rotations) | Faster |
| Use case | Search-heavy | Insert/delete-heavy |
| Memory | Less (only height) | More (color bit) |
Practical Usage:
- AVL: Database indexing, lookup-intensive applications
- Red-Black: Java TreeMap, C++ std::map, Linux kernel
When to use Balanced Trees:
- Need guaranteed O(log n) operations
- Can’t tolerate worst-case O(n) of regular BST
- Frequent insertions/deletions (Red-Black)
- Mostly searches (AVL)
- Implementing ordered maps/sets
Complexity Summary:
| Operation | BST (worst) | AVL | Red-Black |
|---|---|---|---|
| Search | O(n) | O(log n) | O(log n) |
| Insert | O(n) | O(log n) | O(log n) |
| Delete | O(n) | O(log n) | O(log n) |
| Space | O(n) | O(n) | O(n) |
Graph representations: list vs matrix
A graph G = (V, E) consists of:
- V: Set of vertices (nodes)
- E: Set of edges (connections between vertices)
Graph Types:
- Undirected: Edges have no direction (Facebook friendships)
- Directed (Digraph): Edges have direction (Twitter follows)
- Weighted: Edges have values (road distances)
- Unweighted: Edges have no values (social connections)
Two Main Representations:
1. Adjacency Matrix:
A 2D array where matrix[i][j] = 1 if edge exists from vertex i to j.
Undirected Graph Example:
Graph: 0---1
| |
2---3
Matrix:
0 1 2 3
0 [0, 1, 1, 0]
1 [1, 0, 0, 1]
2 [1, 0, 0, 1]
3 [0, 1, 1, 0]
Note: Symmetric for undirected graphs
Directed Graph Example:
Graph: 0→1
↑ ↓
2←3
Matrix:
0 1 2 3
0 [0, 1, 0, 0]
1 [0, 0, 0, 1]
2 [1, 0, 0, 0]
3 [0, 0, 1, 0]
Note: Not symmetric for directed graphs
Weighted Graph:
Graph: 0--5--1
| |
3 2
| |
2--4--3
Matrix (∞ for no edge):
0 1 2 3
0 [0, 5, 3, ∞]
1 [5, 0, ∞, 2]
2 [3, ∞, 0, 4]
3 [∞, 2, 4, 0]
Implementation:
class GraphMatrix:
def __init__(self, num_vertices, directed=False):
self.num_vertices = num_vertices
self.directed = directed
# Initialize matrix with zeros
self.matrix = [[0] * num_vertices for _ in range(num_vertices)]
def add_edge(self, u, v, weight=1):
"""Add edge from u to v"""
if 0 <= u < self.num_vertices and 0 <= v < self.num_vertices:
self.matrix[u][v] = weight
if not self.directed:
self.matrix[v][u] = weight # For undirected
def remove_edge(self, u, v):
"""Remove edge from u to v"""
if 0 <= u < self.num_vertices and 0 <= v < self.num_vertices:
self.matrix[u][v] = 0
if not self.directed:
self.matrix[v][u] = 0
def has_edge(self, u, v):
"""Check if edge exists"""
return self.matrix[u][v] != 0
def get_neighbors(self, u):
"""Get all neighbors of u"""
neighbors = []
for v in range(self.num_vertices):
if self.matrix[u][v] != 0:
neighbors.append(v)
return neighbors
def print_graph(self):
for row in self.matrix:
print(row)
# Usage:
g = GraphMatrix(4, directed=False)
g.add_edge(0, 1)
g.add_edge(0, 2)
g.add_edge(1, 3)
g.add_edge(2, 3)
g.print_graph()
2. Adjacency List:
An array of lists where list[i] contains all vertices adjacent to vertex i.
Undirected Graph Example:
Graph: 0---1
| |
2---3
List:
0 → [1, 2]
1 → [0, 3]
2 → [0, 3]
3 → [1, 2]
Directed Graph Example:
Graph: 0→1
↑ ↓
2←3
List:
0 → [1]
1 → [3]
2 → [0]
3 → [2]
Weighted Graph:
Graph: 0--5--1
| |
3 2
| |
2--4--3
List (vertex, weight):
0 → [(1, 5), (2, 3)]
1 → [(0, 5), (3, 2)]
2 → [(0, 3), (3, 4)]
3 → [(1, 2), (2, 4)]
Implementation:
from collections import defaultdict
class GraphList:
def __init__(self, directed=False):
self.graph = defaultdict(list)
self.directed = directed
def add_edge(self, u, v, weight=None):
"""Add edge from u to v"""
if weight is not None:
self.graph[u].append((v, weight))
if not self.directed:
self.graph[v].append((u, weight))
else:
self.graph[u].append(v)
if not self.directed:
self.graph[v].append(u)
def remove_edge(self, u, v):
"""Remove edge from u to v"""
if v in self.graph[u]:
self.graph[u].remove(v)
if not self.directed and u in self.graph[v]:
self.graph[v].remove(u)
def has_edge(self, u, v):
"""Check if edge exists"""
return v in self.graph[u]
def get_neighbors(self, u):
"""Get all neighbors of u"""
return self.graph[u]
def get_vertices(self):
"""Get all vertices"""
return list(self.graph.keys())
def print_graph(self):
for vertex in self.graph:
print(f"{vertex} → {self.graph[vertex]}")
# Usage - Unweighted:
g = GraphList(directed=False)
g.add_edge(0, 1)
g.add_edge(0, 2)
g.add_edge(1, 3)
g.add_edge(2, 3)
g.print_graph()
# Usage - Weighted:
g_weighted = GraphList(directed=True)
g_weighted.add_edge(0, 1, weight=5)
g_weighted.add_edge(0, 2, weight=3)
g_weighted.add_edge(1, 3, weight=2)
g_weighted.print_graph()
Comparison: Adjacency Matrix vs List:
| Operation | Matrix | List |
|---|---|---|
| Space | O(V²) | O(V + E) |
| Add edge | O(1) | O(1) |
| Remove edge | O(1) | O(V) |
| Check if edge exists | O(1) | O(V) |
| Find all neighbors | O(V) | O(degree) |
| Iterate all edges | O(V²) | O(V + E) |
When to use Matrix:
- Dense graphs (E ≈ V²)
- Need fast edge lookup (is there edge u→v?)
- Graph is small
- Implementing algorithms that check many edges (Floyd-Warshall)
When to use List:
- Sparse graphs (E « V²)
- Need to iterate over neighbors frequently
- Memory is a concern
- Most real-world graphs (social networks, web graphs)
Space Comparison Example:
Graph with 1000 vertices, 5000 edges:
- Matrix: 1000 × 1000 = 1,000,000 entries
- List: 1000 + 2×5000 = 11,000 entries (if undirected)
List is ~90x more space-efficient!
Practical Usage:
# Dense graph (many edges) - use matrix
adj_matrix = [[0]*n for _ in range(n)]
# Sparse graph (few edges) - use list
adj_list = defaultdict(list)
# Most real-world scenarios: adjacency list
# Examples: social networks, web graphs, road networks
Other Graph Representations (less common):
3. Edge List:
# Simple list of edges
edges = [(0, 1), (0, 2), (1, 3), (2, 3)]
# Weighted:
edges = [(0, 1, 5), (0, 2, 3), (1, 3, 2)]
# Good for: algorithms that process all edges (Kruskal's MST)
4. Incidence Matrix:
# Rows = vertices, Columns = edges
# matrix[v][e] = 1 if vertex v is incident to edge e
# Rarely used, but useful for some theoretical proofs
Graph traversals: BFS and DFS
Graph traversal is the process of visiting all vertices in a graph. Two fundamental approaches:
1. Breadth-First Search (BFS):
Explore level by level: visit all neighbors before going deeper.
Algorithm:
- Start at source vertex
- Visit all unvisited neighbors
- Then visit neighbors of those neighbors
- Continue until all reachable vertices visited
Uses Queue (FIFO)
Visual Example:
Graph: 1 --- 2
| \ |
| \ |
3 --- 4
BFS from 1:
Level 0: 1
Level 1: 2, 3, 4
Order: 1 → 2 → 3 → 4
Implementation (Adjacency List):
from collections import deque
def bfs(graph, start):
"""BFS traversal"""
visited = set()
queue = deque([start])
visited.add(start)
result = []
while queue:
vertex = queue.popleft()
result.append(vertex)
# Visit all unvisited neighbors
for neighbor in graph[vertex]:
if neighbor not in visited:
visited.add(neighbor)
queue.append(neighbor)
return result
# Usage:
graph = {
1: [2, 3, 4],
2: [1, 4],
3: [1, 4],
4: [1, 2, 3]
}
print(bfs(graph, 1)) # [1, 2, 3, 4]
BFS Level-by-Level:
def bfs_by_level(graph, start):
"""BFS with level tracking"""
visited = set([start])
queue = deque([(start, 0)]) # (vertex, level)
levels = {}
while queue:
vertex, level = queue.popleft()
levels[vertex] = level
for neighbor in graph[vertex]:
if neighbor not in visited:
visited.add(neighbor)
queue.append((neighbor, level + 1))
return levels
# Returns: {1: 0, 2: 1, 3: 1, 4: 1}
BFS Shortest Path (unweighted):
def bfs_shortest_path(graph, start, end):
"""Find shortest path using BFS"""
if start == end:
return [start]
visited = set([start])
queue = deque([(start, [start])]) # (vertex, path)
while queue:
vertex, path = queue.popleft()
for neighbor in graph[vertex]:
if neighbor not in visited:
new_path = path + [neighbor]
if neighbor == end:
return new_path
visited.add(neighbor)
queue.append((neighbor, new_path))
return None # No path found
# Usage:
path = bfs_shortest_path(graph, 1, 4)
print(path) # [1, 4] (shortest path from 1 to 4)
BFS Applications:
- Find shortest path in unweighted graph
- Level-order processing
- Web crawling
- Social network analysis (degrees of separation)
- Broadcasting in networks
- GPS navigation (unweighted roads)
2. Depth-First Search (DFS):
Explore as deep as possible before backtracking.
Algorithm:
- Start at source vertex
- Visit an unvisited neighbor
- Recursively visit that neighbor’s neighbors
- Backtrack when no unvisited neighbors
Uses Stack (LIFO) - often via recursion
Visual Example:
Graph: 1 --- 2
| \ |
| \ |
3 --- 4
DFS from 1:
Path: 1 → 2 → 4 → 3 (one possible order)
Recursive Implementation:
def dfs_recursive(graph, start, visited=None):
"""DFS traversal (recursive)"""
if visited is None:
visited = set()
visited.add(start)
result = [start]
for neighbor in graph[start]:
if neighbor not in visited:
result.extend(dfs_recursive(graph, neighbor, visited))
return result
# Usage:
graph = {
1: [2, 3, 4],
2: [1, 4],
3: [1, 4],
4: [1, 2, 3]
}
print(dfs_recursive(graph, 1)) # [1, 2, 4, 3] (one possible order)
Iterative Implementation (explicit stack):
def dfs_iterative(graph, start):
"""DFS traversal (iterative)"""
visited = set()
stack = [start]
result = []
while stack:
vertex = stack.pop()
if vertex not in visited:
visited.add(vertex)
result.append(vertex)
# Add neighbors to stack (in reverse for consistent order)
for neighbor in reversed(graph[vertex]):
if neighbor not in visited:
stack.append(neighbor)
return result
# Usage:
print(dfs_iterative(graph, 1)) # [1, 2, 4, 3]
DFS with Path Tracking:
def dfs_all_paths(graph, start, end, path=None):
"""Find all paths from start to end"""
if path is None:
path = []
path = path + [start]
if start == end:
return [path]
paths = []
for neighbor in graph[start]:
if neighbor not in path: # Avoid cycles
new_paths = dfs_all_paths(graph, neighbor, end, path)
paths.extend(new_paths)
return paths
# Usage:
all_paths = dfs_all_paths(graph, 1, 4)
print(all_paths) # [[1, 2, 4], [1, 4], [1, 3, 4]]
DFS Cycle Detection:
def has_cycle_undirected(graph, start, visited=None, parent=None):
"""Detect cycle in undirected graph"""
if visited is None:
visited = set()
visited.add(start)
for neighbor in graph[start]:
if neighbor not in visited:
if has_cycle_undirected(graph, neighbor, visited, start):
return True
elif neighbor != parent:
# Back edge found (not to parent) → cycle!
return True
return False
def has_cycle_directed(graph, vertex, visited, rec_stack):
"""Detect cycle in directed graph"""
visited.add(vertex)
rec_stack.add(vertex)
for neighbor in graph[vertex]:
if neighbor not in visited:
if has_cycle_directed(graph, neighbor, visited, rec_stack):
return True
elif neighbor in rec_stack:
# Back edge to ancestor → cycle!
return True
rec_stack.remove(vertex)
return False
DFS Applications:
- Detect cycles
- Topological sorting
- Finding connected components
- Solving mazes and puzzles
- Backtracking problems
- Path finding (not necessarily shortest)
BFS vs DFS Comparison:
| Feature | BFS | DFS |
|---|---|---|
| Data Structure | Queue | Stack/Recursion |
| Memory | O(V) - wider | O(h) - deeper |
| Shortest Path | ✅ Yes (unweighted) | ❌ No |
| Completeness | ✅ Yes | ✅ Yes (if checks visited) |
| Time | O(V + E) | O(V + E) |
| Use Case | Shortest path, level-order | Cycle detection, topological sort |
Complete Graph Traversal:
If graph has multiple connected components:
def dfs_all_components(graph):
"""Visit all vertices in all components"""
visited = set()
components = []
for vertex in graph:
if vertex not in visited:
component = []
dfs_component(graph, vertex, visited, component)
components.append(component)
return components
def dfs_component(graph, vertex, visited, component):
visited.add(vertex)
component.append(vertex)
for neighbor in graph[vertex]:
if neighbor not in visited:
dfs_component(graph, neighbor, visited, component)
# Example: graph with 2 components
graph = {
1: [2], 2: [1], # Component 1
3: [4], 4: [3] # Component 2
}
components = dfs_all_components(graph)
print(components) # [[1, 2], [3, 4]]
Practical Example: Social Network:
class SocialNetwork:
def __init__(self):
self.graph = defaultdict(list)
def add_friendship(self, person1, person2):
self.graph[person1].append(person2)
self.graph[person2].append(person1)
def degrees_of_separation(self, person1, person2):
"""Find shortest connection between two people (BFS)"""
if person1 == person2:
return 0
visited = set([person1])
queue = deque([(person1, 0)])
while queue:
person, distance = queue.popleft()
for friend in self.graph[person]:
if friend == person2:
return distance + 1
if friend not in visited:
visited.add(friend)
queue.append((friend, distance + 1))
return -1 # Not connected
def find_all_friends_at_distance(self, person, distance):
"""Find all friends at exactly 'distance' hops away"""
visited = set([person])
queue = deque([(person, 0)])
result = []
while queue:
current, dist = queue.popleft()
if dist == distance:
result.append(current)
continue
if dist < distance:
for friend in self.graph[current]:
if friend not in visited:
visited.add(friend)
queue.append((friend, dist + 1))
return result
# Usage:
network = SocialNetwork()
network.add_friendship("Alice", "Bob")
network.add_friendship("Bob", "Charlie")
network.add_friendship("Charlie", "David")
print(network.degrees_of_separation("Alice", "David")) # 3
print(network.find_all_friends_at_distance("Alice", 2)) # ["Charlie"]
Summary:
- BFS: Use for shortest path, level-order, minimum hops
- DFS: Use for cycle detection, topological sort, exploring all paths
- Both: O(V + E) time, essential for graph algorithms
Applications and modeling
Non-linear data structures (trees and graphs) model countless real-world systems and solve diverse problems.
Tree Applications:
1. File Systems:
class FileNode:
def __init__(self, name, is_file=False):
self.name = name
self.is_file = is_file
self.children = [] if not is_file else None
self.size = 0 if is_file else None
def add_child(self, child):
if not self.is_file:
self.children.append(child)
def calculate_size(self):
"""Calculate total size of directory tree"""
if self.is_file:
return self.size
total = 0
for child in self.children:
total += child.calculate_size()
return total
# Example:
root = FileNode("root")
docs = FileNode("Documents")
file1 = FileNode("report.pdf", is_file=True)
file1.size = 1024
docs.add_child(file1)
root.add_child(docs)
print(f"Total size: {root.calculate_size()} KB")
2. HTML DOM:
HTML Tree:
html
├── head
│ └── title
└── body
├── h1
├── p
└── div
└── img
Each tag is a node in a tree structure
3. Database Indexes (B-Trees):
Used in:
- MySQL, PostgreSQL (B+ trees)
- Fast range queries
- Sorted traversal
- O(log n) search/insert/delete
4. Decision Trees (ML):
Age > 30?
/ \
Yes No
/ \
Approved Income > 50k?
/ \
Yes No
| |
Approved Rejected
Used for classification and regression
5. Abstract Syntax Trees (Compilers):
"""
Expression: (3 + 5) * 2
AST: *
/ \
+ 2
/ \
3 5
"""
def evaluate_ast(node):
if node.is_operand:
return node.value
left = evaluate_ast(node.left)
right = evaluate_ast(node.right)
if node.operator == '+':
return left + right
elif node.operator == '*':
return left * right
6. Huffman Coding (Compression):
"""
Build optimal prefix-free code tree
Characters with higher frequency get shorter codes
"""
import heapq
def huffman_encoding(text):
# Build frequency table
freq = {}
for char in text:
freq[char] = freq.get(char, 0) + 1
# Build heap of (frequency, node)
heap = [[weight, [char, ""]] for char, weight in freq.items()]
heapq.heapify(heap)
# Build Huffman tree
while len(heap) > 1:
lo = heapq.heappop(heap)
hi = heapq.heappop(heap)
for pair in lo[1:]:
pair[1] = '0' + pair[1]
for pair in hi[1:]:
pair[1] = '1' + pair[1]
heapq.heappush(heap, [lo[0] + hi[0]] + lo[1:] + hi[1:])
return dict(heapq.heappop(heap)[1:])
# Usage:
codes = huffman_encoding("hello world")
print(codes) # {'h': '000', 'e': '001', 'l': '01', ...}
Graph Applications:
1. Social Networks:
"""
- Vertices: Users
- Edges: Friendships, Follows
"""
class SocialGraph:
def __init__(self):
self.graph = defaultdict(list)
def add_follow(self, follower, followee):
self.graph[follower].append(followee)
def mutual_friends(self, user1, user2):
"""Find common connections"""
friends1 = set(self.graph[user1])
friends2 = set(self.graph[user2])
return friends1 & friends2
def suggest_friends(self, user):
"""Friend suggestions (friends of friends)"""
suggestions = set()
friends = set(self.graph[user])
for friend in friends:
for friend_of_friend in self.graph[friend]:
if friend_of_friend != user and friend_of_friend not in friends:
suggestions.add(friend_of_friend)
return suggestions
2. Web Page Ranking (PageRank):
- Vertices: Web pages
- Edges: Hyperlinks
- PageRank: Importance based on incoming links
Used by Google's original algorithm
3. Maps and Navigation:
"""
- Vertices: Locations/Intersections
- Edges: Roads
- Weights: Distances, travel times
"""
class MapGraph:
def __init__(self):
self.graph = defaultdict(list)
def add_road(self, loc1, loc2, distance):
self.graph[loc1].append((loc2, distance))
self.graph[loc2].append((loc1, distance))
def dijkstra_shortest_path(self, start, end):
"""Find shortest path using Dijkstra's algorithm"""
distances = {start: 0}
pq = [(0, start)]
visited = set()
parent = {}
while pq:
curr_dist, curr = heapq.heappop(pq)
if curr in visited:
continue
visited.add(curr)
if curr == end:
# Reconstruct path
path = []
while curr in parent:
path.append(curr)
curr = parent[curr]
path.append(start)
return path[::-1], distances[end]
for neighbor, weight in self.graph[curr]:
distance = curr_dist + weight
if neighbor not in visited and (neighbor not in distances or distance < distances[neighbor]):
distances[neighbor] = distance
parent[neighbor] = curr
heapq.heappush(pq, (distance, neighbor))
return None, float('inf') # No path found
# Usage:
map_graph = MapGraph()
map_graph.add_road("A", "B", 5)
map_graph.add_road("A", "C", 3)
map_graph.add_road("B", "D", 2)
map_graph.add_road("C", "D", 4)
path, distance = map_graph.dijkstra_shortest_path("A", "D")
print(f"Path: {path}, Distance: {distance}")
4. Course Prerequisites:
"""
- Vertices: Courses
- Edges: Prerequisites (directed)
- Topological sort: Valid course order
"""
def topological_sort(graph):
"""Find valid course order (DFS-based)"""
visited = set()
stack = []
def dfs(course):
visited.add(course)
for prereq in graph[course]:
if prereq not in visited:
dfs(prereq)
stack.append(course)
for course in graph:
if course not in visited:
dfs(course)
return stack[::-1]
# Example:
courses = {
"CS101": [],
"CS102": ["CS101"],
"CS201": ["CS102"],
"CS202": ["CS102"],
"CS301": ["CS201", "CS202"]
}
order = topological_sort(courses)
print(order) # Valid course order
5. Network Flow:
- Vertices: Routers, servers
- Edges: Network connections
- Weights: Bandwidth, capacity
Applications:
- Internet routing
- Traffic simulation
- Supply chain optimization
6. Dependency Resolution (Package Managers):
"""
- Vertices: Software packages
- Edges: Dependencies
- Problem: Find installation order
"""
def resolve_dependencies(packages):
"""Resolve package dependencies (topological sort)"""
# Build graph of dependencies
graph = defaultdict(list)
in_degree = defaultdict(int)
for pkg, deps in packages.items():
for dep in deps:
graph[dep].append(pkg)
in_degree[pkg] += 1
if pkg not in in_degree:
in_degree[pkg] = 0
# Kahn's algorithm (BFS-based topological sort)
queue = deque([pkg for pkg in in_degree if in_degree[pkg] == 0])
order = []
while queue:
pkg = queue.popleft()
order.append(pkg)
for dependent in graph[pkg]:
in_degree[dependent] -= 1
if in_degree[dependent] == 0:
queue.append(dependent)
if len(order) != len(packages):
return None # Cycle detected - circular dependency!
return order
# Usage:
packages = {
"app": ["lib-a", "lib-b"],
"lib-a": ["lib-c"],
"lib-b": ["lib-c"],
"lib-c": []
}
install_order = resolve_dependencies(packages)
print(install_order) # ["lib-c", "lib-a", "lib-b", "app"]
7. Recommendation Systems:
"""
- Bipartite graph: Users and Items
- Edges: Interactions (purchases, views, ratings)
"""
def recommend_items(user, user_item_graph, user_user_graph):
"""Recommend items based on similar users"""
similar_users = user_user_graph[user]
recommendations = set()
for similar_user in similar_users:
for item in user_item_graph[similar_user]:
if item not in user_item_graph[user]:
recommendations.add(item)
return recommendations
8. Circuit Design:
- Vertices: Components
- Edges: Wires
- Applications: Layout optimization, signal routing
Problem Modeling Checklist:
Use Trees When:
- ✅ Hierarchical relationships
- ✅ One path between any two nodes
- ✅ Clear parent-child structure
- Examples: File systems, DOM, organizational charts
Use Graphs When:
- ✅ Many-to-many relationships
- ✅ Multiple paths between nodes
- ✅ Network-like structure
- Examples: Social networks, maps, dependencies
Complexity Summary:
| Problem Type | Structure | Key Algorithm |
|---|---|---|
| File system | Tree | Tree traversal |
| Shortest path | Graph | BFS/Dijkstra |
| Task scheduling | DAG | Topological sort |
| Social network | Graph | BFS/DFS |
| Compression | Tree | Huffman coding |
| Syntax parsing | Tree | Tree traversal |
| Network routing | Graph | Shortest path |
Key Takeaway: Most real-world systems are naturally represented as non-linear structures. Understanding trees and graphs is essential for modeling and solving complex problems efficiently.