Graphs
Nodes and edges modeling relationships and connectivity.
1. Adjacency list vs adjacency matrix
Two primary representations for storing graph structure:
Adjacency List:
- Structure: Array/dict of lists, each storing neighbors
- Space: O(V + E) where V = vertices, E = edges
- Best for: Sparse graphs (few edges)
Implementation:
# Using dictionary
graph = {
0: [1, 2],
1: [0, 3, 4],
2: [0, 4],
3: [1],
4: [1, 2]
}
# Using list of lists
graph = [
[1, 2], # neighbors of vertex 0
[0, 3, 4], # neighbors of vertex 1
[0, 4], # neighbors of vertex 2
[1], # neighbors of vertex 3
[1, 2] # neighbors of vertex 4
]
# With weights (weighted graph)
graph = {
0: [(1, 5), (2, 3)], # (neighbor, weight)
1: [(0, 5), (3, 2), (4, 1)],
2: [(0, 3), (4, 4)],
3: [(1, 2)],
4: [(1, 1), (2, 4)]
}
Advantages:
- ✓ Space-efficient for sparse graphs
- ✓ Fast to iterate over neighbors: O(degree)
- ✓ Easy to add edges
- ✓ Natural for most graph algorithms
Disadvantages:
- ✗ Checking if edge exists: O(degree) or O(log degree) with sorted list
- ✗ Slightly more complex to implement
Adjacency Matrix:
- Structure: 2D array where matrix[i][j] = 1 if edge exists
- Space: O(V²)
- Best for: Dense graphs (many edges)
Implementation:
# Unweighted graph (0 = no edge, 1 = edge)
n = 5
matrix = [
[0, 1, 1, 0, 0],
[1, 0, 0, 1, 1],
[1, 0, 0, 0, 1],
[0, 1, 0, 0, 0],
[0, 1, 1, 0, 0]
]
# Weighted graph (0 or inf = no edge, value = weight)
matrix = [
[0, 5, 3, inf, inf],
[5, 0, inf, 2, 1],
[3, inf, 0, inf, 4],
[inf, 2, inf, 0, inf],
[inf, 1, 4, inf, 0]
]
# Check if edge exists
def has_edge(matrix, u, v):
return matrix[u][v] != 0 # or != inf for weighted
# Get all neighbors
def get_neighbors(matrix, u):
return [v for v in range(len(matrix)) if matrix[u][v] != 0]
Advantages:
- ✓ O(1) edge lookup: matrix[i][j]
- ✓ O(1) edge insertion/deletion
- ✓ Simple implementation
- ✓ Good for dense graphs
Disadvantages:
- ✗ O(V²) space even for sparse graphs
- ✗ O(V) to iterate over neighbors
- ✗ Wasteful for sparse graphs
Comparison table:
| Operation | Adjacency List | Adjacency Matrix |
|---|---|---|
| Space | O(V + E) | O(V²) |
| Add edge | O(1) | O(1) |
| Remove edge | O(degree) | O(1) |
| Check edge | O(degree) | O(1) |
| Get neighbors | O(degree) | O(V) |
| Iterate all edges | O(V + E) | O(V²) |
| Best for | Sparse graphs | Dense graphs |
When to use each:
- Adjacency List: Most real-world graphs (social networks, web graphs, road networks)
- Adjacency Matrix: Dense graphs, need fast edge queries, matrix operations (Floyd-Warshall)
2. Directed vs undirected graphs
Understanding edge directionality:
Undirected Graph:
- Edges: Bidirectional, (u, v) = (v, u)
- Representation: Store edge in both directions or mark as undirected
- Examples: Friendships, road networks (two-way streets)
Implementation:
# Adjacency list (store both directions)
def add_undirected_edge(graph, u, v):
graph[u].append(v)
graph[v].append(u)
# Example: friendship network
graph = {
'Alice': ['Bob', 'Charlie'],
'Bob': ['Alice', 'David'],
'Charlie': ['Alice', 'David'],
'David': ['Bob', 'Charlie']
}
Properties:
- Degree: Number of edges incident to vertex
-
Handshaking lemma: Sum of degrees = 2 × E - Edge count: Each edge counted once
Directed Graph (Digraph):
- Edges: One-way, (u, v) ≠ (v, u)
- Representation: Store edge in one direction only
- Examples: Twitter follows, web links, dependencies
Implementation:
# Adjacency list (store one direction)
def add_directed_edge(graph, u, v):
graph[u].append(v)
# Example: Twitter follows
graph = {
'Alice': ['Bob', 'Charlie'], # Alice follows Bob and Charlie
'Bob': ['David'], # Bob follows David
'Charlie': ['Alice', 'David'], # Charlie follows Alice and David
'David': [] # David follows no one
}
Properties:
- In-degree: Number of incoming edges
- Out-degree: Number of outgoing edges
- Edge count: Each edge counted once
- Can have cycles even with 2 vertices: u → v, v → u
Visual comparison:
Undirected: Directed:
A --- B A --> B
| | | ↓
C --- D C <-- D
Edge (A,B) means: Edge A→B means:
- A can reach B - A can reach B
- B can reach A - B cannot reach A (unless B→A exists)
Conversion:
# Undirected to directed (create both directions)
def undirected_to_directed(undirected):
directed = {v: [] for v in undirected}
for u in undirected:
for v in undirected[u]:
if v not in directed[u]:
directed[u].append(v)
if u not in directed[v]:
directed[v].append(u)
return directed
# Directed to undirected (ignore direction)
def directed_to_undirected(directed):
undirected = {v: [] for v in directed}
for u in directed:
for v in directed[u]:
if v not in undirected[u]:
undirected[u].append(v)
if u not in undirected[v]:
undirected[v].append(u)
return undirected
3. Weighted vs unweighted
Edges with or without associated costs:
Unweighted Graph:
- Edges: All have equal cost (implicitly weight = 1)
- Use case: Simple connectivity, shortest path = fewest edges
- Examples: Social networks, maze solving
Implementation:
# Adjacency list
graph = {
0: [1, 2],
1: [0, 3],
2: [0, 3],
3: [1, 2]
}
# Shortest path: BFS finds path with fewest edges
from collections import deque
def shortest_path_unweighted(graph, start, end):
queue = deque([(start, 0)])
visited = {start}
while queue:
node, dist = queue.popleft()
if node == end:
return dist
for neighbor in graph[node]:
if neighbor not in visited:
visited.add(neighbor)
queue.append((neighbor, dist + 1))
return -1 # No path
Weighted Graph:
- Edges: Have associated costs/weights
- Use case: Optimization problems, realistic modeling
- Examples: Road networks (distances), flight routes (costs)
Implementation:
# Adjacency list with weights
graph = {
0: [(1, 4), (2, 1)], # (neighbor, weight)
1: [(0, 4), (3, 5)],
2: [(0, 1), (3, 2)],
3: [(1, 5), (2, 2)]
}
# Shortest path: Dijkstra's algorithm for weighted graphs
import heapq
def shortest_path_weighted(graph, start, end):
dist = {node: float('inf') for node in graph}
dist[start] = 0
pq = [(0, start)]
while pq:
d, node = heapq.heappop(pq)
if node == end:
return d
if d > dist[node]:
continue
for neighbor, weight in graph[node]:
new_dist = d + weight
if new_dist < dist[neighbor]:
dist[neighbor] = new_dist
heapq.heappush(pq, (new_dist, neighbor))
return -1 # No path
Weight types:
- Positive weights: Most common, use Dijkstra’s
- Negative weights: Use Bellman-Ford, watch for negative cycles
- Zero weights: Valid, but may need special handling
- Non-numeric weights: Can represent other metrics (time, reliability)
Example: Road network:
# Cities and distances (km)
roads = {
'NYC': [('Boston', 215), ('Philadelphia', 95)],
'Boston': [('NYC', 215)],
'Philadelphia': [('NYC', 95), ('Washington', 140)],
'Washington': [('Philadelphia', 140)]
}
# Find shortest route from NYC to Washington
# Answer: NYC → Philadelphia → Washington = 235 km
4. BFS and DFS traversals
Fundamental algorithms for exploring graphs:
Breadth-First Search (BFS):
- Strategy: Explore level by level (nearest first)
- Data structure: Queue (FIFO)
- Time: O(V + E)
- Space: O(V)
Implementation:
from collections import deque
def bfs(graph, start):
visited = set([start])
queue = deque([start])
result = []
while queue:
node = queue.popleft()
result.append(node)
for neighbor in graph[node]:
if neighbor not in visited:
visited.add(neighbor)
queue.append(neighbor)
return result
# Example traversal:
# Graph: 0 -- 1 -- 3
# | |
# 2 -- 4
# BFS from 0: [0, 1, 2, 3, 4]
BFS applications:
- Shortest path in unweighted graph
- Level-order traversal
- Connected components
- Bipartite checking
Depth-First Search (DFS):
- Strategy: Explore as deep as possible first
- Data structure: Stack (recursion or explicit)
- Time: O(V + E)
- Space: O(V)
Implementation:
# Recursive
def dfs_recursive(graph, node, visited=None):
if visited is None:
visited = set()
visited.add(node)
result = [node]
for neighbor in graph[node]:
if neighbor not in visited:
result.extend(dfs_recursive(graph, neighbor, visited))
return result
# Iterative
def dfs_iterative(graph, start):
visited = set()
stack = [start]
result = []
while stack:
node = stack.pop()
if node not in visited:
visited.add(node)
result.append(node)
for neighbor in reversed(graph[node]):
if neighbor not in visited:
stack.append(neighbor)
return result
# Example traversal:
# Graph: 0 -- 1 -- 3
# | |
# 2 -- 4
# DFS from 0: [0, 1, 3, 4, 2] or [0, 2, 4, 1, 3]
DFS applications:
- Cycle detection
- Topological sorting
- Strongly connected components
- Path finding
Comparison:
| Feature | BFS | DFS |
|---|---|---|
| Data structure | Queue | Stack/Recursion |
| Order | Level by level | Deep first |
| Shortest path | Yes (unweighted) | No |
| Memory | More (stores level) | Less (stores path) |
| Completeness | Yes | Yes (with visited set) |
5. Connected components and cycles
Analyzing graph structure:
Connected Components (undirected graphs):
- Definition: Maximal subgraphs where every pair of vertices is connected
- Algorithm: Run BFS/DFS from unvisited vertices
Implementation:
def count_components(graph):
visited = set()
count = 0
def dfs(node):
visited.add(node)
for neighbor in graph[node]:
if neighbor not in visited:
dfs(neighbor)
for node in graph:
if node not in visited:
dfs(node)
count += 1
return count
def find_components(graph):
visited = set()
components = []
def dfs(node, component):
visited.add(node)
component.append(node)
for neighbor in graph[node]:
if neighbor not in visited:
dfs(neighbor, component)
for node in graph:
if node not in visited:
component = []
dfs(node, component)
components.append(component)
return components
# Example:
# Graph: 0-1-2 3-4 5
# Components: [[0,1,2], [3,4], [5]]
Cycle Detection:
Undirected graph:
def has_cycle_undirected(graph):
visited = set()
def dfs(node, parent):
visited.add(node)
for neighbor in graph[node]:
if neighbor not in visited:
if dfs(neighbor, node):
return True
elif neighbor != parent:
return True # Back edge found
return False
for node in graph:
if node not in visited:
if dfs(node, -1):
return True
return False
Directed graph (using colors):
def has_cycle_directed(graph):
WHITE, GRAY, BLACK = 0, 1, 2
color = {node: WHITE for node in graph}
def dfs(node):
color[node] = GRAY
for neighbor in graph[node]:
if color[neighbor] == GRAY:
return True # Back edge (cycle)
if color[neighbor] == WHITE and dfs(neighbor):
return True
color[node] = BLACK
return False
for node in graph:
if color[node] == WHITE:
if dfs(node):
return True
return False
Union-Find for connected components:
class UnionFind:
def __init__(self, n):
self.parent = list(range(n))
self.rank = [0] * n
self.components = n
def find(self, x):
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x])
return self.parent[x]
def union(self, x, y):
px, py = self.find(x), self.find(y)
if px == py:
return False
if self.rank[px] < self.rank[py]:
self.parent[px] = py
elif self.rank[px] > self.rank[py]:
self.parent[py] = px
else:
self.parent[py] = px
self.rank[px] += 1
self.components -= 1
return True
6. Applications: social networks, maps
Real-world uses of graph structures:
Social Networks:
- Vertices: Users
- Edges: Friendships, follows, connections
- Type: Usually undirected (Facebook) or directed (Twitter)
Applications:
- Friend recommendations: Find friends of friends
def recommend_friends(graph, user): friends = set(graph[user]) recommendations = set() for friend in friends: for friend_of_friend in graph[friend]: if friend_of_friend != user and friend_of_friend not in friends: recommendations.add(friend_of_friend) return recommendations - Influence analysis: Find most connected users
def find_influencers(graph, top_k=10): # Calculate degree centrality degrees = [(node, len(graph[node])) for node in graph] degrees.sort(key=lambda x: x[1], reverse=True) return degrees[:top_k] - Community detection: Find clusters of users
- Six degrees of separation: Shortest path between users
Maps and Navigation:
- Vertices: Locations, intersections
- Edges: Roads, paths
- Weights: Distance, time, cost
Applications:
- Route planning: Shortest path (Dijkstra’s, A*)
def find_route(map_graph, start, end): return dijkstra(map_graph, start, end) - Traffic optimization: Find alternative routes
- Delivery routing: Traveling salesman problem
- Public transit: Multi-modal routing
Web and Internet:
- Vertices: Web pages, routers
- Edges: Hyperlinks, network connections
- Type: Directed (links), weighted (bandwidth)
Applications:
- PageRank: Rank web pages by importance
- Web crawling: BFS/DFS to discover pages
- Network routing: Find optimal packet paths
- Load balancing: Distribute traffic
Other applications:
- Dependency graphs: Build systems, package managers
- Recommendation systems: User-item bipartite graphs
- Biology: Protein interaction networks, phylogenetic trees
- Chemistry: Molecular structures
- Game AI: State space search, pathfinding
7. Complexities and storage
Understanding performance characteristics:
Time complexities:
| Operation | Adjacency List | Adjacency Matrix |
|---|---|---|
| Add vertex | O(1) | O(V²) (resize) |
| Add edge | O(1) | O(1) |
| Remove vertex | O(E) | O(V²) |
| Remove edge | O(E) | O(1) |
| Check edge | O(degree) | O(1) |
| Get neighbors | O(degree) | O(V) |
| BFS/DFS | O(V + E) | O(V²) |
| Dijkstra | O((V+E) log V) | O(V²) |
Space complexities:
| Representation | Space | Best For |
|---|---|---|
| Adjacency List | O(V + E) | Sparse graphs (E « V²) |
| Adjacency Matrix | O(V²) | Dense graphs (E ≈ V²) |
| Edge List | O(E) | Simple storage, sorting |
Graph density:
- Sparse: E = O(V), few edges
- Example: Social networks (avg degree ~100)
- Use adjacency list
- Dense: E = O(V²), many edges
- Example: Complete graph (all pairs connected)
- Use adjacency matrix
Memory calculations:
# Example: 1 million vertices
V = 1_000_000
# Sparse graph (avg degree = 10)
E_sparse = 5_000_000 # ~5M edges
list_memory = (V + E_sparse) * 8 bytes = 40 MB
matrix_memory = V * V * 1 byte = 1 TB (impractical!)
# Dense graph (E = V²/2)
E_dense = 500_000_000_000 # 500B edges
list_memory = (V + E_dense) * 8 bytes = 4 TB
matrix_memory = V * V * 1 byte = 1 TB (better!)
Optimization techniques:
- Compressed sparse row (CSR): Efficient sparse matrix storage
- Adjacency array: Flatten adjacency list for cache efficiency
- Bit vectors: Use bits instead of bytes for unweighted graphs
- Graph databases: Neo4j, specialized storage for large graphs
Practical considerations:
- Small graphs (V < 1000): Either representation works
- Medium graphs (V < 100K): Adjacency list usually better
- Large graphs (V > 1M): Must use adjacency list or specialized storage
- Very dense graphs: Consider adjacency matrix or compressed formats