Algorithms-2
Algorithms-2
1
Algorithms 2
Section 1: Graphs and Path-Finding Algorithms
2
Graphs
A graph, G = (V, E), is a set of vertices and edges ⊆ V x V. We usually care about
finite graphs.
3
Representing a graph
There are two basic representations of E: adjacency lists and adjacency matrices.
Adjacency lists are stored in an array of length |V| where A[u] stores a pointer to
a linked list of the vertices, v ∈ V such that (u, v) ∈ E. If G is weighted, the lists
store tuples (v, w) such that (u, v) ∈ E and weight(u, v) = w.
4
Example Adjacency Matrix
1 2 3 4 5 6 7 8 9 10
1 2
1 1 1
2 1 1
10 3
3 1 1 1 1
4 1 1
7 6
5 1 1 1
6 1 1
9 8 5 4
7 1 1
8 1 1 1
💡 This undirected graph is represented with
an adjacency matrix. The shaded cells do not 9 1 1
need to be stored due to symmetry.
10 1 1 1
5
Example Adjacency Lists
9 1 (2, 1.3)
6.9 5.0
2 (3, 2.1) (4, 5.6)
1
8 7
3 (6, 3.0)
5.9 8.6 4.0
4 (5, 4.9)
7.7
5 6 5 (2, 5.3) (6, 7.7)
4.9 5.3 3.0
6 (5, 7.7) (7, 4.0)
5.6 2.1
4 2 3 7 (8, 1.0) (9, 5.0)
1.3
8 (5, 5.9) (6, 8.6) (7, 1.0)
1 9 (8, 6.9)
Compact for dense graphs (no pointers, and Compact for sparse graphs
only 1-bit per entry if unweighted)
Can (approx) halve storage if G is undirected Cannot halve storage for undirected graphs
(without significantly worsening the time
complexity)
O(|V|2) to iterate through all edges O(|E|) to iterate through all edges
7
Other terminology [1]
The transpose of a directed graph G = (V, E) is the graph GT = (V, ET), which (by
transposing the edge matrix) has all the directed edges reversed.
The in-degree and out-degree of a vertex in a directed graph are the numbers of
incoming and outgoing edges, respectively. The degree of a vertex (in a directed
or undirected graph) is the number of edges incident at that vertex.
The square of a graph G = (V, E) is the graph G2 = (V, E2), in which an edge (u,v)
is present if there is a path between u and v in G consisting of at most two edges.
8
Other terminology [2]
A complete graph (also fully connected graph) is one with E = V x V.
A connected graph is one where every pair of vertices are connected by at least
one path (not edge!).
An induced subgraph of G = (V, E) is another graph G’ = (V’, E’) where V’ ⊆ V
and E’ is that subset of E consisting of all edges (u, v) ∈ E where u,v ∈ V’.
A clique within a graph G is any induced subgraph that is complete.
The complement graph of G = (V, E) is the graph G = (V, Ē) where
Ē = { (u, v) | u,v ∈ V ∧ (u, v) ∉ E }.
A graph is acyclic if no vertex can be reached by a path from itself.
9
Other terminology [3]
Vertex colouring is the task of assigning colours to each v ∈ V such that no
adjacent vertices have the same colour.
Edge colouring is the task of assigning colours to each edge e ∈ E such that no
adjacent edges have the same colour.
Face colouring is the task of assigning colours to each face of a planar graph
such that no adjacent faces have the same colour. A planar graph can be drawn
on a plane such that no two edges intersect (other than at their vertices). A face is
a region bounded by edges (including the infinite-area region around the
‘outside’).
10
Breadth First Search, BFS(G, s)
BFS can be used on directed and undirected graphs.
BFS on a graph is slightly more complex than on a tree because we have to worry
about duplicate ‘discoveries’ of a vertex.
11
BFS(G, s) – for trees!
1 for v in G.V 5 while !QUEUE-EMPTY(Q)
2 v.marked = false 6 u = DEQUEUE(Q)
3 Q = new Queue 7 u.marked = true
4 ENQUEUE(Q, s) 8 for v in u.adjacent
9 ENQUEUE(Q, v)
Line 7 is a placeholder. You should ‘process’ node u in whatever way makes sense for your algorithm.
Marking a node to say we’ve been here is a trivial thing to do (and pointless if s is the root because we’ll visit
everywhere in the tree so all vertices will end up marked).
When this terminates, all nodes reachable from s will have been marked (or ‘processed’ in any other way
your algorithm wishes to process them at line 8).
1 Enqueue s
Dequeue s → Mark s, Enqueue 1, 2, and 3. Q=[1,2,3]
s
Dequeue 1 → Mark 1, Enqueue 2. Q=[2,3,2]
2 The memory consumption for the queue is more than necessary.
3 (Line 7 prevents infinite looping on cyclic input.)
To fix this, we need to record when a vertex has been inserted into the queue
already and avoid inserting a second time. Searching the queue would be O(q) in
the length, q, of the queue so let’s try a bit harder…
14
BFS(G, s) – for graphs!
1 for v in G.V 7 while !QUEUE-EMPTY(Q)
2 v.marked = false 8 u = DEQUEUE(Q)
Or other processing
3 v.pending = false 9 u.marked = true
4 s.pending = true 10 for v in G.E.adj[u]
5 Q = new Queue 11 if !v.pending
6 ENQUEUE(Q, s) 12 ENQUEUE(Q, v)
13 v.pending = true
💡 The expected running time is O(|V|) for lines 1–3 and O(|E|) for 7–13 so O(|V|+|E|) overall.
💡 Why is line 4 necessary? Provide a graph that would cause this algorithm to go wrong without line 4. 15
BFS(G, s) – with immutable graphs
1 let M = new HashTable 6 while !QUEUE-EMPTY(Q)
2 let P = new HashTable 7 u = DEQUEUE(Q)
Or other processing
3 HASH-INSERT(P, s) 8 HASH-INSERT(M, u)
4 Q = new Queue 9 for v in G.E.adj[u]
5 ENQUEUE(Q, s) 10 if !HASH-HAS-KEY(P,v)
11 ENQUEUE(Q, v)
This program puts all vertices reachable from s
into the hash table M but you could do any other
processing you like at line 8.
12 HASH-INSERT(P, v)
💡 We can store the pending set in a hash table if the graph vertices do not have a ‘pending’ attribute or we
cannot modify the graph itself (such as in a multithreaded program – see IB Concurrent and Distributed
Systems). 16
2-Vertex Colourability (for a connected, undirected graph)
Input: a connected, undirected graph, G = (V, E)
Output: true if G.V can be coloured using two colours; false otherwise.
Pick an arbitrary vertex, s. Set s.colour = BLACK. BFS from s, colouring the first
level as RED, the next level BLACK, etc. ⇒ O(|V| + |E|) = O(|E|) since connected.
When the BFS completes, scan over the edges checking whether any adjacent
vertices have the same colour. Return true/false as appropriate. ⇒ O(|E|)
O(|V| + |E|) overall (O(|E|) since connected) – for adjacency list representations.
Both steps and overall are O(|V|2) if adjacency matrices are used.
18
SSAD_HOPCOUNT(G, s)
1 for v in G.V 10 while !QUEUE-EMPTY(Q)
2 v.pending = false 11 u = DEQUEUE(Q)
3 v.d = ∞ 12 for v in G.E.adj[u]
4 v.𝛑 = NIL 13 if !v.pending
5 s.pending = true 14 v.pending = true
6 s.d = 0 15 v.d = u.d + 1
7 s.𝛑 = NIL 16 v.𝛑 = u
8 Q = new Queue 17 ENQUEUE(Q, v)
9 ENQUEUE(Q, s)
💡 Subtlety! We have not provided the paths, only a data structure from which paths can be extracted.
To find the path from s to v, start at v, follow v.𝛑 until v.𝛑 = NIL, then reverse the list of vertices visited. 19
Analysis of SSAD_HOPCOUNT(G, s)
Initialisation loop (lines 1–4) costs Θ(|V|). Lines 5–7 are O(1).
Line 8: initialising a new Queue with max length V takes O(1) to O(|V|) time,
depending on how the memory allocator works and the queue implementation.
The WHILE loop and nested FOR loop eventually process every edge at most
once (exactly once – the worst case – when G is connected) so this is O(|E|) if we
use an adjacency list representation.
Each vertex is enqueued and dequeued (both O(1)) at most once: total O(V) time.
20
Analysis of SSAD_HOPCOUNT(G, s) with Adjacency Matrix
10 while !QUEUE-EMPTY(Q)
11 u = DEQUEUE(Q)
12 for v in G.V
13 if G.E.M[u][v]==1 && !v.pending
To use an adjacency matrix, we cannot loop through adjacent vertices on line 12.
Instead, we loop through all vertices and process them only if the edge matrix
(G.E.M) contains 1 (edge present) in position u,v.
This increases the cost of the loops to O(|V|2) (not Θ(|V|2) because disconnected
vertices will never be enqueued/dequeued).
21
Correctness of SSAD_HOPCOUNT(G, s) [1]
Goal: prove that, when SSAD_HOPCOUNT terminates, for all v ∈ G.V, v.d is the
length of a shortest path from s to v. (‘a’ as equal-shortest paths are possible.)
Let the shortest-path distance 𝛅(s, v) be the actual shortest path length: the
minimum number of edges on any path between s and v. If there is no path
between s and v, we say that 𝛅(s, v) = ∞.
22
Correctness of SSAD_HOPCOUNT(G, s) [2]
Next we prove Lemma 2: on termination, for all v ∈ G.V we have v.d ≥ 𝛅(s, v).
The induction hypothesis is that for all v ∈ G.V we have v.d ≥ 𝛅(s, v).
Base case: immediately before the WHILE loop begins, we have v.d = ∞ for all
vertices except the source, where s.d = 0.
23
Correctness of SSAD_HOPCOUNT(G, s) [3]
Inductive case: the WHILE/FOR loops only change the value of v.d if v was not
pending when the loop began, so non-pending nodes are those we must consider.
The hypothesis tells us that u.d ≥ 𝛅(s, u) and the assignment v.d = u.d + 1
(line 15) gives us that:
v.d = u.d + 1
≥ 𝛅(s, u) + 1
≥ 𝛅(s, v) by Lemma 1
v.d is never changed again because its pending flag is set on line 14.
The result follows by induction.
💡 This tells us that the algorithm does not set any v.d to be too low but we have not yet proved that it has
set any v.d low enough to correspond to a shortest path length. 24
Correctness of SSAD_HOPCOUNT(G, s) [4]
Next we show that the queue only ever contains vertices with at most two different
values of v.d, using induction on the number of queue operations. Lemma 3: After
each call to ENQUEUE and DEQUEUE, 𝛟 holds:
𝛟: if Q = v1, v2, v3, .. vx (head .. tail) then vx.d ≤ v1.d + 1 and vi.d ≤ vi+1.d for i =
1..x-1
The induction hypothesis, 𝛟, thus holds after every DEQUEUE and ENQUEUE.
26
Correctness of SSAD_HOPCOUNT(G, s) [6]
A corollary of Lemma 3 is useful: if SSAD_HOPCOUNT enqueues va before vb then
va.d ≤ vb.d on termination.
Proof: vertices are only given a finite value once during the execution of the
algorithm. Lemma 3 tells us that the ‘d’ attributes of queued elements are ordered
so va.d ≤ vb.d on termination. This comes directly from 𝛟 when a and b are in the
queue simultaneously, and we appeal to the transitivity of ≤ when a and b are not
simultaneously in the queue.
27
Correctness of SSAD_HOPCOUNT(G, s) [7]
Finally, we can prove the correctness of SSAD_HOPCOUNT on a directed or
undirected input graph, G. Explicitly, we want to show that the algorithm:
- Really does find all vertices v ∈ G.V that are reachable from s; and
- Really does terminate with v.d = 𝛅(s, v) for all v ∈ G.V.
To further prove that the paths discovered are correct, we must additionally show
that:
- One of the shortest paths from s to v is a shortest path from s to v.𝛑 followed
by the edge (v.𝛑, v).
28
Correctness of SSAD_HOPCOUNT(G, s) [8]
We use a proof by contradiction. If the algorithm doesn’t work then at least one
vertex was assigned an incorrect ‘d’ value. Let v be the vertex with the minimum 𝛅
(s, v) that has an incorrect v.d upon termination.
By Lemma 2, v.d ≥ 𝛅(s, v) so, since there’s an error, v.d > 𝛅(s, v). Furthermore, v
must be reachable from s as, otherwise, we would have 𝛅(s, v) = ∞ ≥ v.d which
contradicts v.d > 𝛅(s, v).
29
Correctness of SSAD_HOPCOUNT(G, s) [9]
Let u be the node on a shortest path from s to v that comes immediately before v.
𝛅(s, v) = 𝛅(s, u) + 1 so 𝛅(s, u) < 𝛅(s, v).
Because we chose v to be the incorrect vertex with minimum 𝛅(s, v), we know that
𝛅(s, u) = u.d (because 𝛅(s, u) cannot equal 𝛅(s, v) so u.d cannot also be incorrect).
30
Correctness of SSAD_HOPCOUNT(G, s) [10]
When u was dequeued, vertex v might have been in one of three states:
Case 1: if v has not yet been enqueued then v.pending = false so the IF statement
that is executed when u is dequeued and processed will set v.d = u.d + 1. This
contradicts v.d > u.d + 1 so vertex v cannot fall under case 1.
31
Correctness of SSAD_HOPCOUNT(G, s) [11]
Case 2: if v is in the queue when u is dequeued and processed then some earlier
vertex, w, must have encountered v as an adjacency and enqueued it.
This contradicts v.d > u.d + 1 so vertex v cannot fall under case 2.
32
Correctness of SSAD_HOPCOUNT(G, s) [12]
Case 3: if v has already been dequeued when u is dequeued then v.d ≤ u.d, by the
corollary to Lemma 3. This contradicts v.d > u.d + 1 so vertex v cannot fall under
case 3.
All three cases yield a contradiction. As there were no mistakes (hopefully!) in the
consideration of the three cases, we are left to conclude the mistake must lie in
the assumption that led to the three cases, i.e. there can be no v with minimum 𝛅
(s, v) where v.d is incorrect.
If there is no “first time” that the algorithm goes wrong, then it must be correct!
33
Correctness of SSAD_HOPCOUNT(G, s) [13]
To show that the paths are correct (over and above their lengths being correct), we
simply note that the algorithm assigns v.𝛑 = u whenever it assigns v.d = u.d + 1
during the processing of edge (u, v) so, since v.d finishes at the correct value it
must be the case that a shortest path from s to v can be obtained by taking any
shortest path from s to v.𝛑 followed by the direct edge from v.𝛑 to v.
34
Predecessor Subgraph
Consider the edges (v.𝛑, v) for v ∈ G.V \ {s} computed by
SSAD_HOPCOUNT(G,s). (We remove s since s.𝛑 = NIL ∉ V so (s.𝛑, s) would not
be a valid edge.)
- VPSG = { v ∈ G.V | v.𝛑 ≠ NIL } ∪ {s} // i.e. all vertices reachable from s
- EPSG = { (v.𝛑, v) | v ∈ G.V \ {s} }
35
Depth First Search, DFS(G)
DFS is similar to BFS but uses a stack instead of a queue, or a recursive
implementation can use the call stack to govern the exploration order (next slide).
DFS is often used on undirected graphs and no source vertex is specified: in this
case, DFS picks any vertex as the source, explores everything reachable, and
repeats with another randomly-chosen (and as yet unvisited) vertex as the source
until all vertices have been visited. This yields a forest (multiple trees).
Let “time” be a global clock that (effectively) numbers events in exploration order.
36
DFS(G) DFS-HELPER(G, u)
1 for v in G.V 1 time = time + 1
2 v.marked = false 2 u.discover_time = time
3 v.𝛑 = NIL 3 u.marked = true
4 time = 0 4 for v in G.E.adj[u]
5 for s in G.V 5 if !v.marked
6 if !s.marked 6 v.𝛑 = u
7 DFS-HELPER(G, s) 7 DFS-HELPER(G, v)
8 time = time + 1
9 u.finish_time = time
💡 The running time is Θ(|V| + |E|) because all vertices and edges will eventually be explored. 37
What’s the time?
v.discover_time is the global time value when DFS first considered v.
v.finish_time is the global time value when DFS finished recursing into all the
descendants of v.
38
Example DFS
2
5 6 7 8
3
8
39
Example DFS
2 2
5 6 7 8
3
8
40
Example DFS
2 2
5 6 7 8
3 3
8
41
Example DFS
2 2
5 6 7 8
3 3
4 4
8
42
Example DFS
2 2
5 6 7 8
3 3
4 4
8 5
43
Example DFS
2 2
5 6 7 8
3 3
4 4
8 5 6
44
Example DFS
2 2
5 6 7 8
3 3
4 4 7
8 5 6
45
Example DFS
2 2
5 6 7 8
3 3
4 4 7
7 8
8 5 6
46
Example DFS
2 2
5 6 7 8
3 3
4 4 7
7 8 9
8 5 6
47
Example DFS
2 2
5 6 7 8
3 3 10
4 4 7
7 8 9
8 5 6
48
Example DFS
2 2
5 6 7 8
3 3 10
4 4 7
6 11
7 8 9
8 5 6
49
Example DFS
2 2
5 6 7 8
3 3 10
4 4 7
5 12
6 11
7 8 9
8 5 6
50
Example DFS
2 2
5 6 7 8
3 3 10
4 4 7
5 12 13
6 11
7 8 9
8 5 6
51
Example DFS
2 2
5 6 7 8
3 3 10
4 4 7
5 12 13
6 11 14
7 8 9
8 5 6
52
Example DFS
2 2 15
5 6 7 8
3 3 10
4 4 7
5 12 13
6 11 14
7 8 9
8 5 6
53
Example DFS
2 2 15
5 6 7 8
3 3 10
4 4 7
5 12 13
6 11 14
7 8 9
8 5 6
54
Classification of edges
We can classify the edges in G.E into four kinds:
1. An edge (u, v) ∈ G.E is a tree edge if v was discovered by exploring (u, v).
2. For a directed graph, an edge (u, v) ∈ G.E can be a back edge if it connects
u to some ancestor, v, in the depth-first tree.
3. An edge (u, v) ∈ G.E is a forward edge if it is not in the depth-first tree and
connects u to a descendant, v, in the tree.
4. All the other edges are cross edges and can run between vertices in the
same depth-first tree provided one vertex is not an ancestor of the other, or
they can run between depth-first trees (only possible in a directed graph).
⚠ These are not mutually exclusive: edges can have multiple classifications! 55
Properties [1]
● Every edge in an undirected graph is either a tree edge or a back edge.
56
Properties [2]
● Given an undirected graph, DFS will identify the connected components
(because it doesn’t matter which vertices we explore from when the edges are
undirected). The number of times DFS calls DFS-HELPER is the number of
connected components.
● If we run DFS on a directed graph then sort the vertices by finish time in
descending order, we have a topological sort for the original graph!
57
Strongly Connected Components
The Strongly Connected Components problem is defined as:
58
Strongly Connected Components Problem Instance
1 2 3 4
Input graph, G = (V, E), directed
5 6 7 8
1 2 3 4
5 6 7 8
59
💡 See CLRS chapter 22
STRONGLY-CONNECTED-COMPONENTS(G)
1. Run DFS on G to populate the finish_time for each vertex v ∈ G.V.
2. Compute GT
3. Run DFS on GT but in the main loop of DFS, call DFS-HELPER on vertices in
order of descending finish_time as computed in step 1.
4. For each tree in the forest produced by DFS(GT), output the vertices as a
separate strongly connected component of G.
60
SCC1: Run DFS on original graph
2 2 15
5 6 7 8
3 3 10
4 4 7
5 12 13
6 11 14
7 8 9
8 5 6
61
SCC2: Compute GT
2 2 15
5 6 7 8
3 3 10
4 4 7
5 12 13
6 11 14
7 8 9
8 5 6
62
SCC3a: Reverse sort nodes by finish, DFS in that order
15 2
5 6 7 8
14 6
13 5
10 3
7 4
6 8
63
SCC3b: Continue DFS in that order
15 2 3 8
5 6 7 8
14 6
13 5
10 3 5 6
15 2 3 8
5 6 7 8
14 6 9 12
13 5 10 11
10 3 5 6
15 2 3 8
5 6 7 8
14 6 9 12
13 5 10 11
10 3 5 6
15 2 3 8
5 6 7 8
14 6 9 12
13 5 10 11
10 3 5 6
1 2 3 4
9 7 4 7
7 4 13 16
5 6 7 8
6 8 14 15
67
Shortest Path Problems [1]
Input: a directed, weighted graph, G = (V, E), with its weight function w: E → ℝ.
We define the weight of a path, p = v0, v1, v2, … vk, as the linear sum of the edge
weights:
The edge weights can represent any additive metric: time, cost, distance.
68
Shortest Path Problems [2]
The shortest path weight from u to v, 𝛅(u, v) = ∞ if there is no path from u to v,
and 𝛅(u, v) = minp(w(p)) otherwise, where the minimisation over p considers all
paths u ⤳ v.
A shortest path from u to v is any such path, p, with w(p) = 𝛅(u, v).
BFS solved one variant of the shortest path problem: the single-source shortest
path problem for unweighted graphs or, equivalently, weighted graphs were all the
edge weights have the same (finite, positive) value.
69
Shortest Path Problems [3]
What’s the output? Actually, there are several kinds of shortest path problems!
Single-Destination Shortest Paths: find the shortest paths from every source to
a single, specified destination vertex. Same as SSSP in GT.
Single-Pair Shortest Path: find the shortest path from u to v (both specified).
Best known algorithm has the same worst case cost as best SSSP algorithms.
All-Pairs Shortest Paths: find the shortest path between every pair of vertices.
70
Complications
It turns out that more efficient algorithms are possible in a subset of cases.
71
BELLMAN-FORD(G, w, s)
This finds shortest paths from s ∈ G.V to every vertex in G.V that is reachable
from s – single source shortest paths – in O(|V||E|) time.
If the algorithm finds a negative weight cycle, it return false. This indicates that
there is no solution to the single-source shortest paths problem for G.
If there is no negative weight cycle, it returns true. This indicates that the paths
found are valid. Paths are acyclic (they exclude zero-weight cycles).
Shortest paths are not returned explicitly but are encoded as 𝛑 attributes. This
takes less time to produce and no additional time to consume.
72
BELLMAN-FORD(G, w, s) RELAX(u, v, w)
1 for v in G.V 1 if v.d > u.d + w(u, v)
2 v.d = ∞ 2 v.d = u.d + w(u, v)
3 v.𝛑 = NIL 3 v.𝛑 = u
4 s.d = 0
5 for i = 1 to |G.V|-1
6 for (u,v) in G.E RELAX(u, v, w)
7 for (u,v) in G.E if v.d > u.d + w(u, v) return false
8 return true
Initialisation Θ(|V|). Line 5 runs Θ(|V|) times and line 6 takes Θ(|E|). Final check Θ(|E|). Overall O(|V||E|). 73
Example: BELLMAN-FORD(G, w, “E”)
Initialisation
B F
1 6 7 Vertex d 𝛑
A E 3 Weighted, directed
5 graph (input) A ∞ NIL
2 1 3
C D G
2
B ∞ NIL
C ∞ NIL
D ∞ NIL
ii v ix E 0 NIL
vii Order of edges in E
iv F ∞ NIL
i vi viii
iii
G ∞ NIL
74
Example: BELLMAN-FORD(G, w, “E”)
Iteration i=1
B F
1 6 7 Vertex d 𝛑
A E 3 Weighted, directed
5 graph (input) A 5 E
2 1 3
C D G
2
B ∞ NIL
C ∞ NIL
D ∞ NIL
ii v ix E 0 NIL
vii Order of edges in E
iv F 7 E
i vi viii
iii
G 3 E
💡 Order of edges means we relax E→G too late to get F.d=6 in this iteration. 75
Example: BELLMAN-FORD(G, w, “E”)
Iteration i=2
B F
1 6 7 Vertex d 𝛑
A E 3 Weighted, directed
5 graph (input) A 5 E
2 1 3
C D G
2
B 6 A
C 7 A
D 9 C
ii v ix E 0 NIL
vii Order of edges in E
iv F 6 G
i vi viii
iii
G 3 E
💡 C and D changed in the same iteration, due to the order of edges. This only speeds up convergence. 76
Example: BELLMAN-FORD(G, w, “E”)
Iteration i=3,4,5,6,7 (no changes)
B F
1 6 7 Vertex d 𝛑
A E 3 Weighted, directed
5 graph (input) A 5 E
2 1 3
C D G
2
B 6 A
C 7 A
D 9 C
ii v ix E 0 NIL
vii Order of edges in E
iv F 6 G
i vi viii
iii
G 3 E
77
Example: BELLMAN-FORD(G, w, “E”)
B F
1 6 7 Vertex d 𝛑
A E 3 Shortest paths
5 (output) A 5 E
2 1 3
C D G
2
B 6 A
C 7 A
D 9 C
Final check: any v.d>u.d+w(u,v)?
E 0 NIL
No ⇒ return true
Distances and shortest paths are valid F 6 G
G 3 E
78
Negative Cycle Example: BELLMAN-FORD(G, w, “E”)
Initialisation
B F
1 6 7 Vertex d 𝛑
A E -4 Weighted, directed
-7 graph (input) A ∞ NIL
2 1 3
C D G
2
B ∞ NIL
C ∞ NIL
D ∞ NIL
ii v ix E 0 NIL
vii Order of edges in E
iv F ∞ NIL
i vi viii
iii
G ∞ NIL
79
Negative Cycle Example: BELLMAN-FORD(G, w, “E”)
Iteration i=1
B F
1 6 7 Vertex d 𝛑
A E -4 Weighted, directed
-7 graph (input) A -7 E
2 1 3
C D G
2
B ∞ NIL
C ∞ NIL
D ∞ NIL
ii v ix E 0 NIL
vii Order of edges in E
iv F 7 E
i vi viii
iii
G 3 E
80
Negative Cycle Example: BELLMAN-FORD(G, w, “E”)
Iteration i=2
B F
1 6 7 Vertex d 𝛑
A E -4 Weighted, directed
-7 graph (input) A -7 E
2 1 3
C D G
2
B -6 A
C -5 A
D -3 C
ii v ix E -2 D
vii Order of edges in E
iv F -1 G
i vi viii
iii
G 1 E
81
Negative Cycle Example: BELLMAN-FORD(G, w, “E”)
Iteration i=3
B F
1 6 7 Vertex d 𝛑
A E -4 Weighted, directed
-7 graph (input) A -9 E
2 1 3
C D G
2
B -6 A
C -5 A
D -3 C
ii v ix E -2 D
vii Order of edges in E
iv F -3 G
i vi viii
iii
G 1 E
82
Negative Cycle Example: BELLMAN-FORD(G, w, “E”)
Iteration i=4
B F
1 6 7 Vertex d 𝛑
A E -4 Weighted, directed
-7 graph (input) A -9 E
2 1 3
C D G
2
B -8 A
C -7 A
D -5 C
ii v ix E -4 D
vii Order of edges in E
iv F -3 G
i vi viii
iii
G -1 E
83
Negative Cycle Example: BELLMAN-FORD(G, w, “E”)
Iteration i=5
B F
1 6 7 Vertex d 𝛑
A E -4 Weighted, directed
-7 graph (input) A -11 E
2 1 3
C D G
2
B -8 A
C -7 A
D -5 C
ii v ix E -4 D
vii Order of edges in E
iv F -5 G
i vi viii
iii
G -1 E
84
Negative Cycle Example: BELLMAN-FORD(G, w, “E”)
Iteration i=6
B F
1 6 7 Vertex d 𝛑
A E -4 Weighted, directed
-7 graph (input) A -11 E
2 1 3
C D G
2
B -10 A
C -9 A
D -7 C
ii v ix E -6 D
vii Order of edges in E
iv F -5 G
i vi viii
iii
G -3 E
85
Negative Cycle Example: BELLMAN-FORD(G, w, “E”)
Iteration i=7
B F
1 6 7 Vertex d 𝛑
A E -4 Weighted, directed
-7 graph (input) A -13 E
2 1 3
C D G
2
B -10 A
C -9 A
D -7 C
ii v ix E -6 D
vii Order of edges in E
iv F -7 G
i vi viii
iii
G -3 E
86
Negative Cycle Example: BELLMAN-FORD(G, w, “E”)
Termination
B F
1 6 7 Vertex d 𝛑
A E -4 Weighted, directed
-7 graph (input) A -13 E
2 1 3
C D G
2
B -10 A
C -9 A
D -7 C
Final check: any v.d>u.d+w(u,v)?
E -6 D
Yes, (A,C): -9>-13+2 ⇒ return false
Distances are invalid. No shortest paths. F -7 G
G -3 E
87
Special cases for DAGs
Many important problems give rise to directed graphs that are naturally acyclic.
It turns out that we can solve this special case of the single-source shortest paths
problem with lower asymptotic time complexity than the general case: Θ(|V| + |E|).
Initialisation (lines 1–4) is Θ(|V|). Topological sort is Θ(|V| + |E|). Lines 6–8 are Θ(|E|). Total Θ(|V| + |E|). 88
Optimal Substructure
● If p = u ⤳ v = u, .. vi, .. vj, .. v is a shortest path from u to v through the
weighted edges of some graph G,
● …and it goes via vi and vj in that order (although not necessarily adjacently),
● …then the subpath from vi ⤳ vj is a shortest path from vi to vj.
This means we can look to dynamic programming methods and greedy algorithms
to provide efficient solutions to problems involving shortest paths! Let’s see how
to exploit this in other algorithms.
89
DIJKSTRA(G, w, s)
Dijkstra’s algorithm solves the single-source shortest paths problem using a
greedy strategy to exploit the optimal substructure of shortest paths.
Dijkstra’s algorithm works on directed graphs with non-negative edge weights, i.e.
w(u,v) ≥ 0 for all (u, v) ∈ G.E.
The greedy algorithm achieves a lower cost than Bellman-Ford (albeit that
Bellman-Ford can handle negative edges and detects negative cycles).
90
DIJKSTRA(G, w, s)
1 for v in G.V 7 while !PQ-EMPTY(Q)
2 v.d = ∞ 8 u = PQ-EXTRACT-MIN(Q)
3 v.𝛑 = NIL 9 S = S ∪ {u}
4 s.d = 0 10 for v in G.E.adj[u]
5 S = EMPTY-SET 11 RELAX(u, v, w)
6 Q = new PriorityQueue(G.V)
💡 The priority queue uses the ‘d’ attribute as the ordering key. Changing ‘d’ (in RELAX) implicitly calls
DECREASE-KEY.
💡 Note that the set, S, of nodes whose shortest paths have been found, is not used. We could delete
lines 5 and 9 without consequence. S is included in most presentations of Dijkstra’s Algorithm because
Dijkstra’s original description used it, and we will use it for the proof of correctness. 91
Example: DIJKSTRA(G, w, “E”)
Initialisation
B F
1 6 7 Vertex d 𝛑
A E 3 Weighted, directed
5 graph (input) A ∞ NIL
2 1 3
C D G
2
B ∞ NIL
C ∞ NIL
PQ: → (G,∞) (F,∞) (D,∞) (C,∞) (B,∞) (A,∞) (E,0) →
D ∞ NIL
E 0 NIL
F ∞ NIL
G ∞ NIL
92
Example: DIJKSTRA(G, w, “E”)
Iteration 1
B F
1 6 7 Vertex d 𝛑
A E 3 Weighted, directed
5 graph (input) A 5 E
2 1 3
C D G
2
B ∞ NIL
C ∞ NIL
PQ: → (D,∞) (C,∞) (B,∞) (F,7) (A,5) (G,3) →
D ∞ NIL
E 0 NIL
F 7 E
G 3 E
93
Example: DIJKSTRA(G, w, “E”)
Iteration 2
B F
1 6 7 Vertex d 𝛑
A E 3 Weighted, directed
5 graph (input) A 5 E
2 1 3
C D G
2
B ∞ NIL
C ∞ NIL
PQ: → (D,∞) (C,∞) (B,∞) (F,6) (A,5) →
D ∞ NIL
E 0 NIL
F 6 G
G 3 E
94
Example: DIJKSTRA(G, w, “E”)
Iteration 3
B F
1 6 7 Vertex d 𝛑
A E 3 Weighted, directed
5 graph (input) A 5 E
2 1 3
C D G
2
B 6 A
C 7 A
PQ: → (D,∞) (C,7) (B,6) (F,6) →
D ∞ NIL
E 0 NIL
F 6 G
G 3 E
95
Example: DIJKSTRA(G, w, “E”)
Iteration 4
B F
1 6 7 Vertex d 𝛑
A E 3 Weighted, directed
5 graph (input) A 5 E
2 1 3
C D G
2
B 6 A
C 7 A
PQ: → (D,∞) (C,7) (B,6) →
D ∞ NIL
E 0 NIL
F 6 G
G 3 E
96
Example: DIJKSTRA(G, w, “E”)
Iteration 5
B F
1 6 7 Vertex d 𝛑
A E 3 Weighted, directed
5 graph (input) A 5 E
2 1 3
C D G
2
B 6 A
C 7 A
PQ: → (D,∞) (C,7) →
D ∞ NIL
E 0 NIL
F 6 G
G 3 E
97
Example: DIJKSTRA(G, w, “E”)
Iteration 6
B F
1 6 7 Vertex d 𝛑
A E 3 Weighted, directed
5 graph (input) A 5 E
2 1 3
C D G
2
B 6 A
C 7 A
PQ: → (D,9) →
D 9 C
E 0 NIL
F 6 G
G 3 E
98
Example: DIJKSTRA(G, w, “E”)
Iteration 7
B F
1 6 7 Vertex d 𝛑
A E 3 Weighted, directed
5 graph (input) A 5 E
2 1 3
C D G
2
B 6 A
C 7 A
PQ: → →
D 9 C
E 0 NIL
F 6 G
G 3 E
99
Example: DIJKSTRA(G, w, “E”)
Termination
B F
1 6 7 Vertex d 𝛑
A E 3 Shortest paths
5 (output) A 5 E
2 1 3
C D G
2
B 6 A
C 7 A
PQ: → →
D 9 C
E 0 NIL
F 6 G
G 3 E
100
Correctness of DIJKSTRA(G, w, s) [1]
We want to show that when DIJKSTRA runs on a directed graph, G = (V, E), with
non-negative edge weights and source s, it terminates with v.d = 𝛅(s, v) for all
v ∈ G.V.
We show that the following property is true at the start of each iteration of the
WHILE loop (lines 7–11):
101
Correctness of DIJKSTRA(G, w, s) [2]
Initialisation: at the start of the first iteration, S = ∅ so ɸ is vacuously true.
We know that u ≠ s since s.d = 0 = 𝛅(s, s), and hence S ≠ ∅ when u was added.
There must be some path s ⤳ u to be found, since otherwise u.d = ∞ = 𝛅(s, u),
and hence some shortest path to be found.
102
Correctness of DIJKSTRA(G, w, s) [3]
Before we add u to S, the shortest path p = s ⤳ u can be split p = s ⤳ y ⤳ u,
where y ∉ S is the first vertex in p not to be in S. Let x ∈ S be the predecessor to
y in path p then we can write p = s ⤳p1 x → y ⤳p2 u. (Either/Both p1 and p2 might
be empty.)
We know that x.d = 𝛅(s, x) when x was added to S because u is the first vertex for
which this failed. The edge (x, y) was relaxed in the iteration that added x to S so
we know that y.d = 𝛅(s, y) – this is known as the convergence property.
Convergence property of RELAX(i, j):
If s ⤳ i → j is a shortest path in G and i.d = 𝛅(s, i) before edge (i, j) is relaxed,
then j.d = 𝛅(s, j) afterwards.
103
Correctness of DIJKSTRA(G, w, s) [4]
Proof of Convergence property of RELAX(i, j):
If s ⤳ i → j is a shortest path in G and i.d = 𝛅(s, i) before edge (i, j) is relaxed,
then j.d = 𝛅(s, j) afterwards.
And since we know that j.d never underestimates 𝛅(s, j), we have j.d = 𝛅(s, j).
104
Correctness of DIJKSTRA(G, w, s) [5]
Back to Dijkstra. Since the weights are non-negative and y is before u in our
shortest path, p, we know that 𝛅(s, y) ≤ 𝛅(s, u) and hence…
y.d = 𝛅(s, y)
≤ 𝛅(s, u)
≤ u.d (since we assume u.d is incorrect and it cannot be less)
Both u ∉ S and y ∉ S when u was taken from the priority queue so u.d ≤ y.d.
105
Correctness of DIJKSTRA(G, w, s) [6]
Termination: when we terminate, the priority queue, Q, is empty. Since Q = V \ S
we must have processed all vertices when DIJKSTRA terminates.
Therefore the maintained property applies to every vertex and we have that
v.d = 𝛅(s, v) for all v ∈ V, i.e. ɸ is true and we have proved the correctness of
Dijkstra’s algorithm.
It follows that the predecessor subgraph G𝛑, is a shortest path tree rooted at s,
i.e. not only are the distances correct but the paths obtained by following the 𝛑
attributes are also correct.
106
Analysis of DIJKSTRA(G, w, s)
The initialisation takes Θ(|V|) time. Initialising a priority queue takes O(1) to O(|V|)
depending on the type of priority queue (and memory allocator) used.
107
Analysis of DIJKSTRA(G, w, s) with an array / hash table
We can implement the priority queue using an array (or hash table) holding (d, 𝛑)
for each vertex v ∈ [1, 2, .. |V|]. PQ-INSERT takes O(1) time per vertex.
PQ-EXTRACT-MIN take O(|V|) time to search the array for the smallest ‘d’.
PQ-EMPTY is O(1) because we can keep a counter. PQ-DECREASE-KEY is O(1),
because we must only change ‘d’ in one array position.
108
Analysis of DIJKSTRA(G, w, s) with a min-heap
For a min-heap keyed by ‘d’, PQ-INSERT takes O(lg |V|) time per vertex, or,
smarter, we can insert all vertices then run FULL-HEAPIFY to build a heap in O(|V|)
time (although if s is first there is no need since all other keys are infinite).
PQ-EXTRACT-MIN takes O(lg |V|) time. PQ-EMPTY is O(1) because we can keep a
counter. PQ-DECREASE-KEY is O(lg |V|), to REHEAPIFY that node in the heap.
The final cost is O(|V| + |V| lg |V| + |V|1 + |E| lg |V|) = O((|V| + |E|) lg |V|).
(initialisation + extractions + empty checks + decrease keys)
Output: a |V|x|V| matrix, D = (dij), where dij = 𝛅(i, j) is the shortest path weight from
i to j (∞ if j is unreachable from i).
One solution is obvious: run a single-source shortest path algorithm with each
vertex v ∈ G.V in turn as the source.
110
All-Pairs Shortest Paths via BELLMAN-FORD
One solution is obvious: run a single-source shortest path algorithm with each
vertex v ∈ G.V in turn as the source.
111
All-Pairs Shortest Paths via DIJKSTRA
If the edge weights are non-negative, w(i, j) ≥ 0 for all i,j ∈ G.V, we can use
Dijkstra’s algorithm.
Using a heap for the priority queue, each source costs O((|V| + |E|) lg |V|) and
overall we have O((|V| + |E|) |V| lg |V|) running time.
Using an asymptotically optimal priority queue (as we shall see later in the
course), we can achieve O(|V|2 lg |V| + |V||E|) overall running time.
112
Matrix Methods
We use the adjacency matrix representation.
If G.E.M is the square matrix of edge weights, consider the matrix G.E.M x G.E.M
(i.e. the matrix multiplied by itself).
- If we reinterpret the scalar + and scalar * operations that are used in matrix
multiplication as MIN and + respectively then…
- Element (i,j) in the resulting matrix is MINk{i→k + k→j}, over all k ∈ G.V
(because regular multiplication would set (i,j) to +k{(i,k)*(k,j)} over all k).
This adds one ‘hop’ to the end of all paths represented in the left matrix.
113
Repeated squaring
Because there can be no shortest paths longer than |V| - 1, the matrix (G.E.M) x is
a matrix of all shortest paths provided x ≥ |V| - 1.
114
Dynamic Programming on Graphs: Floyd-Warshall [1]
We can use dynamic programming to solve the all-pairs shortest path problem.
For any i,j ∈ G.V, consider a minimum weight path p = i ⤳ j that only has
intermediate vertices in a subset {1, 2, .. k} ⊆ G.V.
115
Dynamic Programming on Graphs: Floyd-Warshall [2]
For any i,j ∈ G.V, consider a minimum weight path p = i ⤳ j that only has
intermediate vertices in a subset {1, 2, .. k} ⊆ G.V.
116
Dynamic Programming on Graphs: Floyd-Warshall [3]
This observation gives us a dynamic programming approach! Working bottom-up,
the minimum weight paths i ⤳ j using no intermediates are the edge weights.
For k = 1 to |G.V|
For each i,j ∈ G.V
Lookup the min weight path i ⤳ j only using vertices {1, 2, .. k-1} [x]
Lookup the min weight paths i ⤳ k and k ⤳ j using only {1, 2, .. k-1}[y,z]
Set the min weight path i ⤳j using {1, 2, .. k} as MIN(x, y+z)
The two “Lookup” steps refer to smaller instances of the same problem that have
already been solved. The “Set…” step saves a value that will be looked up later.
117
FLOYD-WARSHALL(G, w)
1 D(0) = w
2 for k = 1 to |G.V|
3 let D(k) = (dij(k)) be a new matrix
4 for i = 1 to |G.V|
5 for j = 1 to |G.V|
6 dij(k) = min(dij(k-1), dik(k-1) + dkj(k-1))
7 return D(|G.V|)
💡 Floyd-Warshall finds the matrix of all-pairs shortest path lengths in O(|V|3) running time. 118
Extensions to Floyd-Warshall
1. In parallel with D(k), keep a matrix 𝚷(k) = (𝛑ij(k)) where 𝛑ij(k) is the predecessor
of j in a minimum weight path from i using intermediates in {1, 2, .. k}.
a. Initialise 𝛑ij(0) to NIL if i = j or (i,j) ∉ G.E; and to i otherwise.
b. Set 𝛑ij(k) to 𝛑ij(k-1) or 𝛑kj(k-1) corresponding to which of the two options was selected by MIN on
line 6.
2. To compute the transitive closure of G.E, G.E*, run Floyd-Warshall with
w(i, j) = 1 for all (i, j) ∈ G.E. Interpret the output matrix, D = (dij), as follows:
a. If dij < ∞ then (i, j) ∈ G.E*
b. Otherwise, (i, j) ∉ G.E*
To compute G.E*, we can also interpret G.E as Booleans (edge = true) then run
Floyd-Warshall with MIN interpreted as Boolean OR and + as AND.
119
Johnson’s Algorithm
Johnson’s algorithm solves the all-pairs shortest paths problem with expected
running time O(|V|2 lg |V| + |V||E|).
Johnson’s algorithm can handle negative edge weights, and will detect negative
cycles and report that no solution exists.
120
Reweighting [1]
In order to run Dijkstra’s algorithm with every vertex as the source, we need to
ensure there are no negative edge weights.
Specifically, we require a new set of edge weights, 𝒘(u, v), such that…
We cannot add a bias, b, to every edge weight such that b + w(u, v) ≥ 0 for all
(u, v) ∈ G.E because paths are different lengths: longer paths would be
penalised.
121
Reweighting [2]
Define 𝒘(u, v) = w(u, v) + h(u) - h(v)
where h : V → ℝ is a function mapping vertices to real numbers.
We cannot add a bias, b, to every edge weight such that b + w(u, v) ≥ 0 for all
(u, v) ∈ G.E because paths are different lengths: longer paths would be
penalised.
122
Reweighting [3]
It is easy to show that there is a negative cycle under 𝒘 if there is a negative cycle
under w: consider a cyclic path p = v1, v2, .. v1. The sum of edge weights under 𝒘
is, 𝒘(p) = 𝒘(1, 2) + 𝒘(2, 3) + … + 𝒘(n, 1)
= w(1, 2) + h(1) - h(2) + w(2, 3) + h(2) - h(3) + … + w(n, 1) + h(n) - h(1)
= w(1, 2) + w(2, 3) + … + w(n, 1)
= w(p)
If p = v1, v2, .. vn, is a shortest path under w then it also is under 𝒘 because
𝒘(p) = w(p) + h(v1) - h(vn) but h(v1) and h(vn) do not depend on the path. If some
path v1 ⤳ vn minimises w(p), it must also minimise 𝒘(p).
123
Reweighting [4]
From our input graph G = (V, E), construct an augmented graph, G’ = (V’, E’):
Note: this ensures that 𝒘(u, v) = w(u, v) + h(u) - h(v) ≥ 0 as h(v) ≤ h(u) + w(u, v).
124
JOHNSON(G, w)
1 Compute G’ = (G.V ∪ {s}, E ∪ {(s,v) | v ∈ G.V})
2 if !BELLMAN-FORD(G’, w, s) then error(“Negative cycle!”)
3 for (u,v) in G.E 𝒘(u,v) = w(u,v) + G’.V[u].d - G’.V[v].d
4 let D = (duv) be a new matrix
h(x) = x.d = 𝛅(s,x), as computed by Bellman-Ford
5 for u in G.V
6 DIJKSTRA(G, 𝒘, u)
7 for v in G.V duv = G.V[v].d - G’.V[u].d + G’.V[v].d
8 return D
Undo the reweighting to restore original weights
125
Analysis of JOHNSON(G, w)
● (Line 1) Computing G’ costs O(|V|) time.
● (Line 2) BELLMAN-FORD takes O(|V’||E’|) = O(|V||E|).
● (Line 3) Calculating new edge weights take O(|E|) time.
● (Line 6) DIJKSTRA run |G.V| times costs O(|V|2 lg |V| + |V||E|) time (using a
clever priority queue), or O(|V||E| lg |V|) with a heap.
● (Line 7) Un-reweighting costs O(|V|2)
126
Algorithms 2
Section 2: Graphs and Subgraphs
127
💡 Reference: CLRS2 chapter 26
V contains two distinguished vertices, s and t, known as the source and the sink of
the flow.
128
Flow Networks [2]
All vertices are on some path s ⤳ v ⤳ t so |E| ≥ |V| - 1 (every vertex other than s
must have at least one inbound edge).
Put another way, we can delete any vertex v (and its incident edges) if v is not
reachable from s or t is not reachable from v. For the problems we want to solve,
such vertices never alter the solution.
129
Definition of a flow
A flow f(u, v) in G is a function of type V ⨉ V → ℝ with two properties:
Σv ∈ V f(v, u) = Σv ∈ V f(u, v)
The flow is defined between all pairs of vertices in G and is known as the flow from
u to v.
130
Value of a flow
We denote the value of a flow f as |f| where
The second term is usually zero because there is no flow into the source but, as
we will see, we want to generalise the networks to which we apply this idea and
the edges into the source will not always have zero weight.
131
Maximum Flow Problem
Input: a flow network, i.e. a directed graph G = (V, E) with edge capacities
c(u, v) ≥ 0, and two distinguished vertices s,t ∈ V being the source and sink.
Note that we seek to determine the flow, not just the flow value.
132
Antiparallel edges [1]
We said that we do not allow having both (u, v) and (v, u) be edges in E.
Several algorithms that solve the Maximum Flow Problem require this.
We cannot simplify antiparallel edges to a single edge with the net capacity
because we might want to use only the capacity in the smaller magnitude
direction, or all the capacity in the larger direction.
7
4
133
Antiparallel edges [2]
We can handle antiparallel edges by introducing additional vertices to split one of
the edges. Two new edges are assigned the same capacity as the original they
replace.
This means we can require no antiparallel edges without limiting the set of
problems our algorithms can solve.
7 7 7
3 3
134
Supersources and Supersinks [1]
If we want to model a system where flow originates from multiple sources (s1 .. sm)
and is consumed by multiple sinks (t1 .. tn), we can add additional vertices and
edges:
This reduces the multiple source, multiple sink problem to the single source, single
sink problem. We lose no generality by only considering solutions to the single
source, single sink problem.
135
Supersources and Supersinks [2]
∞ ∞
∞
∞ ∞
1. Residual networks
2. Augmenting paths
3. Cuts
137
Residual Networks
Given a flow network G = (V, E) and a flow f, the residual network Gf contains
residual edges showing how we can change the flow:
1. If an edge (u, v) ∈ E and f(u, v) < c(u, v) then we can add more flow to the
edge: up to c(u, v) - f(u, v) more. NB: there is no edge if f(u, v) = c(u, v) !!
2. If f(u, v) > 0 then we can cancel flow that is already present by adding flow in
the reverse direction: up to f(u, v) along edge (v, u) [note reverse direction]
Note that (2) allows the residual network to contain edges not in G, and Gf might
include antiparallel edges.
138
Residual Capacity
Given a flow network G = (V, E) and a flow f, the residual edges (u, v) in the
residual network Gf have residual capacities cf(u, v) where
139
Augmentation
Any flow f’ in the residual network Gf can be added to the flow f to make a valid
flow because the flow assigned to every edge cannot exceed its capacity and
cannot become negative. This is augmentation, written as f ⭡ f’.
140
Augmenting Paths
Given a flow network G and a flow f, an augmenting path is a simple path p from
s to t in the residual network.
The maximum amount by which we can increase the flow along each edge in p is
called the residual capacity of the path p:
Notice that if we augment flow f with the residual capacities along each edge (u, v)
on p, then we get a flow with strictly larger value: |f ⭡ fp| = |f| + |fp| > |f|.
141
FORD-FULKERSON(G, s, t)
1 Initialise flow f to 0 on all edges
2 while there exists an augmenting path p in
the residual network Gf
3 augment the flow f along p
4 return f
142
Why do we get a maximum flow?
We saw that Ford-Fulkerson augments a flow using augmenting paths until no
more augmenting paths can be found.
The Max-Flow Min-Cut Theorem tells us that this technique will work.
143
Cuts [1]
A cut (S, T) of a flow network G = (V, E) is a partition of V into S and T = V \ S
such that s ∈ S and t ∈ T.
For a flow f, we define the net flow f(S, T) across the cut (S, T) as
Given a flow network G with source s and sink t, and a flow f, let (S, T) be any cut
of G. The net flow across (S, T) is f(S, T) = |f|. (The proof follows from the
definition of flow conservation.)
144
Cuts [2]
The capacity of the cut (S, T) is c(S, T) = ∑u∈S ∑v∈T c(u, v).
A minimum cut of a network is a cut whose capacity is minimum over all cuts of
the network.
The value of any flow f in a flow network G is bounded from above by the capacity
of any cut of G.
145
Max-Flow Min-Cut Theorem
If f is a flow in a flow network G = (V, E) with source s and sink t, then the following
conditions are equivalent:
1. f is a maximum flow in G;
2. The residual network Gf contains no augmenting paths; and
3. |f| = c(S, T) for some cut (S, T) of G.
146
Proof of Max-Flow Min-Cut Theorem [1]
1. f is a maximum flow in G;
2. The residual network Gf contains no augmenting paths; and
3. |f| = c(S, T) for some cut (S, T) of G.
Note that |fp| > 0 because we did not add edges to Gf with zero capacity.
147
Proof of Max-Flow Min-Cut Theorem [2]
1. f is a maximum flow in G;
2. The residual network Gf contains no augmenting paths; and
3. |f| = c(S, T) for some cut (S, T) of G.
Remember that the value of any flow f in a flow network G is bounded from above
by the capacity of any cut of G, i.e. |f| ≤ c(S, T).
148
Proof of Max-Flow Min-Cut Theorem [3]
1. f is a maximum flow in G;
2. The residual network Gf contains no augmenting paths; and
3. |f| = c(S, T) for some cut (S, T) of G.
Suppose Gf has no augmenting paths (so no paths from s to t). Consider the
partition (S, T) where S = {v ∈ V | ∃ path from s to v in Gf}, and T = V \ S.
149
Proof of Max-Flow Min-Cut Theorem [4]
If (u, v) ∈ E then we must have f(u, v) = c(u, v) since, otherwise, there would be
residual capacity on the edge and (u, v) would be in Ef. That would place v ∈ S.
If (v, u) ∈ E then we must have f(v, u) = 0 because, otherwise, cf(u, v) = f(v, u) > 0
and we would have (u, v) ∈ Ef. That also places v ∈ S.
f(S, T) = ∑u∈S ∑v∈T f(u, v) - ∑u∈S ∑v∈T f(v, u) = ∑u∈S ∑v∈T c(u, v) - 0 = c(S, T)
150
Proof of Max-Flow Min-Cut Theorem [5]
1. f is a maximum flow in G;
2. The residual network Gf contains no augmenting paths; and
3. |f| = c(S, T) for some cut (S, T) of G.
We have proven that 1 => 2 and that 2 => 3 and that 3 => 1, which suffices to
show the equivalence of all three statements in the Max-Flow Min-Cut Theorem.
151
Basic FORD-FULKERSON(G, s, t)
1 for (u, v) in G.E (u, v).f = 0
2 while there exists a path p from s to t in Gf
3 cf(p) = min{cf(u, v) | (u, v) is in p}
4 for (u, v) in p
5 if (u, v) ∈ G.E
6 (u, v).f = (u, v).f + cf(p)
7 else
8 (v, u).f = (v, u).f - cf(p)
152
Termination and Analysis
Interestingly, Ford-Fulkerson can fail to terminate if the edge capacities are
irrational numbers: augmenting paths can add tiny amounts of additional flow in a
series that is not convergent.
If all the capacities are integers, this cannot occur. We can find augmenting paths
using breadth-first search or depth first search, costing O(|Ef|) each time, which is
O(|E|) each time.
With integral capacities, the flow must increase by at least 1 each iteration so the
cost of FORD-FULKERSON is O(|E| |f*|), where f* is the maximum flow.
153
Optimisation: EDMUNDS-KARP(G, s, t)
We find shortest augmenting paths using breadth-first search on the residual
network but with edge weights all set to 1.
It can be shown that this algorithm has O(|V| |E|2) running time.
154
Maximum Bipartite Matchings
Given an undirected graph G = (V, E), a matching M ⊆ E contains at most one
edge that is incident at each vertex v ∈ V.
155
Bipartite Graphs
An undirected bipartite graph G = (V, E) where V = V1 ∪ V2 and E ⊆ V1 ⨉ V2.
We also assume that every vertex has at least one incident edge.
156
Maximum Bipartite Matching Problem
Input: an undirected bipartite graph G = (V, E) where V = V1 ∪ V2 and E ⊆ V1 ⨉
V 2.
158
Maximum Matchings in Unweighted Bipartite Graphs
MAXIMUM-MATCHING(G):
1 let M = ∅
2 do
3 let a = FIND-AUGMENTING-PATH(G, M)
4 M = M ⊕ a
5 while (a != NULL)
6 return M
159
Proof: ∃ augmenting path until M is maximum [1]
Let M’ be a maximum matching. M is the matching we have at the moment.
Notice that we cannot have two edges from M’ meeting at a vertex, nor two from
M, since M’ and M are both matchings.
160
Proof: ∃ augmenting path until M is maximum [2]
We can find isolated vertices.
161
Proof: ∃ augmenting path until M is maximum [3]
We can have chains of even length.
162
Proof: ∃ augmenting path until M is maximum [4]
There are no other options because the maximum degree of any vertex is two: if
we had three or more incident edges at any vertex, at least two would need to
come from M or M’, which is impossible because both are matchings.
We know that |M’| > |M| if M is not maximum, and in that case there must be at
least one more edge from M’ than from M in the symmetric difference.
The loops and even chains use the same number of edges from M’ and M so
there must be at least one odd-length chain with one more edge from M’ than M.
163
Finding Augmenting Paths
A simple method is to run a variant of BFS or DFS starting from each unmatched
vertex in whichever of V1 and V2 has fewer unmatched vertices.
The search is constrained to taking edges (u, v) ∉ M for its first step.
Repeat until either an augmenting path is found or the search gets stuck with no
further edges to follow. If so, start a new search from the next unmatched starting
vertex. If no search finds an augmenting path, there is none to be found.
164
Cost of Finding Augmenting Paths
The algorithm can find at most |V|/2 augmenting paths.
Each search costs O(|V| + |E|) = O(|E|) here (because the graph is connected).
We can do better…
165
HOPCROFT-KARP(G)
1 let M = ∅
2 do
3 a[] = ALL-VERTEX-DISJOINT-SHORTEST-AUGMENTING-PATHS(G, M)
4 M = M ⊕ a1 ⊕ a2 ⊕ … ⊕ aa.length
5 while (a.length != 0)
6 return M
166
ALL-VERTEX-DISJOINT-SHORTEST-AUGMENTING-PATHS(G, M)
These are minimum length augmenting paths for M, with no common vertices.
It can be shown that the WHILE loop requires only √|V| iterations (by considering
the maximum number of augmenting paths that can be found in each iteration).
167
Beware: Maximum and Maximal Matchings
A maximum matching is what we have been finding: a largest cardinality subset of
non-adjacent edges in an input graph.
168
💡 Reference: CLRS2 chapter 23
Output: an acyclic subset T ⊆ E that connects all the vertices and whose total
weight w(t) is minimal, where w(t) = ∑(u,v) ∈ T w(u,v).
Because T does connect all the vertices and T is acyclic, it must be that the edges
in T form a tree. Any such tree is a spanning tree. T is not (necessarily) rooted.
A minimum spanning tree is a spanning tree with minimum total edge weights,
and need not be unique.
169
Minimum Spanning Tree Example
2 2
3 3
1 3 1 1
2 2
170
Computing Minimum Spanning Trees
We will see two iterative, greedy algorithms that exploit safe edges.
As the algorithms run, edges (u, v) ∈ E are added to A, always preserving the
property that A ⊆ T, for some T that is a minimum spanning tree. A safe edge is
one that can be added without violating the property.
Iteration continues until there are no more safe edges, at which point, A = T.
171
Cut, Cross, Respect, and Light
A cut (S, V \ S) of an undirected graph G = (V, E) is a partition of V.
An edge crossing a cut is a light edge if its weight is minimum of any edge
crossing the cut. The minimum weight crossing the cut is unique but light edges
are not necessarily unique: multiple crossing edges might have the same weight.
172
Safe Edge Theorem
Let G = (V, E) be a connected, undirected graph with real-valued weight function w
defined on E.
173
Proof of the Safe Edge Theorem [1]
Let T be a minimum spanning tree that includes A.
If T does not contain (u, v), we can show that another minimum spanning tree T’
exists and includes A ∪ {(u, v)}. This makes (u, v) a safe edge for A.
174
Proof of the Safe Edge Theorem [2]
Add (u, v) to T and note that this forms a cycle (since T is a spanning tree and
must already contain some unique path p = u ⤳ v).
(u, v) crosses the cut (S, V \ S), and there must be at least one edge in T, on the
path p, that also crosses the cut (since T is connected).
Let (x, y) be such an edge. (x, y) is not in A because the cut respects A.
Remove (x, y) from T and add (u, v) instead: call this T’. T’ must be connected
and acyclic (a tree).
175
Proof of the Safe Edge Theorem [3]
Calculate a bound on the weight of edges in T’:
The final inequality is because (u, v) is a light edge crossing (S, V \ S), i.e. for any
other edge (x, y) crossing the cut, w(u, v) ≤ w(x, y).
Since T was a minimum spanning tree, T’ must also be a minimum spanning tree.
So why is (u, v) a safe edge for A? That’s because A ⊆ T’ since A ⊆ T and the
removed edge (x, y) ∉ A, so A ∪ {(u, v)} ⊆ T’. Because T’ is an MST, (u, v) is a
safe edge for A.
176
Corollary
Let G = (V, E) be a connected, undirected graph with real-valued weight function w
defined on E. Let A be a subset of E that is included in some minimum spanning
tree for G, and let C = (VC, EC) be a connected component (tree) in the forest
GA = (V, A). If (u, v) is a light edge connecting C to some other component in G A
then (u, v) is a safe edge for A.
177
Kruskal’s Algorithm
Kruskal’s algorithm finds safe edges to add to a growing forest of trees by finding
least-weight edges that connect any two trees in the forest.
The corollary tells us that any such edge must be a safe edge (for either tree)
because it is the lightest edge crossing the cut that separates that tree from the
rest of the graph.
178
MST-KRUSKAL(G, w)
1 A = ∅
2 S = new DisjointSet; for v in G.V MAKE-SET(S, v)
3 MERGE-SORT(G.E) // Or any other non-decreasing sort
4 for (u, v) in G.E // Can also stop if |A|=|V|-1
5 if !IN-SAME-SET(S, u, v)
6 A = A ∪ {(u, v)}
7 UNION(S, u, v)
8 return A
179
Analysis of MST-KRUSKAL(G, w)
Creating a disjoint set with |V| separate sets costs Θ(|V|).
In the worst case, the FOR loop runs to completion: |E| iterations performing 1
IN-SAME-SET check each, and |V|-1 calls to UNION across all the iterations.
The total cost is ~O(|E| + |V|) since both disjoint-set representation operations cost
~O(1).
The sort costs O(|E| lg |E|), which is O(|E| lg |V|) since G is connected.
The cost of sorting dominates and we state that MST-KRUSKAL costs O(|E| lg |V|).
180
Prim’s Algorithm
Prim’s algorithm maintains that A is a single tree (not a forest), and adds safe
edges between the tree and an isolated vertex, to increase the size of the tree
until |A| = |V|. Prim’s algorithm starts from an arbitrary vertex r ∈ V.
The corollary tells us that that any such edge must be a safe edge because they
are the lightest edges crossing the cut that separates the tree from the rest of the
graph.
181
MST-PRIM(G, w, r)
1 Q = new PriorityQueue
2 for v in G.V v.key = ∞; v.𝛑 = NIL; PQ-ENQUEUE(Q, v)
3 PQ-DECREASE-KEY(Q, r, 0)
4 while !PQ-IS-EMPTY(Q)
5 u = PQ-EXTRACT-MIN(Q)
6 for v in G.E.adj[u]
7 if v ∈ Q && w(u, v) < v.key
8 v.𝛑=u; v.key=w(u,v); PQ-DECREASE-KEY(Q,v,v.key)
182
Analysis of MST-PRIM(G, w, r) [1]
If we use a Fibonacci Heap as the implementation of the Priority Queue ADT then
the |V| calls to PQ-ENQUEUE cost O(1) amortised each (FH-INSERT).
The WHILE loop executes |V| times and each call to FH-EXTRACT-MIN costs
O(lg V) time so the total time is O(|V| lg |V|). (We should sum the costs as the size
of the PQ decreases but this over-approximation turns out to be asymptotically
accurate.)
183
Analysis of MST-PRIM(G, w, r) [2]
Across all iterations of the WHILE loop, the FOR loop covers every edge exactly
twice (once in each direction).
We need to test for membership of the Priority Queue, which is not a supported
operation in the ADT. We can implement this with a bit string: one bit per vertex,
initialise to 11..1 and set bits to zero when extracted; test membership looks at the
corresponding bit. This test for membership becomes O(1) time, and the updates
do not add to the corresponding big-O costs because they are O(1) per vertex.
The calls PQ-DECREASE-KEY, cost O(1) amortised for the Fibonacci Heap
implementation.
184
Analysis of MST-PRIM(G, w, r) [3]
The total cost of MST-PRIM(G, w, r) is O(|E| + |V| lg |V|) amortised, which is better
than MST-KRUSKAL.
Either term could be dominant, depending on the size of the edge set.
If we used a binary heap, MST-PRIM(G, w, r) would cost O(|E| lg |V| + |V| lg |V|),
which is O(|E| lg |V|) if G is connected so |E| > |V|-1.
185
Algorithms 2
Section 3: Advanced Data Structures
186
Amortised Analysis
Sometimes, a worst-case analysis is too pessimistic.
For example, consider a vector: an array that grows when necessary by allocating
an array of twice the size and copying existing elements into the new array. The
worst case cost of INSERT would assume that resizing is necessary.
Three common methods can be used to give more representative cost estimates:
1. Aggregate Analysis
2. The Accounting Method
3. The Potential Method
187
Aggregate Cost of Vector Insert
Let’s start with an array of 16 items. The first 16 inserts take O(1) time. The 17th
insert allocates an array of size 32, copies 16 items in O(n) time since n=16 at that
point, then inserts one more item in O(1) time. The next 15 inserts take O(1) time
and the next uses O(n) time again.
…which has sum ∈ O(N). Dividing by N inserts, we conclude that the typical cost
per insert is O(N)/N = O(1) amortised, per item.
We declare the amortised cost for each operation as the amount we charge our
customer. Amortised costs might exceed the actual costs, with the excess going
into a ‘credit’ account. When an amortised cost is less than the actual cost, the
‘credit’ pays for the shortfall.
The accounting method yields a valid set of amortised costs provided for any
sequence of operations, the total amortised cost is an upper bound for the actual
cost, and the credit never goes negative.
189
Potential Method
The potential method is similar but does not attribute ‘credit’ to particular
operations or items within the data structure.
ɸ(di) is the potential of the data structure in each state, i, it can get into through
sequences of the supported operations. We require that ɸ(initial) = 0 and that ɸ(d i)
≥ 0 for all states, i.
Each operation’s amortised cost is the sum of the actual cost and the change in
potential caused by the operation.
190
Example of the Potential Method [1]
Suppose we have a binary counter stored as a list of bits. We can use the
potential function to calculate the amortised cost of INCREMENT, which adds one to
the current value represented in binary by the string of bits.
We can use the number of 1s in the list of bits, bi, after the ith increment as the
potential function mapping any state of the list of bits to potential ≥ 0.
191
Example of the Potential Method [2]
If the ith increment operation, resets ri bits from 1 to 0, the total actual cost is at
most ri + 1: from the least significant bit, we walk the string of bits either setting a 0
to a 1 and terminating, or setting a 1 to a 0 and rippling to the next bit.
The total amortised cost for any sequence is an upper bound for the actual costs.
All checks pass so n INCREMENT operations have amortised O(n) cost: O(1) each.
192
Abstract Data Types (ADTs)
We used the acronym ADT (three times) in Algorithms 1 but have yet to properly
define it.
We have seen some examples already: a stack is an ADT and our implementation
using an array is a data structure that implements the interface.
193
💡 Reference: CLRS (2nd ed) chapter 19
Binomial Heaps
Binomial Heaps implement the Mergeable Priority Queue ADT:
194
Binomial Heaps vs ordinary Heaps
The heaps we saw in Algorithms 1 perform all these operations in O(lg n) time or
better except for DESTRUCTIVE-UNION.
195
Binomial Trees [1]
A Binomial Heap is a collection of Binomial Trees.
In a Binomial Tree, each node keeps its children in a strictly ordered list: these are
not binary trees.
A Binomial Tree, Bk, is formed by linking two Bk-1 trees together such that the root
of one is the leftmost child of the other. B0 is a single node.
196
Binomial Trees [2]
These characteristics follow from the recursive definition of Binomial Tree, Bk:
1. There are 2k nodes in the tree (note that these are not binary trees!)
2. The height of the tree is k
3. There are exactly kCi nodes at depth i, for i = 0, 1, .. k
4. The root has degree k, which is greater than that of any other node
5. The children of the root are ordered: k-1, k-2, .. 0 and child i is the root of a
subtree Bi obeying these defining characteristics.
197
Binomial Trees [3]
Depth
B0 B1 B2 B3 B4 : a Binomial Tree
with depth 4
198
Binomial Heaps
We build a Binomial Heap, H, out of Binomial Trees, as follows.
- Each Binomial Tree in H obeys the min-heap property: each node’s key is
greater than or equal to that of its parent.
- For any non-negative integer k, there is at most one Binomial Tree in H with
root node having degree k.
Notice this means that the overall minimum key must be one of the roots of the
Binomial Trees.
199
Binary Structure
Because the Binomial Tree Bi has 2i nodes, it follows that a Binomial Heap with n
nodes must contain trees Bi corresponding to the 1s in the binary representation of
n.
The “Root List” is held in
increasing degree order
11 = 8 + 2 + 1 = 23 + 21 + 20
H
11 = 10112
200
Binomial Heap Data Structure [1]
To represent this structure, we need six attributes in each node:
1. Key
2. Payload
3. Next sibling pointer Degree Parent
4. Parent pointer
Key
5. Child pointer (to ONE child) Sibling
Payload
6. Degree (number of children)
Child
BH-CREATE()
return NIL
202
BH-PEEK-MIN(bh)
The minimum key has to be one of the roots.
We perform a sequential scan through the root list to find the minimum.
The root list contains at most ⌊lg n⌋ + 1 Binomial Trees so this is O(⌊lg n⌋).
203
BH-DESTRUCTIVE-UNION(bh1, bh2) [1]
First consider the task of merging two Binomial Trees of the same degree.
BH-MERGE(bt1, bt2) makes bt2 become the first child of bt1 (increasing
bt1.degree in the process). This is achieved by setting bt2.sibling = bt1.child and
then bt1.child = bt2, and setting bt2.parent = bt1.
This is O(1) and maintains the order of the child list (characteristic #5 of Binomial
Heaps): descending order of degree.
204
BH-DESTRUCTIVE-UNION(bh1, bh2) [2]
Now we can merge two Binomial Heaps.
bh1 and bh2 each has a root list that is sorted by increasing order of degree. We
merge these in order, using BH-MERGE when we encounter two degrees of the
same degree (smaller key remains in the root list). This ensures that the resulting
Binomial Heap has at most one Binomial Tree of each degree and preserves the
property that the root list contains at most ⌊lg n⌋ + 1 Binomial Trees.
Because BT-MERGE is O(1), the running time of the operation to merge the two
root lists is O(⌊lg n1⌋ + ⌊lg n2⌋) and this is O(⌊lg n⌋), where n is the total number of
nodes in the merged Binomial Heap.
205
BH-INSERT(bh, (key, payload))
The process to insert one new (key, payload) pair is to:
206
BH-EXTRACT-MIN(bh)
This is also straightforward!
1. Cut the Binomial Tree containing the old minimum out of the root list
a. Use BH-PEEK-MIN to find the minimum if you don’t have a pointer to it already.
2. Reverse the list of the old minimum’s child list
3. BH-DESTRUCTIVE-UNION the (reversed) child list and the root list
All three steps can be achieved in O(lg n) time since that dominates both the
length of the root list and the largest degree (child list length) of any node.
Note that we do not need to find the new minimum because the BH-PEEK-MIN
operations searches for it each time.
207
BH-DECREASE-KEY(bh, ptr_k, nk)
ptr_k is a pointer to the node containing the key we wish to decrease.
Remember that this node is a node in a Binomial Tree, which is min-heap ordered!
We decrease the key using the same method as on a Min-Heap, in O(lg n) time:
208
💡 Reference: CLRS2 chapter 20 / CLRS3 chapter 19
Fibonacci Heaps
Fibonacci Heaps implement the Mergeable Priority Queue ADT:
The low costs are what make Fibonacci Heaps special. Let’s see how it’s done!
💡 DELETE(fh, ptr_k) is DECREASE-KEY(fh, ptr_k, -∞); EXTRACT-MIN(fh). O(lg n) amortised
💡 INCREASE-KEY(fh, ptr_k, nk) is DELETE(fh, ptr_k); INSERT(fh,(ptr_k.key,ptr_k.payload)). O(lg n) amortised209
Fibonacci Heap Data Structure [1]
Fibonacci Heap nodes have eight attributes:
1. Key
2. Payload
3. Left sibling pointer Degree Parent ✅ Marked
4. Right sibling pointer
Key
5. Parent pointer Left Right
Payload
6. Child pointer (to ONE child)
7. Degree (number of children) Child
FH-CREATE()
return (NIL, 0)
211
Fibonacci Heap Data Structure [3] fh = ( ⃞ , 12)
2 ❌ 0 ❌ 1 ❌ 1 ❌
Sibling
1 -2 -4 11 Child
Parent
Key
0 ❌ 2 ❌ 0 ❌ 1 ✅
16 7 6 14
0 ❌ 1 ❌ 0 ❌
9 22 20
0 ❌
23
212
Fibonacci Heap Data Structure [4]
A collection of binomial min-heaps, held unordered in a doubly linked cyclic list.
The children of every node are held in unordered doubly linked cyclic lists.
If a node’s key is decreased and becomes smaller than the parent’s key then it
violates the heap property and cannot remain in its current place in the heap:
213
New FibHeapNode(k, p)
We initialise the 8 fields to create a valid 1-item Fibonacci Heap:
- Key = k
- Payload = p
- Left = <pointer to itself>
- Right = <pointer to itself>
- Parent = NIL
- Child = NIL
- Marked = false
- Degree = 0
214
FH-DESTRUCTIVE-UNION(fh1, fh2) DLL-SPLICE(a, b, c, d)
1 let (p1, n1) = fh1, (p2, n2) = fh2 1 a.left = c
2 if (p1 == NIL) return fh2 2 c.right = a
3 if (p2 == NIL) return fh1 3 b.right = d
4 DLL-SPLICE(p1, p1.left, p2, p2.right) 4 d.left = b
5 if (p1.key ≤ p2.key) return (p1, n1+n2)
6 return (p2, n1+n2)
b a c d b a c d
… … … … 215
FH-INSERT(fh, (k, p))
1 let fh2 = new FibHeapNode(k, p)
2 return FH-DESTRUCTIVE-UNION(fh, fh2)
Notice that this does handle the case where fh is an empty Fibonacci Heap (see
line 2 of FH-DESTRUCTIVE-UNION).
Notice that this does not put the new key into the correct place in a binomial heap
structure. Instead, it “dumps” the new key into the root list.
216
FH-PEEK-MIN(fh) FH-COUNT(fh)
1 let (p, _) = fh 1 let (_, n) = fh
2 if p == NIL 2 return n
3 return NIL
4 else
5 return (p.key, p.payload)
217
FH-DECREASE-KEY(fh, ptr_k, nk)
1 if (ptr_k.key < nk) error “New key is not smaller!”
2 ptr_k.key = nk; ptr_k_orig = ptr_k
3 if ptr_k.parent != NIL && ptr_k.key < ptr_k.parent.key
4 do if (!ptr_k.parent.marked)
5 CHOP-OUT(fh, ptr_k); break
6 else CHOP-OUT(fh, ptr_k); ptr_k = ptr_k.parent
7 while ptr_k.parent != NIL
8 if (fh.p.key > nk) fh.p = ptr_k_orig
218
Private helper function CHOP-OUT(fh, ptr_k)
1 if (ptr_k.parent.degree == 1) ptr_k.parent.child = NIL
Only child of its parent
2 else if (ptr_k.parent.child == ptr_k)
The child that the parent pointed to
3 ptr_k.parent.child = ptr_k.left
4 ptr_k.parent.degree = ptr_k.parent.degree - 1 Parent has one fewer children
5 if ptr_k.parent.parent != NIL ptr_k.parent.marked = true
Parent has lost a child
6 ptr_k.left.right = ptr.right; ptr.right.left = ptr.left
Cut out of parent’s child list
7 ptr_k.parent = NIL
8 ptr_k.left = ptr_k.right = ptr_k Prepare this node to enter the root list
9 ptr_k.marked = false
10 DLL-SPLICE(fh.p, fh.p.left, ptr_k, ptr_k.right) Splice into the root list
- If n==1 then this is the last node in the Fibonacci Heap so return (NIL, 0).
The new minimum key has to be one of the children of the old minimum, or one of
the other keys in the root list. We begin by dropping the current minimum’s
children into the root list:
- If p.child != NIL then set v.parent=NIL and v.marked=false for all nodes, v, in
the p.child list; and then call DLL-SPLICE(p, p.left, p.child, p.child.right)
220
FH-EXTRACT-MIN(fh) [2]
Now we can cut the old minimum out of the root list:
- p.left.right = p.right
- p.right.left = p.left
- p = p.left
- n=n-1
We need to walk around the root list looking for the new minimum key. Because
each entry in the root list is a min-heap, we know the overall minimum cannot be
deep into any of the heaps, but it could be any of the roots as there is no ordering
between them.
221
FH-EXTRACT-MIN(fh) [3]
- let start = p, t = p.right
- while t != start
- if (t.key < p.key) then p = t
- t = t.next
p and n are now set correctly so we could say that we’re done and return (p, n).
Although that would implement the operations correctly, it would not achieve the
asymptotic costs we claimed.
It turns out that all we need to do is clean up the heap at this point.
222
FH-EXTRACT-MIN(fh) [4]
We are going to need an array with D(n) + 1 = ⌊log𝜑 n⌋ + 1 elements, initialised to
NIL (‘n’ is the node count after removing the old minimum). 𝜑 = (1+√5)/2. D(n) is
the maximum degree of any node in a Fibonacci Heap with n nodes.
While we are walking around the root list, we combine heaps with the same
degree, using the array to remember where we last saw a heap with each degree.
⚠ The array length must be 1 + max_degree. We will prove this value later. 223
FH-EXTRACT-MIN(fh) [5]
When considering node ‘t’ in the root list we…
- if A[t.degree] == NIL
- A[t.degree] = t
- else
- old_start = start
- old_start_right = start.right
- merge(t, A[t.degree])
- if (old_start.parent != NIL) start = old_start_right
224
FH-EXTRACT-MIN(fh) [6]
MERGE(a, b):
225
FH-EXTRACT-MIN(fh) [7]
- else
- A[a.degree] = NIL
- a.degree = a.degree + 1
- b.left.right = b.right; b.right.left = b.left
- b.left = b.right = b
- if (a.degree == 1)
- a.child = b
- else
- DLL-SPLICE(a.child, a.child.left, b, b.right)
- if (A[a.degree] != NIL) MERGE(a, A[a.degree])
- else A[a.degree] = a
226
Intuition behind Fibonacci Heaps
Before we prove the asymptotic running times for the key operations, let’s get an
intuition for why they’re so cheap.
To get started, let’s consider only the Priority Queue ADT operations operations:
CREATE, INSERT, PEEK-MIN, EXTRACT-MIN.
It’s obvious that our implementations of CREATE and PEEK-MIN are O(1): the code
only performs a fixed number of fixed-time operations. No loops in either case.
227
Intuition behind Fibonacci Heap INSERT
It’s also obvious that INSERT does a constant amount of work when it is called but
we might consider that it is only doing part of the job: it is putting the item into the
data structure but not into the correct place in the data structure.
1. You have to come back and do the rest of the work later; and
2. There is a price to pay for putting “spurious” items into the root list: it costs
more time to run extract-min.
228
Intuition behind Fibonacci Heap EXTRACT-MIN [1]
EXTRACT-MIN:
- Drops the old minimum’s children into the root list – O(1)
- Because the lists are cyclic, we do not need to walk to the start/end of either to append lists
- Because the lists are unordered, it does not matter where we join the two lists together
- We do have to set the parent pointers to NIL and the marked flags to false: come back to this!
- Cuts the old minimum out of the root list – O(1)
- We do not need to search for the node containing the minimum (we have a pointer to it)
- The root list is doubly linked so we can delete in O(1) time
- Walks around the root list looking for the new minimum – O(r), r: len root list
- It’s O(1) work per item to compare it to the running minimum
- It’s O(1) per item to set parent=NIL and marked=false so we can absorb the earlier costs!
229
Intuition behind Fibonacci Heap EXTRACT-MIN [2]
EXTRACT-MIN is doing O(r) work but we only charge the customer O(lg n), so there
is a shortfall to explain with an amortised analysis.
Suppose, for every O(1) Insert also put O(1) money into a bank account, and that
money can be used to pay for work that is done later. O(1) money can be used to
pay for any constant amount of work.
We need to spend O(r - lg n) money from the bank account to balance the books
for EXTRACT-MIN.
We can only spend the money once so what about the second EXTRACT-MIN?
230
Intuition behind Fibonacci Heap EXTRACT-MIN [3]
The reason that r > k lg n is because there is “junk” in the root list that shouldn’t be
there: all the keys we inserted cheekily in the wrong place!
It’s OK to use the bank account to pay for scanning through those keys once to
find the new minimum but we have to make sure those keys do not need to be
scanned the next time we run extract-min.
This is exactly achieved by combining of roots of the same degree: trees begin as
single nodes and are combined into 2s, 4s, 8s, 16s, etc. so if we have n keys then
we have at most 1 + lg n min-heaps in the root list, i.e. the root list shrunk from r to
~lg n, and r - k lg n is exactly the correct amount of money to balance the books!
⚠ This informal intuition does not account for DECREASE-KEY. (We use “lg n”, not “log𝜑 n”.) 231
Intuition behind Fibonacci Heap DECREASE-KEY [1]
To account for decrease key, we note that it takes O(1) time to replace the key,
compare it to the parent (thanks to the parent pointer) and, if necessary to cut it
out of the parent’s child list and splice it into the root list (thanks to both being
doubly linked). Even if we cut the parent out of its sibling list as well, that’s still
only O(1) work.
The problem comes when the parent’s parent is already marked, and its parent,
and so forth – we do an amount of work that is proportional to the height of that
min-heap and this is not O(1).
Same trick! Actual cost is O(h); customer pays O(1); bank funds O(h - 1). How?
232
Intuition behind Fibonacci Heap DECREASE-KEY [2]
Every time we remove a child and mark its parent node (the parent not already
being marked), we put 2x O(1) amounts of money into our bank account.
DECREASE-KEY costs O(1) so this does not change its asymptotic cost.
When that marked node loses another child, one of those O(1) amounts of money
can pay for its removal and splicing into the root list. If its parent is also marked
then it, too, has 2x O(1) amounts of money in the bank, one of which can be used
to pay for it to fall into the root list. This continues and since each level in the
min-heap pays for its own O(1) work, the total paid is the O(h - 1) shortfall.
BUT… don’t all those items in the root list add to the cost of EXTRACT-MIN?!
233
Intuition behind Fibonacci Heap DECREASE-KEY [3]
Yes, all those decreased-keys in the root list do increase the cost of EXTRACT-MIN
but we have that second O(1) amount of money that we haven’t spent yet.
The second O(1) amount of money is what pays for the additional costs of
scanning the root list during the next EXTRACT-MIN operation. Since that
recombines trees leaving O(lg n) items in the root list, it only needs to be spent
once.
The decreased keys and marked parents that fell into the root list have a
corresponding O(1) amount of money in the bank, mirroring the O(1) money
contributed by INSERT for new keys, and paying for their clean-up.
234
Formal analysis: amortised analysis
We need a potential function that is non-negative, zero for the empty data
structure, and sufficient to “pay for” the expensive steps in the algorithms such that
we can claim amortised costs:
235
Coming up with a potential function
● It often helps to compare the ideal state of your data structure with the actual
state, which might be “damaged” by the cheeky operations that did something
cheaply but imperfectly.
○ The potential needs to be (at least) what it would cost to fix the damage.
● Consider the cost actual cost of the operations you need to perform and what
you want to charge for them, since the difference is what you need the
potential to cover. Which other operations lead to these operations being
more expensive than they might be, and could you get them to pay for the
clean-up in advance?
236
Potential for a Fibonacci Heap
It turns out that a good potential function is ɸ = r + 2m, where r is the number of
items in the root list and m is the number of marked nodes.
- Yes, because CREATE returns (NIL, 0): the root list is empty and there are no
marked nodes (no nodes at all).
237
Amortised analysis of FH-INSERT(fh, (k,p))
Start with any Fibonacci Heap, fh, (including the empty case), with potential
ɸ1 = r + 2m.
FH-INSERT adds (via a call to the destructive union operation) one item to the root
list. The new node is never marked. Existing marked nodes remain marked;
existing unmarked nodes remain unmarked. r increases by 1; m is unchanged.
238
Amortised analysis of FH-DESTRUCTIVE-UNION(fh1, fh2)
Start with Fibonacci Heaps, fh1 and fh2, with potentials
ɸ1 = r1 + 2m1 and ɸ2 = r2 + 2m2.
FH-DESTRUCTIVE-UNION splices the roots lists together, adds n1 and n2, and
compares the min keys to return a value. No marked flags are changed. The
potential of the combined Fibonacci Heap is ɸ3 = (r1 + r2) + 2(m1 + m2).
⚠ The input heaps cannot be used after the call to FH-DESTRUCTIVE-UNION so they do not need their
potential to “repair” them: that potential can be repurposed to repair the damage on the combined heap! 239
Amortised analysis of FH-DECREASE-KEY(fh, ptr_k, nk) [1]
Start with any Fibonacci Heap, fh, with potential ɸ1 = r + 2m.
If we decreased the key but it remains greater than its parent, then no changes
are made to the root list, nor to any marked flags.
The potential is unchanged so the amortised cost is the actual cost in this case,
and is clearly ∈ O(1).
240
Amortised analysis of FH-DECREASE-KEY(fh, ptr_k, nk) [2]
If the ptr_k.key becomes smaller than its parent, but the parent is not marked then
the decreased key falls into the root list and the parent becomes marked.
If the node pointed to by ptr_k was marked before, it becomes unmarked when it
falls into the root list.
In both cases, the work done immediately is constant and the contribution to the
potential is constant so the amortised cost is ∈ O(1).
241
Amortised analysis of FH-DECREASE-KEY(fh, ptr_k, nk) [3]
If the ptr_k.key becomes smaller than its parent, and the parent is already marked
then the parent falls into the root list (and becomes unmarked), and its parent
might do the same. Suppose that, in total, ‘a’ ancestor nodes fall into the root list.
243
Amortised analysis of FH-EXTRACT-MIN(fh) [2]
● When we combine nodes of the same degree, we loop through each node in
the root list:
○ If we do not remove a node from the root list (now or through later merges with it), we do
constant work on it because we put it in the array and do not remove it.
○ If we do remove a node from the root list, we only compare it to one other before doing so: one
is found in O(1) time using the array, and we can only remove a node once.
● The total work to merge nodes is O(r + D(n))
● This leaves at most D(n)+1 nodes in the root list because, if there were more,
there would be two items in the same array position and we would have
merged them. (Remember the array length was D(n)+1.)
244
Amortised analysis of FH-EXTRACT-MIN(fh) [3]
Now we can analyse the change in potential. Note that no nodes changed their
marked flag during the merges.
Potential before = r + 2m
⚠ Worst case because fewer items in the root list would release more potential to pay for work. Also, if
any of the old minimum’s children were marked, unmarking them would release potential. 245
Amortised analysis of FH-EXTRACT-MIN(fh) [4]
To show that the cost of EXTRACT-MIN is O(lg n), we need to show that D(n) is
bounded from above by k.lg n.
We will show that D(n) ≤ ⌊log𝜑 n⌋ where 𝜑 = (1+√5)/2 is the golden ratio.
For any node x, define size(x) to be the total number of nodes in the heap rooted
at node x, including node x itself. (Node x need not be in the root list.)
246
Amortised analysis of FH-EXTRACT-MIN(fh) [5]
Lemma 1: let x be a node in Fibonacci Heap; if x has degree k then let its children
be c1, c2, .. ck in the order they were added as children of x; we have that
c1.degree ≥ 0 and ci.degree ≥ i-2 for i=2..k.
Proof
c1’s degree must be at least zero because any node’s degree is non-negative.
x and ci had the same degree when they were merged, and x had i-1 children at
that point. Since then, ci can have lost at most one child (since losing a second
would have removed it from x’s parentage) so ci.degree ≥ i-2.
247
Amortised analysis of FH-EXTRACT-MIN(fh) [6]
Fibonacci numbers, indexed as the 0th, 1st, … are defined by
Notice that
248
Amortised analysis of FH-EXTRACT-MIN(fh) [7]
Notice also that the (k+2)th Fibonacci number, fib(k+2) ≥ 𝜑k.
Proof: by induction.
Inductive step uses strong induction: assume that fib(i+2) ≥ 𝜑i for all i = 2..k-1 and
prove fib(k+2) ≥ 𝜑k for k ≥ 2.
fib(k+2) = fib(k+1) + fib(k) ≥ 𝜑k-1 + 𝜑k-2 = 𝜑k-2 (𝜑 + 1) = 𝜑k-2 𝜑2 = 𝜑k
249
Amortised analysis of FH-EXTRACT-MIN(fh) [8]
Lemma 2: let x be any node in a Fibonacci Heap and let k = x.degree; then
size(x) ≥ fib(k+2) ≥ 𝜑k.
Proof
Note that sk increases monotonically with k (adding children cannot decrease the
minimum size).
250
Amortised analysis of FH-EXTRACT-MIN(fh) [9]
Now consider some node z with degree k and size(z) = sk (i.e. minimum size).
Consider the children c1, c2, .. ck of z in the order they were added.
size(x) ≥ sk ≥
1 (for z itself)
+ 1 (for c1, also a zero-degree node when merged with z, or
now a larger child if the original first child was removed)
+ Σki=2 sci.degree
≥ 2 + Σki=2 si-2 // using Lemma 1 and monotonicity
251
Amortised analysis of FH-EXTRACT-MIN(fh) [10]
Next, we show that sk ≥ fib(k+2) for k ≥ 0, using induction.
252
Amortised analysis of FH-EXTRACT-MIN(fh) [11]
If x is any node in a Fibonacci Heap and has k = x.degree, then we know that
n ≥ size(x) ≥ 𝜑k.
Because this is true for any node, we have that the maximum degree of any node
in a Fibonacci Heap with n nodes is D(n) ≤ ⌊log𝜑 n⌋ ∈ O(lg n).
253
Uses for Fibonacci Heaps
Fibonacci Heaps used to be used in the Linux Kernel as the priority queue of
processes, waiting to be chosen to run by the Process Scheduler.
It was replaced with a Red-Black tree that, although it has larger asymptotic costs,
runs faster on the typical size of problem instance, due to (much) lower constant
factors.
254
Fibonacci Heaps in Dijkstra’s Algorithm
We said that in the worst case, Dijkstra’s algorithm will call CREATE once, INSERT
O(|V|) times, EXTRACT-MIN O(|V|) times, and DECREASE-KEY O(|E|) times.
Heap O(1) O(lg |V|) O(lg |V|) O(lg |V|) O(|V| lg |V| + |E| lg |V|)
Fibonacci Heap O(1) O(1) amortised O(lg |V|) amortised O(1) amortised O(|V| lg |V| + |E|)
amortised
255
Are there any better mergeable priority queues?
Actually, there are two!
256
Disjoint Sets
The Disjoint Set ADT can be implemented in many ways, including by the use of
a data structure that is based on an amortised cost analysis.
The Disjoint Set ADT is initialised with a collection of n distinct keys. Each key is
placed into its own set. The data structure supports two operations:
1. UNION(s1, s2): combine two disjoint sets, s1 and s2, into a single set
2. IN-SAME-SET(k1, k2): report whether keys k1 and k2 are currently in the same
set (return true) or different sets (return false)
257
Disjoint Sets using Doubly Linked Lists
CREATE: for n provided keys, create n linked lists, each of length 1, holding their
corresponding key.
UNION(a, b): walk forwards along a’s list until you reach the end, and along b’s list
until you reach the beginning. Change the list pointers to join the end of the ‘a’ list
to the start of the ‘b’ list.
IN-SAME-SET(k1, k2): from the node holding k1, walk in both directions until you
reach a NIL pointer. If a node containing k2 is found, return true; else return false.
UNION(a, b): splice a’s list and b’s list together. As the lists are unordered, we can
splice at the positions pointed to by a and b (which do not require any searching to
find).
IN-SAME-SET(k1, k2): from the node holding k1, walk around until you find k2 or get
back to k1. If a node containing k2 is found, return true; else return false.
UNION(a, b): scan through every record in the hash table; if some key maps to
payload b, change the payload to a.
CHASE(k): starting from the node for key k, follow the pointers until you reach the
root, r, of its tree (where pointer == NIL). Change the pointer of each node you
went through to ‘r’. This ensures that the next time we CHASE(k), or we chase any
descendant of k, we jump straight from k to r. The cost of walking the path from k
to r is only paid once. This is called path compression.
Kruskal also sorts the edges: O(|E| lg |E|) = O(|E| lg |V|) since |E| is at most |V| 2.
Hash table O(|E| lg |V|) O(|V|) O(|V|) O(1) O(|V|2 + |E| lg |V|)
Trees with PC O(|E| lg |V|) O(|V|) ~O(1) amortised ~O(1) amortised ~O(|V| + |E| lg |V|)
& UbR amortised
263
Algorithms 2
Section 4: Geometric Algorithms
264
Polygons
Polygons are an ordered list of vertices.
Our first problem is to work out whether a point is on the “inside” of a polygon.
265
Planar Polygons [1]
If the space in which the polygon exists is not planar, it can be tricky or impossible
to define “inside” and “outside”.
Note that we cannot say the “smaller” area is “inside”: ask whether a container
ship’s position is “inside” the ocean polygon or is on land.
266
Planar Polygons [2]
A planar space is 2D, flat, and infinite in the “horizontal” and “vertical” directions.
A polygon drawn on a planar surface separates a finite area from an infinite area:
we refer to the finite area as “inside” and the infinite area as “outside”.
267
Closed Polygons
A closed polygon is one where there is an edge from its last vertex back to its first.
An open polygon does not (necessarily) enclose any area so we cannot define
inside and outside.
268
Simple Polygons [1]
Simple polygons do not overlap themselves.
269
Winding Numbers [1]
How do we define what is “inside” a complex polygon?
270
Winding Numbers [2]
When you get back to the start, if the string is wound around the post an odd
number of times, the post is on the inside; otherwise it is on the outside.
271
Winding Numbers [3]
We can implement this algorithm on a computer:
1. calculate angles subtended at the post by the two ends of each edge;
2. sum the angles
3. divide by 2𝛑 to get the winding number.
272
Inclusion within Simple, Planar, Closed Polygons
Add a semi-line from the point of interest P, in any direction.
273
Awkward cases
If the ray goes through a vertex, we could discard the ray and send one in a
different direction; keep retrying until it doesn’t hit any vertices.
The horizontal ray avoids floating point error in calculations of whether we hit the
vertex, were slightly above or were slightly below because (non-NaN) floats are
totally ordered.
… … … … … …
1 2 3 4
… … 274
Handling the Awkward cases
If a vertex is on the ray, look at the neighbouring vertices. If they’re on the same
side (both above / both below) then the polygon’s edge was not crossed (case 2);
if they are on opposite sides then the edge was crossed (case 1).
If either neighbour is also on the ray, replace it with the next neighbour in the same
direction around the polygon boundary.
… … … … … …
1 2 3 4
… … 275
Line segments
A line segment p1p2 is a straight line between two points p1 and p2. We say that
p1 and p2 are the endpoints and, if the line has a direction then we have a
directed segment p1→p2.
These points might be adjacent vertices in a polygon or the test point and a point
“at infinity”.
276
Convex combinations
If p1 = (x1,y1) and p2 = (x2,y2), then we say that p3 = (x3,y3) is a convex
combination of p1 and p2 if p3 is on the line segment between p1 and p2 (including
the endpoints).
Mathematically, x3 = 𝛂x1 + (1-𝛂) x2 and y3 = 𝛂y1 + (1-𝛂) y2. This is often written as
the vector equation p3 = 𝛂p1 + (1-𝛂)p2. We require 0 ≤ 𝛂 ≤ 1, to place p3 between
p1 and p2 inclusive of the endpoints.
277
Intersection Determination Problem
Input: two line segments p1p2 and p3p4.
278
Intersection Determination
We would like to avoid trigonometry (slow).
The “high school maths” approach based on two equations of the form y = mx + c
leads to divisions, which are slow in floating point, and introduce error that cannot
be managed as effectively as with addition and multiplication (a concept known as
infinite precision). This can lead to incorrect answers: small floating point errors
can lead to the intersection of these two lines being “off the end” of the segments
so not counting.
y y
p1+p2 p
p1
x
p2
x
The cross product of p1 and p2 can be thought The darker regions contains position
of as the signed area of the parallelogram. vectors that are anticlockwise from p;
the lighter region contains vectors that
are clockwise from p.
💡 This is Figure 33.1 from CLRS3. 280
Matrix Determinants
x1 x2
p1 ⨉ p 2 = det
y1 y2
= x1y2 - x2y1
= -p2 ⨉ p1
281
Line Segment Intersection
Check whether each line segment straddles the extension of the other. The
extension of a line segment is the (infinite) line containing its two endpoints, i.e.
drop the constraint that 0 ≤ 𝛂 ≤ 1.
282
SEGMENTS-INTERSECT(p1,p2,p3,p4) [1]
1 d1 = DIRECTION(p3,p4,p1) // Relative orientation of
2 d2 = DIRECTION(p3,p4,p2) // each endpoint w.r.t. the
3 d3 = DIRECTION(p1,p2,p3) // other segment
4 d4 = DIRECTION(p1,p2,p4)
5 if ((d1>0 && d2<0) || (d1<0 && d2>0)) &&
((d3>0 && d4<0) || (d3<0 && d4>0))
6 return true
💡If p3→p1 and p3→p2 have opposite directions w.r.t. p3→p4 then p1p2 straddles p3p4.
💡If p1→p3 and p1→p4 have opposite directions w.r.t. p1→p2 then p3p4 straddles p1p2. 283
SEGMENTS-INTERSECT(p1,p2,p3,p4) [2]
7 else if d1==0 && ON-SEGMENT(p3,p4,p1) return true
8 else if d2==0 && ON-SEGMENT(p3,p4,p2) return true
9 else if d3==0 && ON-SEGMENT(p1,p2,p3) return true
10 else if d4==0 && ON-SEGMENT(p1,p2,p4) return true
11 return false
DIRECTION(pi,pj,pk) = (pk-pi) x (pj-pi)
ON-SEGMENT(pi,pj,pk) = (min(xi,xj) ≤ xk ≤ max(xi,xj)) &&
(min(yi,yj) ≤ yk ≤ max(yi,yj))
💡If p1 or p2 is on p3p4 then the segments intersect if that point is within the limits of the segment (L7,8).
💡If p3 or p4 is on p1p2 then the segments intersect if that point is within the limits of the segment (L9,10).284
n-Segment Intersection Problem
Input: n line segments, each specified as pairs of endpoints, pi for 1 ≤ i ≤ n.
Obvious solution: solve the segment intersection problem for all pairs, O(n2).
There is a smarter solution called sweeping with running time O(n lg n) that
exploits the geometry of lines in a plane to constrain the cases that must be
considered. Supervision exercise!
285
Convex Hull Problem
Input: a set of n>2 points pi for 1 ≤ i ≤ n. At least 3 points are not collinear (so the
polygon is not a zero-area line).
Output: an ordered list of points forming a convex hull for the input points.
The furthest-apart of a set of points in a plane are both on the convex hull. The
convex hull of a set of points is a minimal subset that forms a convex polygon with
none of the points outside the polygon (i.e. either inside or on the edge).
287
Graham’s Scan [1]
● Start at the left-most of the bottom-most points.
● Sort the points by increasing polar angle relative to a horizontal line through
this point.
○ Resolve tie-breaks by retaining only the point farthest from the start point.
● Push the first three points onto an initially empty stack.
● For each of the other points, p, taken in the sorted order:
○ Pop off the stack until the directed segment from the next-to-top vertex on the stack to the top
vertex on the stack forms a (strictly) left turn with the directed segment from top vertex to p
○ Push p onto the stack.
● The points on the stack are the convex hull.
288
Graham’s Scan [2]
To sort by polar angle, we do not need to compute the angles!
The cross product a ⨉ b = |a| |b| sin θ, where θ is the angle between the vectors a
and b.
If a and b are unit vectors, sorting by the value of the cross product is the same as
a sort by θ because sin θ is monotonic with θ for -𝝅/2 ≤ θ < 𝝅/2.
289
Graham’s Scan [3]
9
8
6 4 3
5
2
1
290
Graham’s Scan [4]
9
8
6 4 3
5
2
1
291
Graham’s Scan [5]
9
8
6 4 3
5
2
1
292
Graham’s Scan [6]
9
8
6 4 3
5
2
1
293
Graham’s Scan [7]
9
8
6 4 3
5
2
1
294
Graham’s Scan [8]
9
8
6 4 3
5
2
1
295
Graham’s Scan [9]
9
8
6 4 3
5
2
1
296
Graham’s Scan [10]
9
8
6 4 3
5
2
1
297
Graham’s Scan [11]
9
8
6 4 3
5
2
1
298
Graham’s Scan [12]
9
8
6 4 3
5
2
1
299
Graham’s Scan [13]
9
8
6 4 3
5
2
1
300
Graham’s Scan [14]
9
8
6 4 3
5
2
1
301
Analysis of Graham’s Scan
Calculating one polar angle is O(1). Calculating n of them is O(n).
Sorting n polar angles is O(n lg n), with any sensible comparison-based sort
(including the tie-break logic to discard points with sub-maximal distance).
As we walk around the hull, each point is only pushed to the stack at most once
and is removed at most once. Every comparison either adds a point to the stack
or removes a point from the stack. Hence the walk is O(n).
302
Jarvis’s March [1]
● Start with the left-most of the bottom-most points, p1, which is on the hull.
● Find the point p2 with the least polar angle relative to a horizontal line through
p1. p2 is also on the hull.
● Repeatedly find the point pi+1 with the least polar angle relative to the line
through pi-1 and pi. pi+1 is on the hull. The pi form the right chain.
○ The repetition continues until a top-most point is reached (might not be unique).
● Repeat the previous two bullets to find the left chain using greatest polar
angles.
● Join the right chain and left chain to get the convex hull.
303
Jarvis’s March [2]
9
8
6 4 3
5
2
1
304
Jarvis’s March [3]
9
8
6 4 3
5
2
1
305
Jarvis’s March [4]
9
8
6 4 3
5
2
1
306
Jarvis’s March [5]
9
8
6 4 3
5
2
1
307
Jarvis’s March [6]
9
8
6 4 3
5
2
1
308
Analysis of Jarvis’s March
Calculating one polar angle is O(1). Calculating n of them is O(n).
The right/left chain allows us to exploit the cross product trick for comparisons
because the polar angles, θ, we handle are always in the range -𝝅/2 ≤ θ < 𝝅/2.
309
Revision Guide / Summary of Algorithms 2 [1]
● Graphs
○ Representing the edge set with adjacency lists and adjacency matrices
○ Terminology
● Graph colouring problems: vertex, edge, face colouring
● Breadth-first search
○ With the concept of ‘depth’ to solve vertex colouring
○ Subgraph induced by the predecessors: breadth first tree
● Depth-first search
○ Discovery time and finish time for each vertex
○ Topological sort
● Edge classification: tree edge, back edge, forward edge, cross edge
310
Revision Guide / Summary of Algorithms 2 [2]
● Strongly connected components
○ Two DFSs and the transpose graph
● Shortest path problems:
○ Single-source shortest paths
○ Single-destination shortest paths
○ Single-pair shortest path
○ All-pairs shortest paths
● Complications caused by negative edges, negative cycles, zero-weight cycles
● Bellman-Ford
○ Introduced the concept of edge relaxation
○ Special case for directed acyclic graphs with lower costs
311
Revision Guide / Summary of Algorithms 2 [3]
● Optimal substructure led to Dijkstra’s algorithm
○ Unable to handle negative edge weights
○ Proof of correctness using the convergence lemma
● Matrix multiplication methods for all-pairs shortest paths
○ Mapping domain-specific problems to other theory, to pull in speed-ups from other research
○ Repeated squaring
○ Floyd-Warshall
● Johnson’s algorithm
○ Introduced the concept of reweighting
312
Revision Guide / Summary of Algorithms 2 [4]
● Flow networks
○ Capacity
○ Max-Flow Min-Cut Theorem
○ Ford-Fulkerson (Edmunds-Karp as optimisation)
○ Augmenting paths, flow cancellation
● Bipartite matchings
○ Maximum bipartite matchings (Hopcroft-Karp as an optimisation)
○ Maximum and maximal matchings
● Minimum spanning trees
○ Safe edge theorem
○ Kruskal’s algorithm
○ Prim’s algorithm
313
Revision Guide / Summary of Algorithms 2 [5]
● Amortised analysis
○ Aggregate method
○ Accounting method
○ Potential method
● Mergeable Priority Queues
○ Binomial Heaps
○ Fibonacci Heaps, golden ratio, peculiar property giving them their name
● Disjoint set representations
○ Path compression and union-by-rank
314
Revision Guide / Summary of Algorithms 2 [6]
● Geometric algorithms
○ Simple, planar and closed polygons
○ Defining the inside and outside
○ Winding numbers
○ Line segment intersection problems
○ Cross-product tricks for numerical stability and performance
● Convex Hulls
○ Graham’s scan
○ Jarvis’s March
○ … and I tantalised you with the “Search and Prune” asymptotically optimal method!
315
Thank you for listening!
316