0% found this document useful (0 votes)
9 views

April14SCCs%2BBFS

The document outlines key concepts in algorithm design and analysis, focusing on Depth First Search (DFS), Breadth First Search (BFS), and Strongly Connected Components (SCCs). It details the procedures for implementing these algorithms, their properties, and their applications in graph theory, including the identification of sources, sinks, and the linearization of Directed Acyclic Graphs (DAGs). Additionally, it introduces Tarjan's algorithm for finding SCCs and discusses the correctness of BFS in determining shortest paths in graphs.

Uploaded by

zhixin shen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

April14SCCs%2BBFS

The document outlines key concepts in algorithm design and analysis, focusing on Depth First Search (DFS), Breadth First Search (BFS), and Strongly Connected Components (SCCs). It details the procedures for implementing these algorithms, their properties, and their applications in graph theory, including the identification of sources, sinks, and the linearization of Directed Acyclic Graphs (DAGs). Additionally, it introduces Tarjan's algorithm for finding SCCs and discusses the correctness of BFS in determining shortest paths in graphs.

Uploaded by

zhixin shen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

CSE 101

Algorithm Design and Analysis


Russell Impagliazzo
[email protected]
Today +’s plan: Augmenting DFS,BFS
• SCCs
• Tarjan’s SCC algorithm
• BFS
• Distance in graphs
• Augmenting BFS for distance
• Weighted graph distance (towards Dijkstra’s algorithm)
Depth first search
procedureDFS(G)
procedure DFS(G)
procedure previsit(v)
cccc==00
clock = vertex
1 pre(v)=clock
for each v:
for each vertex v: = false
visited(v) clock++
for each vertexvisited(v)
v: = false
for each vertex
if notv:visited(v): procedure post visit(v)
if not visited(v):
cc++ post(v)=clock
cc++
explore(G,v)
explore(G,v) clock++

procedure explore(G = (V,E), s)


visited(s)=true
previsit(s) Definition: a DFS output forest is the forest
component(s) = cc
for each edge (s,u): structure given by the prev array after DFS has
if not visited(u): been performed on a graph.
prev(u) = s
explore(G,u)
postvisit(s)
Depth first search (Example)
procedure DFS(G)
cc = 0
clock = 1
for each vertex v:
visited(v) = false
for each vertex v:
if not visited(v):
cc++
explore(G,v)
Edge types (directed graph)
• Tree edge: solid edge included in the DFS output tree
• Back edge: leads to an ancestor
• Forward edge: leads to a descendent
• Cross edge: leads to neither anc. or des.

• Note that Back edge is slightly different in directed and


undirected graphs.
Edge types and pre/post numbers
The different types of edges can be determined from the
pre/post numbers for the edge (𝑢, 𝑣)
• (𝑢, 𝑣) is a tree/forward edge then
• 𝑝𝑟𝑒 𝑢 < 𝑝𝑟𝑒 𝑣 < 𝑝𝑜𝑠𝑡 𝑣 < 𝑝𝑜𝑠𝑡(𝑢)
• (𝑢, 𝑣) is a back edge then
• 𝑝𝑟𝑒 𝑣 < 𝑝𝑟𝑒 𝑢 < 𝑝𝑜𝑠𝑡 𝑢 < 𝑝𝑜𝑠𝑡(𝑣)
• (𝑢, 𝑣) is a cross edge then
• 𝑝𝑟𝑒 𝑣 < 𝑝𝑜𝑠𝑡 𝑣 < 𝑝𝑟𝑒 𝑢 < 𝑝𝑜𝑠𝑡(𝑢)
Directed Acyclic graphs
• A directed graph without a cycle is called acyclic.
• We can test whether a graph is acyclic with dfs.

Step 1: perform dfs on the graph


Step 2: test each edge to see if it is a back edge.

(How do you implement Step 2?)


Corollary to the previous Thm
A directed graph G is a DAG if and only if it’s DFS output
forest does not have any back edges.
Linearization of DAGS (topological sort)
• Is it possible to order the vertices such that all edges go
in only one direction?
• For what types of DAGs is this possible?
• How do we find such an ordering?
Property of DAGS
Property: every edge in a DAG goes from a higher post
number to lower post number.

proof:
suppose (u,v) is an edge in a dag then it can’t be a back
edge, therefore it can only be a forward edge/tree edge or
a cross edge.
Both of which have the property that post(v)<post(u).
Edge types and pre/post numbers
The different types of edges can be determined from the
pre/post numbers for the edge (𝑢, 𝑣)
• (𝑢, 𝑣) is a tree/forward edge then
𝑝𝑟𝑒 𝑢 < 𝑝𝑟𝑒 𝑣 < 𝑝𝑜𝑠𝑡 𝑣 < 𝑝𝑜𝑠𝑡(𝑢)
• (𝑢, 𝑣) is a back edge then
𝑝𝑟𝑒 𝑣 < 𝑝𝑟𝑒 𝑢 < 𝑝𝑜𝑠𝑡 𝑢 < 𝑝𝑜𝑠𝑡(𝑣)
• (𝑢, 𝑣) is a cross edge then
𝑝𝑟𝑒 𝑣 < 𝑝𝑜𝑠𝑡 𝑣 < 𝑝𝑟𝑒 𝑢 < 𝑝𝑜𝑠𝑡(𝑢)
Property of DAGS
Linearization of a DAG:
Since we know that edges go in the direction of decreasing
post numbers, if we order the vertices by decreasing post
numbers then we will have a linearization

procedure linearize(a DAG G=(V,E))


run DFS(G)
return list of vertices in decreasing order of post numbers.
Sources and Sinks
• A source is a vertex in a directed graph that has no
incoming edges.

• A sink is a vertex in a directed graph that has no outgoing


edges.
• Is this graph a DAG?
A B C • Which vertices are sources?
• Which vertices are sinks?

D E F
Sources and Sinks
• Property of DAGs:

• Any DAG has at least one source and at least one sink.
• (Why?)
Strongly connected vertices
C
D
Two vertices 𝑢 and 𝑣 in a directed graph are B
strongly connected if there exists a path
F
from 𝑢 to 𝑣 and a path from 𝑣 to 𝑢. A

The maximal set of strongly connected I


vertices is called a strongly connected G
E H
component (or an SCC.)

K M

L
Strongly connected component
C
D
What are the strongly connected components of B
this graph?
F
A

I
G
E H

K M

L
Strongly connected components as
vertices. (Meta-graph)
C
B,C,F,I D
D
B A
F G
A
H
I
G
E H E

J,K,L,M
J

K M

L
Directed Graphs as DAGs of SCCs
Every Directed graph is a DAG of its strongly connected
components.

Some SCCs are sink SCCs and some are source SCCs.
Decomposition
There is a linear time algorithm that decomposes a
directed graph into its strongly connected components.

If explore is performed on a vertex 𝑢, then it will visit only the


vertices that are reachable by 𝑢.

What vertices will be visited when explore is performed on 𝑢 if 𝑢


is in a sink SCC?
Sink SCCs

If explore is performed on a vertex that is in a sink SCC,


then only the vertices from that SCC will be visited.

This suggests a way to look for SCCs.


• Start explore on a vertex in a sink SCC and visit its SCC.
• Remove the sink SCC from the graph and repeat.
Source SCCs
Ideally we would like to find a vertex in a sink SCC.
Unfortunately, there is not a direct way to do this.
Source SCCs

However, there is a way to find a vertex in a source SCC.

The vertex with the greatest post number in any DFS output tree
belongs to a source SCC.

The vertex with the least post number in a dfs output does not
necessarily belong to a sink SCC.
Vertices in Source SCCs
The vertex with the greatest post number in any DFS
output tree belongs to a source SCC.

To prove this, we will state a more general property:

If 𝐶 and 𝐶′ are strongly connected components and there is


an edge from a vertex in 𝐶 to a vertex in 𝐶′ then the highest
post number in 𝐶 is greater than the highest post number in
𝐶′
Proof
Case 1: DFS searches 𝐶 before 𝐶′:
Then at some point dfs will cross into C’ and visit every
edge in C’ then it will retrace its steps until it gets back to
the first node in C it started with and assign it the highest post number

C C’
Proof
Case 2: DFS searches 𝐶′ before 𝐶:
Then DFS will visit all vertices of C’ before getting stuck
and assign a post number to all vertices of C’.
Then it will visit some vertex of C later and assign post numbers to those
vertices.

C C’
Corollary
The strongly connected components can be linearized by
arranging them in decreasing order of their highest post
numbers.
How to find sink SCCs C
D
B
F
A

I
G
E H

K M

L
How to find sink SCCs
Given a graph 𝐺, let 𝐺 ! be the reverse graph of 𝐺.
Then the sources of 𝐺 ! are the sinks of 𝐺.

So if we perform DFS on 𝐺 ! then the vertex with the highest


post number is in a source. This means that this vertex will be
in a sink of 𝐺.

So start with this vertex and explore the SCC.

Then the vertex with the next greatest post number in 𝐺 ! is in


the next SCC in linear order so start with that one next.
The SCC algorithm:
• Construct 𝐺 ! .
• Run DFS on 𝐺 ! and keep track of the post numbers.
• Run DFS on 𝐺 and order the vertices in decreasing
order of the post numbers from the previous step.
Every time DFS increments cc, you have found a new
SCC!!
C
D
B
A

F
G
E
H I

M
K
L
1 2 3 26
A B C
4 25 D
F B
F
5
20
24 A
21
C I
I
6 23
G
22 H
19 E
D H

7
18 J
G
8
17 K M
10 11
J E
13 14 L
9 16 15
12
K L
M
cc = 1 cc = 2 cc = 3
B H D C
D
B
C G F
A

I
F I G
E H
cc = 4 cc = 5 cc = 6
J E A
J

L K M

L
M
K
B, F, I, H, C, D, G, J, K, M, L, E, A
The SCC algorithm
• Run DFS on 𝐺 ! and keep track of the post numbers.
• Run DFS on 𝐺 and order the vertices in decreasing order of
the postnumbers from the previous step. Every time DFS
increments cc, you have found a new SCC!!

How long does this take?

I claim it is linear time for each step and so it is linear time in


general
Breadth first search
• BFS (G, s)
• Initialize array Visited to False, Visited[s]=True
• Initialize Queue Q to include s
• While Q is not empty do:
• v= Q.Dequeue
• For u ∊ N(v) IF Visited[u]== False do:
• Visited[u]= True; Q.Enqueue(u)
• Return set of u so that Visited[u]==True
Example
A
B
C
s=A, Q= (A)
v=A Q=(B, D)
D
v=B , Q= (D, C, E)
v=D, Q=(C,E,F)
H v=C , Q= (E,F)
E
v= E, Q= (F)
v=F , Q= (G)
v=G, Q= (H)
v=H, Q is empty

G F
BFS explores the closest vertices first
• So the intuition is that it should find shortest paths.
• How can we keep track of shortest paths/minimum
distances?
Augmented BFS
procedure BFS(G, s)
Input: Graph G = (V,E), (directed or undirected) and a vertex s in V.
Output: For all vertices u reachable from s, dist(u) is the distance from s to u. and for all
vertices u not reachable from s, dist(u) = ∞
for each vertex u in V:
dist(u)=∞
dist(s) = 0
Q = [s] (queue just containing s)
while Q is not empty
v = Q.dequeue
for all edges (v,u) in E
if dist(u)=∞ then
Q.enqueue(u); prev(u)=v;
dist(v)=dist(u) + 1
Example
A
B
C

H
E A B C D E F G H

G F
Proof of correctness
• For each vertex v, we want to show that dist(v) is the
minimum distance of all paths from s to v.
• Claim: at the first time when the head of the queue has
distance marked d:
• (1) all vertices with distances at distance ≤ 𝑑 from s have their
distance values correctly set.
• (2). All vertices in the queue are exactly those of distance d.
• (2) All other vertices are marked distance infinity
Claim: for each distance value d = 0, 1, 2,…:

Proof of correctness (1) all vertices at distance ≤ 𝑑 from s have their


distance values correctly set.
(2) all other vertices (distance > d from s) have
distances set to ∞
• Proof: (3) The queue contains exactly the nodes at
distance d.
• Base Case: (for d = 0)
• (1) dist(s) = 0 is the correct distance value
• (2) all other vertices have distances set to ∞
• (3) The queue contains only s which is the only vertex at distance 0
Claim: for each distance value d = 0, 1, 2,…:

Proof of correctness (1) all vertices at distance ≤ 𝑑 from s have their


distance values correctly set.
(2) all other vertices (distance > d from s) have
distances set to ∞
• Proof: (3) The queue contains exactly the nodes at
distance d.
• Induction step:
• Let k be an arbitrary integer such that 𝑘 ≥ 0. Assume that
the above three statements are true for when d=k.
• (WTS the three statements are true for when d=k+1.)
• (1)
• (2)
• (3)
Claim: for each distance value d = 0, 1, 2,…:

Proof of correctness (1) all vertices at distance ≤ 𝑑 from s have their


distance values correctly set.
(2) all other vertices (distance > d from s) have
distances set to ∞
• Proof: (3) The queue contains exactly the nodes at
• Induction step (continued): distance d.
• All vertices distance ≤ 𝑘 have been set and the queue only contains vertices at distance = k,
and the remaining vertices are marked distance infinity
• Since queues are FIFO, the head of the queue will have distance marked k until the current
queue is empty.
• (1) All new vertices added to the queue during this time have distance k+1 and
are set correctly, since they are neighbors of distance k vertices and didn’t have
distances set before.
• (2) All vertices of distance k+1 have been added to the queue, since each one is
the neighbor of some distance k vertex. .
• The first time we see the head of the queue at distance k+1 is right after the last
time the head of the queue has distance k.
Claim: for each distance value d = 0, 1, 2,…:

Proof of correctness (1) all vertices at distance ≤ 𝑑 from s have their


distance values correctly set.
(2) all other vertices (distance > d from s) have
distances set to ∞
• Proof: (3) The queue contains exactly the nodes at
distance d.
• Using induction claim.
• Led dmax be the maximum distance of a vertex from s.
• Applying the induction claim to dmax, the first time the head
of the queue has distance dmax, all vertices have their correct distances
defined (since all vertices have distance ≤ dmax).
Runtime of BFS
• Notice that in BFS, each vertex enters the queue (F) at
most one time.
• This was the assumption we made about graphsearch
when we calculated its runtime.
• So BFS runs in 𝑂 𝑉 + 𝐸 time.

•Corollary:
For each vertex 𝑣, dist(𝑣) is set at most one time.
Edge lengths (weights)
edges can be given values such as
• Distance
• Cost
• Time
• Bandwidth
• Value
BFS on weighted graphs.
• BFS only works to find shortest distances on graphs in
which each edge has equal weight.
A 3 B 2 C

2 1 4 1

D 4 E F
BFS on weighted graphs.
• Discuss how we can use a reduction to solve the
problem of shortest paths with edge lengths.
A 3 B 2 C

2 1 4 1

D 4 E F
BFS on weighted graphs.
• On a graph 𝐺 with integer edge lengths, form 𝐺 ! by adding ℓ" − 1 many new
vertices between 𝑢 and 𝑣 for every edge 𝑒 = (𝑢, 𝑣). Then run BFS on 𝐺′.

A 3 B 2 C A B C

2 1 4 1

D 4 E F D E F

𝐺 𝐺′
Problems with this method
If the edge lengths
(weights) are large
integers then it is
impractical.

In this example
with 10 vertices,
we must add 1,783
more vertices!!!!!!!!
Next class
• Simulating bfs in weighted graphs
• Dijkstra’s algorithm
• Priority queues

You might also like