12 13 Union Find
12 13 Union Find
Lecture 12*
Union-Find for Disjoint Sets
Amir Rubinstein
Spring semester 2021-2
*
based on the TAU course slides, edited by AR
Union-Find:
Definition,
Applications
Union-Find
• x Make-Set(info):
Create an item x, with associated information info,
and create a set containing it as its single item
• Union(x,y):
Unite the sets containing x and y
• Find(x):
Return some representation of the set containing x
(can be set name, unique memory location, representative
element, etc.)
Basic requirement:
Find(x)=Find(y) iff x and y are currently in same set
3
Union Find: Example
a b c d e
a Make-Set() c Make-Set()
b Make-Set() d Make-Set()
Union(a,b) e Make-Set()
Find(b) a Union(c,d)
Find(a) a Union(d,e)
Find(e) d 4
Applications (1):
Incremental Connectivity
A graph on n vertices is built by adding edges
At each stage we may want to know whether
two given vertices are already connected
2 5
1 4 7
3 6
c16 Make-Set(16) 5 6 7 8
find(c6)=find(c7) ?
union(c6,c7) 9 10 11 12
find(c7)=find(c11) ?
union(c7,c11) 13 14 15 16
…
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
Amortized
10
(♣)
ignoring time for Find required as part of Union
Union-Find
with
Linked Lists
Implementation 1: Linked Lists
Each set is represented as a linked list
Each element has a pointer to the list head
first
last
size …
x
Make-Set creates a new singleton list – O(1) time
12
Find(x) returns list head – O(1) time
Union using Linked Lists
first
last
size k1 …
first x y
last
size k2
Proof:
• Look at any series of operations, containing Make-Set operations, and
any number of Finds and Unions
• Make-Set, Find and Unions of same-set elements take O(1)
• At most “real” unions of different sets (why?)
• Whenever the set pointer of an item is changed, the size of the set
containing it is at least doubled. So any set pointer can be changed at
most times where is the numbers of items, and in total all the unions
changed pointers
• work over “real” unions O(logn) amortized
14
Roadmap for this Lecture
Up-trees with Linked lists Data
union by rank and with structure
path compression union by size Operations
Amortized
(♣)
ignoring time for Find required as part of Union
Union-Find
with Up-Trees
Implementation 2: Up-trees
Represent each set as a rooted tree
with pointers to parents only
Un
ion
(x,y
)
y x.p
x
The parent of a vertex x is denoted by x.p
Find(x) traces the path from x to the root
Union(x,y) hangs one root under the other 17
Implementation using Up-trees
For efficiency, use:
Union by rank + Path compression
18
Union by Rank
• Let’s define rank of a tree to be its height
• This is a temporary definition, as you will see soon
• In Union, link smaller-rank tree under larger-rank tree
• No change in ranks
• or if both have the same rank, link arbitrarily
• +1 for rank of new root r +1
r2 r r
0 r1
r1 < r2
Union by Rank
r+1
r2 r r
0 r1
r1 < r2
• Ranks as defined here are simply the heights…
• But with path compression (next) heights may decrease
(ranks only increase)
• So ranks are upper bounds for heights
Union by Rank: Properties (1)
r+1
r2 r r
0 r1
r1 < r2
• If x is not a root, then x.rank < x.p.rank and x.rank will not
change any further
Property 1: ranks strictly increase going upwards
Union by Rank: Properties
(cont.) r+1
r2 r r
0 r1
r1 < r2
r1 < r2
• A tree of rank r contains at least 2r elements
Find(x)
Notes:
• Requires 2 passes
• Does not change ranks (as defined earlier)
• heights may decrease
• All previous properties still hold (make sure!)
24
Union Find with Up-trees:
pseudocode
29
Ackermann’s function
(one of many variations)
𝐴 1 ( 𝑛 )=𝑛+1
𝐴 2 ( 𝑛 )=2 𝑛+1> 2𝑛
𝑛 +1 𝑛
( )
𝐴 3 𝑛 >2 ⋅ 𝑛 >2
. .
2. 2.
𝐴4 ( 𝑛 ) > 2
𝑛
30
The Tower function
T(n) n
2 1
4 2
16 3
65,536 4
265,536 5
31
Inverse functions
33
The log n function
*
Tower(n) n log*(n) n
2 1 1 0–2
4 2 2 3–4
16 3 3 5 – 16
65,536 4 4 17 – 65,536
265,536 5 5 65,537 – 265,536
35
O(log*n) amortized
bounds
O(log*n) upper bound
For the sake of simplicity, we prove an O(log*n)
upper bound on the amortized cost of Find, Make-set
The O((n)) upper bound is more complicated
38
Defining Levels
• The level of a node x is defined to be
level(x) = log*(x.rank)
level(x) x.rank
1 0 .. 2
2 3 .. 4
3 5 .. 16
4 17 .. 65,536
5 65,537 .. 265,536
i >1 T(i1)+1 .. T(i) 39
Upper Bound on Level Size
• The level of a node x is defined to be
level(x) = log*(x.rank)
• Recall Property 3: At most nodes of rank r
41
Upper bound on Work during Find
• Bounded by number of parent pointers followed (all but the
last one will change due to path compression).
• How many pointers in a path from to its root?
42
Upper bound on Work during Find
• How many pointers in a path from to its root?
• Main idea: separate between
- Pointers along path to a parent of a higher level
- Pointers along path within the same level of
root
17 …
Level
16
… Level i
5
4
2
3 Level 2
1
0 Level 1 43
Upper bound on Work during Find
• How many pointers in a path from to its root?
44
Total Work on Same-Level Pointers
- Pointer changes along path within the same level of ,
- At level 1 we have nodes, each can change parent at most twice
(why?)
- Suppose is some node at level .
- So
- Can change parent times
- Recall there are nodes at level
- Total work for all nodes at level is bounded by
- Summing up over all levels, total work for same-level pointers:
∗
log 𝑛−1
≤ 2 𝑛+ ∑ 𝑛=𝑛 log 𝑛
∗
45
𝑖 =2
O(log*n) upper bound
amort(Make-Set) amort(Find) 46
Important application of UF:
𝑤3 𝑢2
𝑤2 𝑢1
you are
𝑤1 v here
?
?
? 50
The off-line LCA problem
not yet visited
𝑢3
Postorder traversal
visited
on the tree
𝑤3 𝑢2
𝑤2 𝑢1
you are
𝑤1 v here
Going down: u v:
Make-Set(v)
Going up: v u:
foreach w<v, then
LCA(w,v) = “Find(w)” 51
The off-line LCA problem
not yet visited
𝑢3
Postorder traversal
visited
on the tree
𝑤3 𝑢2
𝑤2 𝑢1 you are
here
𝑤1 v
Going down: u v:
Make-Set(v)
Going up: v u:
foreach w<v, then
LCA(v,w) = “Find(w)” 52
The off-line LCA problem
not yet visited
𝑢3
Postorder traversal
visited
on the tree
you are
𝑤3 𝑢2 here
𝑤2 𝑢1
𝑤1 v
Going down: u v:
Make-Set(v)
Going up: v u:
foreach w<v, then
LCA(v,w) = “Find(w)” 53
The off-line LCA problem
you are
not yet visited
𝑢3 here Postorder traversal
visited
on the tree
𝑤3 𝑢2
𝑤2 𝑢1
𝑤1 v
Going down: u v:
Make-Set(v)
Going up: v u:
foreach w<v, then
LCA(v,w) = “Find(w)” 54
The off-line LCA problem
not yet visited
𝑢3
Postorder traversal
visited
on the tree
𝑤3 𝑢2
We want these to 𝑤2 𝑢1
be the
representatives you are
𝑤1 v here
(How do we do it?)
55