0% found this document useful (0 votes)
9 views53 pages

12 13 Union Find

This document provides an overview of the Union-Find data structure, detailing its operations such as Make-Set, Union, and Find, along with their applications in incremental connectivity and maze generation. It discusses various implementations, including linked lists and up-trees, and introduces concepts like union by rank and path compression for efficiency. Additionally, it explores the amortized time complexity of operations and the use of Union-Find in solving the Lowest Common Ancestor problem.

Uploaded by

bnly1234
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views53 pages

12 13 Union Find

This document provides an overview of the Union-Find data structure, detailing its operations such as Make-Set, Union, and Find, along with their applications in incremental connectivity and maze generation. It discusses various implementations, including linked lists and up-trees, and introduces concepts like union by rank and path compression for efficiency. Additionally, it explores the amortized time complexity of operations and the use of Union-Find in solving the Lowest Common Ancestor problem.

Uploaded by

bnly1234
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 53

Data Structures

Lecture 12*
Union-Find for Disjoint Sets

Amir Rubinstein
Spring semester 2021-2
*
based on the TAU course slides, edited by AR
Union-Find:
Definition,
Applications
Union-Find
• x  Make-Set(info):
Create an item x, with associated information info,
and create a set containing it as its single item
• Union(x,y):
Unite the sets containing x and y
• Find(x):
Return some representation of the set containing x
(can be set name, unique memory location, representative
element, etc.)

Basic requirement:
Find(x)=Find(y) iff x and y are currently in same set
3
Union Find: Example

a b c d e

a  Make-Set() c  Make-Set()
b  Make-Set() d  Make-Set()
Union(a,b) e  Make-Set()
Find(b)  a Union(c,d)
Find(a)  a Union(d,e)
Find(e)  d 4
Applications (1):
Incremental Connectivity
A graph on n vertices is built by adding edges
At each stage we may want to know whether
two given vertices are already connected

2 5

1 4 7

3 6

union(1,2) union(2,7) Find(1)=Find(6)? union(3,5) … 5


Application (2): Generating mazes
c1  Make-Set(1)
c2 Make-Set(2) 1 2 3 4

c16  Make-Set(16) 5 6 7 8
find(c6)=find(c7) ?
union(c6,c7) 9 10 11 12
find(c7)=find(c11) ?
union(c7,c11) 13 14 15 16

Choose edges in random order and remove


them if they connect two different regions
6
Fun application: Generating mazes

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

Proceed until Find(1) == Find(16)


7
Generating mazes – a larger example

Number of edges to remove (=Unions) is and


W.C. number of Find operations unbounded, but expected it’s OK
8
More applications:
• Maintaining an equivalence relation

• Computing LCA (lowest common


ancestor) of two nodes in a tree (later)

• Computing minimum spanning trees


using Kruskal’s algorithm (Algorithms)
• …
9
Roadmap for this Lecture
Up-trees with Linked lists Data
union by rank and with structure
path compression union by size Operations

will prove can prove

O(log*n) O(1) O(1) Make-Set


O(1) (♣)
O(1) (♣)
O(log n) Union(x,y)
O(log*n) O(1) Find(x)
“Log star” – the repeated Inverse Ackermann
logarithm function

Amortized
10
(♣)
ignoring time for Find required as part of Union
Union-Find
with
Linked Lists
Implementation 1: Linked Lists
Each set is represented as a linked list
Each element has a pointer to the list head

first
last
size …
x
Make-Set creates a new singleton list – O(1) time
12
Find(x) returns list head – O(1) time
Union using Linked Lists
first
last
size k1 …

first x y
last
size k2

Concatenate the two lists


Change “set pointers” of shorter list (aka union by size)
Union(x,y) in O(min{k1,k2}) = O(n) time W.C. 13
Union by Size – Amortized Analysis
• Claim: Union runs in time amortized

Proof:
• Look at any series of operations, containing Make-Set operations, and
any number of Finds and Unions
• Make-Set, Find and Unions of same-set elements take O(1)
• At most “real” unions of different sets (why?)
• Whenever the set pointer of an item is changed, the size of the set
containing it is at least doubled. So any set pointer can be changed at
most times where is the numbers of items, and in total all the unions
changed pointers
• work over “real” unions  O(logn) amortized

14
Roadmap for this Lecture
Up-trees with Linked lists Data
union by rank and with structure
path compression union by size Operations

will prove can prove

O(log*n) O(1) O(1) Make-Set


O(1) (♣)
O(1) (♣)
O(log n) Union(x,y)
O(log*n) O(1) Find(x)
“Log star” – the repeated Inverse Ackermann
logarithm function

Amortized
(♣)
ignoring time for Find required as part of Union
Union-Find
with Up-Trees
Implementation 2: Up-trees
Represent each set as a rooted tree
with pointers to parents only
Un
ion
(x,y
)

y x.p
x
The parent of a vertex x is denoted by x.p
Find(x) traces the path from x to the root
Union(x,y) hangs one root under the other 17
Implementation using Up-trees
For efficiency, use:
Union by rank + Path compression

18
Union by Rank
• Let’s define rank of a tree to be its height
• This is a temporary definition, as you will see soon
• In Union, link smaller-rank tree under larger-rank tree
• No change in ranks
• or if both have the same rank, link arbitrarily
• +1 for rank of new root r +1
r2 r r
0 r1

r1 < r2
Union by Rank
r+1
r2 r r
0 r1

r1 < r2
• Ranks as defined here are simply the heights…
• But with path compression (next) heights may decrease
(ranks only increase)
• So ranks are upper bounds for heights
Union by Rank: Properties (1)
r+1
r2 r r
0 r1

r1 < r2

• If x is not a root, then x.rank < x.p.rank and x.rank will not
change any further
Property 1: ranks strictly increase going upwards
Union by Rank: Properties
(cont.) r+1
r2 r r
0 r1

r1 < r2

• A tree of rank r contains at least 2r elements (by induction):


- Union when yields new tree with elements
- Union when yields new tree with elements
Union by Rank: Properties (2-3)
r+1
r2 r r
0 r1

r1 < r2
• A tree of rank r contains at least 2r elements

Property 2: Union by rank gives O(log n) W.C. Find time


Property 3: At most nodes of rank r
Path Compression

Find(x)

Notes:
• Requires 2 passes
• Does not change ranks (as defined earlier)
• heights may decrease
• All previous properties still hold (make sure!)
24
Union Find with Up-trees:
pseudocode

Note the path compression here

• See: https://www.youtube.com/watch?v=VHRhJWacxis (nice video explaining this data structure)


https://observablehq.com/@bryangingechen/union-find-data-structure (configurable demo)
https://www.cs.usfca.edu/~galles/visualization/DisjointSets.html (interactive demo)
Some More Properties
• Property 4: Once a node stops being a root it will never
become root again. So changing a parent during Find is
only possible at this stage in a node’s “lifetime”

• Property 5: once a node stopped being a root, it will


remain in the same rank forever
Up-trees
union by rank + path compression
Claim (proof ommitted, not trivial):
Any series of makeset/union/find operations, of
which are makeset, requires time.
 the amortized time per operation is .

For the sake of simplicity, we prove an O(log*n)


upper bound on the amortized cost of Find, Make-set
The O((n)) upper bound is more complicated
What is this mysterious funtion?
27
Inverse Ackermann’s
Function
Nesting / Repeated application

29
Ackermann’s function
(one of many variations)

𝐴 1 ( 𝑛 )=𝑛+1
𝐴 2 ( 𝑛 )=2 𝑛+1> 2𝑛
𝑛 +1 𝑛
( )
𝐴 3 𝑛 >2 ⋅ 𝑛 >2

}> 2 }=𝑇𝑜𝑤𝑒𝑟 (𝑛)


𝑛
2 2
. .
𝑛+1

. .

2. 2.
𝐴4 ( 𝑛 ) > 2
𝑛
30
The Tower function

T(n) n
2 1
4 2
16 3
65,536 4
265,536 5
31
Inverse functions

33
The log n function
*

Tower(n) n log*(n) n
2 1 1 0–2
4 2 2 3–4
16 3 3 5 – 16
65,536 4 4 17 – 65,536
265,536 5 5 65,537 – 265,536

“For all practical purposes log*(n)  5” 34


Inverse Ackermann function

is the inverse of the function

35
O(log*n) amortized
bounds
O(log*n) upper bound
For the sake of simplicity, we prove an O(log*n)
upper bound on the amortized cost of Find, Make-set
The O((n)) upper bound is more complicated

We use a variant of the accounting method


in which items accumulate debits

38
Defining Levels
• The level of a node x is defined to be
level(x) = log*(x.rank)

level(x) x.rank
1 0 .. 2
2 3 .. 4
3 5 .. 16
4 17 .. 65,536
5 65,537 .. 265,536
 
i >1 T(i1)+1 .. T(i) 39
Upper Bound on Level Size
• The level of a node x is defined to be
level(x) = log*(x.rank)
• Recall Property 3: At most nodes of rank r

level(x) x.rank size


1 0 .. 2
2 3 .. 4
3 5 .. 16
4 17 .. 65,536
5 65,537 .. 265,536
 
i >1 T(i1)+1 .. T(i) 40
Upper bound on Number of Levels
• The level of a node x is defined to be
level(x) = log*(x.rank)

• So we have at most levels

41
Upper bound on Work during Find
• Bounded by number of parent pointers followed (all but the
last one will change due to path compression).
• How many pointers in a path from to its root?

42
Upper bound on Work during Find
• How many pointers in a path from to its root?
• Main idea: separate between
- Pointers along path to a parent of a higher level
- Pointers along path within the same level of

root
17 …
Level
16

… Level i
5
4

2
3 Level 2
1
0 Level 1 43
Upper bound on Work during Find
• How many pointers in a path from to its root?

• Main idea: separate between


- Pointers along path to a parent of a higher level
- can occur times along the path
- will be “paid” by the Find operations  amortized

- Pointers along path within the same level of .


- can occur… ? (see next)
- will be “paid” by the Make-Set operations

44
Total Work on Same-Level Pointers
- Pointer changes along path within the same level of ,
- At level 1 we have nodes, each can change parent at most twice
(why?)
- Suppose is some node at level .
- So
- Can change parent times
- Recall there are nodes at level
- Total work for all nodes at level is bounded by
- Summing up over all levels, total work for same-level pointers:


log 𝑛−1
≤ 2 𝑛+ ∑ 𝑛=𝑛 log 𝑛

45
𝑖 =2
O(log*n) upper bound

Charge to Total charge to


all nodes over all
each Find Find’s

amort(Make-Set) amort(Find) 46
Important application of UF:

Lowest Common Ancestor


(LCA)
Important application of UF:
Lowest Common Ancestor (LCA)
LCAT(x,y) = The lowest node z which
is an ancestor of both x and y
a
T LCA(e,k) = a
b c d
LCA(f,g) = b
LCA(c,h) = c
e f g h

i j k 48
The off-line LCA problem
Given a tree and a collection of pairs,
find for every

Using Union-Find we can get


time,
where and

There are more involved linear time


algorithms, even for the on-line version
49
The off-line LCA problem
not yet visited
𝑢3
Postorder traversal
visited
on the tree

𝑤3 𝑢2

𝑤2 𝑢1

you are
𝑤1 v here

?
?
? 50
The off-line LCA problem
not yet visited
𝑢3
Postorder traversal
visited
on the tree

𝑤3 𝑢2

𝑤2 𝑢1

you are
𝑤1 v here
Going down: u v:
Make-Set(v)
Going up: v u:
foreach w<v, then
LCA(w,v) = “Find(w)” 51
The off-line LCA problem
not yet visited
𝑢3
Postorder traversal
visited
on the tree

𝑤3 𝑢2

𝑤2 𝑢1 you are
here

𝑤1 v
Going down: u v:
Make-Set(v)
Going up: v u:
foreach w<v, then
LCA(v,w) = “Find(w)” 52
The off-line LCA problem
not yet visited
𝑢3
Postorder traversal
visited
on the tree
you are
𝑤3 𝑢2 here

𝑤2 𝑢1

𝑤1 v
Going down: u v:
Make-Set(v)
Going up: v u:
foreach w<v, then
LCA(v,w) = “Find(w)” 53
The off-line LCA problem
you are
not yet visited
𝑢3 here Postorder traversal
visited
on the tree

𝑤3 𝑢2

𝑤2 𝑢1

𝑤1 v
Going down: u v:
Make-Set(v)
Going up: v u:
foreach w<v, then
LCA(v,w) = “Find(w)” 54
The off-line LCA problem
not yet visited
𝑢3
Postorder traversal
visited
on the tree

𝑤3 𝑢2

We want these to 𝑤2 𝑢1
be the
representatives you are
𝑤1 v here
(How do we do it?)

55

You might also like