Chap 8
Chap 8
Chapter 8
Equivalence Relations
An equivalence relation R is defined for a set if for every a,b in the set, aRb is either true or false. An equivalence relation has 3 properties:
Reflexive: a R a for all a in S Symmetric: a R b if and only if b R a Transitive: a R b and b R c => a R c
3
Equivalence Relations
Is the <= operator an equivalence relation?
It is reflexive (a<=a) It is transitive (a<=b and b<=c means a<=c) But it is not symmetric (a<=b does not mean b<=a).
Examples
Is electrical connectivity between components an equivalence relation?
It is reflexive since a component is connected to itself. It is symmetric since if a is connected to b then b is connected to a. It is transitive since if a connects to b and b connects to c then a has connectivity to c.
Examples
Is travel between cities in a country an equivalence relation?
It is reflexive because you may travel from the city to itself. It is symmetric because traveling from a to b implies travel is possible from b to a. It is transitive because traveling from a to b and from b to c implies travel from a to c.
Dynamic Equivalence
We use ~ to mean an equivalence relation. We would like to decide if a~b for any a,b. This could be done in constant time with a 2D array of Boolean values. For example, if for any a,b we inspect the array, we would find either true or false, telling us if a~b or not.
7
Dynamic Equivalence
A 2D array would contain all of the relation information explicitly. However, the data may not come to us in this form, it may come implicitly. For example: a1~a2, a3~a4, a5~a1, a4~a2 implies that all pairs in {a1, a2, a3, a4, a5} are related. We would like to be able to determine this quickly.
8
Dynamic Equivalence
The equivalence class of an element, a, is the subset of S that relates to a. The equivalence class of a partitions S into two sets, the set that relates to a and the set that does not. So, to know if a~b, we need to know if a and b belong to the same equivalence class.
9
Dynamic Equivalence
We start with a list of N sets, each with one element, and no relation between the elements. Since all sets are unique, they are disjoint. We then define two operations:
Find: returns the name of the set containing a given element. Union: merges two equivalence classes into one. 10
Dynamic Equivalence
The operations on the sets do not involve comparing their relative values. For this reason, the values of the elements in the sets are simply representative values and can be number 0 to N-1. Actual data items would need to be mapped to these values for an application to use them.
11
Dynamic Equivalence
The find operation returns the name of a set, but the name is somewhat arbitrary since we merely wish to know if find(a) == find(b).
12
Dynamic Equivalence
An array could be used by letting the index represent the element, and the value represent the set it belongs to (its name). A find could then be done in O(1) time. However, a union would take O(N) time, since a union would need to scan the list changing all elements of the sets to the merged sets name.
13
Trees
0 1 2 3
-1 0
-1 1
-1 2
-1 3
Union(2,3)
-1 0
-1 1
-1 2
2 3
3
15
Trees
So, to perform a union of two sets, we merge two trees in the array, by making one trees root a child of the other trees root. This takes O(1) time (constant time). Each set is stored as a separate tree. A collection of trees is called a forest.
16
Find(x)
The find(x) command can return the root of the tree containing x. Because a tree may be N-1 elements deep, the running time of find is O(N). So, a series of M operations could take O(MN).
17
public class DisjSets { /** Construct the disjoint sets object. * @param numElements the initial number of disjoint sets. */ public DisjSets( int numElements ) { s = new int [ numElements ]; for( int i = 0; i < s.length; i++ ) s[ i ] = -1; }
/** Union two disjoint sets. * Assume root1 and root2 are distinct and represent set names. * @param root1 the root of set 1. * @param root2 the root of set 2. */ public void union( int root1, int root2 ) { s[ root2 ] = root1; }
18
/** Perform a find. Error checks omitted again for simplicity. * @param x the element being searched for. * @return the set containing x. */ public int find( int x ) { if( s[ x ] < 0 ) return x; else return find( s[ x ] ); }
private int [ ] s; }
19
Union-by-size: 0 1 union(2,4) 2 3 4 5 6 7
0
1
4
5 6 7 2 3
merge tree with fewer nodes into tree with more nodes
21
Union-by-size:
6 7
2 3
-2 0
0 1
4 2
2 3
-6 4
4 5
4 6
6 7
22
0
union(0,1);
0
1
union(0,2);
2
3
union(4,6);
4
5
union(0,4);
6
7
0 1 2
4 5 6
0 1 2
4 3
5 6 7
24
Union-by-height: 0 1 union(2,4) 2 3 4 5 6 7
0
1
4
5 6 7 2 3
merge tree with lesser height into tree with greater height
26
Union-by-height:
0 1
5 6
7
2
3
-2 0
0 1
4 2
2 3
-3 4
4 5
4 6
6 7
0 and 4 are roots. -2 and -3 indicate heights 1 and 2. It is one less since a 1-node tree would be height 0, and since 0 is not negative, -1 is used.
27
Path Compression
Path compression is done to make finds faster. When a find(x) is performed, every node on the path to x is made a child of the root. Future finds on these nodes is thus faster. This turns out to be quite easy to do.
28
After Find(4)
0
1 1
0
2 3 4
3 4
5
29
find(4)
public int find(int x) { if (s[x] < 0) return x; else return s[x] = find(s[x]); }
s[3] = 0
s[4] = 0
-1
0
0
1
1
2
2
3
3
4
4
5
-1 0
0 1
0 2
0 3
0 4
4 5
30
Performance
When path compression is used with a smart union algorithm, any sequence of M union/find operations takes O(M log*N) time, where M=(N). log*N is the number of times the log must be applied until the result is <= 1. Example: log* 65536 = 4, because log 65536=16, log 16 = 4, log 4 = 2, log 2 = 1.
31
Performance
log* 265536 = 5, and 265536 is a 20000-digit number. So, log* N grows extremely slow. Because log* N grows so slow, the performance is almost linear across a series of operations.
32
Maze Generation
Union/find operations can aid in the construction of a maze. Suppose the maze is to be created so that there is a path from the upper left cell to the lower right cell. Further suppose that there is a path to any cell, meaning all cells are connected (this results in many false paths).
33
Maze Generation
We could begin with each cell in a different set. We then randomly pick a cell and wall. If this cell is not yet connected, we knock down the wall and union it to the set containing the first cell. We continue until all cells are connected, implying a path from the upper left to lower right cells.
34
5 10 15 20
6 11 16 21
7 12 17 22
8 13 18 23
9 14 19 24
5 6 10 11 15 16
7 8 9 12 13 14 17 18 19
20 21 22 23 24
5 6 10 11 15 16 20 21
7 8 9 12 13 14 17 18 19 22 23 24
35
End of Slides
36