Er Graph Matching
Er Graph Matching
Erdos-Rnyi
Graph Matching
ABSTRACT
Take a pair of correlated Erd
os-Renyi graphs and permute
the vertex labels of one of the graphs. When does the correlation allow the original labels to be recovered? Learning the original labels is equivalent to finding the true correspondence between the vertex sets. We improve on the
achievability bound derived by Pedarsani and Grossglauser
Previously, the converse was only known in the limiting case
where the two graphs are identical. We prove a converse that
applies to correlated pairs. The converse and achievability
thesholds differ by a factor of two throughout the paramter
space.
1.
INTRODUCTION
2.
NOTATION
[n]
2
n
2
.
(1, 1) w.p.
(1, 0) w.p.
(0, 1) w.p.
(0, 0) w.p.
p11
p10
p01
p00
= rs0 s1
= rs0 (1 s1 )
= r(1 s0 )s1
= 1 r(s0 + s1 s0 s1 ).
3.
CONVERSE
Thus E
X
2
n
2
(1 p)2n3 . Then
E[X 2 ]E[X]2 1
"
!#
!
X
= 2E
+ E[X] E[X]2 1
2
(n2 n)(1 p)2n3 + n(1 p)n1
1
n2 (1 p)2n2
= (1 p)1 n1 (1 p)1 + n1 (1 p)n+1 1
p + n1 (1 p)n+1 .
=
Proof:
By Bayes rule and the independence of from (Ga , Gb ), we
have
P [ = |(Ga , Gb ) = (ga0 , gb )]
P [ = , (Ga , Gb ) = (ga0 , gb )]
P [(Ga , Gb ) = (ga0 1 , gb )]P [ =
P [(Ga , Gb ) = (ga0 1 , gb )]
ma mb
2
p10
= p11 2
k
p10 p01
=c
p11 p00
+k
mb ma
2
p01
+k N
p00
ma +mb
2
E[X 2 E[X]2 ]
1
E[X]] 4
.
2
E[X]2
4.
ACHIEVABILITY
Lemma 4. Let (Ga , Gb ) ER(n, p), let be a permutation of [n], and let 2k = (Ga , Ga ). Conditioned on k,
d(, Ga , Gb ) has the generating function
k
k
p01 z + p00 z 1
p11 z 1 + p10 z
Dk,Gb (z) =
p0
p1
4.1
Cycle combinatorics
Lemma 5.
Let al,k,r be the number of cyclic sequences of length l with
k ones and r ones that followed by ones. Let
X
al (x, y, z) =
al,k,r xk y lk z r .
k,r
Additionally, P [d(, Ga , Gb ) 0|Ga ] (z )k and P [d(, Ga , Gb ) Let cl be the number of cycles of length l in . Then
0] K (z ), where
n
Y
K (z) =
al (p1 , p0 , z)cl .
z =
(p00 p10 + p01 p11 + 2 p00 p10 p01 p11 ) .
l=1
p0 p1
Proof: Let a(e) P
= |Ga ((e))Gb (e)||Ga (e)Gb (e)|. Then
d(, Ga , Gb ) = e a(e). Because a(e) depends on Ga only at
Ga (e), the terms of the sum are conditionally independent.
If Ga ((e)) = Ga (e), then |Ga ((e)) Gb (e)| = |Ga (e)
Gb (e)| and the contribution of a(e) to d(, Ga , Gb ) is zero. If
Ga ((e)) 6= Ga (e), then |Ga ((e))Gb (e)| 6= |Ga (e)Gb (e)|
and a(e) is either 1 or 1.
Within each cycle of , the number of e such that Gb (e) = 0
and Gb ((e)) = 1 is equal to the number of e such that
Gb (e) = 1 and Gb ((e)) = 0. Throughout all of , the
number of e such that Gb (e) = 0 and Gb ((e)) = 1 is equal
to k = 21 (Ga , Ga ).
Suppose that Ga (e) = 0 and Ga ((e)) = 1. Then
p01
P [a(e) = 1|Ga (e) = 0, Ga ((e)) = 1] =
.
p0
Suppose that Ga (e) = 1 and Gb ((e)) = 0. Then
p10
.
P [a(e) = 1|Ga (e) = 1, Ga ((e)) = 0] =
p1
Thus
k
k
p01 z + p00 z 1
p11 z 1 + p10 z
Dk,Gb (z) =
p0
p1
k
2
p00 p11 z + p00 p10 + p01 p11 + p10 p01 z 2
=
p0 p1
Abbreviate d = d(, Ga , Gb ). For all 0 < z 1
2
Theorem 3. Let q =
p11 p00 p10 p01 and let z
be as defined in Lemma 4. Then
al (p1 , p0 , z ) (1 2q)l/2
and
K (z ) (1 2q)
N c1
2
bl,s = 2
X
i
! !
l
i
2i
s
( p00 p10 p01 p11 + p00 p10 + p01 p11 + p00 p10 p01 p11 )k .
Thus
P [d 0] =
P [(Ga , Ga ) = 2k](z )k
= K (z ).
Then we have
p1 p0 (1 z )
From
k,r
K (z) =
X r
=
x y al,k,r
(z 1)s
s
s
k,r
!
X k lk
l 2s
=
x y bl,s
(z 1)s
ks
k,s
X
=
bl,s (xy)s (x + y)l2s (z 1)s
X
k lk
= (x + y)l bl
xy(z 1)
(x + y)2
.
1l+2s
=2
X
!i
X
1l
and
Pn
l=1
n(n 1) (n m)(n m 1) m
2
m(2n m 2)
.
=
2
N c1
! !
n
i s
z
2i
s
n
(1 + 4z)i .
2i
1l
al (p1 , p0 , z ) = 2
g 2i
2i
i
!
X l
l
=2
(1 + (1)j )g j
j
j
l
l
1g
1+g
+
=
2
2
2
2 !l/2
1+g
1g
+
2
2
2 l/2
1+g
=
2
= (1 + 2p1 p0 (z 1))l/2
al (p1 , p0 , z)cl
l=1
n
Y
The probability that there is some permutation that produces a better match than the identity permutation is
P [6=I d(, Ga , Gb ) 0]
X
P [d(, Ga , Gb ) 0]
=
6=I
n
X
P [d(, Ga , Gb ) 0]
m=2 Sn,m
n
X
m=2
P [d(, Ga , Gb ) 0] K (z ) (1 2q)
N c1
2
where q =
2
P [6=I d(, Ga , Gb ) 0]
n
X
N c1
nm (1 2q) 2
m=2
n
X
nm (1 2q)
m=2
n
X
m(2nm2)
4
2n m 2
log(1 2q)
4
m=2
m
n
X
q(2n m 2)
n exp
2
m=2
m
n
X
q(n 2)
exp log n
2
m=2
m
n exp
x2
1x
where
log x = log n
q(n 2)
.
2
As long as
q2
log n + (1)
n
we have
2
log x log n (log n + (1)) 1
n
.
5.
REFERENCES
[1] B. Bollob
as. Random graphs. Springer, 1998.
[2] P. Pedarsani and M. Grossglauser. On the privacy of
anonymized networks. In Proceedings of the 17th ACM
SIGKDD international conference on Knowledge
discovery and data mining, pages 12351243. ACM,
2011.