0% found this document useful (0 votes)
66 views13 pages

Network Analysis and Mining: Pagerank

This document discusses PageRank and eigenvector centrality for network analysis. It explains that PageRank assigns a numeric value to represent a webpage's importance based on the number of votes (links) it receives from other pages. It then describes how eigenvector centrality can be used to model importance in a network by making a node's centrality a function of its neighbors' centralities. The power method is discussed as an iterative way to calculate the principal eigenvector, which represents the steady state of this recursive definition of importance. Finally, modifications are presented to apply eigenvector centrality to directed graphs like the web through PageRank and Katz centrality.

Uploaded by

Manogna Gv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views13 pages

Network Analysis and Mining: Pagerank

This document discusses PageRank and eigenvector centrality for network analysis. It explains that PageRank assigns a numeric value to represent a webpage's importance based on the number of votes (links) it receives from other pages. It then describes how eigenvector centrality can be used to model importance in a network by making a node's centrality a function of its neighbors' centralities. The power method is discussed as an iterative way to calculate the principal eigenvector, which represents the steady state of this recursive definition of importance. Finally, modifications are presented to apply eigenvector centrality to directed graphs like the web through PageRank and Katz centrality.

Uploaded by

Manogna Gv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

NETWORK ANALYSIS AND MINING

PageRank

Bhaskarjyoti Das
Department of Computer Science Engineering
[email protected]
NETWORK ANALYSIS AND MINING

Page Rank Formula

Bhaskarjyoti Das
Department of Computer Science Engineering
NETWORK ANALYSIS AND MINING
PageRank

• Web pages are organized in a network.


–Each webpage is represented as a node.
–Each hyperlink is a directed edge
–The entire web can be viewed as a directed graph.
NETWORK ANALYSIS AND MINING
PageRank Approach

PageRank is a numeric value that represents how


important a page is on the web.

• Webpage importance
–One page links to another page is a vote for the other
page
–More votes = More important the page must be

• How can we model this importance?


NETWORK ANALYSIS AND MINING
Concept of Eigen Vector Centrality For Undirected Graph

• Eigenvector centrality tries to generalize degree centrality by


incorporating the importance of the neighbors (undirected)
• Which edge A-B or B-C should be severed to make B safe
from Sexually Transmitted Disease (STD)?
• This is a recursive definition : depends on propensity of
getting infected via your neighbor, neighbor’s neighbor and
so on..
D

A B C
NETWORK ANALYSIS AND MINING
Why Eigen Vector Makes Sense

• Degree centrality uses Only the number of neighbors -> all


neighbors are equal
• We want the centrality of vi to be a function of its neighbors’
centralities.
–We posit that it is proportional to the summation of their
centralities
–Assume initially everyone has score xi=1 and we update its
centrality using the sum of the scores/centralities of his
neighbors
NETWORK ANALYSIS AND MINING
Eigen Vectors Centrality by Power Method (by Hotelling )

• X is the popularity vector of all nodes


• Let Xj (0) be the popularity of the adjacent nodes j at time t=0
• Since popularity of a nodes depends on the popularity of
i j
nodes it is connected to , at time t=1, for node i,
Xi(1) = ∑Aij Xj(0) , summing over j nodes
or X(1) = A. X(0)
• Continuing our recursive definition (using power method)
after t such steps, we can say
X(t) = A.X(t-1) = A2X(t-2)=A3X(t-3) = At.X(0)
 X(t) = At.X(0)
 As per iterative power method by Hottelling to calculate
dominant eigen values , when t is very large, it will
stabilize and will not change any further
NETWORK ANALYSIS AND MINING
Power Method For Eigen Vectors by Hotelling - 2

• X(t) = At.X(0)
• Let X(0), the vector, be a linear combination of the eigen vectors
of A (adjacency matrix)
• X(0) = ∑civi over i arbitrary eigenvectors of matrix A
• Now plugging in, X(t) = At ∑civi
• Now in equation Av = λv
 Multiplying both sides by A, A.Av = A.(λv) =λ (AV) = λ2 v
 Continuing for t times, At v= λt v
 Now after plugging in, X(t) = ∑λitcivi
NETWORK ANALYSIS AND MINING
Power Method For Eigen Vectors by Hotelling - 3

• X(t) = ∑λitcivi
• Let λ1 be the principal eigen value, we divide both sides of the
equation by λ1t
• X(t) / λ1t = ∑(λi/ λ1)tcivi
• If t -> ∞, Lt (Xt(t) /λ1t ) = c1v1 as the first eigen vector is largest
and all other terms in the sum will become zero

• Hence, in this recursive model, popularity value converges to (


proportional to ) “principal eigen vector” of adjacency matrix !!

• Note that, this exercise is done for an undirected graph and


adjacency matrix is assumed to be symmetric
NETWORK ANALYSIS AND MINING
Eigen Vector Centrality Modified for Directed Graph

•Popularity for undirected graph can be recast for popularity in a


directed graph
•In a directed graph (such as web), popularity is passed on only by
incoming edges. Hence, popularity becomes zero even though the B
node can have many outgoing edge connected to it if it does not have A
an income edge C
D
•In diagram, A has popularity = 0 and since B has one incoming edge
from A, B also has popularity or centrality 0 !! This may propagate
through the network !
•To resolve this problem we add initial popularity β to the centrality
values for all nodes whereas alpha depends on the neighborhood
•Xi(1) = α∑Aji Xj(0) + β
NETWORK ANALYSIS AND MINING
Katz Centrality For Directed Graph

vector of all 1’s

Rewriting equation in a
vector form with C as the
centrality
Katz centrality:
NETWORK ANALYSIS AND MINING
PageRank Fix to Katz Centrality For Directed Graph

• Issue : Everyone known by a well-known person is assumed


to be well-known
• To mitigate this problem we can divide the value of
passed centrality by the number of outgoing links

• PageRank centrality is the correction introduced to Katz


Centrality.

• Both Katz and PageRank Centrality are variants of Eigen


Vector Centrality !!
THANK YOU

Bhaskarjyoti Das
Department of Computer Science Engineering
[email protected]

You might also like