4.link Analysis and Page Rank S4
4.link Analysis and Page Rank S4
Before Google:
• copy the page that come back as the first choice into your own
di out-degree of node i
Solving the flow equations
X ri
rj =
di
i→j
• No unique solution
• Additional constraint forces uniqueness:
P
ri = 1
• Gaussian elimination method works for small examples
• A better method for large web-size graphs?
Matrix Formulation
• Stochastic adjacency matrix M :
• Let page i has di out-links
1
• If i → j, then Mji = else Mji = 0
di
• M is a column stochastic matrix
• Columns sum to 1
r = Mr
• The Google solution for spider traps: At each time step, the
random surfer has two options:
1. with probability β follow a link at random
2. with probability 1 − β jump to some random page
3. Common values for β are in the range 0.8 to 0.9
• Surfer will teleport out of spider trap or a dead end
within a few time steps
Using PageRank in a Search Engine