0% found this document useful (0 votes)
34 views

Page Rank

This document summarizes the PageRank algorithm used by Google to rank web pages. It discusses how PageRank assigns each page a numerical importance value based on the link structure of the internet, formulated as the principal eigenvector of the link structure matrix P. It describes P as representing a random walk through web pages, and explains that the power method is used to efficiently approximate the PageRank eigenvector by multiplying P by an initial guess 50 times. The special properties of P ensure the largest eigenvalue is 1, allowing normalization and fast convergence in 50 iterations for the extremely large matrix.

Uploaded by

Mudit Mathur
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views

Page Rank

This document summarizes the PageRank algorithm used by Google to rank web pages. It discusses how PageRank assigns each page a numerical importance value based on the link structure of the internet, formulated as the principal eigenvector of the link structure matrix P. It describes P as representing a random walk through web pages, and explains that the power method is used to efficiently approximate the PageRank eigenvector by multiplying P by an initial guess 50 times. The special properties of P ensure the largest eigenvalue is 1, allowing normalization and fast convergence in 50 iterations for the extremely large matrix.

Uploaded by

Mudit Mathur
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Computing the PageRank

The World’s Largest Eigenvalue Problem

What is PageRank The Link Structure Matrix P The PageRank Algorithm

A search with Google’s search engine usually returns a 1 2 3 The Google matrix P is currently of size 4.2×109
very large number of pages. E.g., a search on ‘weather and therefore the eigenvalue computation is not
forecast’ returns 5.5 million pages. trivial. To find an approximation of the principal
A tiny internet eigenvector the power method is used:
4 5 with 5 pages.
w0 = initial guess
Although the search returns several million pages, the For k = 1 to 50
most relevant pages are usually found within the top The model forming the basis of the PageRank
wk = P*wk-1
ten or twenty pages in the list of results. algorithm is a random walk through all the pages of
End
the Internet. Let pt(x) denote the possibility of being
on page x at time t. The PageRank of page x is Return w50
How does the search engine know which pages are
the most important? expressed as lim(pt (x)) for t → ∞. To make sure the
random walk process does not get stuck, pages with The special properties of the matrix P ensures that the
no out-links (here: page 3) are assigned artificial links largest eigenvalue is λ = 1, rendering normalisation in
Google assigns a number to each individual page,
or “teleporters” to all other pages. the power method unnecessary. Fast convergence of
expressing its importance. This number is known as the
the power method makes 50 iterations adequate.
PageRank and is computed via the eigenvalue problem
1 2 3  0 1/ 3 1/ 5 1/ 2 0 Because the computation involves an extremely large
Pw=λ w
 1/ 3 0 1/ 5 0 0 
 matrix, the matrix-vector multiplications must be
P =  0 1/ 3 1/ 5 0 1
  implemented in parallel on multi-processor systems.
where P is based on the link structure of the Internet.  1/ 3 0 1/ 5 0 0
 1/ 3 1/ 3 1/ 5 1/ 2 0 
4 5 

PageRank The matrix P is irreducible and stochastic and therefore


the random walk can be expressed as a Markov chain,
The key problem is to formulate the link structure, i.e., and the PageRank of all pages can be computed as the
the matrix P, in a proper way. principal eigenvector of P.

Informatics and Mathematical Modelling

You might also like