Page Rank
Page Rank
A search with Google’s search engine usually returns a 1 2 3 The Google matrix P is currently of size 4.2×109
very large number of pages. E.g., a search on ‘weather and therefore the eigenvalue computation is not
forecast’ returns 5.5 million pages. trivial. To find an approximation of the principal
A tiny internet eigenvector the power method is used:
4 5 with 5 pages.
w0 = initial guess
Although the search returns several million pages, the For k = 1 to 50
most relevant pages are usually found within the top The model forming the basis of the PageRank
wk = P*wk-1
ten or twenty pages in the list of results. algorithm is a random walk through all the pages of
End
the Internet. Let pt(x) denote the possibility of being
on page x at time t. The PageRank of page x is Return w50
How does the search engine know which pages are
the most important? expressed as lim(pt (x)) for t → ∞. To make sure the
random walk process does not get stuck, pages with The special properties of the matrix P ensures that the
no out-links (here: page 3) are assigned artificial links largest eigenvalue is λ = 1, rendering normalisation in
Google assigns a number to each individual page,
or “teleporters” to all other pages. the power method unnecessary. Fast convergence of
expressing its importance. This number is known as the
the power method makes 50 iterations adequate.
PageRank and is computed via the eigenvalue problem
1 2 3 0 1/ 3 1/ 5 1/ 2 0 Because the computation involves an extremely large
Pw=λ w
1/ 3 0 1/ 5 0 0
matrix, the matrix-vector multiplications must be
P = 0 1/ 3 1/ 5 0 1
implemented in parallel on multi-processor systems.
where P is based on the link structure of the Internet. 1/ 3 0 1/ 5 0 0
1/ 3 1/ 3 1/ 5 1/ 2 0
4 5