On pagerank

My father is a Statistician, and encountered a nice article on Google Pagerank in one of the occupational magazines.

PageRank was developed in 1998 by the (now very rich) Stanford PhD students Sergey Brin and Larry Page. The vector of page ranks of all web pages in fact follows an equilibrium distribution of an enormous Markov chain.

The hyperlink structure of the web is transformed in transition probabilities. The Markov chain contains all kinds of absorbing states, but PageRank assumes that every user continues with a probability of ? = 0.85 to a another page of the same website, an with a probability of 1 – ? he will enter an arbitrary new URL. This causes the finite, non-periodic Markov chain to be non-reducible, and therefore a equilibrium distribution exists. This equilibrium distribution can be determined numerically by solving a system of linear equations. The enormous size of the state space causes storage problems and numerical problems, but the sparse nature of the matrix helps. The choice of ? = 6/7 is a lucky one from numerical point of view.