The topology of the Internet
Jul 16, 1999
A popular party game called "six degrees of separation" involves trying to make a connection between the actor Kevin Bacon and a randomly chosen Hollywood star via six or less other stars or movies. The same game can also be played on the Web: how many hyperlinks are needed to connect any two Web pages? A new physics-based technique developed to answer this question could make it easier to find information on the Web.
There are already over 300 million documents on the Internet, and only 34% of them have been catalogued by the most popular search engines such as Hotbot. This has made the topology of the Internet extremely difficult to describe or measure. However, Reka Albert, Hawoong Jeong and Albert-László Barabási of the University of Notre Dame in the US have developed a measuring technique based on power-law distributions from statistical physics. According to their technique, any two randomly selected Web pages are, on average, 18 hyperlinks or clicks apart. This figure, they claim, represents the "diameter" of the Web.
Albert and co-workers built a robot that adds all the Web links found on a document to a database, and then follows these links to other Web pages and so on. This allowed them to calculate the probability distribution for the number of outgoing and incoming links to a given Web page, and then to construct a simple equation which can calculate the shortest route between any two random selected pages.
Their technique also predicts that if the number of pages on the Web grows by 1000%, its diameter will only grow from 18 to 20 clicks. Their results have implications for finding information on the Web. If an intelligent robot agent could interpret and follow the relevant Web links, it could find information much faster than the current generation of search engines.