Surfing the Web is now a legitimate topic for scientific research if the latest issue of Science magazine is any guide. The issue contains two papers on the Web: one proposing a 'law of surfing' and another comparing the performance of different search engines.
On page 98 of the April 3 issue of Science Steve Lawrence and Lee Giles of the NEC Research Institute in Princeton, New Jersey, report that no search engine covers more that one third of the 320 million indexable pages on the Web. The pair studied six search engines and found that HotBot covered the highest number of indexed pages (34% of the total), while Lycos covered the least (3%). However, HotBot also had the highest per centage of ‘dead links’ with 5.3% of the links not leading to the correct page. The average number of dead links on all engines was 3.1%.
Lawrence and Giles suggest that the best way to search the Internet is to use a search engine that combines output from a number of engines, such as Metacrawler. However, to look for a individual’s homepage, a ‘softbot’ – a program such as AHOY that can intelligently sort search data into a more meaningful structure – was recommended.
On page 95 of the same issue Bernardo Huberman and colleagues at the Xerox Palo Alto Research Center in Palo Alto, California, found that users follow common patterns of behaviour when looking at Web sites. The team constructed a model that matched an ‘interest’ threshold to the number of pages a user would look at, and tested it by studying the Web behaviour of 23, 692 America Online users over a total of five different days. They discovered – just as their model predicted – that on average a user only looks at three pages on a Web site, with most users looking at only one page. Part of the reason for this low number could be the slow access speeds available to most American Online members. The model does not work (yet) to predict the behaviour of individual communities such as physicists.