The web is a bow tie


A study of the web's structure, five times larger than any attempted previously, reveals that it isn't the fully interconnected
network that we've been led to believe. The study suggests that the chance of being able to surf between two randomly
chosen pages is less than one in four.

Researchers from three Californian groups — at IBM's Almaden Research Center in San Jose, the Altavista search engine
in San Mateo and Compaq Systems Research Center in Palo Alto — have analysed 200 million web pages and 1.5 billion
hyperlinks. Their results, which will be presented next week at the World Wide Web 9 Conference in Amsterdam, indicate
that the web is made up of four distinct components.
 

Figure 1 The web is a bow tie
 

A central core contains pages between which users can surf easily. Another large cluster, labelled 'in', contains pages that
link to the core but cannot be reached from it. These are often new pages that have not yet been linked to. A separate 'out'
cluster consists of pages that can be reached from the core but do not link to it, such as corporate websites containing only
internal links. Other groups of pages, called 'tendrils' and 'tubes', connect to either the in or out clusters, or both, but not to
the core, whereas some pages are completely unconnected. To illustrate this structure, the researchers picture the web as a
plot shaped like a bow tie with finger-like projections.
 

Nature © Macmillan Publishers Ltd 2000 Registered No. 785998 England.
 
 
 
 
 
 

Nature 405, 113 (2000) © Macmillan Publishers Ltd.