Web: 19 Clicks Wide
InitZero writes "The journal Nature reports that the web is only 19 clicks wide. What it fails to mention is that a least one of those must be through Kevin Bacon. " The graphic at the beginning of the article is gorgeous in a Mandelbrot style-now if I could just have it in a 24 x 30 print.
Let us not forget the story of the statistician who drowned while fording a very wide river with an average depth of 6 inches...
"...they may harpoon us, but they ain't gonna pick us up on no radar screen!"
that you are never more than two clicks from a porn site.
There's an additional study in Scientific American awhile back that shows evidence of that. The premise of the study was to create a search engine that refines the quality of web sites by giving them a "hub" and a "target" rating (I believe that was the terminology used). The "hub" rating was determined by the number and quality of the target sites the page linked to (quality being determined by the "target" score), and the "target" rating was determined by the number and quality of the "hubs" that linked to it. (again, quality determined by "hub" score) So they'd run the list of sites through several time using these, and each time the hub and target scores were refined by each other. Eventually stabilized scores are obtained by running this evaluation scheme enough. To relate it to the topic at hand in this thread, though, when they studied the web using these, individual communities of targets and hubs could be discerned by an above-average rate of linkage within the group.
Hypersearching The Web is that SA article. Google is different as it primarily tries to find authorities based on links, while the Clever method in the article is more like finding authorities within communities. The Clever algorithm looks at text around a link to estimate importance and relevance of a link.
The Source code for that mandelbrot set is available at Caida. My friend has been working on the project for quite some time, ever since graduating at UCSD. Most of the work is done by him in the San Diego Super Computer Center. Take a look at the software, it's java and Brad put a lot of cross platform testing into it. So it should run fine everywhere. (Java claim). It has a lot of really nice features to it.
Joseph Elwell.
Hmmm...
To figure a shape to the web I would think you would first have to decide how many dimensions it has. Perhaps by assigning a dimension to each method of getting to a page, or perhaps by counting each hyperlink into a page as a separate dimension. Either way it could get pretty hairy pretty quick.
For example, is a hyperlink on a search engine different in some way from a hyperlink on a personal page? How about a web directory? Bookmarks?
But even if you only assume two or three dimensions, why 'clicks wide'? Seems more like 'clicks deep' to me. I always think of clicking on a hyperlink as 'drilling down'. Showing my age again I guess...
Jack
- -
Are you an SF Fan? Are you a Tru-Fan?
I think the idea behind this article of a "click" would define "click" to be a mouse click. Which would rule out all typing. It's a fair assumption that the web would be considerably smaller if search engines were lumped together, because you could type in the addresses of the two places you wanted to find the distance of into altavista as +url:www.math.com +url:www.slashdot.org . Placing all sites that are logged by altavista much closer.
But then do you calculate width by starting at point A and continuing to point B? Because if that were true then the search engine argument would still be relatively benign. As you would still have to reach a search engine from page A in less "clicks" than it would take you to go straight.
If the "true" diameter is required one could measure in any fashion as long as we agree on a definition of "click" (which I define - for myself - as only mouse presses). So such sites as yahoo might bring unrelated pages closer. But without typing would many be relatively fewer than 19 clicks away? Yahoo is still categorically sorted so sites that are unrelated would need to traverse up the Yahoo category tree after first leaving the first site.
Joseph Elwell.
This was touched on by this Scientific American article a few months back. It covers another project looking for useful ways to index the Web. They came up with a similar hub/cluster topology based on authorities, which are sites that a lot of other sites link to, and hubs, which have large collections of links to authorities. Unfortunately, the cool illustration that was in the print version didn't make it online. (You can pretty much skip the first two sections of the article unless you want to read the authors' grossly incorrect definition of spamming; it doesn't get interesting until the "Searching With Hyperlinks" subhed.)
Mean hop distance is relatively easy to measure, as IP addresses are nicely arranged. Measuring clicks takes a bit more work, I feel. A somewhat cryptic document by another guy at caida.org puts the average hop at 14-15. A great link with more info is the
Internet Distance Maps Project.
For more pretty pictures, check out the Internet Mapping Project.
--
Make mine methylphenidate.