Web: 19 Clicks Wide
InitZero writes "The journal Nature reports that the web is only 19 clicks wide. What it fails to mention is that a least one of those must be through Kevin Bacon. " The graphic at the beginning of the article is gorgeous in a Mandelbrot style-now if I could just have it in a 24 x 30 print.
Let us not forget the story of the statistician who drowned while fording a very wide river with an average depth of 6 inches...
"...they may harpoon us, but they ain't gonna pick us up on no radar screen!"
While looking at the mess of tangled wires in our companies engineering patch-room we have often comment on the near organic appearance of the patch cables interconnecting our company. Now, this is just a very small subsection of the entire Internet. Doesn't it seem possible with hundreds of thousands of patch-rooms, protocols, and processors out there that something could evolve?
It would start with a few anomolous packets zipping back and forth reconfiguring routers to interconnect into a giant super-being. It's first triamph as supreme net-being would be to spam us all in every known language, "could you please stop pinging it gives me indigestion...*burb*!"
-APFront page to comment list, comment list to here, here to here. Moderaters can prove me wrong (for most people) if this gets up to a 4 or 5 :)
A few weeks ago, I helped set up a mirror for the LAM pages (http://www.mpi.nd.edu/lam/) and while one of the mirrors was spidering our site, downloading everything, we noticed another machine that was on our campus doing the same thing. We found this slightly odd and sent a Big Brother sorta mail (we're watching you spider our site... why?) to the people doing this... got a response about them doing some study about how the web is laid out. They thought they could predict it using some physical or mathematical model.
I also administer the NDLUG (http://www.ndlug.nd.edu/) web server, and noticed massive spidering from the same machine on campus.
Now I read this article and see this quote:
"The Web doesn't look anything like we expected it to be," said Notre Dame physicist Albert-Laszlo Barabasi, who along with two colleagues studied the Web's topology.
so, i guess i don't have much of a point, but it's kinda cool to see that something actually came of some people in the college of science abusing our poor 486 webserver...
Northern Light, as I recall. I don't know that that claim is verified in any credible way, however.
Mind the Gap
Start: www.microsoft.com
- Click on MSN Home
- Click on Computing
- Click on Operating Systems
- Click on Linux
- Click on Linux vs NT Server 4.0
- Click on FreeBSD and Linux Resources
- Click on Slashdot
So microsoft is at most seven clicks from slashdot. Can anyone do better?I can imagine some cases in which a link-path between two pages would be useful. For example, if you are researching differential geometry and combinatoral topology, you might suspect that there is some connection between them. Unfortunately, a page containing a proof of the Gauss-Bonet theorem -- the connection you're looking for -- probably doesn't contain both of the original search terms on it. A link path between a diff-geo and a comb-top page might work out better.
"Trailblazing" through link-space was a prime motivator of Bush's Memex vision: finding new paths between separate "pages" of information was the same as discovering new relationships between discrete pieces of knowledge. In fact, knowledge can be thought of as connections between previously unlinked sets of facts.
In all seriousness, finding a link-path between two separate pages is a thorny issue. First, you are dealing with a directed graph, and as the posts above point out, a link-path from A to B probably won't contain the same set of pages as a link-path from B to A. Then there is the issue of _which_ link paths are useful (and I believe there are some which could potentially be useful) and which aren't; this is largely a decision made based on the weighting you've placed on the Web Pages in question. Finally, there is the issue that you have to have link-structure information sitting around for a good chunk of the Web before something like this could actually work.
But it would be neat!
It also depends on how they did their sampling... Search engines such as Yahoo are easy to traverse, but crawlers like altavista, while they indeed link to a huge number of sites, are difficult to count because you need to enter a search phrase in order to get outbound links.
I personally don't think this was a really meaningful survey, because of the large range it found. Also, does one really care how far away you are from an arbitrary page? I certainly don't. Generally I care how far my information is from a search engine, which is generally 1-2 clicks, and how easy it is to find said information based on results.
Rarely do I start from a random page, and try to get to information by clicking through links.
WRCT Pittsburgh, 88.3FM
By far and away the most awesome thing I have seen in a long time. I love how www.microsoft.com has ICMP blocked!
The dialog box has a number of buttons, of which the fourth one down is "Open Source". However, the one on my version doesn't work -- ie it does not open the source of Microsoft Windows.
Sorry to have wasted your time, really.
jsm
Stochastic (random) functions can be characterized by a range of power-law functions and other spectral shapes. The randomness of the power-law function is given by the randomness (e.g., mean and variance) of the individual components of the spectral function. For instance, draw a x-y plot consisting of a straight line that has a negative slope. The y-axis is the amplitude while the x-axis is the frequency (or the inverse wavelength). Now suppose that this straight line represents the "average" value as random fluctuations about this line exists. This is power-law random function.
Sorry if this is over simplified. BTW, fractals are characterized by a power-law function. OTOH, true fractal functions have constraints on what the power-law slope can be (Haussdorf dimension).
Now for something silly. What is the degree of freedom from Gore (Father of the Internet) to Slashot (the bastard child of the Internet)?
As I remember it, it was the First Foundation that was based on the mathematical calculation of where society was heading. The Second Foundation was basically the opposite - they were created to police the world of the problems that Psychohistory (mathematics) could not predict...
/. is like a steer's horns, a point here, a point there and a lot of bull in between.
The mathematical aspect was indeed the most interesting part of the Foundation books, and it always surprised me that Asimov played it down as the series continued. Sort of like how the problems with trying to implement an absolute ethical system into a being was the most interesting part of his Robot stories, yet he wormed his way out of that (zeroeth law etc).
-
I agree that by itself, the average link distance between two pages (19) isn't a very useful number. However, there is definitely useful information in the article itself:
1. We are given a real-world (probabilistic) distribution of link distances between pages (i.e. given two randomly chosen pages, what is the probability that the shortest/longest link distance between them is X?)
2. From the visualizations, we can see that the web is a graph containing a number of densely connected components which are themselves only fairly loosely connected to one another, and that this behavior is fairly scale-independent.
These two tidbits could lead to impressively improved Web crawlers. You could decide to stop following links once you've gone 25 deep, for example; you could try and determine on-the-fly if more than one of your crawler processes is working on the same densely connected component of the Web and combine their efforts (or move one of the processes over to a new uncharted component), thus effectively searching more of the web. Using similar statistics for distribution of in-link and out-link counts, you could improve crawler heuristics so that pages with a number of out-links significantly deviant from the mean are given more weight for future crawling.
Oh well, just some random thoughts.
Since we're talking about a discrete set (a directed graph), you can forget about "upper limit" and just say "maximum of".
So what they are measuring ("the average distance between two random pages") does *not* match the mathematical definition of a diameter, in spite of their claims.
I must admit I dunno what is the right term for what they are measuring, though.
But on a more serious note (yeah, right), I once thought about the following. It is too bad that this site is not more pro Microsoft (boy, does this company's name have Freudian meaning, right Billy boy?). Then Rod can put up a link to assembler info. It would be:
http://slashdot.org/asm
And, just as it seems weve run out of things to do, we might actually have a moon base with a couple hundred-thousand miles of 100BaseT. Voila, brand new web to play with.
Due to the limited speed of light, the lag would be intolerable (rougly 2 seconds?)
Great tool!
I don't really know much about this stuff, but apparently somebody/thing at SURAnet (server: mae-east.ibm.net, located Vienna, VA) is "causing packets to be lost" on their way to Frisco...
marco baciarello
"I think that we might end up in an era where, just as people today have their own e-mail addresses, people will have their own Web sites," he said.
Every llama has their own website...what are you talking about??
It's 10 PM. Do you know if you're un-American?
the distance in the other direction now is only 2 clicks: from slashdot open this article. then click here:
www.microsoft.org
:)
I can get to any site on the Internet with just one click! I just click on the "Location" bar and type in the URL...
One of the interesting applications they have at caida is the graphical traceroute that plots the physical location of the hop on a map. www.caida.org/Tools/GTrace
I read that article, and I remember that it sounded a lot like what Google already does.
pooptruck
People who want to express themselves, and know how to do it, will always get websites.
Those who don't care to do so, well, won't.
The real limit is on the number of people willing to be creative.
I don't think it has anything to do with "cool" technologies - whether I have JavaScript rollovers on my site or not doesn't affect the quality of the content itself.
D
----
that you are never more than two clicks from a porn site.
The WWW *will* become self limiting. Yes, geeks like us will be building more and more web pages. But more and more normal people (you didn't think you were normal, did you?) will not. Also, as resources start getting tight, some of those wonderful "calling cards" will get wiped. You may no longer need them. Somebody may be willing to pay you for the domain name (okay, that would cause more sites). The admin may decide one day that since the site got no hits in six months, it's gone (think GeoCities).
:).
Using the tree analogy:
- Yes, the tree will get a *lot* bigger.
- Yes, the tree can only get so big.
- Yes, leaves (pages) and branches (sites) will fall off and hit the WWG (world-wide ground).
- Yes, there is a gardener, but he's only interested in a branch or two.
- No, I haven't gotten much sleep lately
The Lord DebtAngel
Lord and Sacred Prince of all you owe
Is this post not nifty? Sluggy Freelance. Worshi
http://www.caida.org/Tools/Plankton/Images/
pithy comment
Yeah, I guess... but hey, if a webpage couldn't be accessed from any of the others, wouldn't that make it 0 clicks, as opposed to one that required 100?
Wonder if they took that into consideration.
Or, I guess they couldn've got all the sites from Yahoo, where they're good enough to get quite a few links.
I tried counting how many clicks it takes to get from one web site to another some time back. It's on my web site here. :(
My effort never quite got off the ground though
"To figure a shape to the web I would think you would first have to decide how many dimensions it has."
If you want to know geometrical shape that would be true. But topological shape is a little different. Topologically, a coffee cup and a donut both have the same shape. The shape in common is that they each have one hole.
Similarly, their power law reference means that the web is fractal in dimension, which is not the usual 1-2-3-4 dimensionality commonly meant. I would imagine this dimension is somewhere between 1 and 2. It's a set of lines (one dimensional) that are almost dense enough to fill an area (two dimensional.)
"But even if you only assume two or three dimensions, why 'clicks wide'? "
It sounds like they are looking at the web as a graph, a series of points (web pages) connected by edges (links). The width of a graph might be found like this:
For each pair of points in the graph, find the shortest path along edges between the points (in terms of number of edges.) The maximum length among all these shortest paths is the width of the graph.
What sort of analysis can be done with points which there is no connecting path? How many of these are there? Why is there no path? How did that affect that average?
Just like the '6 degrees of seperation' game, I fail to see how this could be useful. Finding the shortest path between two websites is nice when you're stuck on a machine that can't go to random URLs (secured lynx, some web kiosks, browse slashdot from anywhere, yeah!), but the 19-click average sounds like a curiousity.
And, ooh, the web exhibits properties of exponential growth, with some sites that have many more links to other sites. Like I couldn't figure *that* one out. Some people post their bookmarks, and lists of links, and other people only link within their interests. A graph of this might look interesting if done correctly, but I still don't see how this would be that useful.
The graph at the top was pretty, though, it looked like an IFS fractal. They look like stuff found in nature too, so I guess that gives this article a context to exist...
pb Reply or e-mail; don't vaguely moderate.
There's an additional study in Scientific American awhile back that shows evidence of that. The premise of the study was to create a search engine that refines the quality of web sites by giving them a "hub" and a "target" rating (I believe that was the terminology used). The "hub" rating was determined by the number and quality of the target sites the page linked to (quality being determined by the "target" score), and the "target" rating was determined by the number and quality of the "hubs" that linked to it. (again, quality determined by "hub" score) So they'd run the list of sites through several time using these, and each time the hub and target scores were refined by each other. Eventually stabilized scores are obtained by running this evaluation scheme enough. To relate it to the topic at hand in this thread, though, when they studied the web using these, individual communities of targets and hubs could be discerned by an above-average rate of linkage within the group.
Yes, we have indeed, but I like the hub-system of this one the best. But stop with the Kevin Bacon references already... ;-) Anyways, I think that a YOU ARE HERE, like CmdrTaco said in the last set of these to crop up on Slashdot, would have been pretty nifty.
Times are bad. Children no longer obey their parents, and everyone is writing a book.
--Cicero
The tree may increase in size, but the nodes at the root will have more paths to the leaves. As databases like yahoo become larger and larger it stands to reason that as the size of the web increases they will have more links to leaf sites. What would be interesting research is if the 19 remains constant through web growth, or if it decreases (or increases) as more "portals" condense the web down to only a few entry points for the masses.
-Rich
You are absolutely right, but then I am guessing that within 3 deviations that this is true.
-
ping -f 255.255.255.255 # if only
It's not true that topology is about geometry on surfaces, though many examples from it are. Topology is about which subsets of a space are open. Given a definition of openness of subsets that satisfies some requisite properties, you have a topology. Sometimes, as with function space topologies, there isn't a very good geometric analog.
The fractal dimension is an invariant of the topological space, so it's embedding in a superspace like 2D or 3D Euclidean space is not important in terms of it's fractal dimension.
I wasn't trying to imply that the lines have area, but a space filling curve has fractal dimension close to 2 because it "nearly" fills an area, and does so in the limit. It was a rough, and possibly poor, attempt at an analogy to this situation. You may be right that fractal dimension isn't important here, but it would be my guess that the increase in diameter from an given increase in nodes that was mentioned in the article was calculated could be based on that dimension, as it relates to how self-similar structures scale.
For you Van Vogt fans...
I agree with many here that the analysis in the article described is not much to talk about, and yeah not much use except for generating some cool fractalized images, but the basic precept behind their development may be the only real way of performing a true mapping of the Net's shape and growth, something that will be important in the years to come.
You can't build a useful map of the Internet's structure in the way you map the streets that wind through your town. Future search tools will require a fair amount of intelligence not in the way they go about a search, but in they way they 'think' about a search (there is a difference). Topographical mapping -- not index cataloging -- will help developers figure out these ways of thinking.
Clicks Wide seems accurate to me because you can get back to your starting point through a different route than the way you came.
I think the Web analogy is the most accurate, with multiple routes to each site.
Maybe Im just thinking two dimensionally, lousy brain.
Wiggeda Wiggeda Wack - Kriss Kross
"It's alive! It's alive!!!"
Seriously, though, that's very interesting, but it's actually obvious when you think about it. The reason why galaxies are not distributed randomly is that there are centres of attraction that begin as random fluctuations in an evenly-distributed environment; but as matter condenses, these types of patterns emerge.
Now, replace "gravity" with "number of hits". A site with a lot of hits, of course, represents a centre of interest, where people congregate. And naturally, they will either link to the site, or try to get linked from it.
And so, the same patterns emerge.
Hey, that means Slashdot is kinda like a black hole generator! Once it aims its beam at a site, it submerges it with hits until the site reaches critical mass and implodes, dropping out of the known Uiverse!
"There is no surer way to ruin a good discussion than to contaminate it with the facts."
I believe this is false. Counterexample:
There are people in remote region that do not have contact with many people outside their groups. Say there exists a tribe (lets call them a) in a forest in Indonesia. Suppose the only contact that this group has with the outside world is through some anthropologists. Now suppose there was another tribe(b) in the same region that had contact only with a. Assume there are also tribes c and d that is in the same situation as a and b but in a totally different region, maybe Africa. Now consider the degrees of separation between a child in tribe d and a child in tribe b. Clearly it would be something like child(b)->parent->tribe a-> anthropologist ->?->anthropologist->tribe c->parent->child(d). In order for the six degrees of separation to be effective, the two anthropologists have to know each other directly. This isn't necessarily true given the number of anthropologists around.
I believe the six degrees of separation came about when someone figured that everyone knows at least 30 other people. Therefore a given person is separated by one person from 30*30=900 people. Analogously a person is separated by 6 people from 30^7~18 billion. However this doesn't take into account the redundancy in the relationships.
For example, many of the people that your friends know are from small cliches so the real relationships appear like many tightly interconnected clusters with a few connections between clusters.
BTW, I think I've been doing too many math proofs.
"When you sit with a nice girl for two hours, it seems like two minutes. When you sit on a hot stove for two minutes, it
First example: from the /. home page to... (let's see, something really obscure...) this one. Ready, set, go!
Of course, me and my investors hope that this path is also packed full of computer-generated Web abs, since this is how we get rich^W^W make it more fun for everyone!
Isn't this what Google does?
*grin*
Ahh! But! We are touching upon a _very_ important issue here: latency. Our's is tad high today, since she's in Europe and I'm in California. It'll prawly be around noon EDT before I get a response out of her.
An ICMP-like roundtrip could maybe be shorter than that (I have her hotel number), but the annoyance that that would generate would probably void the possibility to get a full blown TCP-like connection setup any day soon.
Breace.
Hypersearching The Web is that SA article. Google is different as it primarily tries to find authorities based on links, while the Clever method in the article is more like finding authorities within communities. The Clever algorithm looks at text around a link to estimate importance and relevance of a link.
You had to create a rather special example, although a real world worst case would probably involve anthropologist->?>anthropologist replaced by merchant->city resident->city resident->merchant, which is just a little longer.
I think an example of the real-life short circuits is my wife's friend of a friend. My wife is from another country. We quickly found a friend of a friend of hers five miles from our home in this country. It seems unlikely, but the real math is something like this:
I wouldn't be suprised if your math explanation is exactly where it came from. I'm just trying to figure it out so I can justify why I told her she was talking out of her ass... ;o)
On the other hand, I'm sure that on average people know a lot more than 30 people. If you count unidirectionally knowing someone, it will be even higher.
Breace.
To track people-links, try Six Degrees.
yes, yes, YES!!! My thinking exactly! I was hoping they'd come up with a true diameter of the Web.
Here's what I'd envisioned: you define a metric -- call it "distance" for convenience -- as follows: Distance between 2 Web pages is the minimum number of clicks to get from one to the other. ("Minimum," because otherwise all pages can be infinitely far apart.) Then the diameter would be max(distance). That's the number I'd be interested in seeing.
And it's probably HUGE, since Nature seems to imply that 19 is the average "distance" (as defined here).
-- Craig
wedge@slip.DeleteThisPart.net
bloggs@fubar > lynx http://slashdot.org/ \n
I really hate those stupid "how can I get from here to there through Kevin Bacon" things. Ultimate silliness if you ask me. The graphic is excellent however :)
It's cool and all but where do I fit in?
What do you mean 'Linux in a nut shell', it don't fit.
What?
That is only an average... And the minimum and maximum is wide apart. The web is not really a web I think, but a set of (nearly) independent webs. An example of this is the set of x86 protected mode programming sites. Nearly all of them are linked with the others, but not with anything else.
--The knowledge that you are an idiot, is what distinguishes you from one.
I am not sure that the web is even that wide. Maybe the scientists did not factor in enough search sites; but I am quite sure that search engines cover at least 150 million web pages total. I don't think that the distance will be that long, except to some hard-to-reach or foreign sites that almost nobody links to. What would be more revelaing is the frequency distribution of distances.
I fly southwest whenever i can and i have to agree its like hoping from internet sites. I always like to go and check out the cockpit in-beetween stops. Ohh fun, lets just hope this dosent turn into a Southwest airlines rulez string, cause it does but nuff said. Actually getting back to the topic who really cares if the next internet site is only 19 clicks away. I just use me little addreas bar and that means its some keystrokes and an enter key away!.
The map of the Internet reminded me of a map of an airline's routing table, which, unless you fly Southwest Airlines, usually runs through a series of hubs. I'm sure that the 'Net has it's own hubs (hell, you have Yahoo and Slashdot already), but I wonder if it has many Southwests. Webrings are the only real analogy I can think of, so I wonder if anyone else could throw any more of those out.
FYI: Southwest Airlines doesn't use a hub-based system of flights, but does direct flights between cities. Many flights are thus 'direct' but not 'non-stop'. They're also pretty cheap. Don't factor the cheapness into your analogies.
My girlfriend told me a while back that ALL people are only something like 5 or 6 people away from each other. (I guess as in a who-knows-who kind of way)
:( )
Anybody else knows more about this?
(I can't verify it right now cause she's not here...
Breace.
But to me, the most important part was the last paragraph:
:-)
"I think that we might end up in an era where, just as
people today have their own e-mail addresses, people will
have their own Web sites," he said. "But eventually it will
taper off. Eventually it has to be self-limiting."
That last sentance makes me think he isn't too sure of the Web's self-limiting qualities. I personally don't think it will ever taper off. Just about the time it starts to get stale, the Netizins will get a new toy (ala JS rolovers, Java applets, Flash, Shockwave, whatever). There will always be too much excitement and new technology.
And, just as it seems weve run out of things to do, we might actually have a moon base with a couple hundred-thousand miles of 100BaseT. Voila, brand new web to play with.
censorship is a form of noise, which actively seeks to drown out content with silence - Crash Culligan
The Source code for that mandelbrot set is available at Caida. My friend has been working on the project for quite some time, ever since graduating at UCSD. Most of the work is done by him in the San Diego Super Computer Center. Take a look at the software, it's java and Brad put a lot of cross platform testing into it. So it should run fine everywhere. (Java claim). It has a lot of really nice features to it.
Joseph Elwell.
Well, if they're factoring in clicks, does typing something count? Yahoo!'s database is huge, some categories require 19 clicks themselves to get to. I don't think that's accurate at all...
Are search engines included? If so, wouldn't that greatly underestimate the diameter of the web?
Hmmm...
To figure a shape to the web I would think you would first have to decide how many dimensions it has. Perhaps by assigning a dimension to each method of getting to a page, or perhaps by counting each hyperlink into a page as a separate dimension. Either way it could get pretty hairy pretty quick.
For example, is a hyperlink on a search engine different in some way from a hyperlink on a personal page? How about a web directory? Bookmarks?
But even if you only assume two or three dimensions, why 'clicks wide'? Seems more like 'clicks deep' to me. I always think of clicking on a hyperlink as 'drilling down'. Showing my age again I guess...
Jack
- -
Are you an SF Fan? Are you a Tru-Fan?
...which of those big "link hubs" is Slashdot?
Posted from the wireless couch.
What would be superb in a graphical representation is a combination of this data (the supposed "distance" from point to point) combined with last week's graphic on /geographic/ location or top-end domain (.com, .org, .etc).
It would be interesting to see if, for example, internet traffic patterns show any kind of focus or foci about certain domains or sites or even specific boxen, and how those machines are distributed in real space. . . where, essentially, are our eyeballs and electrons going?
As for "dimensions," a 3D rendering would be the easiest to comprehend. Perhaps a sphere representing the globe, with an atmosphere of satellite link channels, and a substrata of bandwidth pipes and routers. Or a flat geometric field with peaks to represent the Big Iron, fractal spires twisting off as homepages and smaller sites. And isolated islands or floating moons of self-contained networks, or pages that go nowhere.
Don't mind me, I just finished reading Snow Crash, Diamond Age, and Idoru, and would enjoy a virtual walk through the data we're all accumulating.
Rafe
V^^^^V
Rafe
Opinions expressed by the author may not actually exist in the wild.
They said the web is like a tree and will grow until it eventually runs out of resources. It's not shaped at all like a tree. It's more of a tumbleweed.
Peacock Maps has a couple of posters based on the Internet Mapping Project data.
Since the average distance is 19 clicks, would it then be possible to take a trip round the web? In 80 days perhaps?
Of course, it's no problem surfing the web continously for 80 days (with a T3 and a tank of coffee) but how far would you get by then?
Where does it all start? End?
-
The internet is full. Go away!
This was touched on by this Scientific American article a few months back. It covers another project looking for useful ways to index the Web. They came up with a similar hub/cluster topology based on authorities, which are sites that a lot of other sites link to, and hubs, which have large collections of links to authorities. Unfortunately, the cool illustration that was in the print version didn't make it online. (You can pretty much skip the first two sections of the article unless you want to read the authors' grossly incorrect definition of spamming; it doesn't get interesting until the "Searching With Hyperlinks" subhed.)
The article mentions the Internet was created nine years ago... Is this the same Internet that Al Gore invented?
I just think the people that made the foundations (like the creators of ARPAnet, etc.) should be given a little credit, that's all.
if(!toilet_paper) roll.replace(new roll);
1- sure, everybody can get their own web page, probably will, but how many "here's my dog, check out my car" web pages does the world need. I think we (humanity) are finding out the hard way that we aren't as individual as we'd like to think. This is why I haven't made my own web page about my own boring life. 2- the other thing that's said, the real imporant thing in this article, has some pretty chilling implications - which have already proven true. The major portals are the main targets of the search engine, the "fringe" sites are not. Only a small fraction of sites are even scanned by search engines, and this will become a limiting logic for them. Thus, only mainstream sites will be found by search engines (in general), and the rest of the sites, the covert sites, the underground, the fringe, will not be reachable by the vast majority of people, and this has all the Orwellian implications you think it does. All this time, the netizens have been thinking, this internet empowers the individual, it frees information. BUZZZT! wrong, try again please. It empowers the rich, the mainstream media, the corporate establishment. Now how much do you think it matters that we wire the ghettos and set up internet access for cambodian rice farmers? It's not going to be a channel that connects everybody with everybody else, it's going to be just another orifice for greedy corporations to rape us with. Sorry about the class-warfare tone, and the oblique anti-Microsoft slam, but at least I didn't say anything about Beowolf clusters, though a Beowolf cluster of internets would be cool.
How many clicks does it take to get to the center of a tootsie-pop? http://fatdays.com/jokes/misc/licks.html
"The number of suckers born each minute doubles every 18 months."
These are my friends, See how they glisten. See this one shine, how he smiles in the light.
And with this link, you're but a click away 8^)
http://visualroute.datametrics.com/Just like the end of Mona Lisa Overdrive, :))
"The Matrix has a shape" etc..
Maybe when we have discovered that shape we will discover life on Alpha Centuri as well...
(No not the game
-- You ain't seen me, right?
How do you suppose they account for dynamically-generated (i.e. database- or object-driven) web pages that typically don't register with search engines?
What about sites with logins, where hundreds of pages are hidden from public view?
It seems to me that most of what's interesting about the emerging behaviour of the web is buried within one of those two types of sites... discuss?!
Have a look at these chaps. Similar map sort of thing, but looks better. ;)
(No, I don't work for them
__ Em
The really cool thing would be if someone were to write a program (probably a cgi script) that would use a search engine that list pages that link to a site to let you type in two web sites and see the hyperlink path. I wonder how far slashdot.org is from www.microsoft.com?
My new web page has already crosslinked between a bunch of previously unconnected pages.
At the same time, all my web pages only use a few hundred KB at a time when you can't find anything as small as 1GB for $100. I'll be helping the growth in my little corner, but I've cut corners a lot more than I'm growing.
Mean hop distance is relatively easy to measure, as IP addresses are nicely arranged. Measuring clicks takes a bit more work, I feel. A somewhat cryptic document by another guy at caida.org puts the average hop at 14-15. A great link with more info is the
Internet Distance Maps Project.
For more pretty pictures, check out the Internet Mapping Project.
--
Make mine methylphenidate.
No doubt. My SO signed up through a friend. To get fully registered, you have to provide the names and addresses of two other people.
I killed the browser window at that point.
Once my SO twigged to the fact that you *have to* spam people in order to join, she felt as bad as I did. Needless to say, she hasn't been back.
censorship is a form of noise, which actively seeks to drown out content with silence - Crash Culligan
The article says the WWW is 9 years old, not the Internet. The average consumer doesn't know, but the two are not the same.
That's pretty interesting, but maybe some sites aren't good enough for people to put links to them on their webpages. Maybe the only way you could find them is by search engine. Something that might be cool too is that if you got a large database of people with different IDs (ie Hotmail/ICQ/AIM) and had them send in their contact lists or address books, then check how far apart people are through those contacts. Of course, I doubt you could persuade many people to just send you their address book or contact list.
The other comment that mentioned topology I think was rather off-course on the dimensions of the web. Topology is the study of geometry on surfaces; donuts and teacups are similar because primitive shapes behave in similar ways on them.
A second factor is that you must consider what you are calling dimensions. A representative graph may be made in any number of dimenstions - flattened to two, or made in 3d. But the dimension in the fractal since is a different animal. No matter what dimension you draw it in, the fractal dimension stays the same. Yes, the dimension would be between 1 and 2 because as the number of links -> infinity, the 'perimeter' does too, but the lines certainly don't have an area!
The shape of the web, however, is not about fractal dimensions. It's about summarizing and arranging the points and connections in such a way that clustering and localization phenomena begin to emerge. With 800 million+ nodes, this task is nearly impossible - however, an analogous structure of fewer nodes and clusters can be made that will have visible patterns.
... check out http://www.sixdegrees.com, they're actually trying to link people through their relationships to see if everyone really is related to everyone else by 6 degrees or less. I've actually run into people I know personally just by looking through my different 'degrees'... that's quite a weird experience.
Did anyone try that Plankton java app they have? It's quite fun, and pretty.
One thing, perhaps, we can agree on...
... DirectHit, which bases its analysis on "what people are clicking on" ... and Inktomi, which is "looking at what people are actually viewing."
Google, which ranks its results by link "importance"
Hmmm, follow the herd or go for something "immportant" (perhaps the perfect word for a search, a clue if you will) seems like a pretty simple decision. I suggested it to the folks in my company (along with www.m-w.com and babelfish) and they love it. I'm feeling lucky.....
(Get Andover to buy it, or maybe the other way around...)
+&x
This is bang on, not at all offtopic. Can someone fix this? BTW, the url to find the source code and more images is
At first I was amazed at the claim that
the web is only 19 clicks wide.
To me, this means the maximum distance between any two sites is 19 clicks. Sort of the way that people claim that only six degrees separate us from any other person in the world. This would be an impressive display of the "web" aspect of the world wide web.
But this isn't the claim at all. If you read the article, it says that
there's an average of 19 clicks separating random Internet sites.
Different story altogether.
heck, maybe this is out of line...
... Sure the second foundation was a lot more then that but the mathematical aspect was what I loved most about the concept.
Every time I see an article about a statistical study of something created by man, I get a flash back to Asimov's second foundation : how mathematics can generally describe man, events, history,
This study has some distant similarities to it. Statisticians studying the average distance between two ramdomly chosen internet site. The catch is that the entire structure is created by man, there really isn't much ramdomness in it : Compared to the Bacon thing were you may have met someone, that knew someone, at one point or another walking down the street; having a link from your web page to another web page is a completely conscious action.
Which brings me to the counter-argument that news site such has slashdot or c|net have their content (and thus their links) influenced by random events of the outside world such has tornadoes or floods.
Which now brings me to a conclusion before I back home, how long will it be before someone attemps to make a measure of the amount of randomness in the web, that is the influence of events that cannot be predicted (to some extent or other, to be determined later) by man. Is the web something that could over time become completely predictable?
I really should reread these Asimov books.
microsoft is only 1 step from slashdot, maybe two if you want the slashdot main page (or at least it is now).