Slashdot Mirror


Describing The Web With Physics

Fungii writes: "There is a fascinating article over on physicsweb.com about 'The physics of the Web.' It gets a little technical, but it is a really interesting subject, and is well worth a read." And if you missed it a few months ago, the IBM study describing "the bow tie theory" (and a surprisingly disconnected Web) makes a good companion piece. One odd note is the reseachers' claim that the Web contains "nearly a billion documents," when one search engine alone claims to index more than a third beyond that, but I guess new and duplicate documents will always make such figures suspect.

133 comments

  1. The Living Net, or the selfish net? by dxkj · · Score: 1

    Doesn't it make sense that the web would follow patterns just like a living organism? What are the basic motivations? When millions of developers and individuals act in their own interest, a basic pattern evolves. The question is, could an "intelligent" approach at design result in a better result, or does this naturally evolving creature continue to evolve, with un-efficient parts dying off until a unimagineable result comes about, far more complex and efficient than we could have ever come up with?

    --
    Tweak like your pocketbook depended on it!
  2. Re:1,000,000,000 urls by Anonymous Coward · · Score: 0

    Of course it's a noun dumbass. I suppose you flunked the fifth grade and never went back to school. Thanks for playing.

  3. 99 bottle of beer by xixax · · Score: 3, Funny

    100 million URLs on the net, 100 million URLs
    Take one down, pass worms around,
    99 million URLs on the net...

    Xix.

    --
    "Everything is adjustable, provided you have the right tools"
    1. Re:99 bottle of beer by pmc · · Score: 2
      100 million URLs on the net, 100 million URLs Take one down, pass worms around, 99 million URLs on the net...

      99 million, 999 thousand, 999 URLs on the net, surely?

  4. Re:LAIN by NonSequor · · Score: 1

    Well, I don't think his thought deserves total dismissal. The question is whether consciousness is something peculiar to human brains (if you want you can assert that all vertebrates have consciousness, but I'm not entirely certain) or a property of any extremely complex system. I don't think the Internet would be a candidate for consciousness since information just gets sent from one end to the other without being acted on. Of course, we can never be certain because we will never have a way of asking the Internet if it is conscious.

    --
    My only political goal is to see to it that no political party achieves its goals.
  5. Lawrence and Giles study was published in 1999... by soboroff · · Score: 2, Insightful
    According to a recent study by Steve Lawrence of the NEC Research Institute in New Jersey and Lee Giles of Pennsylvania State University, the Web contains nearly a billion documents. The documents represent the nodes of this complex network and they are connected by locators, known as URLs, that allow us to navigate from one Web page to another.
    The Lawrence and Giles study was published in 1999, so stop picking on the 1 billion number... it's quite out of date. Web researchers know this already.

    The important thing from that paper is on the growth of the web; and from Kumar's bowtie-theory paper, we also think that most of the web is growing in places where we can't see.

  6. But the real question is... by drjoe1e6 · · Score: 1

    ...the typical number of clicks between two Web pages is about 19, despite the fact that there are now over one billion pages out there...

    So, how many clicks does it take to get to the home page of Kevin Bacon?

    -Joe

    --
    Lose = not win ...... Loose = not tight
  7. Re:Intersting, but flawed. by Anonymous Coward · · Score: 0

    Hey -- it is an achilles heal. If a site doesn't get linkage, it doesn't get googlage, and it doesn't get in front of my face when I search for it, and -- what's more -- I can't stumble across it by following a hyperlink somewhere. In other words, if a site is only linked from a few other sites, then it is vulnerable to not being seen. Coincidentally, if a site is only linked from a few other sites, then it probably suffers from profuse quantities of suckage. So what's this achilles heal, again? A weakness whereby I only get to see sites that matter? Oh, okay. My bad...

    and sorry you got mod'd down so low. next time, read, don't skim :)

  8. Re:LAIN by esonik · · Score: 2

    Can it have a soul?

    Would a single cell know whether the whole thing has a soul or not?

  9. Re:Advantage of Scale-Free Topology by Anonymous Coward · · Score: 0

    I was thinking about this too, it seems that the forces and companies (read, for instance microsoft) that want to re-engineer the topology and access/delivery of internet content don't realize in their greed and haste in wanting the future billions/trillions of dollars don't consider is the reality that very controlled internet structure would by it's nature have to have a very centralized control architecture that could not possibly handle all the current and future growth of the net. Very much like how the communism model broke, it could not handle a world where informational, technologic and economic forces where all diverging in 10 different directions at once. It figures that microsoft, a company just that produces crap products hasn't a clue that it's idea of a centrally controlled internet won't fly (if they had good R&D capabillity, chances their product's wouldn't suck so badly, but it seems that physics will probabbly save us from the microsoft/brazil version of the future...if not, then we are really doomed and most peaple will grow up to hate the computer revolution...

  10. Re:Wow, thats kind of deep. by Flabdabb+Hubbard · · Score: 1
    Obviously I only have one account, and obviously the longer you have been reading slashdot the more intelligent you become.

    Reactions like yours make my day :-)

  11. got root? by dstone · · Score: 2

    One odd note is the reseachers' claim that the Web contains "nearly a billion documents," when one search engine alone claims to index more than a third beyond that

    I have to apologize for that one. I was VPN'ed in the other day and I opened MS Explorer on "\\internet". I accidentally selected "www" and hit CTRL-C, CTRL-V, CTRL-V. I guess the "Copy of www" and "Copy (2) of www" tripled the document count on some search engines. My bad.

  12. Re:LAIN by Anonymous Coward · · Score: 0
    It's very possible. But how would one access any information the entity might produce? Taken as a one to one analogy to a biologucal brain, one neuron, or computer wouldn't be able to convery information very well. All you get is either on or off.

    Biological brains convey information via cooperation of large numbers of neurons acting together to drive motor neurons. What would be analageous to that on the internet?

    I wouldn't expect a magic eight ball or ghost in the machine, but there are most likely some crude form of "intelligence".

    Think of a cellular automata that has a fixed set of rules that are activated according to the state of adjacent cells. Nodes on a network are constantly resending, resizing, redirecting packets according to a fixed set of rules. Considering the number of nodes on the internet, there should be sufficient fertile ground for various intelligence or life primitives to come about.

    Such a thing might reveal itself as a packet storm of some degree or form. But packet storms are things that are problematic and are actively quashed and avoided as part of the design of networking gear. It would be seen as noise in the data.

    If the system has been designed well enough to prevent these patterns from appearing outright as noise, they will just appear elsewhere. Maybe in patterns of delay, congestion, routing paths. Other subtle things.

    But something has to be there. It's a complex system that has many the elements that are fertile to primitives that are associated with intelligence, or at the very least cellular automata.

  13. Re:Intersting, but flawed. by nihilogos · · Score: 2

    If you misread everything else the way you misread this I doubt you understand the fundamentals of anything. The researchers make a clear distinction between physical networks and hyperlinks, calling them the 'internet' and the 'web' respectively. One of their suprising results is that the internet and web have similar network topologies. Or in their words

    Why do systems as different as the Internet, which is a physical network, and the Web, which is virtual, develop similar scale-free networks?

    They go on to describe some properties of scale free networks and mention some interesting examples from physics.

    So, in summary, you have completedly misunderstood the article.

    --
    :wq
  14. some thoughts on complex networks... by eh2o · · Score: 1

    Well, its an interesting read.. but most of the technical stuff is kind of glossed-over .. I'm sure the graph theory behind it makes sense if you know the math.

    The article mentions a 0.05% sample... is that statistically significant? Not to mention the fact that 'web page' is a vaguely defined term (i.e. static versus dynamic) -- this makes me doubt that this report contains any type of 'real' conclusions.

    However I suspect this type of research must be really juicy for the big search engine comanies (e.g. Google, etc..). I especially like the idea of giving the user a feeling of spatial orientation when browsing the internet (but what would that mean??)... in the end, I'm afraid that the internet/web/whatever is simply changing too fast -- by the time we analyze it enough to determine its topology and organization, something new will be replacing it. Note that the data in this article is already 2 years old... the web has probably at least doubled in size by now.

    To really understand the internet, statistical mechanics is not going to cut it-- we need better tools - adaptive ones that learn the new rules without being reprogrammed... ;)

    1. Re:some thoughts on complex networks... by Chirs · · Score: 1

      The required sample size can be determined based on desired precision and accuracy, with the general shape of the probability curve playing an effect as well. It doesn't actually matter how big the population is, as long as we know how it's distributed so that we can factor it in to the calculations.

      In this case, don't think of it as a .05% sample, but rather as a sample of 500000 data points. Then consider that national surveys can get accurate results with only a few thousand data points.

      In any case, it is entirely possibly for their results to be statistically valid.

  15. yeah, right... by nycdewd · · Score: 1

    while it is more than evident that you are a troll and should be ignored, i can't let your parade of ignorance go unremarked upon. SOCIALISM is ALIVE and quite well in many countries of the world, the scandinavian countries are quite socialist. COMMUNISM is not to be equated with socialism. the soviet union and the eastern bloc crumbled, yes, but there are other communist countries in the world that remain.

    1. Re:yeah, right... by Dmitry+Skylarov · · Score: 0

      I can't believe you actually bit!

      --

      ----
      Please, I are begging you! To save Dmitry from teh jail!

  16. Re:LAIN by trixillion · · Score: 1

    While most individuals on slashdot may not have compromised machines... how many compromised machines are out there? If an primitive intelligece were to form, it would never be allowed to take hold of most machines. But it could certainly exist entirely in the domain of poorly managed boxes, and nobody would ever have to know ;)

  17. Mob Psychology describes the web better than... by hillct · · Score: 2

    I believe that Mob Psychology might model the web better, particularly the growth patterns of the web, however I don't have any studies to prove it yet.

    I's interesting though that every academic out here has tried to comment on completely unrelated fields usin the language of his area of expertise. I've seen studies by mathematicians who claim to be able to model the web, and even industrial design students who claim that the design-to-maturity process of a network of websited (a small subset of the web) is identical to the processes championed by industrial designers who led the way in Japan in the late 1970s.

    This seems to suggest (to me anyway) that those who enguage in this cross-discipline analysis, are somehow unsatisfied with their chosen field and are trying to latch onto an area of study that is populat at a particular moment in time.

    --CTH

    --

    --Got Lists? | Top 95 Star Wars Line
    1. Re:Mob Psychology describes the web better than... by Anonymous Coward · · Score: 0
      Fleischman and Pons, those notorious cold-fusion hoaxsters are a good example of what happens when people stray from their area of expertise.

      Mind you, if we criticised every slashdot poster who did that, we would be here for a very long time.

  18. Physics or math??? by Eythor · · Score: 2, Interesting

    I'm confused. The subject of the article is "Describing the Web with Physics" while, to me, it looks like Describing the Web with Graph Theory or Mathematics. Is there not a distinction between math and physics?

    1. Re:Physics or math??? by apsmith · · Score: 2

      It's physics if the researchers are physicists :-) There's a lot of overlap, since physics is inherently a very mathematical subject. But the particular ideas concerning scale-invariant systems (power-law behavior of various sorts), dynamical behavior, and percolation networks come out of statistical mechanics, a branch of physics. Specifically, a lot of this was developed starting with the theory of critical phenomena in the 1970's. The move to self-organized criticality (sandpile, wildfire, earthquake etc. models) and more generally now "complexity theory" has spread to quite a number of fields from biology to economics, but there are still a lot of physicists involved in this kind of research.

      --

      Energy: time to change the picture.

  19. Re:19 clicks of seperation by Anonymous Coward · · Score: 0

    Would someone mind explaining how this could concievable perceived as a troll or is this just another example of pathetic moderators.

  20. Dynamic properties by ahogue · · Score: 1

    In the "Outlook" section at the bottom of the article, they express an interest in studying the "dynamic" properties of such graph systems. That is, how they pass messages between nodes, how the messages get routed, which nodes are passing the important/unimportant messages, and is there a cyclic pattern to how messages and replies get passed.

    Does anyone know of any studies on this subject?

  21. Re:LAIN by Rinswind · · Score: 0

    Has anyone heard of the Neural Net mathematical model? It tries ot describe the way the neurons in your brain work. Basicaly the idea is that each neuron has signal inputs to reieve inpulses and outputs to send impulses to the other neurons. When the neurons work they influence the power of the signals coming from their inputs and than forward the signals to the other neurons through their outputs. The brain evolves towards contiusness (or image recognision, speach comprehension etc.) when with time each neuron "calibrates" it's output. Finaly the result is the ongoiong "sparkling" of the whole neural net as signals jump from one neuron to the next forming memories, images etc. I think of the interenet it does nothing like that. Firts the signals introduced in the system are chaotic and for mostly noise. They are not organized in images or sounds or ahything that can stimulate the neural net to addapt towards something like image recognision. Even if this agrument is not viewed as too important since we don't realy know how the mind works there is a problem with the "calibration" itself. We are the ones that calibrathe the nodes of the internet and I don't think that we follow some global pattern that could spaw intellect.
    Perhaps there is still a chance for intellect to evolve if we assume that after all in order to increase the performance of the net we humans fallow some golbal pattern. The problem is that once the intellect emerges in order for it to work it would have continue the "calibration" by itself overriding our interfearence. And this is unimaginable. Can you imagine your computer trying to reconfigure itself driven by some foreign force? Sounds kinda doubtfull :P

  22. 19 clicks of seperation by evilviper · · Score: 0, Troll

    It seems like a sound theory to me...

    Now I want to see an equally technical paper on the slashdot effect.

    --
    Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
  23. Lying with statistics for fun and profit by SimCash · · Score: 1
    For a proper answer we need a full map of the Web. But, as Lawrence and Giles have shown, even the largest search engines cover only 16% of the Web. This is where the tools of statistical mechanics come in handy they can be used to infer the properties of the complete network from a finite sample.

    Tread softly here, Grasshopper, the very fact that you can only easily see 16% of the Web means that you must expect that your sample is strongly biased, hence does not represent the Web in its entirety. Just as statistical resampling of the census would require much more care than a political entity can usually bring to bear, so would attempting to extrapolate web characteristics from a sample at random.

  24. Websense is blocking this site by Anonymous Coward · · Score: 0

    Its conspiracy theorizing time!!!

    But seriously this sucks, they claim its porn! Hep me, my boss thinks the web might warp my brain. So they install websense on the fucking routers. Die Cisco, Die Webnofuckingsenseatall.com!!

  25. Re:LAIN by trixillion · · Score: 1

    The characteristic timescales are remarkably similar when you think about. The brain has massively parrelel transmissions but each packet is sent with delays measured in the low milliseconds and each "message" containing a few bytes at most. Typical ping times between servers with decent connectivity are going to give you delays around 10ms and the average package size is ~1.6kB. The only real difference is that the brain currently has orders of magnitude more nodes than the internet. But at some point in the near future, this may cease to be true.

  26. Re:LAIN by Kynde · · Score: 1

    Further, if you did have a distributed consciousness, what would the consequences of lag, network outages, and outright crashes be? In that sense, it would be interesting to see if random/semi-random/genetic algorithms are capable of generating an intelligence capable of coping with such noise

    Well, in our head brain cells die left and right starting from the days when we're still inside the womb without really noticeable effects... no,wait! Could this explain the precidency of Bush?!

    -

    --
    1 Earth is warming, 2 It's us, 3 it's royally bad, 4 we need to take action NOW
  27. Re:complexity and deregulation by tbo · · Score: 2
    I'm not sure if more research and/or more computer science will solve the problem.

    What problem are you talking about? Their research found that the current structure of the internet is extremely resilient to random attacks. Yes, co-ordinated attacks against key routers could work, but every network has some vulnerability, and the best solution is probably just to make sure the few key routers are well-protected and hidden. As Mark Twain says,
    The fool says, "don't put all your eggs in one basket," whereas the wise man says, "go ahead, but watch that basket!"
    There's no problem that needs to be solved, so I don't know where you're going with this "Not to sound anti-business" rant. The current chaotic approach to building network infrastructure works great, just like many natural systems.
  28. hmmm by sewagemaster · · Score: 3, Funny
    it's all about the right hand law (with F-normal force represented by the middle finger), friction, and porn!!

    *grin*

  29. Re:LAIN by shaunak · · Score: 0, Flamebait

    (What will happen as the net becomes more and more like a brain? Can it have a soul? )

    "Please don't take this the wrong way, but that's honestly the sort of question I'd expect from someone who doesn't understand computers."

    Nope. I think its from a person who doesn't understand almost anything. Taco, please use your own ID when you post intelligent (by your standards) comments.

    Yeah, I know this is flamebait. I don't give a @#@^

    --
    -Shaunak.
  30. Slashdot in Space (or terrain) by zauber · · Score: 1

    I especially like the idea of giving the user a feeling of spatial orientation when browsing the internet (but what would that mean??)...

    I'm reluctant to post this without having had more time to revise, but one way of spatializing the data is by making it into more familiar terrain. Again, still early stages, but for an example, how about the terrain described by the hyperlinks surrounding Slashdot in a typical week.

  31. Re:Intersting, but flawed. by tbo · · Score: 3, Informative

    As a senior undergraduate in combined honours physics and computer science, I hereby pronounce you a moron. The researchers first talk about the structure of the web (hyperlinks, etc.), then they talk about the physical structure (Achilles Heel, virus threshold, etc.). You must have missed the transition, Mullusk.

    The interesting thing is that both the web and the physical network follow this power-law structure (or scale-free, as the "Physics Boys" call it).

    Oh, don't think it's possible to study the physical structure of the internet? I'd like to introduce you to a new and powerful tool called traceroute [yes, that was sarcasm]. BTW, you can buy maps of the internet from ThinkGeek, in case traceroute is too much for you.

    How the hell did that guy get modded up, anyway?

  32. Re:Vulnerability to Carefully Coordinated Attack? by Tony+Shepps · · Score: 2

    If you really want to worry about it, Tom Clancy has written "Executive Orders" for you, which starts with the scenario as described.

  33. Re:Describing the web with biology.... by zauber · · Score: 1

    This is certainly an idea that has been around for a while. Consider a 1989 interview with Clifford Stoll ("The Cuckoo's Egg") in The Boston Globe:

    THE BIG PUSH IN THE COMPUTER INDUSTRY IS TOWARD STANDARDS. DO YOU THINK STANDARDS HELP OR HURT SECURITY?

    Viruses spread because of standardization. If a virus gets into one hole in a standard computer operating system, it's everywhere. Diversity in computing causes survival just like its biological counterpart.

    Of course, the solution is to make every system idiosyncratic. And (also) of course, this is not anything like a reasonable solution to the problem of security. Rather, we should view networked computing as a whole system--a system that could not even exist but for standardization--and attack the real problems: exploitable weaknesses in widely used software.

  34. jeez by Anonymous Coward · · Score: 0

    Does your penis feel bigger now?

  35. Re:Vulnerability to Carefully Coordinated Attack? by Malibu+Barbarian · · Score: 1
    What if the President of the USA, the Vice President, the entire Cabinet, the entire Senate, the entire House of Representatives, etc. etc. were simultaneously assassinated?

    In networking terms, I don't believe loss of the top officials would have to make a bit of difference to the operations of the country. The strength and stability of the government lies in the well-established and huge burocracy. That complex system really needs no supervision (Though it could use a kick in the ___).

    It's interesting that the correlation with the internet breaks down because the President et al are not the hubs of communication. They aren't part of the information network of government; they sit atop it, but separate from it. (They're more like ICANN than Google?)

    Unfortunately, humans are not routers, and I wouldn't even try to predict what the emotional effects of mass-assasination would do to the function of the goverment or the nation.

  36. Nothing Sacred about intelligence by Anonymous Coward · · Score: 0
    You can't really shoot down the concept of internet intelligence with current AI theory since current AI theory cannot produce intelligence itself.

    Considering the rate of technology growth, we will most likely see some sort of intelligence as an unintentional byproduct, or as an error in some system.

    I was playing with some cellular automata code, trying out different rules. guess what popped up? A serpinski gasket formed from the noise. Now it's easy to say how they come up with the benefit of hindsite and having studying them for awhile but I doubt anyone predicted them.

    And what produced that was a rather simple logical arrangement. The internet is a much, much, more complex and dynamic system. Doubtfully complex enough to form a HAL but that's no excuse for failing to investigate.

  37. Re:1,000,000,000 urls by Anonymous Coward · · Score: 0

    Nice idea, but like so much grammar, this depends now on house style. Some academic journals expect it to be lower-cased.

  38. What about the rest of the "Web" by ultitool · · Score: 1

    Looking over the article it seems most of their research is about web pages and documents. Now to me that seems like a gross oversite. What all those hosts out their that don't accept Code Red (or in a perfect world use Apache :P ), and aren't part of any traceroute. I'm sure most of the end users on the internet don't have services running that would put them on these graphs yet we aren't included in the research. Just my 2c

    --
    If You Drink, Don't Park, Accidents Cause People.
  39. Re:PhysicsWeb by NonSequor · · Score: 1

    No, not really. What makes you say that?

    --
    My only political goal is to see to it that no political party achieves its goals.
  40. Re:LAIN by evilocity · · Score: 1

    (rantness>thought out argument, proceed with caution)
    I think this discussion has been somewhat flawed, in that it has been considering the internet and its human operators as separate entities. However, the Internet is so driven by human action nd interaction that it is impossible to view it as the technology alone. Yes, it is just a communications network, but with humans at its nodes, the internet may be able to act as a brain, albeit a primitive one for the moment. Ideas of group conciousness are not new, they can be traced back through Freud and Roussaeu in the Western tradition, possibly just as far or further in the East. That humans organize their social groupings the same way as biology organizes their brains would seem to lend a bit more credence to this idea. Of course, one could argue that a group conciousness is not as intellegent, self aware or responsive as individual conciousness, but in the past, group conciousness has been limited in scope to towns, villages, and tribal groups. Cities and nations push the slowness and dumbness of group conciousness to the point that it is probably meaningless (i doubt that they are nonscaling systems). The Internet, however, allows us to form a non-scaled social network of unprecedented speed and size.

    What does this mean? The possibility for conciousness is there, and perhaps the reality is, as well. But a concious Internet will not go running amok, create a body for itself, or any of that other sci-fi stuff, because it is us. Unless, of course, the whole net condenses on aol or .net.

    Oh, and btw, Lain was an ai created on the net, not out of the net, at least as far as I understand Lain, which isn't very.

    --
    ----- I don't believe in wisconsin.
  41. Re:Interesting... by zauber · · Score: 1

    Basically, this represents a cluster analysis of the linkage network. Elevation represents those domains that are most interlinked. More details are available in the directory above, under c5.doc.

  42. Vulnerability to Carefully Coordinated Attack? by the_one_smiley · · Score: 2, Interesting

    There have now been several studies asserting that a concentrated attack on just the top 3% (or some other low percentage), in terms of connectivity, of the major hubs / backbones of the internet would result in some critical failure scenario such as fragmentation into small isolated clusters. But isn't this type of condition valid for a lot of systems besides the internet?

    Consider this example, though it isn't meant to be analogous to the internet in any way. What if the President of the USA, the Vice President, the entire Cabinet, the entire Senate, the entire House of Representatives, etc. etc. were simultaneously assassinated? Can you even imagine ensuing chaos? You can even throw in all the state Governers, whatever, but that still wouldn't come out to more than the top 0.0004% of the country's population, in terms of "political importance" or some other metric. Is this scenario plausible or worth worrying about? You decide.

    - The One God of Smilies =)

    --
    "Never put off for tomorrow what can be avoided altogether"
    1. Re:Vulnerability to Carefully Coordinated Attack? by bartle · · Score: 2

      What if the President of the USA, the Vice President, the entire Cabinet, the entire Senate, the entire House of Representatives, etc. etc. were simultaneously assassinated? Can you even imagine ensuing chaos?

      It wouldn't be that bad actually. One of the major strengths the US government has is a fairly clear line of succession, it's always obvious who is in charge in a given situation. And really it isn't even important who specifically is in charge, just so long as someone is. I doubt we'd really be too worried about it anyway, we'd be far more concerned about what killed them.

      The point is though that this small minority is also under the best protection. You estimate the number to be 0.0004% of our population (a little over a thousand people), inverse that to say they have 250,000 times better security than the rest of us. As this applies to the Internet, we just need to make sure that our main routers have the same level of protection. That rule of thumb makes sense to me, if there are 10,000 machines behind a connection then it should be 10,000 times harder to take down that connection than a single machine. I know it doesn't sound like a good metric, but it's an interesting thought experiment.

  43. Goatse... by Kragg · · Score: 1
    What about multiple links from a single page?

    For example, almost every slashdot page links to Goatse.cx more than 20 times...

    --
    If you can't see this, click here to enable sigs.
  44. Re:when describing by mikewhittaker · · Score: 1
    "Physics" isn't the equations.

    The equations are just our human attempts to understand the physics.

    You are conflating the subject as taught in school with the subject matter itself.

  45. Interesting... by Rinswind · · Score: 0

    It is a beutiful picture concidering the fact that it represents soemthing meaningful like linked websites. By looking at the picture I recon the algrythm would determine the "elevation" of each pixel from he plain. Maybe it will interpolate the elevations between several major pixels on the plain represnting the most linked sites? All thats left is to determine now the algorythm maps "elevetion" to the links topology ;P

  46. Re:1,000,000,000 urls by ElderKorean · · Score: 1

    Do all these wild guesses about the number of pages include such wonders as all of the pages in Yahoo, and Google's Web Directory. No doubt these would boost the page counts a little.

    Especially if Google is caching someone elses content pages.

    Ian.

  47. Re:Wow, thats kind of deep. by Anonymous Coward · · Score: 0

    yes, and to a twat with an UID of >260000
    I
    can
    only
    say:
    fuck you.

    thank you.

  48. Re:LAIN by asdfdf · · Score: 1

    I second this.

    If the internet became 'alive' we would all see a _lot_ of packets going around we didn't understand. We would all see hits to say our webservers from totally random IP's containing code to take over our machines and change them, as any brain changes itself. Such a system would never happen, and if it did we'd have it all over the news about the internet slowing down, and the 'internet' taking over machines.

    Besides I'd get loads of messages in my apache logs..

    oh wait

    wtf

    HELP!

  49. hmm by isudoru · · Score: 0, Offtopic

    i'm still 15, but i got this book called The dancing Wu Li masters which is about physics without the math. they try to explain physics in plain language as Einstein said it could be done. This was really an interesting cookie to go along with my book. thanks

    --

    ----
    "I believe in karma. That means I can do bad things to people and assume they deserve it" - Dogbert
  50. Wow, thats kind of deep. by Flabdabb+Hubbard · · Score: 1
    I mean, its like, wherever you look, the same patterns pop up. At the microscopic level, or the macroscopic level. Amazing

    Its this sort of technical link that keeps me coming back to slashdot, even though its not as good as it used to be, and it no longer seems to attract the 31337 intelligent posters of the good old days.

    Oh well, nothing lasts forever.

    1. Re:Wow, thats kind of deep. by Kalani · · Score: 1

      ... and your comment contributes greatly to that ideal.

      --
      ___
      The ends are ape-chosen, only the means are man's. -- Aldous Huxley
  51. I've got a million of them... by Sun+Tzu · · Score: 3, Interesting

    ...In the single Continuum of Chaos game. Seriously, the game is played in a universe consisting of one million sectors, each of which corresponds to a web page -- with multiple sub-pages. Google and the other engines can't really index it because it requires a log in. Further, even if they did log in they would run out of Antimatter long before they got through even a tiny fraction of the pages.

  52. PhysicsWeb by LocalYokel · · Score: 2

    It just seems appropriate that a Physics site called PhysicsWeb would have an article about Physics and the Web, don't you think?

    --

    --
    E2 IN2 IE?

  53. Don't forget! by Anonymous Coward · · Score: 0

    Google's figure hasn't been updated in a long time. It's probably way more than 1.3 billion now. (Notice that google.com shows a static number of "pages indexed".)

  54. 1,000,000,000 urls by grammar+nazi · · Score: 4, Insightful
    The story mentions "nearly 10^9 urls", so duplicate documents would be counted multiple times.

    Most of their research seems to be on 'static pages'. They state that the entire internet is connected via 16 links (similar to the way that people are connected to 5-6 aquantances). I believe as the ratio of dynamic to static content on the internet increases, this will bring increase the total number of clicks that it takes to get one site to the next. For example, I could create a website that dynamically generates pages, the first 19 pages are all contained within my site and the 20th time that the page is generated, it contains a link to google.

    The metric functions that they use are good for randomly connected maps, but they don't apply to the internet, where nodes are not randomly connected. Nodes cluster into a group depending on topic or categories. For example, one Michael Jackson site links to other Michael Jackson websites.

    --

    Keeping /. free of grammatical errors for ~5 years.
    1. Re:1,000,000,000 urls by Anonymous Coward · · Score: 0

      Grammar note: "the Internet" is a proper noun and should be capitalized. "An internet", as in, some network of interconnected networks, is not capitalized.

    2. Re:1,000,000,000 urls by Paolomania · · Score: 1

      The metric functions that they use are good for randomly connected maps, but they don't apply to the internet, where nodes are not randomly connected.

      Actually the article describes the finding that the connectivity of nodes on the web and Internet follow a power-law distribution instead of the poisson distribution one would expect with a randomly connected graph. Maybe we should read beyond the introduction of the article before we post?

    3. Re:1,000,000,000 urls by Anonymous Coward · · Score: 0

      Is the internet really a person, place or thing? I'm not really sure in that one. We can rule out person, but it could be concidered to be a place and a thing at once. Definitively, a noun can be only one at once, because a person cannot be a thing, and a place cannot be a thing. Also, proper nouns cannot be things to begin with. So, it cannot be a proper-noun by definition.

      It's official, you are an idiot.

    4. Re:1,000,000,000 urls by Anonymous Coward · · Score: 0

      ...distribution instead of the poisson distribution one would expect with a randomly connected graph
      uh, why would you EXPECT a randomly connected graph? Why on Earth? If anything, the amount of randomness is probably a little HIGH (i.e. sometimes a site will link to its hoster, even though their content has nothing to do with each other.) Other than that, you would expect the graph to segment horribly. Which is why Google's ranking system works so well -- each node in a relevant "clump" has a highly weighed vote -- so the site that the most "Michael Jackson"-clump sites point at is the one that'll make it to #1. Even if thirty times as many sites point to a CNN article, and that CNN article happens to contain the search term. Unless, of course, the sites doing the pointing are themselves in the Machael Jackson clump.

    5. Re:1,000,000,000 urls by demaria · · Score: 1, Offtopic

      Well if I am not mistaken, the AP style guide says "Internet" when talking about the Internet, internet in the general or otherwise sense.

      www.m-w.com defines "Internet" as a noun.

    6. Re:1,000,000,000 urls by mino · · Score: 1
      The story mentions "nearly 10^9 urls", so duplicate documents would be counted multiple times.

      <pedant>Actually, it might be more accurate to say that if they're talking about "documents", then they are talking about 10^9 URIs, not 10^9 URLs. A URL merely identifies the "site", while a URI identifies a specific document. There's a difference.</pedant>

      Or, then again, who cares?

  55. Surprising claim... by mwillems · · Score: 1

    ...is that no search engine covers more than 16% of the web. are the authors confusing the "hiiden web" (i.e. web-based databases, such as newspaper articles) with network topology, which is what they are concerned with? I would have thought that google covers most of the web (in terms of topology), not 16% of it. Michael

    --

    ---
    BDOS ERR ON A:>
  56. Describing the web with biology.... by swordboy · · Score: 2, Insightful

    The Code Red thing was interesting in the respect that, if it had worked, it would reveal just how *evil* homogeneity is. In nature, it leads to plagues and/or like disasters.

    It turns out that computing may prove similar.

    Different is good!

    --

    Life is the leading cause of death in America.
    1. Re:Describing the web with biology.... by sconeu · · Score: 2

      Stoll made the same point many years ago in The Cuckoo's Egg, when discussing the Morris Internet Worm.

      --
      General Relativity: Space-time tells matter where to go; Matter tells space-time what shape to be.
  57. These figures are normally HTML-only by iReflect · · Score: 1

    One odd note is the reseachers' claim that the Web contains "nearly a billion documents," when one search engine alone claims to index more than a third beyond that, but I guess new and duplicate documents will always make such figures suspect.

    Also remember that most search engines are indexing only html pages and are probably only counting said pages in their "pages indexed" figures. The web CAN contain other media that may be considered documents. The obvious one is PDF.

    1. Re:These figures are normally HTML-only by Anonymous Coward · · Score: 0

      Well.. at least Google search inside PDFs from the Net too. It's a start.

  58. Advantage of Scale-Free Topology by Grokopen · · Score: 2, Interesting
    A few days ago, /. had a story on how big-businesses wanted to get rid of the current *open* Internet for an allegedly better version: The Death of the Open Internet.

    The problem with getting rid of the current Internet is that we would probably lose the advantage of having scale-free topology ... something the PhysicsWeb article discusses at length. Scale-free topology is one of the key factors in keeping the current Internet stable and relatively fault tolerant even as the number of users have grown exponentially. I doubt that those who want to replace an open Internet would create a replacement that would incorporate this type of scale-free topology.

  59. LAIN by Schezar · · Score: 1, Insightful

    "The number of nodes in the wired is rapidly approaching the number of cells in the human brain." Or something like that.

    What will happen as the net becomes more and more like a brain? Can it have a soul?

    Or worse, can it comprehend the garbage we use it for? ;^) "Sorry Dave, but I cannot allow you do download that pr0n..."

    --
    GeekNights!
    Late Night Radio for Geeks!
    1. Re:LAIN by norton_I · · Score: 2

      This is an idea that has certainly been discussed before, but the answer is "almost certainly not".

      First of all, the formation of a scale free network was caused by measurable "evolutionary" pressures for fault tolerance. In the absence of some similar evolutionary advantage to developing a global conciousness, it doesn't seem likely that it would happen spontaneously.

      On the other hand, if some (possibly unintentional) goal was aligned with that, I wouldn't be totally surprised if through maintenence and updates, some form of conciousness arose.

      Except: characteristic time scales on the internet are very large compared to connections within the brain. Any large scale behavior, including conciousness, would be expected to be slower than a human brain by orders of magnitude.

    2. Re:LAIN by Erasmus+Darwin · · Score: 4, Interesting
      What will happen as the net becomes more and more like a brain? Can it have a soul?

      Please don't take this the wrong way, but that's honestly the sort of question I'd expect from someone who doesn't understand computers.

      While I believe in the possibility of machine intelligence (along with the moral, ethical, and most importantly philisophical questions that raises), the net is more of a data transfer mechanism than a processing mechanism. Short of very delibrate projects, such as SETI@Home, you just don't have your average machine on the net doing random computation. In that sense, the net really hasn't changed much since its inception. Further, if you did have a distributed consciousness, what would the consequences of lag, network outages, and outright crashes be? In that sense, it would be interesting to see if random/semi-random/genetic algorithms are capable of generating an intelligence capable of coping with such noise. However, I think such issues would rapidly kill off something before it became "evolved" enough to cope. If we do get an intelligence, I think it'll be something that happens on purpose. It may be distributed (maybe as a redundant, non-real-time simulation of a brain), but I doubt it'll be a spontaneous Skynet-like entity.

    3. Re:LAIN by Kierthos · · Score: 1

      I agree. A lot of recent science-fiction treats AIs not as spontaneously arising from the Internet but as deliberate projects to create AI. (Usually, something goes horribly wrong, but that is one of the genres of sci-fi.) One of my personal favorites was the story that had an AI that had been 'killed' twice by the NSA, and they weren't even aware that they had done it, giving rise to the theory that it had been at such a non-robust stage that it was easy to 'kill'.

      An Internet-wide AI.... hmm... lag would problem be analagous to senility or Alzheimer's, network outages would be memory loss or brain damage, crashes would be brain damage as well. However, given that a network will eventually come back up, and no crash lasts forever (although I'm certain MS is working on a 5 9's crash), it wouldn't be permanent brain damage. And theoretically, such an AI could become 'accustomed' to lag and work around it.

      But it's all still speculation...

      Kierthos

      --
      Mr. Hu is not a ninja.
  60. All is goo in the land of Microsloth by clinton(x) · · Score: 0, Troll

    Fucking Winders NT. I was going to summarily execute the machine it was running on (you guessed it - it crashed), drag it out into the car park and break into and hotwire my old best friend's dilapidated dumped de-registered car and start taking potshots, screaming howling berating and throwing Jack Daniels bottles at it whilst I reversed backward and forward over it, but then I calmed significantly and remembered that all is goo in the land of Microsloth.

  61. when describing by +a++00+y0u · · Score: 1, Funny
    I try to avoid physics and use words instead. It is sooo much easier than having to remember all those equations. ugh.

    --
    My name isn't really Jenny....

    1. Re:when describing by Anonymous Coward · · Score: 0

      But then, you don't end up describing it as accurately.

  62. Physics can EASILY kill the web by sharkey · · Score: 0, Offtopic

    Just get my high school Physics teacher to lecture the Internet admins when they shut down the Web for cleaning this New Year's. They'll sleep forever, long after the Internet has been replaced.

    Hi to Mr. Konkle! Still doing your grade book in pencil? Physics might have been Phun, but you certainly killed that, didn't you?

    --

    --
    "Outlook not so good." That magic 8-ball knows everything! I'll ask about Exchange Server next.
  63. Read the fscking article... by friode · · Score: 5, Interesting

    One odd note is the reseachers' claim that the Web contains "nearly a billion documents," when one search engine alone claims to index more than a third beyond that

    Look deeper, grasshopper:

    ...This expression predicts typically that the shortest path between two pages selected at random among the 800 million nodes (i.e. documents) that made up the Web in 1999 is around 19 assuming that such a path exists...

    ...the typical number of clicks between two Web pages is about 19, despite the fact that there are now over one billion pages out there...

    Hey, Timothy, next time try reading the article instead if skimming it.

    --
    There may be many reasons not to kill you, but among them is not that you'll be missed by NASA - The Long Kiss Goodnight
    1. Re:Read the fscking article... by Anonymous Coward · · Score: 0

      Excellent point.

      Perhaps Slashdot has a not-so-innocent bias towards Google? Kickbacks, perhaps? Maybe like Transmeta...

      With VA Linux going down the tubes, the money has to come from somewhere!!!

  64. Errror..Does not fempute by Anonymous Coward · · Score: 0

    The web site is physicsweb.org not physicsweb.com.

  65. Other Explanations by Anonymous Coward · · Score: 0

    Google indexes a lot more than the Web. For instance, it has the nicest collection of PDF's I've seen anywhere all accessible as text. Sweet function.

  66. complexity and deregulation by beanerspace · · Score: 3, Interesting
    The article does a good job at pointing out the seemingly chaotic and certainly cell-like nature the internet. However, unlike the article, I'm not sure if more research and/or more computer science will solve the problem.

    THat's not to say that understanding how the various layers of complexity architecture and dynamics won't provide an answer ... and not because I think such diciplines suck, but because we have and will continue to have commercial influences on how networks are established.

    Certainly some, in fact many businesses will higher and follow good practice. The problem comes about when some large companies don't. Or worse when mergers and buyouts occur, e.g. Verizon, CIHost and a few others come to mind.

    Not to sound anti-business, because business has footed much of the bill for Internet expansion ... but rather to voice concern that sometimes there is a big disparity between technical solutions and the shareholder's bottom line.

  67. Few Months Ago.. by Anonymous Coward · · Score: 0

    Actually, the bow tie theory press was a year and a few months ago. i.e. 5/00.

  68. Internet is not web by Anonymous Coward · · Score: 1, Insightful

    You are confusing the two. WWW is the documents, etc. Internet (which is simply the DARPA net suit connectivity with the underling routing protocols and physical connectivity) is simply a transport mechanism. It can carry anything, it just so happens that the web is the most popular (along with email).

  69. critical threshold for virus spreading by winnetou · · Score: 1
    The authors claim the threshold is 0, but Bliss never made it in the wild.

    The mere existence of that term IMHO shows that the threshold is greater than 0.

    1. Re:critical threshold for virus spreading by Anonymous Coward · · Score: 0

      if it wasnt released into the wild it would be at 0, the threshold, if it were more than that (i.e. released) it would be above the threshold and continue spreading

    2. Re:critical threshold for virus spreading by norton_I · · Score: 3, Insightful

      The virus infection threshold is based on something like this model:

      1) Some set of nodes are infected
      2) Each of those nodes has a probability of X of infecting it nearest neighbors.
      3) repeat
      I just made that up, and there are many oportunities for variations (add the ability for nodes to be cleaned and/or vaccinated), but under models like this:

      random networks have a critical threshold for X, above which they will infect the whole network, below which they will die out.

      scale-free networks will have a macroscopic fraction of the network infected for any value of X.

      First of all, there are additional features not caputred in this model, which could be important for "viruses" like Bliss which have an extremely low probabiliy of infection.

      Second, the internet is not exactly a scale free network. As mentioned in the article, while the dominant behavior is a power law, if you go high enough, you find exponential cutoffs. This could cause some viruses to die out (I am certain Bliss isn't the only one that never made it).

  70. Tim Berners Lee by Anonymous Coward · · Score: 0

    Was a physicist. No physics, no web. Sorry about your lousy teacher though.

  71. i had a feeling... by Prion86 · · Score: 2

    this was going to show itself sooner or later. im no math guy, nor am i a computer guy. i am in fact a molecular biologist and i play with complex systems every day. i was thinking about how everything is connected to the net these days. cell phones, pda's, cars, even appliences. with all this stuff, the physical topology of the net is very dynamic, almost to the point its evolving on its own. despite the fact it is an entirely man-made thing, it looks very organic. all these complex systems look the same. as chaotic and complex as it is, it seems as tho someone (with too much time on their hands) saw how it is behaving like things found in nature. looks like i was thinking along the right lines.i just wonder with the speed of expansion what it will be like 5 years from now. i hope it doesnt develope morals...

    --
    "Alot of people don't know what they are doing...and most are pretty good at it." -George Carlin
  72. IBM "bow tie" paper by mgarraha · · Score: 4, Interesting

    In "Graph structure in the web," Kumar et al. divide 200 million web pages into four categories of roughly equal size:

    The first piece is a central core, all of whose pages can reach one another along directed hyperlinks -- this "giant strongly connected component" (SCC) is at the heart of the web. The second and third pieces are called IN and OUT. IN consists of pages that can reach the SCC, but cannot be reached from it - possibly new sites that people have not yet discovered and linked to. OUT consists of pages that are accessible from the SCC, but do not link back to it, such as corporate websites that contain only internal links. Finally, the TENDRILS contain pages that cannot reach the SCC, and cannot be reached from the SCC.

    So is your home page an innie or an outie?

  73. Newton's World Wide Web by Anonymous Coward · · Score: 0

    The Force of the web is equal to the product of the Mass of web and the Acceleration of the web.

  74. I am infinitely grateful... by Nathdot · · Score: 1

    ... that they didn't try to Describe The Web With Gym Class.

    I sucked at that! (As I imagine many /.ers did and do)

    :)

  75. naming by Anonymous Coward · · Score: 0

    throughout the article they confuse the words 'network topology' (relating to physical connections) and website links (i.e. URLs). its pretty annoying

  76. Re:get it out of your hands by Anonymous Coward · · Score: 0

    But come to think of it, that was the second time I tried that, and the first time nearly killed me, so we'll just say I'm batting .500.

  77. Microsoft, Betamax, Qwerty, oh my by Anonymous Coward · · Score: 1, Insightful
    This simple model illustrates how growth and preferential attachment jointly lead to the appearance of a hierarchy. A node rich in links increases its connectivity faster than the rest of the nodes because incoming nodes link to it with higher probability this "rich-gets-richer" phenomenon is present in many competitive systems.
    And there's your explanation for how VHS beat out beta, QWERTY beat out other arrangements, and Microsoft won out in the OS and Apps biz. A small initial advantage gets magnified over time. The wingbeats of a butterfly become a hurricane.
  78. 19 Degrees of separation by nihilogos · · Score: 0, Troll

    That makes goatse.cx a little too close for comfort. Keep those homepages with pictures of your cat coming people.

    --
    :wq
  79. Fractals by snack · · Score: 1

    Kind of like a fractal... the complexitivity (sp) goes up the harder you look into it.

    -Tim

  80. Relation of Net Connections to Neural Nets by polyphemus · · Score: 1

    I wonder if the mathematics dealing with the spread of the net and the web are things that can be applied to neural nets. I mean, I know that as it stands a neural net is just a test-and-analyze-results type system, where the strongest signal (weighted by past tests) dominates on any later action, but maybe it can be broken down into classes of actions (very connected hubs) break down into specific action types (somewhat connected hubs) to specific actions (singly connected network nodes) based on vague discriptions of input data on down to specific information on it (when there are only 5 real possible actions to check). Anyone out there a Neural Net expert who thinks this might work?