Slashdot Mirror


Map the Internet... In One Day?

rjbrown99 writes "There have been numerous stories over the past few years on Bill Cheswick's Internet Mapping Project. The Lumeta folks even created a company out of it. Well, now there is a competitor. A single guy with a single computer is working to accomplish the same feat - within ONE DAY and using open-source tools to do it. The new project is called Opte and can be found at www.opte.org." He's made some progress and is looking for volunteers.

47 of 263 comments (clear)

  1. This server will die ! by Anonymous Coward · · Score: 5, Informative

    Who
    This project was started by me (Barrett Lyon) as a response to a conversation with my colleagues at Network Presence. Over a lunch we were discussing William Cheswick and Hal Burch's Internet Mapping Project. I was not very impressed with the results of their project, they produce beautiful maps but they don't seem to be very useful nor do they release their code freely. Their mapping also takes nearly six months to generate a single map. My comment was that, "I can write a program that can map the entire net in a single day." The comment was met with some hostility. Thus, this project was born.

    What
    The goal of this project is to use a single computer and single Internet connection to map the location of every single class C network on the Internet. It is obvious that the Internet is not routed as a bunch of class-c networks, but it is easy to see that by treating the Internet IP space as a bunch of class C networks, it will be possible to make a detailed map of the entire Internet. The global Internet address space currently offers 32 bits worth of unique host addresses, or a theoretical maximum of 2^32=4,294,967,296 hosts. In reality, the address space has been allocated in fairly large contiguous blocks, which renders strictly optimal utilization difficult. The smallest block that is logically routed via BGP or allocated by ARIN is a class C network (CIDR /24.)
    At the rate of 194 traceroutes per-second it is possible to scan the entire theoretical 2^24 space within a single day. Thus about 16,777,216 class C networks could be processed by a single computer in a single day. Yet, there are huge portions of network blocks that are no longer used, many network blocks fall into the RFC 1918 standard and other blocks that are reserved by ARIN.

    According to ARIN there are about 47 class A networks in the reserved status (search ARIN for OrgName "Internet Assigned Numbers Authority".) Doing the math results in a reduction of 3,080,192 class C blocks to be removed from the scan list, leaving us with a theoretical list of 13,697,024 blocks.

    Applying some additional thought large portions of the 13.7 Million blocks may route to the same place. By testing about 20 routes at random within a class B and comparing the results, it is possible to see if there are multiple routes worth investigating or if the entire thing goes to the same place. By applying that logic it increases the speed of the scanning.

    After some testing and beta code I proved that with enough bandwidth it is possible to scan the entier Internet with a single computer. The 1/5th of the Internet map only took about 2 hours to create, yet it generated nearly 200k/sec of traffic and put my machine at a load of 60+ while scanning. If you apply the math, the entire internet would take about 10 hours to scan and another hour or two for the visual map output.

    I found a lot of value in the project, so after the proof of concept was completed I continued to program. I turned the entire system into a distributed client/server model. The clients request a chunk of random IP space from the server and when it is completed the IP space is registered with the server. This is done until all of the IP space has been scanned. I'm also working on a stats system so I can monitor the productivity of the different scanning nodes and users involved in the project.

    By taking a more distributed approach the data will look more like the real Internet. It will show more of the backup routes, more of the smaller links in different countries, etc. When the first version of the code is done I should have about 5 to 10 different scanning nodes running on the Internet. If you would like to donate a computer and some bandwdith to this project, please contact me. I can give credit where credit is due!

    When
    The first scanning tests began in late October 2003 and I wish to have the project generate a new map every week.

    Where
    Currently the project is hosted in San Francisco on a multi-homed fiber ba

    1. Re:This server will die ! by bigmouth_strikes · · Score: 4, Insightful

      If he's mapping the whole Internet in a day he should be able to stand up to a little Slashdotting, shouldn't he ?

      --
      Oh, I can't help quoting you because everything that you said rings true
  2. Here ya' go... by swordboy · · Score: 3, Funny

    Several maps of the internet right here

    --

    Life is the leading cause of death in America.
    1. Re:Here ya' go... by BadCable · · Score: 3, Interesting

      Why don't they make maps like that of say the telephone network?

      That'd be very interesting to see with very similar benifits.

  3. He needs more bandwidth by defMan · · Score: 5, Funny

    I am in serious need of more bandwidth and hardware power. If anyone has a Co-Located system on a nice network to donate to this project for a few months, I would be very happy!

    Slashdotting was never easier!

    1. Re:He needs more bandwidth by DukeyToo · · Score: 4, Funny

      Its an evil plot to map slashdot users. He's probably logging all the IPs of the people who hit his website today!

      --
      Most writers regard truth as their most valuable possession, and therefore are most economical in its use - Mark Twain
  4. Turn left at the third router... by bcolflesh · · Score: 5, Funny

    Go past the burnt-out Cray and then right at the Commodore64 Contiki server - you'll see my drive lights.

  5. Things look good so far... by sk3tch · · Score: 2, Funny

    ...his web server is already unavailable within minutes of it being posted on Slashdot...

    Mapping...Slashdot.org......

  6. Ok heres my part... by GoofyBoy · · Score: 5, Funny


    IP Address: 127.0.0.1
    Computer: The one from Microsoft with the Start button in the bottom left hand corner.
    Location: my bedroom.

    --
    The surprise isn't how often we make bad choices; the surprise is how seldom they defeat us.
  7. His Map Is Wrong by DoctorMabuse · · Score: 5, Funny

    SCO IPs are in the Mordor address space.

  8. Re:Lets face it by nate+nice · · Score: 2, Informative

    "You can't make money with computers anymore because some jackass is always trying to give away the same thing you're doing."

    Don't feel too bad, the government here (USA) is on your side mainly. I would disagree with you as there is always good money to made here but you have to be creative. The idea is to push each other further to create new ideas and technologies where you can make money.

    --
    "If you are a dreamer, a wisher, a liar, A hope-er, a pray-er, a magic bean buyer ..."
  9. Been there, done that by Fux+the+Penguin · · Score: 4, Informative

    Okay, yes, I fully admit that it's cool to map the internet in one day. Regardless...I think I hear about some internet every other day.

    There's John Quarterman who's been doing it for years, and then the CAIDA visualization tools, and Cybergeography and the Internet weather report and damn maps and more maps.

    Note to everyone: please stop mapping the internet.

    1. Re:Been there, done that by jd · · Score: 2, Interesting
      I believe this new project uses the CAIDA tools. The maps look like the output from their Java-based network mapping package.


      However, it looks like it's one map a week, not one a day, and that's only with more power. Based on the charts on the site, it's going to take between 3-4 months to map a decent portion of the Internet, and he's only going to Class C resolution.


      Further, he's mapping as a spanning-tree. This means that tunnels, load-balancing and multipath connections cannot be shown at all.


      Also, it only shows IPv4 unicast nodes, so you don't see any IPv6 or multicast paths.


      Networks that are gatewayed, NATted or otherwise complicated, will also not show up. It would be good if gateway nodes where the other side of the gateway cannot be seen were marked, so that we can see where obscured parts of the net are.

      --
      It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
  10. Slashdotted ! by Goody · · Score: 4, Funny

    Well, there's one less server to map...

    --
    Tired of being "punished" by the Slashdot $rtbl since 2002. I'm now over at http://soylentnews.org/ .
  11. Re:It has to be asked.... by akiaki007 · · Score: 4, Informative
    Read the web-site and you would know. ... and quote...

    Mapping the Internet weekly will allow us to see major disasters in different parts of the world. The Internet is a huge disaster censor. If I had maps of pre-war Iraq and then compared them to today, one could see how badly Iraq was destoryed. The idea of a metaphysical representation of the real world is very interesting to me.

    The project can show the Internet growth.

    The project is art.

    --
    "Time is long and life is short, so begin to live while you still can." -EV
  12. I can map that Internet in ... by burgburgburg · · Score: 3, Funny
    half a day with a broken computer, dial-up access and a guy with no hands.

    Top that!

    1. Re:I can map that Internet in ... by DrEldarion · · Score: 2, Funny

      That's nothing. My grandmother maps the internet during the commercials while watching Wheel of Fortune.

  13. Re:Are we overlooking something? by ViolentGreen · · Score: 3, Insightful

    I think he means that the program will take less then one day to completely map the internet. Not less then one day to write/compile/run.

    --
    Not everything is analogous to cars. Car analogies rarely work.
  14. Re:It has to be asked.... by DrEldarion · · Score: 5, Funny

    Forgive me if I'm wrong, but if we need the internet to tell us when a major disaster or war happens in a certain part of the world something is wrong.

  15. Re:It has to be asked.... by nate+nice · · Score: 2, Insightful

    A real word map could have many uses. First, it neat to see and learn from to see the real structure of this inter-network of computers. Secondly, graph theorists could use it for research etc as this is a real (as opposed to theoretical) graph so it has real uses. From this graph theory, we could think of new ways to enhance the internet to make it more reliable, faster and more secure. Many things can come from looking at what we have put together and then using our analytic skills to hypothesize about it. I'm sure I'm missing 100 other reasons why this is good.

    --
    "If you are a dreamer, a wisher, a liar, A hope-er, a pray-er, a magic bean buyer ..."
  16. And one day... by Lugor · · Score: 2, Funny

    the Internet came to him! And he was no more.

  17. Creepy by Seanasy · · Score: 4, Interesting

    When I first saw the image on the right it looked like a human brain. It would be creepy if the Internet had a sort of fractal self-similarity to our physiology.

    1. Re:Creepy by Anonymous Coward · · Score: 2, Insightful

      How it appears graphically is decided by the person who translates the database to an image. They could make it S shaped if they really wanted. Not creepy.

    2. Re:Creepy by zangdesign · · Score: 2, Funny

      Following your metaphor - the internet's genitalia must be really huge then. Or at least the portion of the brain responsible for sex.

      --
      To celebrate the occasion of my 1000th post, I will post no more forever on Slashdot. Goodbye.
  18. Fantastic. by ToadSprocket · · Score: 2, Funny

    Now map the people mapping the internet in one day.

    --


    If this article confuses you, don't worry. It was posted yesterday in a much clearer fashion.
  19. It's too easy... by Fux+the+Penguin · · Score: 5, Funny

    A single guy with a single computer...

    He's mapping the Internet. Why am I not surprised he's single?

  20. rsync by bigjnsa500 · · Score: 2, Interesting

    Why can't somebody just rsync the Google search cluster? Wouldn't it have the same results this guy is looking for?

    --
    This is a test. This is a test of the emergency sig system. This has been only a test.
  21. Re:Great! by gregfortune · · Score: 2, Insightful

    That's the whole point. Existing methods take months while he claims it can be done in a single day with a single computer.

  22. Dude. by FreeLinux · · Score: 4, Funny

    Please explain how one pings a web page. Is this a feature of AOL?

    Web pages are NOT internet hosts.

    Web servers are relatively few compared with other types of hosts on the internet.

    The World Wide Web is NOT the internet.

    The World Wide Web is NOT the internet.

    The World Wide Web is NOT the internet.

    The World Wide Web is NOT the internet.

    1. Re:Dude. by freeweed · · Score: 2, Funny

      Please explain how one pings a web page.

      C:\>ping www.slashdot.org

      Pinging www.slashdot.org [66.35.250.151] with 32 bytes of data:

      Request timed out.
      Request timed out.
      Request timed out.
      Request timed out.

      C:\>

      He's right! You can't ping web pages!

      --
      Endless arguments over trivial contradictions in books written by ignorant savages to explain thunder in the dark.
  23. Internet Mapping Project does daily maps by hburch · · Score: 4, Informative

    As a side comment, now I understand why my connection got so slow.

    [Internet Mapping Project's] mapping also takes nearly six months to generate a single map. My comment was that, "I can write a program that can map the entire net in a single day."

    The Internet Mapping Project maps the Internet in under two hours (105 minutes for this morning's run). I'm not certain where the six months came from. The rate limitation is the packet rate limit we set (500 packet per second).

    Map layout time is not included in that time, but that is not done on a daily basis. A map layout take about six hours, as I recall. It only took a couple weeks to produce all the layouts necessary for a movie of the Internet from Aug 1998 to Jan 2001 based on the daily runs.

    CAIDA also creates daily maps of the Internet as part of their Skitter project. Their schedule varies between measurement points. In addition, other projects, such as the Mercator project and the RocketFuel projects, also map or did map the Internet.

    Each project has slightly different goals. Skitter focuses on paths to major web and DNS servers. Mercator attempted to discover networks with limited pre-knowledge. RocketFuel wants a very accurate map of a particular ISP. The Internet Mapping Project is focused on the router connectivity within and between public backbones.

  24. Re:Lets face it by Stone316 · · Score: 2, Insightful
    Come on, be realistic. If a couple of guys can solve the same problem developing the software in their spare time and get results at a fraction of the time then that company doesn't deserve to make anymore or they should hire this guy.

    This is nothing new, you can find free software to solve just about any problem. People buy commercial software because in some cases free versions aren't advanced enough or easy enough to use or they want to buy support.

    --
    "Thanks to the remote control I have the attention span of a gerbil."
  25. I have considered something similar by fv · · Score: 4, Interesting
    As the author of the free Nmap ("Network Mapper") tool, I have also considered creating a map of the entire Internet. I would have focused on end hosts (where they are, what operating systems and services they run, trending, etc.) instead of routing. Rather than try this from a single high-bandwidth machine (as with Opte), I was going to take a distributed approach. I would release a P2P-like application that users could run and each scan small sections of network space to be contributed to the global database. The app would be called Nmapster :). I also liked to think about it as a "caching service", so that you don't have to spend the time rescanning the Microsoft network if someone else has done so in the last N hours.

    Then I came to my senses and decided to work on more practical and less controversial projects such as Nmap Version Detection. But the subversive in me still hasn't given up entirely on Nmapster :).

    -Fyodor

  26. Brilliant! by Quixadhal · · Score: 2, Funny

    You want to map the internet?

    1 Setup a site saying you want to map the internet.
    2 Get posted on slashdot.
    3 Parse the referer logs.
    4 ???
    5 Profit!

  27. Pfft! Kids today by freeweed · · Score: 2, Funny

    You kids are so spoiled today. Back in the 60s we used to be able to map the entire internet using nothing more than a piece of string and 2 pushpins.

    Huh? 2 nodes? Why the hell should that matter?

    --
    Endless arguments over trivial contradictions in books written by ignorant savages to explain thunder in the dark.
    1. Re:Pfft! Kids today by hburch · · Score: 2, Interesting
      We still have that picture.


      Since I linked to his site, I should mention that Martin Dodge has gathered a nice collection of maps of the Internet on his CyberGeography site, including many historical maps. CyberGeography also includes many other interesting types of maps.

  28. bad for business by glassesmonkey · · Score: 2, Interesting

    Maybe people already take this into consideration, but won't this impact webhosting? Won't people try to get their webpage/company closer to the main trunk / center of map? When you look for a hosting service (basically an IP address) right now most people don't consider where in the map the host is.

    I mean with this tool, I would look up where my new IP would land me and try to find a host closer to the main backbones. Is this already done now by most people?

    (on another subject the maps remind me of the species origin stuff)

  29. Hierarchy by sploxx · · Score: 2, Insightful

    Has anyone noticed that nearly all of the maps have a more or less tree-shaped structure?
    This means concentration of power. So, the real, failure-tolerant internet is gone, at least it seems to be.

    1. Re:Hierarchy by daves · · Score: 3, Interesting

      Has anyone noticed that nearly all of the maps have a more or less tree-shaped structure?

      No matter where you are on the net, your view is going to look like a tree with you at the center. Traceroute-type mapping will not capture the redundancies.

      --
      People who disagree with you are not automatically evil, greedy, or stupid.
  30. limited by the speed of light? by Elminst · · Score: 2, Interesting

    So if we assume the electrical signals of his packets travel at the speed of light (186 000 miles per second) across the internet (which they don't really, but we'll ignore that for this argument), then logic tells us that the internet must have less than 16,070,400,000 miles of cable in order for this to work. Because his data cannot travel any faster along the pipes.

    And that's only one way... Assuming query and response, his packets have to effectively travel double the existing cable lengths.

    So do all the (public) networks in all the world total less than 16 trillion miles of cable?

    --
    No unauthorized use. Trespassers will be shot. Survivors will be shot again.
  31. Re:Actually, no. by geoffspear · · Score: 2, Funny

    You don't need a map of the Internet... just watch for your spam to drop by about 90% and you know there's something wrong with the Chinese internet.

    --
    Don't blame me; I'm never given mod points.
  32. Internet Topology by Tacoguy · · Score: 2, Interesting

    I have followed various projects related to mapping cyberspace through the years and have always found An Atlas of Cycerspaces to be fascinating.

    Mapping by Lumeta is one such methodology and I even have a poster of theirs printed by Peacock Maps (server down just now) in my office.

    I have noticed that these mappings take a long time to complete and being able to map in a short time frame could be beneficial in much the same way that Internet Traffic Report can be to visualize traffic patterns or disruptions.

    Taco

  33. limited by the speed of IP by Roadkills-R-Us · · Score: 2, Funny

    No. They are limited by the speed of IP, which is not only slower, but its speed is random within a fairly large range. So to be safe, we have to asume the total cabling on the internet is (think, think, think) less than 3 meters.

  34. AS mapping would be more useful by cpghost · · Score: 2, Interesting

    Why not map Autonomous Systems instead? Routes to AS are being advertised by BGP, and a set of well placed looking glasses would be all it takes to get a big picture. I never saw anything like an AS mapping, with the ASes as nodes and the (BGP announced) routes between them as links.

    Of course, some AS span multiple geographical areas, but this is also true of class C networks.

    The big advantage of mapping ASes is, that there are not so many of them, compared to class C nets, thus resulting in much simpler graphs. Moreover, the graphs would nicely show the boundaries between institutions/organizations, rather than artificial boundaries based on numerical addresses.

    --
    cpghost at Cordula's Web.
  35. Let the mountain come to Mohamed by Porag_Spliffing · · Score: 2, Informative

    I see his trick already. Post on /. that you plan to map the entire net and then wait till the entire net maps its way to you.

    P.S.

    Is there such a thing as trecart ?

    --
    Maybe you live in interesting times
  36. The Internet according to Garp by Alomex · · Score: 2, Insightful

    Notice that he maps the paths from his computer to the rest of the world. That is not the same as a map of the entire Internet.

    To illustrate, if I map routes from, say Chicago, I'm likely to miss the direct connection between Seattle and San Francisco, as there is no traffic I could generate that would take that path.

  37. Re:Disturbing by ultranova · · Score: 2, Insightful

    Of course, this could simply be a matter of traffick using the fastest route available. If there's an information superhighway and an information dirt path, then as long as the superhighway stays up, it's going to be used.

    In other words, the low-level interconnects probably wouldn't show up in a scan like this, because the backbone nodes are faster. That doesn't mean they aren't there, just that data prefers the faster routes as long as they are available. There could be a million paths that don't include the backbone nodes, but traceroute only shows one (fastest) path per trace, and thus they would never show up as long as the backbone stays up. To interpret this to mean they don't exist is analogous to taking the same route to work each day and saying there's no other possible route, since you've never used any other route. But as soon as there's an accident that causes that main road to become useless, traffick will simply use alternative paths, slowing it down but not stopping it entirely.

    To properly map the Internet, you would need millions of volunteer nodes, making traceroutes near and far. You can _not_ map the Net from a single point of view, because that's excatly what you get: a single viewpoint, which might show some detail nearby, but only the major traffick points at the far side of the Net. To get truly accurate results, you'd need to run this program from every single one of the class C networks, and then combine the results.

    --

    Forget magic. Any technology distinguishable from divine power is insufficiently advanced.