Map the Internet... In One Day?
rjbrown99 writes "There have been numerous stories over the past few years on Bill Cheswick's Internet Mapping Project. The Lumeta folks even created a company out of it. Well, now there is a competitor. A single guy with a single computer is working to accomplish the same feat - within ONE DAY and using open-source tools to do it. The new project is called Opte and can be found at www.opte.org." He's made some progress and is looking for volunteers.
foolish mortal! time to drop a heavy lourde on him
Who
/24.)
This project was started by me (Barrett Lyon) as a response to a conversation with my colleagues at Network Presence. Over a lunch we were discussing William Cheswick and Hal Burch's Internet Mapping Project. I was not very impressed with the results of their project, they produce beautiful maps but they don't seem to be very useful nor do they release their code freely. Their mapping also takes nearly six months to generate a single map. My comment was that, "I can write a program that can map the entire net in a single day." The comment was met with some hostility. Thus, this project was born.
What
The goal of this project is to use a single computer and single Internet connection to map the location of every single class C network on the Internet. It is obvious that the Internet is not routed as a bunch of class-c networks, but it is easy to see that by treating the Internet IP space as a bunch of class C networks, it will be possible to make a detailed map of the entire Internet. The global Internet address space currently offers 32 bits worth of unique host addresses, or a theoretical maximum of 2^32=4,294,967,296 hosts. In reality, the address space has been allocated in fairly large contiguous blocks, which renders strictly optimal utilization difficult. The smallest block that is logically routed via BGP or allocated by ARIN is a class C network (CIDR
At the rate of 194 traceroutes per-second it is possible to scan the entire theoretical 2^24 space within a single day. Thus about 16,777,216 class C networks could be processed by a single computer in a single day. Yet, there are huge portions of network blocks that are no longer used, many network blocks fall into the RFC 1918 standard and other blocks that are reserved by ARIN.
According to ARIN there are about 47 class A networks in the reserved status (search ARIN for OrgName "Internet Assigned Numbers Authority".) Doing the math results in a reduction of 3,080,192 class C blocks to be removed from the scan list, leaving us with a theoretical list of 13,697,024 blocks.
Applying some additional thought large portions of the 13.7 Million blocks may route to the same place. By testing about 20 routes at random within a class B and comparing the results, it is possible to see if there are multiple routes worth investigating or if the entire thing goes to the same place. By applying that logic it increases the speed of the scanning.
After some testing and beta code I proved that with enough bandwidth it is possible to scan the entier Internet with a single computer. The 1/5th of the Internet map only took about 2 hours to create, yet it generated nearly 200k/sec of traffic and put my machine at a load of 60+ while scanning. If you apply the math, the entire internet would take about 10 hours to scan and another hour or two for the visual map output.
I found a lot of value in the project, so after the proof of concept was completed I continued to program. I turned the entire system into a distributed client/server model. The clients request a chunk of random IP space from the server and when it is completed the IP space is registered with the server. This is done until all of the IP space has been scanned. I'm also working on a stats system so I can monitor the productivity of the different scanning nodes and users involved in the project.
By taking a more distributed approach the data will look more like the real Internet. It will show more of the backup routes, more of the smaller links in different countries, etc. When the first version of the code is done I should have about 5 to 10 different scanning nodes running on the Internet. If you would like to donate a computer and some bandwdith to this project, please contact me. I can give credit where credit is due!
When
The first scanning tests began in late October 2003 and I wish to have the project generate a new map every week.
Where
Currently the project is hosted in San Francisco on a multi-homed fiber ba
hello room
Several maps of the internet right here
Life is the leading cause of death in America.
Project History
/24.)
Current Status
Find yourself on the Internet map
View/Generate Maps
Downloads
Contacts
Links
Help this project!
Who
This project was started by me (Barrett Lyon) as a response to a conversation with my colleagues at Network Presence. Over a lunch we were discussing William Cheswick and Hal Burch's Internet Mapping Project. I was not very impressed with the results of their project, they produce beautiful maps but they don't seem to be very useful nor do they release their code freely. Their mapping also takes nearly six months to generate a single map. My comment was that, "I can write a program that can map the entire net in a single day." The comment was met with some hostility. Thus, this project was born.
What
The goal of this project is to use a single computer and single Internet connection to map the location of every single class C network on the Internet. It is obvious that the Internet is not routed as a bunch of class-c networks, but it is easy to see that by treating the Internet IP space as a bunch of class C networks, it will be possible to make a detailed map of the entire Internet. The global Internet address space currently offers 32 bits worth of unique host addresses, or a theoretical maximum of 2^32=4,294,967,296 hosts. In reality, the address space has been allocated in fairly large contiguous blocks, which renders strictly optimal utilization difficult. The smallest block that is logically routed via BGP or allocated by ARIN is a class C network (CIDR
At the rate of 194 traceroutes per-second it is possible to scan the entire theoretical 2^24 space within a single day. Thus about 16,777,216 class C networks could be processed by a single computer in a single day. Yet, there are huge portions of network blocks that are no longer used, many network blocks fall into the RFC 1918 standard and other blocks that are reserved by ARIN.
According to ARIN there are about 47 class A networks in the reserved status (search ARIN for OrgName "Internet Assigned Numbers Authority".) Doing the math results in a reduction of 3,080,192 class C blocks to be removed from the scan list, leaving us with a theoretical list of 13,697,024 blocks.
Applying some additional thought large portions of the 13.7 Million blocks may route to the same place. By testing about 20 routes at random within a class B and comparing the results, it is possible to see if there are multiple routes worth investigating or if the entire thing goes to the same place. By applying that logic it increases the speed of the scanning.
After some testing and beta code I proved that with enough bandwidth it is possible to scan the entier Internet with a single computer. The 1/5th of the Internet map only took about 2 hours to create, yet it generated nearly 200k/sec of traffic and put my machine at a load of 60+ while scanning. If you apply the math, the entire internet would take about 10 hours to scan and another hour or two for the visual map output.
I found a lot of value in the project, so after the proof of concept was completed I continued to program. I turned the entire system into a distributed client/server model. The clients request a chunk of random IP space from the server and when it is completed the IP space is registered with the server. This is done until all of the IP space has been scanned. I'm also working on a stats system so I can monitor the productivity of the different scanning nodes and users involved in the project.
By taking a more distributed approach the data will look more like the real Internet. It will show more of the backup routes, more of the smaller links in different countries, etc. When the first version of the code is done I should have about 5 to 10 different scanning nodes running on the Internet. If you would like to donate a computer and some bandwdith to this project, please contact me. I can give credit where credit is due!
When
The first scanning tests began in late October 2
You can't make money with computers anymore because some jackass is always trying to give away the same thing you're doing.
I am in serious need of more bandwidth and hardware power. If anyone has a Co-Located system on a nice network to donate to this project for a few months, I would be very happy!
Slashdotting was never easier!
Go past the burnt-out Cray and then right at the Commodore64 Contiki server - you'll see my drive lights.
Google Cache of Map
Bill Cheswick, Lumeta Corp.
Hal Burch, Lumeta Corp
The Internet Mapping Project
The Internet Mapping Project was started at Bell Labs in the summer of 1998. It's long-term goal is to acquire and save Internet topological data over a long period of time. This data has been used in the study of routing problems and changes, DDoS attacks, and graph theory.
In the fall of 2000, Ches and Hal moved to a spin-off from Lucent/Bell Labs named Lumeta Corporation. This company applies our topological discovery techniques to discover the perimeter of our clients' intranets.
The Internet Mapping Project continues at Lumeta. As a result, The Internet mapping host is changing. Since 1998, the trial packets came from
ches-netmapper.research.bell-labs.com, (204.178.16.36).
The same software will soon be running from a new host,
netmapper.research.lumeta.com, 65.198.68.56
In these troubled times, the scans may be run a bit more frequently than before, but still only one traceroute per announced or registered network.
Our test packets are lightweight and non-invasive. But if they are a concern to you, we will be happy to include the CIDR blocks you supply in a don't-scan list, if requested.
Introduction
This mapping consists of frequent traceroute-style anal probes, one to each registered Internet entity. From this, we build a tree showing the paths to most of the nets on the Internet. We have no interest in the specific endpoints or network services on those endpoints, just the topology of the center of the Internet.
These paths change over time, as routes reconfigure and the Internet grows. We are preserving this data, and plan to run the scans for a long time. The database should help show how the Internet grows. We think we can even make a movie of this growth someday.
The simple layout algorithm produces some nice maps.
Map gallery.
Recent raw Internet mapping data.
Some maps of Serbia showing damage during the war.
Maps
This data yields a large tree-like structure. It is not easy to lay out a tree with 100,000 nodes. (Standard graph-viewing programs have traditionally considered 800 nodes a large task.) Our programs jostle the nodes around according to half a dozen simple rules, simulating various springs and repelling forces. A typical layout run requires 20 CPU hours on a 400 MHz Pentium.
We have made some maps from this layout. A map helps us visualize things, to pick out points of interest, and find things that warrant closer inspection. Once the layout is computed, the map can be colored to show a number of things. We don't try to lay out the Internet according to geography---people like John Quarterman are working on that. Besides, the Internet is its own space.
The layout can be colored in many ways: with geographical clues, network capacity, etc. An Internet atlas would be interesting. We currently have maps colored by distance from the test host, IP address, and geographic region.
These maps are quite smashing, if we do say so ourselves. The December 1998 issue of Wired Magazine has the layout generated from data collected in mid-September. Hal generated a color scheme based on the IP address of the nodes. This sick idea (Excuse me, may I have a prettier Internet address please?) creates a color scheme that seems to match Wired's traditional typography. But it actually does show communities that share similar network addresses.
Here is a .gif of the layout appearing in Wired.
Where are you on the Wired map? Don't ask. With nearly 100,000 nodes on the map, an index would be a huge sea of small type. Perhaps we'll make a web page where you can look it up some day.
We are actively working to make some versions of these maps available commercially as posters and perhaps other items.
Uses
This data has a number of uses, including co
There's a Mercedes gap too. I want one and can't afford one, but it's not government's job to do anything about it.
So he's made progress and needs volunteers, so, uh, forgive me if I sound stupid, but, uh, its been more than ONE DAY!!
This is a test. This is a test of the emergency sig system. This has been only a test.
...his web server is already unavailable within minutes of it being posted on Slashdot...
Mapping...Slashdot.org......
I, for one, welcome our new cybercartographic ov-- GOD SHUT THE HELL UP ALREADY
but goatse.cx swallowed all the traffic
Exactly why do we need a "map" of the Internet?
Life in Orange County
IP Address: 127.0.0.1
Computer: The one from Microsoft with the Start button in the bottom left hand corner.
Location: my bedroom.
The surprise isn't how often we make bad choices; the surprise is how seldom they defeat us.
SCO IPs are in the Mordor address space.
But with the always evolving nature of the Internet, this would need to be updated every day.
I think that last one is either wrong or way way in the future.
"Not knowing when the dawn will come, I open every door." - Emily Dickinson
Someone said that hell is the impossibility of reason. Slashdot is hell.
Okay, yes, I fully admit that it's cool to map the internet in one day. Regardless...I think I hear about some internet every other day.
There's John Quarterman who's been doing it for years, and then the CAIDA visualization tools, and Cybergeography and the Internet weather report and damn maps and more maps.
Note to everyone: please stop mapping the internet.
#/usr/sbin/traceroute *.*.*.*
Well, there's one less server to map...
Tired of being "punished" by the Slashdot $rtbl since 2002. I'm now over at http://soylentnews.org/ .
...*cough* Like handling the readership size of a /. news story :)
Mad Hatter
That must be the nastiest thing I have ever seen.
...It would probably be faster if you spend a day taking a picture of the internet yourself... Btw, why don't we permanetly slashdot SCO's servers? If it happended, SCO would get a BIG bandwich bill, and they would be unable to afford a good lawyer ;)
What do you do with it ?
Top that!
Well, I guess that map of the Internet has one less location to worry about now.
Where's my lobbyist? Right here.
SO why doesn't he write a downloadable client that can map the Internet. It will then be possible to let a computer map a specific ip-range close to its location. That will be faster than doing it from one location.
get a load of this
[o]_O
Just look at this picture. You're not a heterosexual man if you don't get a hard-on. Look at that wicked smile! She's the perfect dominatrix. Don her in black, tight leather corset and have her whip my ass with a riding crop and I'd be coming like there'd be no tomorrow...
Do any of you moderators notice that the word "anal" shows up in that so-called "mirror" comment? Only the poster (called TrollBridge) knows how much else that has been changed from the original...
the Internet came to him! And he was no more.
When I first saw the image on the right it looked like a human brain. It would be creepy if the Internet had a sort of fractal self-similarity to our physiology.
is more about geolocation than mapping, but I guess I deserve at least a passing mention :-)
Simon.
Physicists get Hadrons!
We just made his job easier. There is one less web server to map now!
Now map the people mapping the internet in one day.
If this article confuses you, don't worry. It was posted yesterday in a much clearer fashion.
Assume 1,000,000,000 web pages.
Assume average ping time is one milisecond (10^-3)
1,000,000,000*(0.01) = 1,000,000 (seconds)
1,000,000,000/60 = 16666.6 (minutes)
16666.6/60 = 277.7 (hours)
277.7/24 = 11.5 (days)
Remember, this is only to PING every page, not transfer/parse each page to find sub-pages.
It's a horrible picture!
It's be even better if he could overlay a map of the world so that we could easily identify regions.
And I thought goatse.cx was the pinnacle of anal-induced shock. This definitely tops goatse.cx as top trolling material.
All trolls. Please use this picture in the future.
A single guy with a single computer...
He's mapping the Internet. Why am I not surprised he's single?
It would be creepy if the Internet had a sort of fractal self-similarity to our physiology.
....
Agreed.
Good material for an X-Files episode
-kgj
-kgj
Why can't somebody just rsync the Google search cluster? Wouldn't it have the same results this guy is looking for?
This is a test. This is a test of the emergency sig system. This has been only a test.
How is it possible to map something that is always changing, and what use is such a map, if it can be created?
What about the reality that all nodes are no longer created "equal," so to speak?
Oh gosh, just one semicolon out of place...
Uh... is that 21st Century Math? Crap. My kids are going to come home from school and I won't be able to help them with their homework.
Read the EFF's Fair Use FAQ
"I knew I should have taken that left at Albuquerque." -- Bugs Bunny
This is a test. This is a test of the emergency sig system. This has been only a test.
Patent this asap before Amazon gets their grubby little fingers on it! =)
Good luck finding me! Even my boss doesn't have a clue where I am!
MMORPG Fan? Prove your worth!
it sound like a William Gibson novel, the one with the guy obsessed over the "form" or "shape" of the cyberspace being a "snapshot" of the universe. i can't seem to remember the name anymore
M$ Internet MapPoint 2003
Exactly why do we need a "map" of the Internet?
Because it is there.
http://www.techweb.com/printableArticle?doc_id=TWB 19991013S0007
When he finishes the map it will already be outdated and no representative of the truth. However, this is not a real issue.. one day (or ten hours) is better than anything else
__
Sig: Marine Stock Photos
Why bother mapping it, just post a link on /. and we've already sent a majority of the internet straight to him.
We should just stand in line, take a number, and tell him the path we took to get there.
She should have stopped by here first http://www.archive.org "The Internet Archive Wayback Machine contains over 300 terabytes of data and is currently growing at a rate of 12 terabytes per month. This eclipses the amount of text contained in the world's largest libraries, including the Library of Congress. If you tried to place the entire contents of the archive onto floppy disks (we don't recommend this!) and laid them end to end, it would stretch from New York, past Los Angeles, and halfway to Hawaii." http://www.archive.org/about/faqs.php?
He doesn't have to wait for one to respond to send another request. Its called parallelism and computers are good at it these days.. well, some.
"Thanks to the remote control I have the attention span of a gerbil."
Please explain how one pings a web page. Is this a feature of AOL?
Web pages are NOT internet hosts.
Web servers are relatively few compared with other types of hosts on the internet.
The World Wide Web is NOT the internet.
The World Wide Web is NOT the internet.
The World Wide Web is NOT the internet.
The World Wide Web is NOT the internet.
As a side comment, now I understand why my connection got so slow.
[Internet Mapping Project's] mapping also takes nearly six months to generate a single map. My comment was that, "I can write a program that can map the entire net in a single day."
The Internet Mapping Project maps the Internet in under two hours (105 minutes for this morning's run). I'm not certain where the six months came from. The rate limitation is the packet rate limit we set (500 packet per second).
Map layout time is not included in that time, but that is not done on a daily basis. A map layout take about six hours, as I recall. It only took a couple weeks to produce all the layouts necessary for a movie of the Internet from Aug 1998 to Jan 2001 based on the daily runs.
CAIDA also creates daily maps of the Internet as part of their Skitter project. Their schedule varies between measurement points. In addition, other projects, such as the Mercator project and the RocketFuel projects, also map or did map the Internet.
Each project has slightly different goals. Skitter focuses on paths to major web and DNS servers. Mercator attempted to discover networks with limited pre-knowledge. RocketFuel wants a very accurate map of a particular ISP. The Internet Mapping Project is focused on the router connectivity within and between public backbones.
I have actually an Al-Qaida network runnig in my bedroom. Its ok for me if you send me your patriotic brainwashed friends over here to nuke the entire country patriots are idiots!
Because we can.
You sure you're on the right website?
Endless arguments over trivial contradictions in books written by ignorant savages to explain thunder in the dark.
...when featured on Slashdot, now, is he? =^^=
This sig no verb.
> He's made some progress and is looking for volunteers.
Yeah, I'm always looking for additional ways to waste my time on pointless projects for free.
Then I came to my senses and decided to work on more practical and less controversial projects such as Nmap Version Detection. But the subversive in me still hasn't given up entirely on Nmapster :).
-Fyodor
Ha.Ha.Ha..Ha...Ha....
litigatorIE scams?
not so, reports the pateNTdead eyecon0meter kode base.
the planet is a little short of genuine heros nowadaze, so by the creators' mandate, the light bringers will protect the core developers of the gnu millennium, naturally.
we don't need any more fauxking fraudulent billyonerrors & torvalds doesn't need any corepirate nazi 'protection'. actually, it's quite the opposite, as they need his cooperation to survive their previous/ongoing greed/fear/ego based poor judgemeNT calls/bets.
& you won't need any phonIE payper liesense gadgets, not even a model rocket cam, to be able to sense which way the winds of change are bullowing at gale force/farce.
Due to excessive bad posting from this IP or Subnet, comment posting has temporarily been disabled. If it's you, consider this a chance to sit in the timeout corner. If it's someone else, this is a chance to hunt them down like with fuddles' corepirate nazi bouNTy hunters ?pr? scams. If you think this is unfair, we don't care.
You want to map the internet?
1 Setup a site saying you want to map the internet.
2 Get posted on slashdot.
3 Parse the referer logs.
4 ???
5 Profit!
You kids are so spoiled today. Back in the 60s we used to be able to map the entire internet using nothing more than a piece of string and 2 pushpins.
Huh? 2 nodes? Why the hell should that matter?
Endless arguments over trivial contradictions in books written by ignorant savages to explain thunder in the dark.
(tm) devise, fails again?
what a surprise?
Does anyone else worry that if he is successfull and the code is released that this will significantly slow network traffic? Just think about 100,000 people (may be a conservative number) all trying to map the internet at the same time. Would this result in effectively a huge DOS attack? Plus none of these maps would be complete because they are all competing with each other and it will make it even harder to access some sites.
Is this possible???
If it is, then I think that this is one instance that it will be in everyones interest to not have any kind of release of this product and naturally keep the source closed.
Not everything is analogous to cars. Car analogies rarely work.
http://slushdot.org/mirror/opte/
I think its an outstanding idea. Time and time again people come up with breakthroughs in technology after they discover a different way of thinking. Seeing something in a different way is often the first step in that process. This kind of thing would be an ideal candidate for a seti-at-home like solution where people use their screen savers to map their local area. Imagine how cool it would be to see the throughput over that map as well.
aka mapping@home
4Z5TX
Maybe people already take this into consideration, but won't this impact webhosting? Won't people try to get their webpage/company closer to the main trunk / center of map? When you look for a hosting service (basically an IP address) right now most people don't consider where in the map the host is.
I mean with this tool, I would look up where my new IP would land me and try to find a host closer to the main backbones. Is this already done now by most people?
(on another subject the maps remind me of the species origin stuff)
a regular wolverine in a penguin suit, we'll bet.
he doesn't appear to be afraid to speak his mind, either.
none so dedicated as volunteers, they say?
Here's a link to a page which links to these and other similar projects
Actually, it's kind of interesting. It would let us see into traditionally restrictive places like China.
It would be very interesting to know that a major portion of the Chinese Internet infrastructure went down, when it happened.
tasks(723) drafts(105) languages(484) examples(29106)
PARENT IS POSIBLE TROLL
how can this guy map all the porn in on day?
I am working on this for years now.
The code for this is distributed, then anyone on the internet can scan the entire internet for some nuance on this purpose.
(shiver)
Perhaps a centralized open database would be a good idea.
People who disagree with you are not automatically evil, greedy, or stupid.
Has anyone noticed that nearly all of the maps have a more or less tree-shaped structure?
This means concentration of power. So, the real, failure-tolerant internet is gone, at least it seems to be.
You've uncovered SkyNet!
--
But then again I thought VCR+ was a stupid idea and would die a quick death--so what do I know?
geez, why is this news even up here at smashblot?
:)))
/. the next stories url instead :))))
if that dude claims he'd need only one day, so why isnt he long done by now.
jeez, even i couldnt think of a more stupid aproach and a more silly marketing blah blah than him
leave that server of his alone, cuz there is nothing new to his stuff any more. since he is long done by now, move along people, and
l8r
subject sess all..
This Picture has two hosts out of place, one on the far right and the other far bottom, with the billions? of others all next to each other..
Why?
Browse at -1, because trolls are often the most creative part of
I'm mapping teenkelly.com right now!
Quick, aren't you?
So if we assume the electrical signals of his packets travel at the speed of light (186 000 miles per second) across the internet (which they don't really, but we'll ignore that for this argument), then logic tells us that the internet must have less than 16,070,400,000 miles of cable in order for this to work. Because his data cannot travel any faster along the pipes.
And that's only one way... Assuming query and response, his packets have to effectively travel double the existing cable lengths.
So do all the (public) networks in all the world total less than 16 trillion miles of cable?
No unauthorized use. Trespassers will be shot. Survivors will be shot again.
Happy Halloween everyone!
Sounds like a waste of bandwidth, hurrah...
MD5 (gnupg-1.2.3.tar.bz2) = cdca1282d7901f9ddb52f9725b001af2
indeed!
And the muscular cyborg German dudes dance with sexy French Canadians
I have followed various projects related to mapping cyberspace through the years and have always found An Atlas of Cycerspaces to be fascinating.
Mapping by Lumeta is one such methodology and I even have a poster of theirs printed by Peacock Maps (server down just now) in my office.
I have noticed that these mappings take a long time to complete and being able to map in a short time frame could be beneficial in much the same way that Internet Traffic Report can be to visualize traffic patterns or disruptions.
Taco
OK, so you map the ever-changing net in a day.
A week later, you map it again. Eventually, you're mapping it every day. After a yeear or two of that, you have a cool little animation of how the internet changed. You project it on the wall of a dark room, and watch it koop, and go "wow".
We know the real reason you didn't do it. The RIAA scared you, didn't they, when they showed you their copyright on the IP address scheme...
They should sell their data to DoubleClick. They could serve geography-sensitive banner ads! If they know you live in San Francisco and you are visiting a food web site, they could serve up banner ads for local San Franciso restaurants.
I think there's a company called MaxMind GeoIP that already does this.
cpeterso
No. They are limited by the speed of IP, which is not only slower, but its speed is random within a fairly large range. So to be safe, we have to asume the total cabling on the internet is (think, think, think) less than 3 meters.
You have to be one humorless ignorant moderator to consider this flamebait. They guy made a type of joke that is all too common around slashdot, which is to misinterpret the title's meaning on purpose.
OK, the joke is not that funny, but flamebait?! More like "moderator is a moron."
I'd be interested in seeing a real global world map with the locations of servers pinpointed on the map to show the density of computer equipment around the global. Actually, it wouldn't even need the real map to exist, if all the points of light to represent a computer server were placed in their proper geographic locations, I bet you'd get a very good mapping of the world. In fact, it would probably look similar to the famous map of the world at night where the lights from industrialized countries creates a spectacular image of the developed world.
Does such a map exist? Is somebody working on one?
--
RumorsDaily
What's disturbing about the current map thus far, is it clearly shows how CENTRALIZED the internet really is. This old idea of traffic routing around damage is in fact a rather fragile network of handfull of backbone nodes. I would have expected more lower hierarchical nodes crisscrossing the network, forming more of spiderweb system, rather than everything going across 3 or 4 nodes.
www.enthea.org
Why not map Autonomous Systems instead? Routes to AS are being advertised by BGP, and a set of well placed looking glasses would be all it takes to get a big picture. I never saw anything like an AS mapping, with the ASes as nodes and the (BGP announced) routes between them as links.
Of course, some AS span multiple geographical areas, but this is also true of class C networks.
The big advantage of mapping ASes is, that there are not so many of them, compared to class C nets, thus resulting in much simpler graphs. Moreover, the graphs would nicely show the boundaries between institutions/organizations, rather than artificial boundaries based on numerical addresses.
cpghost at Cordula's Web.
The problem you allude to is believed to be responsible for the power-law behavior of the Internet. If you look at the distribution of degrees, there are more highly-connected nodes than there should be if the graph was random. The distribution can be explained if people are more likely to connect to nodes that have high degree already.
On the other hand, these maps are not the cause for either of the behaviors above. These maps generally only show IP-level connectivity, ignoring link-layer tunneling, which can be very important. In addition, you have to additionally consider latency, loss rates, and bandwidth at least to some extent. Pure hop-count is what these maps show, and that is only a decent prediction of performance, not a great one (like clock rates for processor performance, if that helps at all).
There are other factors that go into location selection. One such factor is which machines will you talk with. You do not care much about your connectivity to hosts in Norway if you are running a US-only business. Another example factor is price. Few people are willing to pay for a T1 across North America to improve your speed by 10%. While you could put your computers in a co-lo instead, that only helps for servers and incurs yet more costs.
The maps are nice representations, but, generally, more analysis is necessary before useful data can be extracted from them, including computing the best location to connect to the network.
All that said, yes, most people connect to highly-connected nodes. They just generally estimate those nodes, rather than doing direct measurements.
I see his trick already. Post on /. that you plan to map the entire net and then wait till the entire net maps its way to you.
P.S.
Is there such a thing as trecart ?
Maybe you live in interesting times
"Got to see the whole net
From Yahoo on down to eBay--
In just one day!"
Notice that he maps the paths from his computer to the rest of the world. That is not the same as a map of the entire Internet.
To illustrate, if I map routes from, say Chicago, I'm likely to miss the direct connection between Seattle and San Francisco, as there is no traffic I could generate that would take that path.
Until it features a big arrow that says "You are here!" I'm not interested.
Since we all like pretty graph pictures go over to: http://networkviz.sourceforge.net/ and look at the packages out there. Many of these need help, so don't hesitate to offer your services if you like graphing. Most of these would be able to view these internet graphs interactively, which would be far more exciting than just pictures.
If he's "already made progress" doesn't that defeat the "map the Internet in one day" promise... just think about that ;).
his day is up!
Conformity is the jailer of freedom and enemy of growth. -JFK
A somehow similar, i.e. a semi-private Internet Auditing Project by Liraz Siri (for which BASS was written) five years ago (only 36,431,374 hosts, mind you) took twenty days with five scanning nodes. I highly doubt today Internet could be scanned in one day with a single host. Remember that this single host will be attacked, like the Liraz Siri's hosts was:
The keyword here is "backups." Remember that scanning the entire Internet you will step on someone's toes.
(By the way, it's good that this story was posted on Slashdot, since I could be the one counterattacking them and making idiot out of myself --- not that it has ever happened before...)
Sincerely,
Pan Tarhei Hosé, PhD.
"Homo sum et cogito ergo odi profanum vulgus et libido."
Web servers are relatively few compared with other types of hosts on the internet?
/20's worth) of distributed transient users, but for the purpose of this particular mapping exercise that's irrelevant -- the maps shows (network) centrality rather than geographic location...
Really? Compared to what? Routers and switches? Access servers? Storage devices? Perhaps you mean that web servers account for a relatively small small amount of IP address space? A single access servers can accommodate an awful lot (maybe the equivalent of a
In a sense, the results of the project do seem to match earlier research on the topology of the web; at a glance, the graph arrived at, does seem to be scale-free in nature.
Which, actually raises an interesting question. Scale free networks, by their nature, are supposed to have certain highly connected nodes, the connectivity of which, is extremely critical to the network as a whole.
In particular, look at the resultant graph for one-third of the net. Note the single link in the middle between two nodes that seems to connect all four sub-trees together. Now imagine that link being, say, DDoS'ed. (You can see it in the one-fifth-of-the-net graph as well; only, it's more clear here)
(Additional points for all you neurologists out there:- we've been comparing the structure of the human brain with that of the Internet, do you know of any such neurons?)
[Even more points:- Will you tell the world if you've found one? :-) ]
More than mere navel gazing.
By the time he finds us, the 24 hours will be up.
The support that the world provides to projects like this makes me feel better as a human.
Idiot. Who's paying for this nonsense? Taxpayers as usual?
The stuff's not even pretty.
Academic masturbation.
the Internet makes a map out of you.
Bush is on fire and its not good for my lungs.
...I'd be more busy fapping the whole Internet.
In fact, BRB.
The maps at http://idl.net/MAP seem different than the brain tree displays. Lots of squares from dns connections I guess. Theres some tracert ability. Anyone know more about this method ?
> When I first saw the image on the right it looked
> like a human brain. It would be creepy if the
> Internet had a sort of fractal self-similarity to
> our physiology.
Oh, God, no! I don't want to know how the GAPING HOLE of unused address blocks look like!
"If anyone has a Co-Located system on a nice network to donate to this project for a few months, I would be very happy [because I am mapping the Internet in one day as a single guy with a single computer without any help from anyone!]" --- WTF?
I can see you have a whole class A network on 127.0.0.0/8 ---
*runs "nmap -v --randomize_hosts -p1- -O -T Insane 127.0.0.0/8" and goes to make an espresso*
It's not clear to me where the idea came from that it takes /24s on the Internet to limit consternation
us 6 months to map the Internet. Our daily run takes
an hour or two. We do not "expand"
the search to
of the scannees.
I'd be interested in seeing the layouts. The last
time I looked Steve North's stuff couldn't handle
dataset of this size, but that was a long time ago.
Others are collecting data that is probably more useful
than ours on the Internet. Check out CAIDA's work
and especially Rocketfuel.
Our bread-and-butter is scans of intranets, which tend to
be smaller, but need to have the data from several points
integrated into one data set.
We are still collecting the IMP data, and now have
about five year's worth of nearly continuous data.
ches
We need a 'Admin Apriciation Day', where all the admins pull the plug on the main systems and let the redundancies do their work. That way we can get maps of those too :).
GPLv2: I want my rights, I want my phone call! DRM: What use is a phone call, if you are unable to speak?