98% of DNS Queries at the Root Level are Unnecessary
LEPP writes "Scientists at the San Diego Supercomputer Centerfound that 98% of the DNS queries at the root level are unnecessary. This doesn't even take into account the 99.9% of web pages suck or are unnecessary anyways. This means that the remaining 2% of necessary DNS queries are probably not necessary either."
And they assumed the other 12 were exactly the same? Wouldn't looking at 2 at least be merited?
On a similar note, I noticed that AOL causes a lot of DNS lookups. From what I can see from my firewall logs, each TCP connection from an AOL user is handled by a separate proxy. Each proxy then does its own lookup on the host. So, for a normal sized webpage with some images or whatever, you get like 10 TCP connections for the content and 10 UDP connections for the DNS lookup. Seems kind of excessive to me.
From the article:
"Researchers believe that many bad requests occur because organizations have misconfigured packet filters and firewalls, security mechanisms intended to restrict certain types of network traffic. When packet filters and firewalls allow outgoing DNS queries, but block the resulting incoming responses..."
It's nice to see a story with info I can take and use. This is actually "stuff that matters".
Kudos to the researchers, and now I am off to check my firewall.
There are 01 kinds of cars in the world. The General Lee, and everything else.
Is it just me, or is this a description of a reverse lookup? How does that qualify as unnecessary? This is a pretty common step in troubleshooting, and some software does a reverse lookup following a forward lookup to verify that the hostname it gets back is the same one it started with.
Chuckles
Why don't DNS servers have a list of correct top-level domains, in order to answer directly, without going to a root server? The list is short, compared to the information the DNS server caches already, and the content of the list doesn't change so often. This list could be downloaded once in a day or so, from the DNS root servers.
When packet filters and firewalls allow outgoing DNS queries, but block the resulting incoming responses, software on the inside of the firewall can make the same DNS queries over and over, waiting for responses that can't get through
Why the hell does a firewall accept outgoing queries to black-listed domain names, if they are configured to block the response to these queries? This seems like a serious misconception to me.
JB.
And that's a problem? My understanding was dealing with this sort of thing was exactly the purpose of the root DNS servers. If every ISP's DNS server was pre-configured to recognize valid and invalid top-level domains, you could just set them up to go straight to the specific DNS servers handling those domains (.com, .net, .org, etc.) There would be no need for a root-level system.
The argument for allowing this kind of cracked query through to the root server is that it makes it easy to add new domains (.elvis, .corp, what have you) without forcing everyone to reconfigure their DNS boxes for each new top-level domain.
Ummm... what does IPv6 have to do with DNS vanishing? With 128-bit IP addresses in an ugly hex-colon notation... DNS will be even more important when people move to IPv6.
The problem with DNS (and SMTP) is that they are protocols developed during a time where everyone on the internet was operating in a cooperative mode. Now that there is a proliferation of SPAM and DOS attacks, these old protocols break down because they were not developed with security in mind.
DNS will not go away. But the protocol will probably change at some point.
--
"What do you want me to do? Whack a guy? Off a guy? Whack off a guy? Cause I'm married."
Actually go deeper than that...what really needs to happen is a redesign of the underlying core of the whole damn thing...DNS, DHCP, and Routing need to be combined into a single protocol and server implimentation(already particially have that in DDNS)...but taken a step further(and I am being intentially light on details here, since its a huge subject) it would make the whole thing easier esspecially in todays world where everyone and thier brother has a web site (or other service) attached to their cable/DSL line, and they can't get a static IP and never mind getting IPs they migh own routed behind that IP to the rest of the world. One protocol that could publish IP/Domain Name/Routing for the whole shooting match through a rooted, treed and P2P system...(The root maintains order, tree allows clients to work backwards through the tree till they find the information they are looking for till they hit the root, the P2P moves updates around with sequence numbers probably in MD5 ro something to maintain chronology)...this is by no means the full idea, but might be a good seed....
Power Corrupts,Absolute Power Corrupts Absolutely, leaving one person(group)in charge is absolutely corrupt.
1. Bad request received by a root server
2. Root server notices it's of the 'non-existent top level domain' variety.
3. Root server sends back information pointing to an ip that shows a web page with a nicer version of 'either you clicked a FrontPage created link, you are a monkey banging a banana on the keyboard, or your ISP administrators don't have a clue'.
Advantages: It'll embarrass ISP's. It'll cut down on the traffic to the Root Servers.
Disadvantages: It'll only be noticeable with web queries.
This is not a dream, not a dream...we are transmitting from the year 1-9-9-9.
ya know, that's not impossible these days.
What with the private subnets you can't get to, and coorporations buying up whole class IP blocks, you're not going to need to map every single IP to a set of names.
Say you need to map 2**30 names. Give each name 256 bytes to list the hosts using that ip. You've just used 256GB. Alot, yes, but I'm willing to bet at least one person reading this has that much storage dedicated to MP3s.
I'm surprised that they did not mention massive numbers of "broken" requests from Windows 2000/XP systems. I see this all the time due to misconfigurations. Administrators often set up the Windows 2000 DNS servers incorrectly and Windows 2000/XP systems(workstations and servers) configured such that they constantly try dynamic DNS updates to the wrong DNS servers, even the root servers.
Linux too, has some issues here. Obviously misconfigured DNS servers will always be a problem but, distros like Red Hat have IPv6 support compiled into the BIND RPM, this results in an IPv6 formatted query folllowed by an IPv4 query for every request.
Yesterday I querried the root servers once a minute to see if they had been updated. Why? Because Network Solutions screwed up and transferred a domain that I manage to their own name servers; I had to put a request in to change it back to our name servers and wait, wait wait. I wonder how common that is :)
This is somewhat of an invalid metaphor for both the way dns works, and the way computer caching works. Pretty much every local DNS server(unless my information is wrong), has some sort of caching system of varying degrees of efficiency. The problem is that unlike humans who are more likely to remember things if they are repeated, caching usually just consists of a series of entries which can quite easily be overwritten, older entries will be overwritten if they aren't updated or caching would never work for new frequently accessed sites. It's quite easy to get an access pattern which would remove even the most frequently accessed files from a list especially on a server with a great deal of users. By providing different servers for each chunk of users you can diminish this problem but then you'll get requests from each server. DNS is an ugly system because it does and ugly job.
It all comes out in the end anyway. Say AOL has 100 proxies. If 10,000 AOL users visit your site, then it'll look like only 100 unique visitors. Granted this is more than the 1 unique visitor that it would look like for most proxies, but it's still less than the actual number, not more. Presumably there are significantly less proxies at AOL than there are users. It only really matters to small sites like yours and mine, where we're getting excited about each and every visitor, and 10 all at once makes us need a new keyboard.
If the First 98% are unnecessary and the last 2% are unnecessary as well...that's 100%...
That means that you just explained and wished for the Internet to go away...
or...you some how figured out how an end user can magically come up with the IP for a Host name from thin air. Go You. Your a Millionaire.
www.fotoforay.com
Really, we should have some sort of gnutella-like system for distributing zone files. The problem with DNS is that it was designed a LONG time ago before the more recent advances in P2P networks.
There shouldn't be much argument at this point that we need DNS2 - the current system is vulnerable to attack.
The problem is that, if you distribute zone files (or pieces of zone files) among a loosely-connected network, then you will need to establish trust. These zone files would have to be signed, and the certificate authority then becomes the bottleneck.
It hurts my head.
The Internet was shown to be a scale-free network by U. Notre Dame physicist Barabasi. It means that the majority of the Web Page Requests is only for a fraction of the total Web Pages (the 'hubs').
Thus the 98% DNS Queries might be needed for only a minority of connections (I am assuming that Web Traffic is the bulk of Internet Traffic here).
Wow crazy. I thought you just made that number up, but if I make a text file with "xxx.xxx.xxx.xxx" in it, find out how big that is, and multiply times the number of hosts in ipv4, I get 268.4 gigabytes. Very interesting. ipv6 is gonna keep that dream a dream.
slashdot: where everyone yells sarcastic metaphors to themselves to understand the issue
While the moderators seem to think this is a "funny" idea, I personally kind of like it. Only, I would recommend creating increasingly long delays in the response or to increase the number of dropped requests. You want that sysadmin who pulls his head out of his ass to at least be able to download fixes and such.
SPF support for most open source mail servers can be found at libspf2.
I've read through every comment on this page, hoping someone could explain this to me, but alas, I guess I'll just ask.
How does a query containing a valid IP address ever get to a root server?
It just seems to me that anything with an IP would bypass DNS altogether. Thanks!
I guess you could argue that it was unnecessary--but I was the one laughing when everyone else thought "the internet was down" when they couldn't get to Yahoo Games, Hotmail, and Google.
Of course, it might be noteworthy to mention I find far more "relevant" uses of my Internet connection. :)