98% of DNS Queries at the Root Level are Unnecessary
LEPP writes "Scientists at the San Diego Supercomputer Centerfound that 98% of the DNS queries at the root level are unnecessary. This doesn't even take into account the 99.9% of web pages suck or are unnecessary anyways. This means that the remaining 2% of necessary DNS queries are probably not necessary either."
If the authors actually thought how the DNS works they would realise the reason for this. A DNS server that gets a request for .com will consult the root the first time and then cache the result. So even though the server might then get a million hits in .com it won't ask the root again.
If the server tries to query for a non existent domain it will get back a 'non-existent' response. Now it will cache that response for some time but the chances of getting a cache hit is actually pretty low.
So if you have a properly configured DNS with a bunch of web surfers that view 1 million pages in 20 TLDs and 1,000 bogus ones they will generate 20 hits they would classify as genuine and 1,000 that were 'unnecessary'.
That is how the system is meant to work.
The 70% of repeated requests are likely to include outright attacks as well as misconfigured DNS systems.
The problem dealing with these issues is that a DNS query is pretty cheap to handle, cheaper in fact than most of the proposed defenses. It is probably more expensive for a DNS server to check IPs against a blacklist than to just return the damn data...
Looking for an Information Security student project suggestion?
Try http://dotcrimeManifesto.com/
This is actually an excellent idea and one that people who use opennic do already. The root zone "." at OpenNIC is setup to be slaved so my DNS server downloads a copy of the root zone which has all the information for all the top level domains. If the root zones get DOSed I don't care because I don't use them anymore. Everyone should use OpenNIC. It is the Internet friendly thing to do. :)
-- Thou hast strayed far from the path of the Avatar.
I believe that reverse lookups are identified by an "inverse" status flag in the request header. One can only assume that the authors were not counting this sort of valid query, and were only focusing on the "standard" queries that contained IP addresses. Those certainly would, I think, be rather pointless.
It obviously seems to be a lot of junk traffic, but the only part we can say is bad requests are part 3 and 4 from the chart. Bad spellings must go to the root since there may be such domains!
It would be nice to analyze the 70% repeated or identical queries, probably lots of traffic can be explained for (or else there are a bunch of administrators out there who need a good manual on bind).
To do a reverse lookup, the resolver sends a different request type, asking for a PTR resource record. The form is to put the IP address (or network address) backwards, and append
If you have your own DNS server and watch your DNS traffic, you can see these two effects happening differently.
For a forward (A or MX record) lookup:
Local server queries root server for an A record
Root server responds with NS record for the registry of the domain
Local server contacts registry server for A
Registry server responds with NS records for the domain
Local server contacts the domain's server, which responds with an A record
Local server answers the resolver with the A record.
For a reverse (PTR) lookup, the resolver traverses the netblock providers:
Local server queries the root servers with a properly constructed PTR request (z.y.x.w.in-addr.arpa.)
Root server knows only where major net blocks are allocated, and returns the NS record of a Regional Internet Registry (RIPE, APNIC, etc)
Local server again queries an RIR NS with the PTR
RIR NS knows which ISPs hold which blocks, so responds with the ISP NS record
Local server again queries the ISP NS server, which either has the reverse hostname, or once again returns the NS record of the the local DNS server.
The two different types of queries follow different paths, either Name Registries or Netblock Providers. This article points out that many resolvers are broken because they allow obvious reverse lookups to pass as forward lookups, and then can't deal with the resulting error messages.
I have often seen broken resolvers repeatedly query DNS servers I manage, possibly because as the article points out, fucked firewalls allow the requests out, but block the requests from getting back to the resolver. It happens so much I just ignore it when I see it, its not worth notifying the admins because they are usually too clueless to know how to fix the problem.
the AC
Hemos is like...sci-fi fans;he thinks technology is cool, but he hasn't bothered to understand the science it's based on
Well, that's the theory. In practice, however, there are millions of servers out there that do not cache NXDOMAIN at all, and just keep querying, over and over and over again, for TLDs that they've already been told don't exist. Microsoft's name server has been known to do this.
At one point, f.gtld-servers.net was seeing millions of repeated queries per hour from the same two .mil servers asking the same question and refusing to accept the NXDOMAIN. For long periods, these two servers were asking the same question multiple times per click of F's timer. That's.. ummm.. Bad.
I suggest that you read the actual CAIDA paper, and the other papers on the subject that Evi Nemeth and others at CAIDA have produced. They *have* thought about how the DNS actually works in practice. You've only thought about how it would work if every implementation worked perfectly, according to your expectations.