Slashdot Mirror


98% of DNS Queries at the Root Level are Unnecessary

LEPP writes "Scientists at the San Diego Supercomputer Centerfound that 98% of the DNS queries at the root level are unnecessary. This doesn't even take into account the 99.9% of web pages suck or are unnecessary anyways. This means that the remaining 2% of necessary DNS queries are probably not necessary either."

26 of 426 comments (clear)

  1. Highlight... by swordboy · · Score: 1, Informative

    About 12 percent of the queries received by the root server on Oct. 4, were for nonexistent top-level domains, such as ".elvis"

    Now there's your 2 percenter right there!

    --

    Life is the leading cause of death in America.
    1. Re:Highlight... by Zeinfeld · · Score: 5, Informative
      About 12 percent of the queries received by the root server on Oct. 4, were for nonexistent top-level domains, such as ".elvis"

      If the authors actually thought how the DNS works they would realise the reason for this. A DNS server that gets a request for .com will consult the root the first time and then cache the result. So even though the server might then get a million hits in .com it won't ask the root again.

      If the server tries to query for a non existent domain it will get back a 'non-existent' response. Now it will cache that response for some time but the chances of getting a cache hit is actually pretty low.

      So if you have a properly configured DNS with a bunch of web surfers that view 1 million pages in 20 TLDs and 1,000 bogus ones they will generate 20 hits they would classify as genuine and 1,000 that were 'unnecessary'.

      That is how the system is meant to work.

      The 70% of repeated requests are likely to include outright attacks as well as misconfigured DNS systems.

      The problem dealing with these issues is that a DNS query is pretty cheap to handle, cheaper in fact than most of the proposed defenses. It is probably more expensive for a DNS server to check IPs against a blacklist than to just return the damn data...

      --
      Looking for an Information Security student project suggestion?
      Try http://dotcrimeManifesto.com/
    2. Re:Highlight... by Zeinfeld · · Score: 4, Informative
      Though I wonder how the 'search from address bar'-feature has affected the number of non-existent queries.

      A way to tell would be to see how many of the queries were looking for mx records.

      I suspect that people using dummy email addresses like 'a@b.c' for subscriptions are another major cause of the misfires.

      The browsers doing search from the address bar probably reduces the number of misfires. A modern browser will only go to DNS if it sees something like foo.bar. If it just sees foo it will typically try foo.com and then go bang a search engine.

      Another reason I suspect spam is a major issue in the misfires is that lots of spam filters do lookup on sender addresses and those frequently point to non existent domains. Also the spam senders rarely do the most basic filtering on their lists - you can tell that since every now and again you get a spam with a full sender list at the top and you can see the broken addresses right there.

      --
      Looking for an Information Security student project suggestion?
      Try http://dotcrimeManifesto.com/
    3. Re:Highlight... by pde · · Score: 5, Informative
      If the authors actually thought how the DNS works they would realise the reason for this. A DNS server that gets a request for .com will consult the root the first time and then cache the result. So even though the server might then get a million hits in .com it won't ask the root again.

      Well, that's the theory. In practice, however, there are millions of servers out there that do not cache NXDOMAIN at all, and just keep querying, over and over and over again, for TLDs that they've already been told don't exist. Microsoft's name server has been known to do this.

      At one point, f.gtld-servers.net was seeing millions of repeated queries per hour from the same two .mil servers asking the same question and refusing to accept the NXDOMAIN. For long periods, these two servers were asking the same question multiple times per click of F's timer. That's.. ummm.. Bad. I suggest that you read the actual CAIDA paper, and the other papers on the subject that Evi Nemeth and others at CAIDA have produced. They *have* thought about how the DNS actually works in practice. You've only thought about how it would work if every implementation worked perfectly, according to your expectations.

    4. Re:Highlight... by Zeinfeld · · Score: 2, Informative
      Well, that's the theory. In practice, however, there are millions of servers out there that do not cache NXDOMAIN at all,

      That is hardly suprising since a lot of servers don't even cache the positive hits.

      The report said 70% of the hits were repeated requests. Again this is not too suprising, the root zone caches really well. There are less than 200 domains after all and only 20 of those have a significant degree of activity. The TLD configurations change so infrequently that the TTL could be set at a month without inconveniencing anyone.

      So the 'necessary' traffic for the root servers is negligible. Even with a million odd DNS servers out there each root need see no more than a few tens of thousands of hits an hour.

      It makes no real difference since the roots have to be scaled to be able to survive a sustained DDoS attack for at least as long as it takes remediation measures to kick in. Get rid off all the bozo queries and you still need the same size box because of the script kiddies.

      There are a bunch of changes that could be put in place that would reduce the DDoS problem. First we could follow the proposals of Mark Kosters and Paul Vixie to start using anycast (this looks like it is going ahead).

      Another thing we could do is to change the DNS logic so that servers keep records in their cache beyond the TTL and use those as backup if the root or TLD is unavailable. Then even a DDoS that succeeded would have only marginal effect.

      --
      Looking for an Information Security student project suggestion?
      Try http://dotcrimeManifesto.com/
  2. Re:Incorrect top-level domains by dfn5 · · Score: 5, Informative
    Why don't DNS servers have a list of correct top-level domains, in order to answer directly, without going to a root server?

    This is actually an excellent idea and one that people who use opennic do already. The root zone "." at OpenNIC is setup to be slaved so my DNS server downloads a copy of the root zone which has all the information for all the top level domains. If the root zones get DOSed I don't care because I don't use them anymore. Everyone should use OpenNIC. It is the Internet friendly thing to do. :)

    --
    -- Thou hast strayed far from the path of the Avatar.
  3. Re:AOL by toast0 · · Score: 3, Informative

    i doubt it.

    it is common knowledge that aolusers come through aol's proxies, and the proxy hostnames contain proxy in them, so it should be fairly obvious

    also, anybody who is running web statistics should know the following things:
    1) web statistics are inaccurate
    2) proxies screw up web statistics
    3) not all proxies are visible
    4) refer to 1 and 2

  4. One factor... by ZoneGray · · Score: 4, Informative

    One factor is that I suspect people are increasingly lowering their TTL's, expires, or whatever that parameter is. Most of the manage-it yourself DNS providers now allow an option toreduce that to a few minutes, which makes it much easier to move hosts around. And while a low setting increases DNS traffic, it rarely if ever incurs an extra cost to the domain holder.

  5. Re:Ignant by dachshund · · Score: 5, Informative
    Is it just me, or is this a description of a reverse lookup? How does that qualify as unnecessary?

    I believe that reverse lookups are identified by an "inverse" status flag in the request header. One can only assume that the authors were not counting this sort of valid query, and were only focusing on the "standard" queries that contained IP addresses. Those certainly would, I think, be rather pointless.

  6. Original story... by Goodbyte · · Score: 5, Informative
    It' seems this originally came from UCSD, so when the page gets /.:ed, here is another one: Original story, and the interesting pie-chart from original story.

    It obviously seems to be a lot of junk traffic, but the only part we can say is bad requests are part 3 and 4 from the chart. Bad spellings must go to the root since there may be such domains!

    It would be nice to analyze the 70% repeated or identical queries, probably lots of traffic can be explained for (or else there are a bunch of administrators out there who need a good manual on bind).

  7. Re:Not really "broken" queries by aridhol · · Score: 4, Informative

    Actually, according to RFC 2606 (Reserved Top Level DNS Names), .localhost can be blocked by the local DNS, as it is an invalid name (along with .test, .example, .invalid, and .example.(com|org|net)). These are supposed to be used for testing and documentation, so if they aren't in use, they may as well be blocked.

    --
    I can't say that I don't give a fuck. I've just run out of fuck to give.
  8. Re:Not really "broken" queries by alanjstr · · Score: 3, Informative

    No, but the ISPs are supposed to query once a day or so and cache the results, so that the ROOT server isn't the DNS server that Everyone queries.

  9. Re:AOL by Goodbyte · · Score: 2, Informative

    Maybe so, but all requests should go to a dns-server at AOL which will cache the results. So if all users make a request for a domain in the same top-domain, there should still only be one request to the root-server.

  10. Re:Why... by PhxBlue · · Score: 4, Informative

    Excellent point, and I hope whomever has modpoints today will mod the parent up. Your PC is a sieve of information even with nothing more than a web browser and E-mail client. When you install IM applications or, gods forbid, file-sharing applications like KaZaa, the sieve becomes a fount.

    I've made a couple other posts regarding this in the past week or so, to point out that most applications don't need access to port 80, for example. E-mail doesn't need it, and IM programs certainly don't need it. ICQ uses a port in the 400 range somewhere, IIRC, for its message traffic; but it uses port 80 to report usage statistics to Mirabilis and to download banners. So does it really need port 80? Nope--you can save yourself bandwidth and gain privacy by blocking it.

    The list goes on, of course; but my biggest gain from firewalling my PC has been the freedom to restrict outgoing traffic.

    --
    !#@%*)anks for hanging up the phone, dear.
  11. Re:Ignant by Anonymous Coward · · Score: 1, Informative

    As I understand things, reverse lookups are never done by sending the IP address. To do a reverse lookup on 131.202.1.3, you'd do a PTR type lookup on 3.1.202.131.in-addr.arpa

    % dig 3.1.202.131.in-addr.arpa ptr ... ;; ANSWER SECTION:
    3.1.202.131.in-addr.arpa. 43052 IN PTR mars.csd.unb.ca. ...

  12. That's not the same problem... by robbo · · Score: 4, Informative

    That's a local problem, between the user and AOL's DNS servers. The article is descibing a different, higher-level problem between, for example, AOL's DNS servers and the root-level servers. If an AOL user's machine makes ten DNS requests for the same host, only one request should propagate past AOL's nameservers, but instead a misconfigured DNS will propagate all ten.

    I can suddenly see lots of slashdot users thinking-- oh, I should fix my firewall, I have all these DNS requests; but that's normal operation for a client workstation. Your firewall would be broken only if all your DNS queries failed, and you'd know it pretty fast if that were the case.

    --
    So long, and thanks for all the Phish
  13. Re:Ign(or)ant by anticypher · · Score: 5, Informative
    Its not just you, the two completely different DNS databases require different lookups, a common enough mis-understanding. Consider yourself less ignorant now :-)

    To do a reverse lookup, the resolver sends a different request type, asking for a PTR resource record. The form is to put the IP address (or network address) backwards, and append .in-addr.arpa to the request. All (well, ok, most) IPv4 addresses are mapped under the .in-addr.arpa domain. But these misconfigured resolvers are sending A (address) record requests but with a IP address included instead of a domain name.

    If you have your own DNS server and watch your DNS traffic, you can see these two effects happening differently.

    For a forward (A or MX record) lookup:

    Local server queries root server for an A record

    Root server responds with NS record for the registry of the domain

    Local server contacts registry server for A

    Registry server responds with NS records for the domain

    Local server contacts the domain's server, which responds with an A record

    Local server answers the resolver with the A record.

    For a reverse (PTR) lookup, the resolver traverses the netblock providers:

    Local server queries the root servers with a properly constructed PTR request (z.y.x.w.in-addr.arpa.)

    Root server knows only where major net blocks are allocated, and returns the NS record of a Regional Internet Registry (RIPE, APNIC, etc)

    Local server again queries an RIR NS with the PTR

    RIR NS knows which ISPs hold which blocks, so responds with the ISP NS record

    Local server again queries the ISP NS server, which either has the reverse hostname, or once again returns the NS record of the the local DNS server.

    The two different types of queries follow different paths, either Name Registries or Netblock Providers. This article points out that many resolvers are broken because they allow obvious reverse lookups to pass as forward lookups, and then can't deal with the resulting error messages.

    I have often seen broken resolvers repeatedly query DNS servers I manage, possibly because as the article points out, fucked firewalls allow the requests out, but block the requests from getting back to the resolver. It happens so much I just ignore it when I see it, its not worth notifying the admins because they are usually too clueless to know how to fix the problem.

    the AC

    --
    Hemos is like...sci-fi fans;he thinks technology is cool, but he hasn't bothered to understand the science it's based on
  14. Re:No wonder these servers have so many problems by wstearns · · Score: 2, Informative

    (I do realize that the post was supposed to be funny, but I suspect that people will wonder why there aren't more if the 13 get overloaded). This was tried a few years back; additional nameservers were put in place. Because the query for the root nameservers no longer fit in a udp packet, dns servers had to fall back to dns/tcp requests just to get the list of root nameservers, and we were reminded that a large number of firewalls block dns/tcp. With so many sites no longer able to make any dns lookups, the number was dropped back to 13 within a day.


    For those that would like to try the dnstop package mentioned on the site, I have signed rpms available.

    --
    Mason, Buildkernel and more: http://www.stearns.org/
  15. Re:Ignant...you've got it all wrong. by Agent+Green · · Score: 4, Informative

    Reverse lookups go by sending a PTR request containing an IP address to a DNS server, versus a A request with a name as a snippet from this TCPdump shows a request from one my boxen to my DNS server:

    Reverse:

    12:59:31.814847 defender.licensedaemon > gimpy.domain: 20091+ PTR? 1.65.0.199.in-addr.arpa. (41)
    12:59:31.816003 defender.1029 > arrowroot.arin.net.domain: 19500 [b2&3=0x10] [1au] PTR? 1.65.0.199.in-addr.arpa. (52)

    Forward (complete request cycle from defender to gimpy):

    13:11:54.760484 defender.globe > gimpy.domain: 47604+ A? www.gtei.net. (30)
    13:11:54.761597 gimpy.1029 > dnsauth1.sys.gtei.net.domain: 51438 A? www.gtei.net. (30)
    13:11:54.977584 dnsauth1.sys.gtei.net.domain > gimpy.1029: 51438*- 1/3/3 A 128.11.42.31 (167) (DF)
    13:11:54.978626 gimpy.domain > defender.globe: 47604 1/3/0 A 128.11.42.31 (119)

    DNS & BIND is the first book to use for more info, though.

    --
    // Agent Green (Ian / IU7 / KB1JQO)
    // IEEE 802.3: All 10base Are Belong To Us
  16. Wait a minute, you don't understand the artical by qix · · Score: 4, Informative


    A DNS query for an IP address is a *BAD REQUEST* contrary to what some of these other posters have said. Asking a root server to resolve anything in the first place, is bad - they should only be asked for NS records - and in the second place, an IP address is not a valid domain name (unless ICANN has serripitiously added 256 new top level domains, namely, the numbers 0 thru 255).

    Most networks that I've seen, are badly broken this way. The usual problem is that the network in question may use private address space (192.168.1.0/24 for example), but fail to install reverse dns for these addresses, causing delays and other problems when machines try to get the name associated with their ip address or that of a local machine connecting to them. Yes, you heard right - if you use any of the 192.168.x.x, 10.x.x.x, or 172.16-32.x.x addresses, you are broken unless you install dns to resolve for those addresses! This also goes for any ip netblock in general, although most isp's these days are setting up dummy records for their unused ip space that'll cover their customers allocations ok.

  17. SDSC Has A Lot of Bandwidth by Anonymous Coward · · Score: 1, Informative

    The SDSC is a part of UCSD, so whatever comes from one comes from the other. Both have huge amounts of bandwidth. Of course, most of you probably aren't connecting to them by Internet2, so if it gets Slashdotted, it is for that reason only.

  18. Re:News you can use by lanner · · Score: 4, Informative


    I crazy about my home network firewall configuration, and when it is under my authority, the firewall rules of the business to which I am employed at any time.

    An important but often left out part of a firewall's configuration is logging. Attempts to do things that should never be done should not just be dropped, they should be logged and then brought to your attention.

    Some examples;

    If your local network is 192.168.2.128/29 then any outgoing packet that does not have a source within the range of 192.168.2.129 and 192.168.2.134 should be dropped AND logged. Someone on YOUR network is either stupid or trying to spoof someone!

    The same thing goes for ports and protocols that should not be outgoing on your network.

    Okay, so getting probed on TCP 80 is getting annoying now that you are logging everything that is not allowed. Fine, explicitly drop it without logging.

    Conform to RFC1918 -- don't route IP private space to or from the Internet. Route it to /dev/null or null0 AND filter it. AND if it came from YOUR network, log it. The quantity of ISPs that fail to conform to this is astounding and scary. You don't need this traffic moving around your ISP -- use GRE or MPLS tunneling instead.

    Also, conform to BCP38 ftp://ftp.rfc-editor.org/in-notes/bcp/bcp38.txt

    After tuning your firewall logging filters, you will find that when new attacks occur or something is up, you notice. Otherwise, you are blind and dumb to what your firewall is doing, which means that you are blind and dumb to what your network is doing.

  19. Re:DNS queries are for lamers by jovlinger · · Score: 2, Informative

    I mistyped -- made sense to me at the time, but not what I meant upon rereading it. I wrote:
    > Say you need to map 2**30 names. Give each name 256 bytes to list the hosts using that ip

    I meant to write:
    Say you need to map 2**30 addresses. Give each address 256 bytes to list the host names using that ip

    ok. So 256 is an arbitrary limit. However, assuming that most addresses wont use all 256 bytes, you should be able to borrow some bytes from lines that aren't using 'em. The whole thing was meant as a ballpark estimate. I made up the assumption that only 1 IP in 4 is named, too. Give it a few years, and I won't need to make these assumptions either.

  20. This is pathetic and typical of /. by Royster · · Score: 3, Informative

    You don't know what you are talking about, so you rant.

    There is nothing malformed about a .elvis DNS request. ICANN might decide to open up .elvis registrations tomorrow and program the root servers to respond to them. If every DNS server had to be reprogrammed every time a new TLD was added, it would be a maintenance problem whenever the TLDs were expanded.

    The elegant part of the design was to define the protocol to look up unknown TLDs and unrecognized TLDs at the roots. It didn't anticipate a few million monkeys typing search terms into browser address lines.

    The fault for the excess lookups lies in the applications programmers.

    --
    I have discovered a truly marvelous sig, unfortunately the sig limit is too small to contain i
  21. 3rd party DNS servers and experience. by BrookHarty · · Score: 2, Informative

    I can think of some important things to consider about the interaction of your network and root servers.

    1. Some 3rd party DNS programs, like NetIQ and Preside. Require you to have root servers configured. Some will even break if you put false root information into it.

    2. All unknown queries are sent to root servers, like your DMZ'ed networks. (Depends on your software has failsafe mode, but Bind can be disabled.)

    3. Other TLD's are queried by the root domains. .GPRS used for 3G phone networks (example) might be also query the wrong root level servers.

    4. With the security being a hot topic, networks are switching from recursive lookups to iterative mode. Which makes for more visible lookups on a network sniff by increased traffic.

    Also, you can offload dns on your home networks by using a local dns server. Really handy, caches lookups, saves bandwidth, easier to setup than bind, and can use your /etc/hosts file to pretend you have DNS on a nat'ed network. dnsmasq on freshmeat is very nice choice.

  22. Re:Oh, give me a break... by kindbud · · Score: 2, Informative

    You haven't looked at the roots in a while, have you? ;) All TLDs (except in-addr.arpa.) were moved off the roots some time ago. In the old days, the roots were also the dot-com servers (and dot-net and dot-org). They returned the NS for a 2nd level gTLD domain because they were ALSO the NS for that TLD.

    This is no longer the case, and I can't paste an example because of Slashdot's LAME lameness filter. But run this command:

    dig @a.root-servers.net slashdot.org mx

    You will getback a delegation to the .org servers, and not the NS for slashdot.org. It's been this way for almost two years now.

    --
    Edith Keeler Must Die