Providers Ignoring DNS TTL?
cluge asks: "It seems that several large providers give their users DNS servers that simply ignore DNS time to live (TTL). Over the past decade I've seen this from time to time. Recently it seems to be a pandemic, affecting very large cable/broadband and dial up networks. Performing a few tests against our broadband cable provider has shown that only one of the three provided DNS servers picked up a change in seven days or less. After turning in a trouble ticket with that provider - two of the three provided DNS servers were responding correct - while the third was still providing bad information more than two weeks after that specific change. What DNS caches ignore TTL by default? Is there a valid technical reason to ignore TTL?"
"This struck me as odd, and I decided to run a few tests using my own domain. Lowering the TTL to twenty four hours, and making changes and then checking to see when a change was picked up. I queried twelve outside DNS servers/caches that I had access to (Thanks to my friends and relatives with dial ups and DSL who put up with me and my requests to reboot their machine daily!). Checks performed against these outside DNS servers indicate that it may take as much as four to five weeks before a DNS change is picked up! Most DNS servers picked up the change within 48 hours. A small number did not (three out of twelve - that's a quarter of them!)
This merits more study, and prompts a few questions. So, before I begin with a more serious broad study, I'd like to get some feedback on the problem as I've seen it. I know the tin foil hat crowd will see the failure to propagate DNS correctly as censorship, and the OS/bind/djb/whatever zealots will simply see this as an argument for their particular religion.
Based on the responses I get, I will then setup and test a couple of domains with different DNS servers for 6 weeks and report back the findings. [volunteers welcome!]"
This merits more study, and prompts a few questions. So, before I begin with a more serious broad study, I'd like to get some feedback on the problem as I've seen it. I know the tin foil hat crowd will see the failure to propagate DNS correctly as censorship, and the OS/bind/djb/whatever zealots will simply see this as an argument for their particular religion.
Based on the responses I get, I will then setup and test a couple of domains with different DNS servers for 6 weeks and report back the findings. [volunteers welcome!]"
How would I check my ISPs DNS servers for this?
Of course there is a reason, To save bandwidth, and to provide the 3rd world internet service we have come to expect here in the USA.
Ad eundum quo nemo ante iit!
RoadRunner
They can't run a DNS server properly to save their lives.
TTL is ignored, SpamAssassin is a "trojan DDOSing our network"
It kind of defeats the point of root DNS servers being updated so fast if the ISPs are going to drag their feet and take their time updating their cache. I speculate it is server admin laziness.
in VOIP networks TTLs can be as low as 10 minutes
Usually on big providers overriding the TTL of the zone is a usual practice for sure, I do that myself in the ISP I'm working for (it's middle sized).
But I don't think they're setting a TTL longer than 24 hours, that would be kind of insane, isn't? At least from my own experience when I did a big DNS servers change (changed all the serials) the delay was less than 24 hours for almost all of them.
May the source be with you!
nscd does not obey TTL by default. It uses gethostbyname(), which does not return TTL.
We use nscd quite a bit, as im sure many other providers do. We only cache positives for 30 minutes, so we dont end up ignoring it for too long.
.
DNS isn't about "the web". It's much bigger than that.
I am becoming gerund, destroyer of verbs.
I remember once I had the TTL set on a bunch of domains to over a year. I found out its a great way to retain customers, because their domains will not work anywhere else.
it sounds possible the OP lowered the TTL on entries expecting that to have a retroactive effect on servers with the entries already cached. can we get confirmation that this is not what is happening?
Send a plain text email to
dns-subscribe@angrypeoplerule.com
This is a moderated list, and is only for letting people who are interested know when the study will begin, how to participate and the final results.
"Science is about ego as much as it is about discovery and truth " - I said it, so sue me.
They're referring to DNS TTL, not IP TTL.
And then there's the times when I just plain forgot to bump the serial number field. Works great on my master server after I restart it, but nothing else (especially my secondary) notices the change.
--
"Open source is good." - Steve Jobs
"Open source is evil." - Microsoft
DNS queries are figgin tiny...
So, what's it really save you, even if you're a massive ISP to not obey the TTL's?
The only thing it's going to save you is from having to go out to the root servers and pull it again when the TTL expires. And, I speculate that this really is a very, *very* small amount of traffic compared to the other traffic to those servers.
I'd expect the highest bandwidth/resource users, by a very large margin, to be standard "in-TTL" answers to DNS clients.
So, these cranks, for lack of a better term, simply bork the systems they manage for no appreciable gain from doing so. Reducing spam by 0.0001% would have vastly more impact on the servers they maintain than ignoring TTL's.
Has anyone done any measurement stats on DNS queries. How much of the total traffic is DNS. I can't imagine it's even 0.5% of an average ISP.
Cheers,
Greg
Saving bandwidth is the only reason. It still a pain in the ass though. I found out the hard way when I though I was moving a website to a different IP address for the first time. I normally set the TTL to 7 days. So 7 day before the planned move I set the TTL to 1 hour. We switched the IP address and everthing was fine using ISP that followed DNS rules. AOL still had the address cached though. We didn't find out about teh problem with AOL right away. Our additude at the time was well screw the AOL users, there's no one importing using AOL anyways. I think it took a month before AOL finally updated to the right IP address. This definately makes it hard to do dns moves. At least with smtp you can add the mx record of the new server in advanced.
Pity you didn't paste the appropriate part of the wikipedia article.
"TTLs also occur in the Domain Name System (DNS), where they are set by an authoritative nameserver for a particular Resource Record. When a Caching (recursive) nameserver queries the authoritative nameserver for a Resource Record, it will cache that record for the time specified by the TTL."
http://en.wikipedia.org/wiki/Time_to_live
-- perl -e'print pack"H*","6e656d6f406d38792e6f7267"'
(Thanks to my friends and relatives with dial ups and DSL who put up with me and my requests to reboot their machine daily!).
/flushdns
ipconfig
I would counter your argument, but it's too much effort...
I'll turn into a supernova and burn up everything. Well I'll turn into a black little hole and you'll turn into string.
Caching, at all levels of computing, gives enormous performance gains. In the DNS world, without caching, everyone would hit top level domains which would introduce a serious bottleneck to what should be a distributed system.
I'm a bit curious about this test case. It says the TTL was changed TO 24 hours and then checked to see how long it took for results to propogate. What was the TTL set to for these entries before the change? If it was set to 2 weeks and was just queried before the change occured, there is no reason for the server to recheck just because it was changed to 24 hours.
The TTL should stay the same for a while and then try simply making a change without modifying any other configuration to avoid any other problems with this test.
Thanks to my friends and relatives with dial ups and DSL who put up with me and my requests to reboot their machine daily!
If you're rebooting client machines to check DNS records, then I'm forced to view your entire study with caution.
Why did you need to contact your friends/relatives to check whether or not your domain gets propagated?
Couldn't you just query DNS servers directly using nslookup and/or dig?
Querying them directly would eliminate you from wondering if the machine you are checking from has the DNS cached and you wouln't need to flush it (why would you need your friends/relatives to reboot their machines?). Not to mention the amount of time you would spend in having to coordinate this type of testing.
Even if you don't want to use nslookup and/or dig from your Windows/Linux/Mac/whatever, there are tools available via the web that can help as well.
This certainly is not a list of all the tools, or even the best ones... they're just ones that I have used in the past:
dig Web-based "dig" tool
nslookup Web-based "nslookup" tool
DNS Report Checks for DNS errors and provides nicely formatted information on a given domain
DNS Stuff Various web-based DNS tools
"Very funny, Scotty. Now beam down my clothes."
I submitted a story similar to this one about a month ago regarding my experiences with direcway.com.
One of my customers was behind their network and we moved his email to our server. They couldn't access their domain name of course since it didn't exist on the server direcway's dns pointed to.
So I called them. Huge mistake. I spent hours on the phone escalating through foreign phone monkies until I made it to someone in management. Her attitude was that I was in the wrong regardless of what I had to say. Finally she lowered her defense just long enough to see that I was right but told me there was nothing they could do and that I wasn't allowed to talk to the people that run the DNS servers.
So I wrote a nasty little letter to corporate. 4 days later it was fixed. Not sure if the letter helped or not.
The man who trades freedom for security does not deserve nor will he ever receive either. - Benjamin Franklin
I've heard from a few people that many people are setting their TTL to like 5 minutes; due to this the ISPs are ignoring the TTL.
rooooar
I solved the problem by routing around the providers' poor DNS servers by pointing my home LAN to my own DNS server on my colo box. I run a DNS server in my house which then identifies my colo DNS server(s) with the forwarders option. Instead of relying on who-knows-what from the ISPs I use my own DNS server that I set up and am responsible for.
I know it's not a solution for everyone, but it does let me avoid stupidity. And having my own, reliable DNS server sure came in handy recently since Comcast has been having bad DNS problems the last couple of weeks.
If it's all about trust, then you don't want to extend the TTL, you want to *shorten* it. That way if you're hit with a cache-poisioning attack, you get the correct record *faster*, instead of holding on to crap for weeks.
Besides, this behavior blows up all sorts of geographical load balancing, datacenter failover, etc. type solutions (google for a F5 3DNS device sometime).
Bad stuff, mucking about with the TTL that someone has assigned to a record. It's not arbitrary information. To those fucking with TTLs, how about we arbitrarily alter the numbers in your paycheck? Oh? What's that? That doesn't seem like a good idea? Gee. Go Fig. HANDS OFF MY TTL, ASSHAT.
-AC
It's irresponsible tampering, it's that simple.
The world's burning. Moped Jesus spotted on I50. Details at 11.
Actually, the "TTL" in an IP header is different from the "TTL" in a DNS response (though in both cases the acronym means "time to live" and is intended as a limit on how long data hangs around).
IP header TTL is basically a hop-count, to stop IP packets going round in circles indefinately in the event of routing loops in the network.
Typically, when you look up a name like "www.example.com" your workstation consults a caching DNS server (on the local LAN, or offered by your ISP, or something). This DNS server goes off and talks to the root name servers, which refer it to the "com" name servers, which in turn refer it to the "example.com" name servers, from where it gets an IP address to go with the name. A couple of seconds later you ask for another page from "www.example.com". Your workstation asks the local DNS server for the information again, but the DNS server doesn't go and figure out the answer from scratch - it remembers the answer that it provided last time, and just repeats it. Time-To-Live is an "expiry date" that the authoritative name servers (like the "example.com" name servers) can put on their answers, so that the caching name servers know how long the answer is good for without them rechecking with an authoratative source.
Our company made a DNS change for a download server accessed by customers, over a month passed and multiple tickets opened with several large ISPS (Road Runner being the biggest) with no action taken. We finally had to setup a new server name for customers to be able to access the download server...
In all there were 3 large US isps that were major offenders...
Here in Argentina. We don't have bandwidth problems, bandwidth should be cheap considering the kind of conections that we have. But, all the bandwidth belongs to a few, that are not so interested in letting others grow, so they resell it at really high prices. So, since bandwidth _is_ a problem, many ISPs have Proxys, transparent Proxys, etc. The most dirty thing they are doing now is transparent proxys that never cleans their caches, content seems to never expire, etc. The other is DNSs that updates it's records all at once, every X days, not taking TTLs into account. I worked for about 2 years as a sysadmin for a hosting company, and this was a nightmare. Once, a customer's website was defaced, we cleaned up, restores a backup for him, but many people was still seeing the old website ... for more than a WEEK.
A solution to this problem would be a law, that would create a set of standard services that a comunications company may give, with well defined names and categorys, and it should be MANDATORY for companys to market their services using this names, in their comercials too. So, for example, we would have categorys such as "Full Duplex Simetric DSL Conection", or "ADSL, With Proxy, Blocked Ports".
WTF am I doing replying to an AC at 5 A.M on a Friday night?
this greatly reduces network traffic, as your records will be cached for over 68 years. if caching worked as described in the rfcs, you could probably even forget about keeping your domain registered after a few years, most folks would still come to you even if someone else bought your domain. of course ipv6 is coming any day now and that will probably ruin my evil plan.
Its completely within the spec, and as a fundemental principle I can do whatever I want with my server. So get with the program and understand there are other ways of dealing with the issue. Two weeks before the change, set the new IP address of the mail server as a lower priority (higher number) server, so if the info is cached, it will fall back to the new number when the old one fails. When you make the change, you can purge the old address entirely.
This is DNS maintenance 101, and should not surprise anyone who works on DNS.
Actually that's for TCP's time to live. For DNS TTL, here's the scenario: (and yes this is simplified; there is more that actually happens, but it's not important for this discussion)
:-)
Background
----------
Domain Name Servers (DNS) are usually configured in a heirarchy, such that each server has a parent. This fact will be important below.
Every domain (i.e. slashdot.org) has one or more "authoritative" name servers. These name servers know what web host slashdot.org is hosted on and how to get there.
Other DNS's on the Internet do not know how to get to slashdot.org, because they are not "authoritative" for that particular domain. So they send a request out to their parent asking how to get to slashdot.org. Eventually, one of the parents will know the address of slashdot.org's authoritative name server, and will return this address.
How This Relates To TTL
-----------------------
Here is what happens once the address of the authoritative name server is returned:
A = The name server trying to figure out how to get to slashdot.org
B = The authoritative name server for slashdot.org
A asks B how to get to slashdot.org
B responds to A with an address (66.35.250.150)
A asks B how long this address is valid
B responds to A with a TTL (e.g. 24 hours)
So now name server A will not have to ask for slashdot.org's address again for 24 hours, since it was told by the authoritative name server that it can keep the address for 24 hours.
This "keeping of addresses" is called caching, and name servers that do this are called caching name servers.
I hope this helps.
Waaa.. Waaa.. somebody ignored my TTL.
Listen -- We are SOA for around 11,000 domains. Both myself and the other uber-admins get tickets like this "escalated" when some clueless newbie wet behind the ears freaking junior admin DOESN'T RTFM and doesn't realize that if the serial #'s don't change then TTL is ignored.
We interface with nearly every major ISP -- I assure you, their name servers are configured just fine --- It's yours that is broke.
For those of you who just aren't DNS afficiandos -- so how retarded is this? Well here is another story ideas for slashdot that is along the same lines:
OMG! Two major RG vendors (NetGear and Dlink) don't follow RFC798 (TCP). See when I block port 80 on my firewall, the web stops working -- Imagine the whole web stops working by blocking just one little port. This is a huge problem! It needs to be addressed!
How about this little doosey:
I've just uncovered a SCO/Microsoft conspiracy. They've apparently teamed up because after reinstalling Windows XP onto another partition XP disabled my Linux partition -- the boot menu doesn't come up anymore. Clearly Microsoft is doing this to help SCO protect their intellectual property! Quick tell Groklaw!!!!
If you don't get either of the two above -- I can't help you. (seriously -- WTF are you reading slashdot for??)
I swear
I've recently "disconnected" my .COM version of my home domain (solely using the .US version now). As I get -0- spam to the .US address' (so far :) it is very apparent that a LOT of DNS servers [world wide] simply ignore TTL. A couple of weeks before the "switch" I set the TTL very low (60 seconds) -- I can easily handle the DNS traffic across my DNS servers (peppered here and there :).
.COM domain as I set _everything_ to IP address 127.0.0.1 (just to screw with the spammers high-jacked computers). Yet the spam [attempts] still come in. Every minute.
I would expect, now almost two months later, to be getting -0- traffic to the
Technically -- even running your own DNS servers there is nothing you can do if you move/add/delete something and others out there decide not to honor it. Everybody loses.
Our additude at the time was well screw the AOL users, there's no one importing using AOL anyways
I'd be curious who's the audience for the site(s) you're talking about. I'm pretty uncomfortable calling tens of millions of users unimportant, especially when it comes to e-commerce. Different "additude", I guess. Or attitude, even.
I maintain an ancient AOL account specifically so I can see things the way that some of my customers' customers see them. But it has one other advantage: if I've just made DNS changes to domain I care about, I set up a temporary new A record (like X.whatever.com) and then surf to it through AOL's proxies. This seems to get their name servers to notice that the SOA record is new, and it flushes out the rest of their cache. This seems to work on all sorts of servers, most of the time.
Don't disappoint your bird dog. Go to the range.
In fact, servers that are obeying TTL won't see the new record until the old record's TTL expires.
The querent doesn't say whether or not there was any wait for the old TTL to expire. They don't even mention what the old TTL was!
GLaDOS for President 2016! "Well here we are again. It's always such a pleasure." -- GLaDOS, 2011
I work at a hugeISP and we sometimes receive tickets accusing us of ignoring TTLs. However, it has always boiled down to one of three things.
1. Change in the hosting of a domain to new DNS servers without properly removing the domain from the old hosting DNS servers.
When this happens, a DNS server caching a domain's info will continue to check the old servers until the old server stops answering.
2. A change in the TTL of a domain to a lesser value.
If you change the TTL of a domain from 7 days to 1 hour, DNS servers currently caching that domain's information will hold onto it for 7 days before discovering the new TTL.
3. A bug in BIND 8 that prevents it from pulling updated information from the primary DNS server for a domain.
We see this rarely, but it requires a restart of an affected DNS server. We have not diagnosed the specific cause yet since we're moving servers to BIND 9.
My dad uses Comcast and he kept calling me to "make the Internet work" during their recent DNS outages. I just SSH'd in to his router and added a Verizon DNS server (4.2.2.1) to his DHCP info, and his Internet worked right away. His neighbors were complaining they couldn't use the 'Net but he was surfing away just fine.
Grandparent is full of shit or doesn't understand what this thread is about.
Serials are primarily for the two servers do get the same data (primary/secondary), so when the secondary is done waiting it goes to look at the serial on the primary and grabs the new zone transfer if the serial is higher.
TTL on an A record is just a recomendation (a specific setting that over-rides the default TTL for the zone up near the SOA).
IF a server has cached an A record with a TTL of 6000 seconds (just under 2 hours) it should hold and server data for only a maximum of 6000 seconds, and after that time dump the data and go get new data from the authoritative name servers.
If you do a DIG against them, they'll tell you how much time is left on a cached record.
Serial doesnt come into the "when to drop cached data" transaction at all.
Sure, not incrementing the serial can cause all sorts of problems. But that's not what the article is on about.
AOL et. al are ignoring specific A record TTL and putting their OWN TTL on cached information that over-rides mine. (I know this because the tool I use makes it so I CANT forget to incriment the serial, and I still run into TTL problems. What about that smartypants?) So when I set a domain from default to 3600 seconds a day before an MX record (email server) change and they ignore it, email migration from one server to another stays messed up for days rather than the hour my TTL would do. A good admin doesnt abuse TTL (like yahoo apparently does...) and sets it back up higher when finished moving stuff, most of the time I am prefectly happy with the nice long standard cache time. But sometimes you NEED a low TTL.
I got the O'Reilly Grasshopper book right here in front of me and none of the TTL sections mentions SOA needing increment for TTL caching. If someone wants to point out a page number that says I am wrong I'd be happy to shut up. But self-righteous indignation better be fact checked... seriously.
There was an article called On the Responsiveness of DNS-based Network Control presented at the Internet Measurement Conference" last year. It is based on data from the Akamai content distribution network and shows that some DNS servers and even more client applications do not honor DNS TTL information.
I hate to bear bad news, but there's nothing new here. Back in the 90s I observed similar situations--that no matter what the TTL, and even if you were careful to increment your version IDs, there were plenty of DNS servers that wouldn't notice DNS changes for weeks.
Hence whenever someone tells me that they're going to move their web hosting, I always advise them to allow for a couple of weeks of overlap, so that they don't lose a ton of traffic. Often they ignore my advice, because of course they have set their TTL low so their changes will take effect immediately, or so they think.
And then they wonder why they're getting hundreds of e-mails from people telling them that the site is down. And I forward them a copy of the e-mail I sent them beforehand warning them of the problem.
My gut feeling is that screaming about other people's DNS servers refusing to observe your TTLs is going to get you about as far as screaming about other people's SMTP servers refusing to deliver your spam. It's their server, they can do what they like. For instance, any TTL under 24 hours will be ignored by my caching DNS server. (RFCs say TTLs should be at least a day, more like 1-2 weeks.)
GCHQ Quantum Insert installed. If only our tongues were made of glass, how much more careful we would be when we speak