Entire .SE TLD Drops Off the Internet
Icemaann writes "Pingdom and Network World are reporting that the SE tld dropped off the internet yesterday due to a bug in the script that generates the SE zone file. The SE tld has close to one million domains that all went down due to missing the trailing dot in the SE zone file. Some caching nameservers may still be returning invalid DNS responses for 24 hours."
They are going to extremes in Sweden to get thepiratebay.org off the internet!
The downtime lasted 30 minutes, and most domains were probably cached by nameservers anyway.
"Oppression and harassment is a small price to pay to live in the land of the free." -- Montgomery Burns.
Goat.se
I seriously hope someone is fired or loses a contract over this. Where was the validation, change control, etc? I would expect that at the TLD level, a change to a configuration file would have to be inspected by someone AND run through some syntax-checking scripts...
As for the person who was modded up for saying "hey, no big deal, fixed in 30 minutes!", not quite. DNS servers (and individual computers!) cache negative results. Anything anyone did a query on during those 30 minutes will be negatively cached by their system and their local DNS server. Granted, a whole lot of local Swedish ISPs and network providers have probably flushed their DNS server caches, but it's still going to seriously impact traffic to many, many sites, especially for everyone outside Sweden.
Please help metamoderate.
...borked!
One missing character, repeated a whole lot of times, results in an entire TLD going offline. Awesome.
512 MB RAM, 20 GB disk, 200 GB transfer, five datacenters. $19.95/month.
Uh, it would make no difference.
DNS is hierarchical, and has teh caching.
2 independent groups running DNS would strive to make sure they sync with each other quickly - thus all failures would sync quickly too.
The difference between
- the delay of a correct change propagating across the two firms running DNS
- the delay of an incorrect change propagating within a single DNS
would essentially be zero.
No good things could come from what you propose unless it was specifically designed to have a 24 hour delay or something.
Can't get to milkmaids.se ? Try milkmaids.se via DNS2 to get a 24-hour old version.
This is something the CURRENT DNS system could support - explicitly calling for older versions.
In fact, it might be worthwhile. Somebody write an RFC.
i don't think you have a right to call this no big deal
the internet is becoming more and more vital to our lives
its "no big deal" until you need to know something off the internet right now, high stakes
intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
http://en.wikipedia.org/wiki/Thule
we all know that thule is the ends of the earth
so none of us should be surprised. it should have been anticipated that sweden would drop off the earth at some point. today's that day apparently
intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
an admin has popped back from lunch and asked, "hey guys did someone turn my computer off while i was gone? there was a file i was working on......"
Good people go to bed earlier.
It still boggles my mind that anyone thought zone files are a good idea. The file format is so damn brittle, that a single byte can spell disaster. On top of that, the hierarchical naming structure presents an inherent systemic risk for all sub-domains as exhibited by this .se fiasco. Nevermind the injection attacks, Pakistan taking out Youtube, and the rest, you have organizations like Verisign which profit immensely off of keeping the system broken. And don't even bother mentioning DNSSEC, as it still doesn't resolve this fundamental issue. The next systemic fuckup will simply be a signed fuckup.
the trailing dot got /.'d?
PPN
This is why MaraDNS (my open-source DNS server) uses a special zone file format.
MaraDNS uses a zone file format that, for the most part, resembles BIND zone files. However, the zone file format has some minor differences so the common "Forgot to put a dot at the end of a hostname" and the "forgot to update the SOA serial number" problems do not happen; a domain name without a dot at the end in a syntax error in MaraDNS' zone file parser; if you want to end a hostname with the name of the zone in question, this has to be explicitly specified with a .% at the end of the hostname.
There is also a mechanism for automatically generating SOA records, or having a SOA record where the serial is automatically updated based on the "last write" timestamp for the zone file.
For people who want to use their BIND zonefiles, there is included a Python script that converts a BIND zonefile in to MaraDNS' similar zone file format.
MaraDNS is an open-source DNS server.
If they were using NSD like the RIPE does for K root, the zone compiler wouldn't have compiled the faulty zone file and the parser would have made noise about it. NSD is very hard to break as the zone files must be compiled into a database before loading. The parser simply refuses to compile when there are zones with errors in them, so the database it creates will never be bogus (similar to the way a compiler won't create an executable if the source code violates its rules).
You can't protect against a single point of failure when you're talking about a person updating a system. Redundancy protects against computer error, not human error.
See, ultimately, somebody, somewhere has to be responsible for the name updating. Having it in two places just means that an incorrect update gets pushed to both places by the person making the change.
In this case, the effects were minimized by the nature of DNS itself, and the caching mechanisms involved. Most servers probably never saw the changes. Those that did will get their caches cleared fairly rapidly, and the effect is minimal.
- Give a man a fire and he's warm for a day, but set him on fire and he's warm for the rest of his life.
Wi nøt trei a høliday in Sweden this yer?
See the løveli lakes
The wonderful telephøne system
And mani interesting furry animals
#DeleteChrome
Didn't they use something like this before reloading the zone? If the mistake was a missing '.' it should've given you big warnings ...
http://ftp.isc.org/www/bind/arm95/man.named-checkconf.html
Natxo Asenjo
obviously mean http://ftp.isc.org/www/bind/arm95/man.named-checkzone.html
Natxo Asenjo
The Swedish alphabet does not have the letter "ø", it's written "ö" in Swedish. The letter "ø" is found in Danish and Norwegian.
The letter is NOT a ligature or a diacritical variant of the letter o! The vowel it sounds most like is the vowel in "bird" or "hurt".
That's how it looked like in Thunderbird's RSS reader.
Any sufficiently advanced intelligence is indistinguishable from stupidity.
Don't worry, there's plenty of mirrors......unfortunately.
Table-ized A.I.
How many times have we all forgotten that Dot. :)
Funny how the software tells you that the dot is missing, why can't the software just fix it by now NAMED/BIND deserves a A.I. by now for sure.
No, it was a single dot of failure.
Excuse me, but please get off my Pennisetum Clandestinum, eh!
So then, logically, what we need are computers that can think for themselves Then we could just let them run things for us, without human error. I think I'll start by just connecting these two supercomputers together. What could go wrong...
Regedit32.exe
I agree. It's long past time for the .com domain to be upgraded to .exe.
Because Unix admins never test-run their code.
What about regression testing?
It'd be quite possible to run a check and throw a warning if a change effects greater than a certain percentage of domains. Or you could check against a sample of domains that really aren't going to change (I'm thinking mcdonalds.se, ibm.se etc etc).
...and the DNS bit everyone else.
Those who can make you believe absurdities can make you commit atrocities. - Voltaire
Hand in your geek card: http://www.youtube.com/watch?v=4I2qWq9L6rw#t=1m00
Obligatory Dota song
BIND's max-ncache-ttl setting defaults to 3 hours for a good reason. Negative caching TTLs are capped to avoid everlasting NXDOMAIN records sitting in recursive caches.
What a disappointment, I saw the title and was thinking DNSSEC key-rollover screwup. THAT would have made for a righteous thread.
It doesn't matter. .SE is only Sweden. If .SEX fell off; then the whole Internet would melt down into a small singularity.
Sigs. We don't need no steenking sigs.
"In this case, the effects were minimized by the nature of DNS itself"
Well, at least somebody shows some common sense.
Of course, losing a whole TLD even if only for half an hour is a shame probably the one that did it won't include in his resume, but the fact is that nobody will expend more on secure a resource than it's very value. DNS is basically distributed, cached information described on plain text files; if an update works (which is vastly most of the time), it works; if it isn't you detect the failure within seconds (logs at reload), it is not so tragical (the previous information will be cached through the Internet), it's easy to spot (is a diff away) and you can easily revert to the previous version plus higher serial number in the meantime. No need for triplechecks that triplicates the costs and will bring in their own share of bugs to the equation.
Everybody set all your TTLs to 1.
you don't realize how valuable they are until they go down on you.
The entire Swedish Internet effectively stopped working at this point.
That's incorrect. Only domain lookups weren't working. The Internet was working fine.
I am sick and tired of this kind of knee jerk reaction.
Unless is somebody that consistently has been messing things up and has been warned, I don't see why this should be a firing somebody issue.
It is not like we are all perfect, right?. RIGHT?
IANAL but write like a drunk one.
so this is not something done by MPAA or RIAA to prevent people from accessing thepiratebay? :P
Guarantee she'd detect a missing period earlier.
Get off my launchpad!
Of course you can.
When transcribing medical records, double-or-triple keying the data is the norm.
And where does this data come from? The doctor? The testing lab? What happens when the error is in the source that you give to three people to key in?
Ultimately, all data to be input derives from somewhere. An error there will just get duplicated down the line.
- Give a man a fire and he's warm for a day, but set him on fire and he's warm for the rest of his life.
What about regression testing?
It'd be quite possible to run a check and throw a warning if a change effects greater than a certain percentage of domains. Or you could check against a sample of domains that really aren't going to change (I'm thinking mcdonalds.se, ibm.se etc etc).
- The total impact was less than an hour.
- The number of affected users was likely only in the dozens range (thanks to DNS caching).
- Even individual computers use DNS caching nowadays. All Windows machines, for example, cache DNS lookup results for a default of a day or so.
- Do we really need to develop a cumbersome and expensive process to solve a most likely one-time problem that affected virtually nobody in any serious way?
- Give a man a fire and he's warm for a day, but set him on fire and he's warm for the rest of his life.