Slashdot Mirror


Entire .SE TLD Drops Off the Internet

Icemaann writes "Pingdom and Network World are reporting that the SE tld dropped off the internet yesterday due to a bug in the script that generates the SE zone file. The SE tld has close to one million domains that all went down due to missing the trailing dot in the SE zone file. Some caching nameservers may still be returning invalid DNS responses for 24 hours."

9 of 207 comments (clear)

  1. Re:No big deal by eldavojohn · · Score: 5, Funny

    The downtime lasted 30 minutes, and most domains were probably cached by nameservers anyway.

    I once viddied an animated documentary about a small town in Colorado that lost the internet for 22 minutes. It was not pretty. Our hearts and minds go out to you, people of Sweden. I cannot even fathom what that would be like ... I hope the looting and rioting has died down with the restoration of the internet.

    --
    My work here is dung.
  2. change control / management, anyone? by SuperBanana · · Score: 5, Insightful

    I seriously hope someone is fired or loses a contract over this. Where was the validation, change control, etc? I would expect that at the TLD level, a change to a configuration file would have to be inspected by someone AND run through some syntax-checking scripts...

    As for the person who was modded up for saying "hey, no big deal, fixed in 30 minutes!", not quite. DNS servers (and individual computers!) cache negative results. Anything anyone did a query on during those 30 minutes will be negatively cached by their system and their local DNS server. Granted, a whole lot of local Swedish ISPs and network providers have probably flushed their DNS server caches, but it's still going to seriously impact traffic to many, many sites, especially for everyone outside Sweden.

    1. Re:change control / management, anyone? by RabidMonkey · · Score: 5, Insightful

      As a DNS admin myself, touching high value zones, let me tell you, missing a stupid dot happens all the time. All the change control in the world doesn't help when you just don't type one little period. Even more helpfully, most tools won't notice and the zone will pass a configuration check because missing the trailing "." is syntactically correct.

      Let me add as well that "change management" that you want is just fantastic .. no making changes during core hours. When you run a 24/7 business, non-core hours means something like 2am. at 2am, I, and most mammals, are not at their mental best, so missing a single dot isn't horribly hard.

      The only thing I'd suggest they do is use an offline test box for zones, then promote that change to prod. Then, you can load all the mistakes you want, do your digs, and if stuff works, THEN you move it to prod. I never ever make changes on production servers, they are done offline, tested, then put into prod with scripts. It makes it a lot harder for missing periods to make it into production.

      Finally, this is a good reason why negative caching should have low TTLs. If you run a DNS server that can't handle low neg-caching TTLs, it's time to upgrade from a 386.

      Cheers.

      --
      We emerge from our mother's womb an unformatted diskette; our culture formats us. - Douglas Coupland
  3. So I guess it's... by 6Yankee · · Score: 5, Funny

    ...borked!

  4. Re:DNS is the problem by Anonymous Coward · · Score: 5, Funny

    Regedit32.exe

  5. Re:No big deal by eln · · Score: 5, Insightful

    The actual downtime is no big deal, but the reason it happened is. Evidently, the registrar for an entire country's domain likes to roll out changes to the primary zone file without any sort of testing or syntax checking first. Simply having a small network (one or two computers) running a test root server, and running your scripts against that first, would have discovered the bug.

    DNS is very simple, but it's just as prone to human error as anything else. If you're responsible for the records of a large number of domains (like, say, an entire country), you probably ought to take some time to develop proper testing and change control procedures before you fiddle with it. It sounds like these guys didn't take it seriously enough and got burned. I hope they'll learn their lesson from this and change their procedures.

  6. Re:DNS is the problem by photon317 · · Score: 5, Informative

    Part of the problem with DNS these days, which your post exemplifies, is that from very early on "BIND's implementation of DNS", and "DNS The Protocol" have been mashed together and confused by the RFC authors (who were involved with the BIND implementation and had motive to encourage the world to think only in BIND terms) and basically everyone who ever used DNS in any capacity. Zonefiles are not implicit in DNS address resolution (neither for authoritative servers or recursive caches). They really aren't any part of the wire DNS protocol for resolving names. They *are* part of a wire protocol for secondary servers that slave zonefiles from primary servers, but even in that case it's really more a "BIND convention" than a necessity. Ultimately how you transfer a zone's records from a master server to a slave server is up to however those two servers and their administrators agree to do so. You can skip the AXFR protocol that uses zonefiles and instead do something else that works for both of you. Inventing a new method of slaving zone data is easy and doesn't involved much complicated rollout. Some people just rsync zonefiles for instance instead of using AXFR today.

    It's really frustrating (believe me, I've done it) when you try to implement a new DNS server daemon from scratch from the RFCs, and you have to wade through this mess of "what's a BIND convention that doesn't matter and what's important to the actual DNS protocol for resolving names on the wire".

    --
    11*43+456^2
  7. There's møre to Sweden than .se by 93+Escort+Wagon · · Score: 5, Funny

    Wi nøt trei a høliday in Sweden this yer?

    See the løveli lakes

    The wonderful telephøne system

    And mani interesting furry animals

    --
    #DeleteChrome
  8. Re:No big deal by CorporateSuit · · Score: 5, Funny

    DNS is very simple, but it's just as prone to human error as anything else.

    Are you kidding? I've been programming DNS for a long time, and if theirs one thing I learned, its that programmers like me don't make errors.

    --
    I am the richest astronaut ever to win the superbowl.