Army DNS ROOT Server Down For 18+ Hours
An anonymous reader writes "The H-Root server, operated by the US Army Research Lab, spent 18 hours out of the last 48 being a void. Both the RIPE's DNSMON and the h.root-servers.org site show this. How, in this day and age of network engineering, can we even entertain one of the thirteen root servers being unavailable for so long? I mean, the US army doesn't even seem to make the effort to deploy more sites. Look at the other root operators who don't have the backing of the US government money machine. Many of them seem to be able to deploy redundant instances. Even the much-maligned ICANN seems to have managed deploying 11 sites. All these root operators that have only one site need a good swift kick, or maybe they should pass the responsibility to others who are more committed to ensuring the Internet's stability."
An Oxymoron indeed!
Nobodies Prefect
Tidbits for Techs Technology Blog
So the Internet worked as it should, and routed around this disruption. The other root servers were unaffected, and still functioned fine. So what exactly is the problem?
They didn't want YOU to access their servers?
did you forget to take your meds?
Because they don't have redundancy? Everyone gets mad because the USA wants to control the internet, but let something go bad and then someone wants to point fingers? Really? I just don't get the mentality of "We want you to do this for free" and then people turn around and B&M about the service being down for a bit.
--- Relax, that mass muderer is just trying to reduce our carbon footprint, one fetus at a time...
What's the problem? The point of redundancy isn't to keep all redundant instances up all the time. The system is designed to allow for downtime of quite a few servers.
This is what happens when you give contracts to the lowest bidder. The military may have tons of money, but that doesn't mean they spend it wisely. Even if it's not a contracted company taking care of these servers, and it's government employees (there's a difference), a LOT of those employees get their jobs based on keywords and general qualifications and several have a 'I did my time in the military and retired, they owe me this for all the hard work I did before' attitude. Not everyone is like that, and I've met some government employees (in the tech field) who really did know their stuff.. and not all contracts are bad -- but they can turn sour when a company steps in, says they'll do all that and more for this much less, and they really don't know what they're doing. I've seen that happen too. And if it's managed by soldiers.. well. They always told us, you're a soldier first, and a 'whatever your job is' after. Most technically trained soldiers don't know how to do their job well, or even at all. They just tough it out until they're an NCO, and then they're supposed to be a leader and tell their underlings to do the work.
The H-Root server does NOT run Windows, just in case anyone was wondering ;-)
On a more serious note, while the downtime is bad... there are 13 root servers owned by different organizations (both government & non-government) for a reason.. to provide redundancy. Interestingly, the D-Root at College Park and H-Root at Aberdeen are relatively close to one another geographically. The distributing the H-Root service would be nice, but there are lots of other letters to use in the Root namespace. In short: The Army should probably take some steps to beef things up, but the (usual) mouth-breathing hypersensationalized crap spewed in the summary is mostly for getting ad revenue into Taco's bank account and not a rational evaluation of the situation.
AntiFA: An abbreviation for Anti First Amendment.
"I don't give a fuck, I'm still going to get my paycheck, and won't be fired."
A story I'm interested in reading... the only problem is there is no article?
Having a root server without multiple instances running is horrible. Anyone who has spent an hour studying DNS would understand how bad a decision this is.....
That having been said, this is one root server amongst many, so the actual impatc is almost zero. If your DNS server is only pointing at 1 rot server, than you are more foolish that the US Army and deserve what you get.
Still, they need to fix this and move into the 1990's.
Hardware fails. That's just how it is. Even with the highest end hardware available today, outages can happen. This is why there are 13 root servers to start with. So long as they don't all go down at once, all is good. As far as 18 hours to recover, why is that bad? With 12 others to pick from, should this one be a high priority? I think not. Getting one's panties in a bunch because a server fails and takes some time to recover makes you sound like a silly management type. Most of us lived at least a large part of our lives without any root servers - or any servers at all. It's not the end of the world if DNS goes down. It will be ok, I promise.
They're sticking to their moto and deploying an Army of one.
This is what we need for DNS system. p2p dns.
Read radical news here
Whine much?
Some muscle bound, Rambo GI Joe type, trampled around inside the server room to play paint ball wargame, and managed to trip on the servers' main power cord accidentally?
I've seen numerous instances where the monitoring system, itself, was confused or detached. The results on a chart are then quite confusing, unless you know how to backfill the data in the chart.
Why, no, I've never been asked to do that for a 99.999% uptime SLA monitored site when some confused person in the offsite monitoring station put a bad IP address in /etc/hosts. No, no, no, couldn't happen.
They are too busy getting blocked by my PeerBlock application to deploy more DNS sites.
"I hope you know how very lucky you are to know me, because I am so incredibly incredible."
micromanaging your personal healthcare choices? LOL, just LOL. Just lost your arm in a car wreck? Slap some duct tape on it and walk it off, son. Gramps has a cold? Better send in the nurse to "fluff his pillow," a.k.a. press it against his face until he stops moving.
Rest assured, the government isn't holding back. Those non-redundant Army servers already cost an order of magnitude more then everybody else's redundant servers.
No sig today...
Old troll is old.
No news outlets have picked up on this, and rightly so, since it isn't a big deal.
If the other 12 root servers were also down, then it might warrant a story. ;)
Reminds me of a stupid boss I had that got on my case for not being overly worried when one system in a three-level redundant design went off-line for a few minutes.
He flipped out screaming that the entire system was all "aye ree" (his pronunciation of "awry"). I guess he had no clue what "redundant" meant, either.
Wondering, at least one US Base in Germany had a curfew for all DOD personnel from Oct 1 2300 through Oct 2 0500, with base installation closures from Oct 1 1800 ...
I think you are overreacting a little bit. The expectation always was that one or more root servers would be unavailable at any one time - hence why there are 13 different root server systems available. More than one can be unavailable for days, and due to redundancy and caching it won't affect anything - as expected, nobody has really noticed this blip.
There should be a good mix of technologies used in the different root server systems - different architectures, OS, etc. Some sites use anycast which gives massive redundancy within that system as well as providing good performance. However other architectures have their place and may be more robust to attack or certain failures. We need the variety.
So technically it's a shame that H has gone down - they don't seem to have a good track record. Fortunately this time it isn't an issue.
And during the curfew/outage they could not access Medal of Honor online multi-player... wow that is a strange coincidence...
Maybe someone in IT thought the AAFES ban meant they had to shut down access, and they was their plan...
You have to realise that the layout of the root dns server hierarchy is historical. It is composed of organizations that are vastly different now than they were 20 years ago. The H root server people don't seem to care about things very much and there are a couple of other root servers where the organizations operating them don't put too much effort into things.
Luckily, the internet doesn't really depend on them, as there are a couple of big organizations with heavy investment into making sure the root servers stay accessible all the time, like RIPE or Verisign. They operate thousands of physical machines at dozens of geographically distributed locations, all structured under one ip address, via anycast. This results in the situation where one logical root server outweights the other one in terms of physical boxes at least 100:1, if not more.
My last information about the Verisign operated root servers from a couple years ago for example is that they are ridiculously overprovisioned, operating well under 1% used capacity, even when subjected to a fairly large DDOS. As far as I know, the common dns servers all support rtt banding, so basically using a random list of dns servers for a given resource that fall below a threshold of latency, therefor they wouldn't really notice the H root being down.
It takes a man to suffer ignorance and smile
Be yourself no matter what they say
Could this simply be a part of the Cyber Storm III information warfare exercise?
http://www.military-technologies.net/2010/09/29/test-of-first-us-cyber-blitz-response-plan-begins/
Tell your friends about xenu.net
Agreed.
From the offending server's website: "BRL volunteered to host one of the original root servers ... to provide a root server for the MILNET in the event that MILNET had to be disconnected from the Internet."
The purpose of the G/H servers is not to support the greater good (that's a side benefit), but to ensure that the MILNET can function if the DoD cuts itself off from the rest of the internet.
And besides, If my math is correct, there are a total of 205 redundant root sites (http://www.root-servers.org/), so imagine going up asking for funding...
[IT Guy] "General, we need money to add another redundant root server site, if all the sites go down the internet collapses!"
[General] "That sounds bad! How many redundant sites are there now?"
[IT Guy] "Only 205"
[General]
..and which Microsoft Product are you running?
"Computers are a lot like Air Conditioners" "They both work great until you start opening Windows"
It was down for that long because that is the amount of time it took them to install the new monitoring hardware.
*Unplugs toaster oven and plugs back in server*
--BOFH
Well, there's spam egg sausage and spam, that's not got much spam in it.
My guess is that since this root server is designed to operate on MILNET after disconnecting from the Internet, they may have been running a drill to do just that. Also, I highly doubt that this is the only root server on MILNET. I expect that they have multiple sites and plenty of redundant locations, but they only give out the Maryland location for security reasons.
Hello little man. I will destroy you!
Did anyone actually notice the outage?
And like sibling jojoba points out- p2p DNS would be HORRIBLE from a trust perspective.
Actually, what it means, is that we would have to actually fix once and for all, the identity/trust/reputation problem that the Internet already engenders. Unless you use https for everything, signed emails etc you are already trusting people all over the place.
Deleted
> All these root operators that have only one site need a good swift kick...
Alright, anonymous coward, I nominate YOU to be the one to go and give the US Army a "good swift kick". See ya when you get back!
----
Not to be confused with Col.
Just a small, minor issue there... 1 of the root dns went down... only another 12 still up. Not really a problem even if it had been down for a week.
The whole reason for having 13 root servers is that you can lose a fair few of them before anyone needs to start worrying.
--- Users are like bacteria -> Each one causing a thousand tiny crises until the host finally gives up and dies.
I'm betting it had more to do with the Tropical Storm that hit the US East Coast, as referenced in the announcement and "back online" emails sent from the US Army. Maybe they're in on the conspiracy.
I heard through the grapevine that a cable at ARL was cut. I can't find anything to substantiate this other than a slightly related "unscheduled network maintenance" notice here
-----BEGIN GEEK CODE BLOCK----- Version: 3.12 GIT d? s: a-- C++++ UL++++ P++ L+++ E- W++ N o-- K- w--- O- M+ V PS+ P
Your mom was down for 18 hours.
Wish I had mod points...
Of the 64 comments I see in full, only this one has actual pertinent information about the downtime.
...
I must be new here. :)
Exactly. Why do root servers need to be redundant themselves? Aren't they already made redundant by the fact that there's 13 primary servers? Call me crazy but I thought these were setup in a way so they CAN go down without causing as much as a news worthy event.
I use a local root to reduce outside traffic, load on the roots and to say f-u to the organizations that get a hard on by trying to control the dns system and make money off that situation.
Maybe people using bind can't/don't know how to do that, but with djbdns it is fast/easy to set up.
More than a signed root we need a signed root file that can be retrieved from a website/rsync site so that everyone can just run their own root and reduce the load on the official roots down to a trickle but that scheme would be attacked with extreme prejudice by the people with a vested interest in maintaining centralized control of dns and the internet.
Stop using troll and flamebait mods for "I disagree with this post"