Seattle Data Center Outage Disrupts E-Commerce
1sockchuck writes "A major power outage at Seattle telecom hub Fisher Plaza has knocked payment processing provider Authorize.net offline for hours, leaving thousands of web sites unable to take credit cards for online sales. The Authorize site is still down, but its Twitter account attributes the outage to a fire, while AdHost calls it a 'significant power event.' Authorize.net is said to be trying to resume processing from a backup data center, but there's no clear ETA on when Fisher Plaza will have power again."
Redundancy ain't just a river in Egypt.
The world's burning. Moped Jesus spotted on I50. Details at 11.
Hmm. Power outage stops /. posts. News at 11
"She's furniture with a pulse"
http://twitter.com/AuthorizeNet/status/2455435020 Hopefully someone made an offsite backup as well.
A Magic the Gathering Article and Forum Aggregator
News at 11...
tomorrow.
Bing Travel servers are located in the same server hall. More info: http://isc.sans.org/diary.html?storyid=6721
Apparently Verizon has a single point of failure for much of its FiOS for the metro areas of Western Washington state in this building as well so the FiOS customers are offline as well right now.
Hot/Hot is always a more ideal solution than Hot/Warm or Hot/Cold for disaster recovery (and increasing equipment utilization/ROI), and this event demonstrates why.
... that authorize.net does not have a failover site.
TOP DSLR Cameras Reviews of the top DSLRs
When this happens in this day and age the CIO should be fired! There is no excuse. It's a situation where you gamble that this will never happen but when it does you should go.
Fisher Plaza is supposed to be a regional telecomm / communications / medical care hub for the Seattle area. It was designed and built to *not* crash, even in a magnitude 9.5 quake. Sounds like they've got work to do ...
1: ACTS OF GOD ...
Meteor strike, lightnight strike, extreme weather
2: ACTS OF MALICE ...
War, terrorism, extortion, employee sabotage, criminal attacks
3: WEAK INFRASTRUCTRUCTURE ...
Underpowered networks, inadequate UPS backups, skeleton staffing, the shaving of safety margins as an efficiency exercise, inadequate rate of replacing old hardware
4: MANAGEMENT ARSINESS
This is when a problem starts, and the people in charge either don't know how to react, don't care, or prioritise face-saving over actual problem-solving. This happens when you get an outage, and instead of system management promptly calling all their critical clients to inform them, and warn them that there's maybe twenty minutes of UPS capacity in the routers if the system's not fixed by then, they instead cross their fingers and hope that things'll work out, and worry about what to tell the clients afterwards.
Fisher Plaza seems to have suffered from a case of #4 recently, so it's not surprising that they've gone down again. The first time should have been the wakeup call to show them that their human systems were in need of an overhaul. Without that overhaul, you're setting up a dynamic in which the second time it happens, things are even worse (because now people are locked into defensive mode).
No matter how advanced your technological systems, if the people running it have the wrong mindset, you're gonna go down. And when you go down, you're gonna go down far far harder than necessary.
Eric Baird
...except it failed as well. From their twitter:
"@gotwww The backup data center was impacted too. Don't have info as to why. The team is solely focused on getting us back up for now."
And on a holiday. Bummer. :(
The media are also following the story, KOMO a local station was knocked offline but are broadcasting from a backup site.
Way to go guys! At least two national, and maybe even international, ICT companies on whom numerous affiliates depend upon fail to provide for an adequate backup facility and continuity plan, yet the local AM radio station manages to pull it off. I'm guessing that some heads are gonna roll after the holiday weekend...
UNIX? They're not even circumcised! Savages!
When this happens in this day and age the CIO should be fired!
And if the CIO recommended a redundant D.C. but the CEO, CFO or Board rejected it as "too expensive"????
"I don't know, therefore Aliens" Wafflebox1
I know redundancy and such is better on business stuff, but this kind of reminds me of the fact how customer lines have lots of single failure points aswell. There was a day when TeliaSonera's, large nordic ISP, DHCP stopped working, leading 1/3 of the whole country's residents without internet access. Turns out there was a hardware failure on the dhcp server, leading me to believe that they actually depend on just one server to handle all the dhcp requests coming from customers. They did fix it in a few hours, but it was still unavailable for the rest of the day because hundreds of thousands computer's were trying to get an ip address from it. That being said, I remember it happening only once, but it still seems stupid.
... who's broadcast facilities reside in this building (they were broadcasting from a park on Queen Anne hill this morning), it was due to a transformer vault fire. The resulting sprinkler operation rendered their backup generator inoperable.
Being in the power biz, this sort of thing is to be expected in typical office buildings. Sometimes the power goes out. Live with it. What really puzzles me is how someone can take such a structure, install a raised floor and some big A/C units on the roof and sell it as a data center. This kind of crap goes on all the time, as I've seen purpose built data centers go down for single point failures.
Have gnu, will travel.
Let's imagine that you're actually paying this data centre large amounts of money with the assurance that the money means 99.9% uptime. Then, maybe, it might mean something more.
If you don't give a crap about uptime, then hell, get a Google webpage or something.
The world's burning. Moped Jesus spotted on I50. Details at 11.
"Our current estimate for re-establishing Bing Travel functionality is 5pm PST," says a notice at Bing
When someone in a technical role screws up a timezone designation, for me that is always a red flag that they are sloppy with facts, and I need to closely watch their other decisions, actions and statements, because they may be in over their head.
One simple rule for its versus it's
Wow, you are just as bad as AuthorizeNet... Namely you are putting all of your eggs into one basket called AMERICA... What you are ignoring are the ramifications if a government decides to take you down. And frankly I am more worried about a government taking me down than some accident.
I am part of a hedge fund and we have data centers in... Caymans, Monaco, and Switzerland... I think you get the drift here... And our exchanges that we talk to are scattered throughout the world... Is it simple? Cheap? Nope...
"You can't make a race horse of a pig"
"No," said Samuel, "but you can make very fast pig"
I'm guessing that the server was probably local, possibly above the store, and might have gone fritzy in the heat.
So, real-world implications of computer failure. A server goes down, and suddenly Eric Cannot Buy Cheese ("Aaaaiiiieeee!"). Eric has hard cash, store (presumably) has cheese, but store can no longer sell cheese to Eric. Or anything else.
The shop "crashed".
Okay, so I trudged off and did my grocery shopping elsewhere, but it was a little disturbing to think that we've already gotten to the point where a server problem can stop you buying food, in a "real" shop, with "real" money.
Eric Baird
You know we had something similar happen (North Central AR) a few years back. We had over 70k people with zero Internet anything for two days. They couldn't get medical records, use CC, hell the three towns affected pretty much ground to a halt. The cause? The lines heading out to the main branch all converged on a single big fiber trunk that some dumbass farmer nailed with his backhoe while digging a ditch.
So while you can hope there is enough redundancy in the system to keep catastrophic failures like this from occurring, the simple fact is we have no idea how much of our critical infrastructure can be taken down with a single fuck up. Maybe the US gov needs to find out which companies we depend on have such single points of failure and demand redundancy for critical infrastructure? Of course with bribery....uhhh I mean lobbying being legal the mega corps would just use it as another excuse for a bailout which they would stuff in their pockets instead of doing what we paid for. Kinda like how they took those billions we gave them for nationwide broadband and gave us the finger in return.
ACs don't waste your time replying, your posts are never seen by me.
http://www.seattlepi.com/local/6420ap_wa_fisher_plaza_fire.html?source=mypi
http://seattletimes.nwsource.com/html/localnews/2009415646_webfisherplaza04.html
Sig Battery depleted. Reverting to safe mode.
That's pathetic. I've seen stores stay open during 24 hour POWER FAILURES! Any manager who does not teach their employees how to manually do credit card transactions (yes you can do them by paper!) should never have been hired in the first place.
When we lose power around here (once every 6 months or so), the stores stay open. They simply don't accept debit cards (which require a connection to the bank) until the power comes back on.
So THAT'S why there was a do not touch sign above it...
*avoids eye contact*
it would require a terrorist attack on New York PLUS an earthquake in San Francisco to knock us offline.
Which is all moot since you're using authorize.net as a payment gateway. ;)
Sounds more like Fischer Price. Glad that none of customers rely on Authorize.net.
Don't disappoint your bird dog. Go to the range.
Or (gasp!) make change without a computron! I wonder if they even train that in grocery stores anymore...scary, indeed.
My debut novel AMITY now available: http://jeremydbrooks.c
Imagine my surprise at learning that the problem is big enough to make /.. Actually, what's even more surprising is the unplanned outage in the first place: I don't recall Adhost ever going down for this long, especially in the middle of the day.
I used to manage a 22 rack cage that we leased from Internap at Fisher Plaza back in 2005. They really did build the place well. Massive diesel generators, independent well water, redundant cooling, etc. But it was designed to survive and continue broadcasting for a local news station for 18 days without resupply in the event of a major external disaster like an earthquake.
I imagine they are reviewing their DR procedures and designs now to minimize collateral damage from internal factors.
But let's not be too hard on them, it was one of the better colo facilities I've seen. There are far worse out there holding their pants up with three hands.
These opinions guaranteed or your money back.
They should also fire the person who was responsible for having a sprinkler installed above a transformer, exactly how is spraying water on a transformer going to help in a fire?
This is the 2nd fire since 2008... Apparently Internap rent the power from the building so they have no control over the quality/maintenance of these generators and UPSes.
The fire which started around 11:30 PM (or maybe earlier, but first signs were around that time) damaged badly some of the electrical risers, so they are unable to get power back so some parts of the datacenter. According to their last update they're getting external generators to bypass the damaged equipment and power up the rest of the datacenter, which should be completed late this evening... At best it's going to be a nearly full day outage for some of their customers.
When this happens in this day and age the CIO should be fired! And if the CIO recommended a redundant D.C. but the CEO, CFO or Board rejected it as "too expensive"????
If that's the case, then the aformentioned officers should give up their pay to the thousands of merchants who lost their day's pay due to this problem. Yeah, like that'll happen.
Phone lines occasionally go out and that might affect local merchants, but when it's a data center that handles the livelihoods of thousands of merchants, there needs to be much greater redundancy. The businesses that are affected by this are not all huge e-tailers either. Many are just small operators trying to make a living on the web. As it stands now, a merchant can't have multiple card processors unless he's willing to pay the monthly fees for two processors. I've never heard of that being done and doubt it would be feasible.
Merchants affected by this will just have to suck it up, but for those who are not involved in e-commerce, this is a shining example of how doing business with credit card processors is dancing with the devil. They screw you on all of the charges, they screw you on chargebacks, and now they've screwed a lot of small business people by denying them income, probably because it wasn't cost effective to have a first class backup plan.
Happy Independence Day!
== First cross river, then insult alligator.
http://twitpic.com/966ee
The Twitpic link works fine from the place I found it but not when clicked via slashdot???
Or (gasp!) make change without a computron! I wonder if they even train that in grocery stores anymore...scary, indeed.
I think the bigger issue in this case would be manually looking up the price for every single item. We tend to simplify selling things manually in this way (manually processing credit card transactions, making change manually, etc.), when really when really the biggest problem is being without the UPC system.
I can't buy any cheddar here? But it's the most popular cheese in the world!
I listen to both RIAA and non-RIAA stuff if I like the music, tangential business/politics nonwithstanding.
All fine and good... There is no possible way to design the entire world with redundant systems. But a company like Authorize.net doesn't have that excuse. Hopingh has nothing to do with it, it's called network engineering. They should have multiple data centers located in geographically dispersed parts of the world. This is hosting 101 for any large-scale internet business. The OP is right, the CIO should be cleaning out his desk as we speak.
There is a reason its a 99.9% uptime and not 100%, this can happen and you can't really sue them if they argue that this is the .1% its down.
Anything can be found funny, from a certain point of view.
That's pathetic. I've seen stores stay open during 24 hour POWER FAILURES! Any manager who does not teach their employees how to manually do credit card transactions (yes you can do them by paper!) should never have been hired in the first place.
When we lose power around here (once every 6 months or so), the stores stay open. They simply don't accept debit cards (which require a connection to the bank) until the power comes back on.
In other words it happens frequently enough that there is a procedure to handle it. But not frequently enough for the stores to use a UPS and generator to cope with unreliable power. Not even given the loss of refrigerated/frozen stock.
I worked in retail once, for a regional department/grocery store.
We had enough generator to maintain minimal lighting, keep cold stuff cold, and run the registers. Whenever the power was out on that end of town, people would instantly line up buying things there instead of the neighboring competitors who had no such facilities.
I'd guess that this allowed it to pay for itself.
Kid-proof tablet..
Google Checkout and Amazon Payments -- there's your redundancy, both with neither setup nor monthly fees.
Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
The only thing I can add was that I was at a Home Depot once during an extended power outage. They had a generator that ran emergency lighting and the register system, but they had to wait a while for it to boot back up and re-sync to corporate or something. Anyway, during that time they had employees all over the place helping people write down the price of items they were purchasing so the checkers could ring you up manually. At the register they would write down the UPC, price and quantity to update inventory later. Credit transactions were authorized by phone. Certainly a bit slower checkout process than usual, but they didn't close up shop just because the power went out.
this is my sig
Way back when dinosaurs roamed the earth, we had these 'stickers' on every can that showed the price....
Others would print it in ink using a stamper.
Then I'd have to kill myself for having created a stupid business model.
You are welcome on my lawn.
And if the CIO recommended a redundant D.C. but the CEO, CFO or Board rejected it as "too expensive"????
Then they fire the CIO post-haste and blame the whole thing on him.
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
Neither Google Checkout or Amazon Payments look like a good substitute for card-present transactions, while authorize.net has a card-present interface (among others).
Belief? Hope? Preference?The Existential Vortex