How One Drunk Driver Sent My Company To the Cloud
snydeq writes "Andrew Oliver offers further proof that drunk driving and on-site servers don't mix. Oliver, who had earlier announced a New Year's resolution to go all-in on cloud services, had that business strategy expedited when a drunk driver, fleeing a hit-and-run, drove his SUV directly into the beauty shop next door to his company's main offices. 'Our servers were down for eight hours, and various services were intermittent for at least 12 hours. Had things been worse, we could have lost everything. Like our customers, we needed HA and DR. Moreover, we thought, maybe our critical services like email, our website, and Jira should be in a real data center. This made going all-cloud a top priority for us rather than "when we get to it."' Oliver writes, detailing his company's resultant hurry-up migration plan to 100 percent cloud services."
I've been drunk the previous two weeks and it's been awesome!
or is it just the name for all datacenter hosted servers now? (trick question.. it is).
world was created 5 seconds before this post as it is.
Servers Against Drunk Drivers
Google, Amazon and Microsoft can't be trusted with your data, but it's better than taking the risk of having a little downtime due to a freak accident.
Self driving cars ...... but what if Google decides to use them to promote the cloud by taking out servers?
This story isn't remarkable is it. Man shocked when putting all eggs in one basket is a bad idea. Solution: put all eggs in another basket. DR is what colocation and failover is for. The cloud doesn't magically make you impervious to disasters.
..than have my privacy violated 24-7.
Why would you not have critical system on a back-up power system? Generator? Something. It sounds like poor planning more then anything.
If a single drunk driver is able to stop your production and that production is critical you are doing something wrong to begin with. While the cloud might (and probably will) offer better HA and DR it will not fix a bad design by itself. The article also states: " I didn't want to create my own internal IT department". I' guessing Andrew Oliver is a PHB.
The issue here is that he didn't have adequate disaster recovery procedures and policies.
The standard solution to this sort of problem is that you have a backup system that sits off site ready to take the load should something happen to primary. This backup system should be located in another data center, with a different ISP etc.
Moving to the cloud doesn't solve this, per se, if you move all your infrastructure to say Amazon you're still beholden to that company and its internal procedures. A system administration on their part could easily render you down for many hours.
The lesson hasn't been learnt.
Now all you have to do is wait for some drunkard to trip over a power cable in whatever cloud provider you're with and wait for them to realise it because _you have no control over it_.
Hosting 101: Disaster recovery.
Hosting 102: Would you rather the NSA back up your data from the cloud?
Common Sense 101: The cloud ultimately doesn't work.
https://xkcd.com/908
Someone would have been injured or killed.
'In the cloud' is a nice hype. But you have _NO_ control about where your data is located (backup-up) and/or who has access to your data. From business standpoint it is a tricky solution.
Hybrid Cloud Solutions are the way for those serious about HA but aren't insane enough to let it be run by someone else completely. You allow for better availability in the event your own cloud fails as you can shift to a provider's cloud during downtime but you have the control and standards of having it in-house. For all the problems with clouds, when someone drives car right through your server room/datacentre, at those particular moments, you're willing to let them slide to get back up and running for the short term.
and then, reality hit when a drunk driver ran into the building holding your 'cloud'.
this isnt a justification to go 'to the cloud.' its a cautionary tale on the merits of redundant infrastructure. in the grande tradition of slashdot car analogies: what you did was the equivalent of buying a maserati after your car was in the shop instead of taking the bus.
Amazon and friends still have regular service outages. these in fact may exceed your yearly downtime depending on how good an admin you are. the only difference is instead of a drunk driver you're held hostage by a provider that has no accountability when it comes to your uptime.
Good people go to bed earlier.
...use drywalling FOR EVERYTHING in construction, then there is a good chance this wouldn't have happened. Seriously. You guys use it even for outer walls in private homes. Good grief.
Why is this a story? Why?
Now, if a drunk driver runs into the "cloud" datacenter, what will you do? Go to the heavens?
If a single drunk driver is able to stop your production and that production is critical you are doing something wrong to begin with. While the cloud might (and probably will) offer better HA and DR it will not fix a bad design by itself.
The article also states: " I didn't want to create my own internal IT department". I' guessing Andrew Oliver is a PHB.
Because cloud services have never had extended outages...
Honestly, anyone who sees cloud services as the great fix for reliability problems is an idiot, especially reliability problems caused by a once-in-a-lifetime drunk-driver incident. Most of the cloud services seem to have had their fair share of incompetence-related downtime. I wouldn't mind betting that if he'd put all his IT stuff one one of the commercial cloud platforms for the last 2 years, he would've had more downtime than he had running them in his offices.
In any case, shoving stuff in the cloud doesn't absolve you of needing a competent IT admin to handle backups and such, unless you're insane enough to trust *everything* to a cloud operator who, at the end of the day, doesn't actually give too much of a crap about one tiny customer who might've lost all their data.
http://blog.nexusuk.org
Yup. It is all fancy way to tell services are not in a local closet, but in a specialized center.
It all seems fancy, until you hit downtime, and your SLA happens to be "best effort" and the response time is nothing more than someon looked at it within a certain time. You will never get a sla that returns money for the lost productivity.
You will still have to figure out how to get your backups regularly out of the cloud, and retreive the data if the cloud operator stops. You will have to provide a fast internet link, or maybe even a double link, since if one provider fails, it might be cheaper to have a second provider instead of having one with a expensive business SLA.
Stating "put it in the cloud"sounds simple, but a lot of details are really important. Notice how the Tarticle is a consulting firm in such things? and even they hoose to do in inhouse for quite some time?
Amazon Cloud Service Hit By Car Crash
NB: The message above might reflect my opinion right now, but not necessarily tomorrow or next year.
Seems the main problem here (as already noted by others) was a lack of DR facility not moving to the cloud. But this is the same for many SMB's, requirements for a decent internet connection (not in the costs and quite a bit) and you STILL need a DR solution for alot of the stuff. Googles and others have suffered outages.
Also they've struggled with the online versions of the accounts and expenses system, just because it's a SaaS solution doesn't mean it any good!
Also moving from internal BIND to GoDaddy for DNS - seriously.....
Didn't this cloud used to be known as hosted services, the only difference being that your server is now some VM running on shared hardware and you still have to hire someone to configure/install and upgrade your computing infrastructure.
AccountKiller
What happens when a 'drunken' MBA cancels the service. Or a drunken admin deprovisions the wrong servers?
Silence is a state of mime.
By putting your data in someone else's hands you are also opening your self up to major security breaches if you are dealing with PII. Example, the company I work for stored PII, they have a website, and on that site, they also have their DB...don't ask, it was developed and implemented long before I got the job. I argued that they should never have a DB with PII on the same machine as their web server...they looked at me like I was stupid...so I just plodded along...well the company we got our hosting from decided one day to remove all the IPSec filters that were acting as a firewall (again don't ask not my implementation) when they did some changes to their VM infrastructure...well lo and behold, our DB was exposed to the internet for about a month before anyone noticed...not only do you have to deal with possible downtime from failures at your provider, but you also have to deal with their stupidity and random configuration changes without your approval...
Could someone please explain what this means?
RTFA... He wanted to go cloud before the car hit the building. Seriously? the car didn't move his business to the cloud. He just didn't want to create his own internal IT department
Sometimes HA and DR is accomplished by having your data mirrored to various data centers. Worst case, you do not even know in which jurisdiction the data centers are. Suddenly, your data is governed by the law of another country. If you live in a police state, this might not frighten you. For some of us, the result is not normal.
...and what happens when there is a major data centre outage? Nearly all cloud providers have had them. In the whole time cloud providers have existed, I haven't had any downtime. Also what happens when you loose connectivity to your cloud services.
What you need is off-site redundancy in your own setup, not to delegate the responsibility to someone else. Even if you have an 100% uptime SLA, and redundant mirrors, all provided by the third party, you will probably end up having a less reliable setup. I'd be very interested in the metrics used during this decision making exercise, including your evaluation of the reliability of large cloud providers, the factoring in of the connectivity issues, and the additional bandwidth costings. Can you please post this information - you are certainly incompetent, if you don't have it.
Why the hurry? the probability of something as catastrophic like the car accident on your datacenter happening again is lower now, you can now proceed with the transition to the cloud or colocation service with the same speed than before.
I guess I'm supposed to go scrambling for my acronym dictionary, but I just don't care. I'll assume he means laughter and medial attention.
- First they ignore you, then they laugh at you, then ???, then profit.
Drunk drivers have been sending things to the clouds for a hundred years.
Wait. That's not funny. :-/
(-1: Post disagrees with my already-settled worldview) is not a valid mod option.
Moving all your data and/or application to "the cloud" does not eliminate the need for it to be stored on and served from a physical machine.
All it does is make the server it is stored on part of some giant datacenter owned by Amazon or Rackspace or something, rather than part of some smaller data center owned by you. Oh, and that for a fee.
I am officially gone from
I work for a company that would have been in a good place, and better off than they were, if they had gone to the cloud a year or more before they hired me. However, they hired me because they were experiencing rapid growth and part time IT support from brother of one of the owners was not longer adequate. When they hired me their IT infrastructure was about three years overdue for replacement from top to bottom. The owners wanted to go to the cloud as part of that change. As we investigated options it became obvious that we had outgrown where the cloud would have been a good solution for us (it is not just size, it is also the way that we do business). We are at a size where it is cost effective to build out our own server infrastructure, including what is needed to ensure business continuity rather than pay someone else for it. The cloud might be a viable option as the location for our business continuity redundancy, but it is not cost effective as the location for our day to day operations.
The truth is that all men having power ought to be mistrusted. James Madison
If you own a small storefront, I understand the need for collocation, but I don't think they need the cloud.
If a drunk driver plows through your office and takes out your net connection and power supply, then cloud services are not going to help you either.
Cloud services are only as good as your net connection and power supply. Lack of power is a guaranteed show stopper. If you can get your hands on a petrol/diesel genset then at least you can power up your servers and keep working with your own data. In-house servers and off-site backups are still pretty hard to beat, and who says that natural disaster cannot hit your cloud service provider.
So no thanks...
as a side note I would never leave my business data to the trust of another business, if you can't trust yourself who can you trust?
Ask yourself this: if cloud computing is so great why aren't all the big players eating their own dog food and using 3rd party cloud services?
They don't.. none of them do, they just want you to be sucker, not them.
They were running on a badly designed system to begin with. Honestly. why did they not have an offsite failover? If 8 hours of downtime is expensive then the CIO needs to be fired for his incompetence for not having a failover system in place.
Do not look at laser with remaining good eye.
If a single drunk driver is able to stop your production and that production is critical you are doing something wrong to begin with. While the cloud might (and probably will) offer better HA and DR it will not fix a bad design by itself.
This seems a bit harsh. I would say that in general, if you're running a small business and you think a single drunk driver can't potentially stop your production for eight hours (as happened here), you're either kidding yourself or paranoid.
If you read the article, the driver hit the shop next-door, severing the gas main and filling the buildings with gas. At some point during either the crash or the firefighters' entry their internet access was knocked out. Now, migrating all their services online protects them against the specific problem they had this time; but it wouldn't protect them against an actual gas explosion, which could easily have occurred. Nor against a key employee having been hit in the actual hit and run, or the driver hitting your delivery truck, or crashing into your actual building and damaging your workplace. Any of those - and a thousand other scenarios about as likely as the one that actually took place - could easily cost you eight hours of production, and anyone who protects against all of them is wasting an enormous amount of money.
Moving services to the cloud probably makes sense, because it protects against a whole load of problems. But saying of a small business that "if a single drunk driver is able to stop your production and that production is critical you are doing something wrong" is unrealistic.
Our company is on the opposite side of the beauty shop and had our services out as well. They're not telling you the whole story on why they chose that site...
http://www.sexyandfunny.com/image_gallery/random-photos-174_76473_80531_full.html
nsfw and just copy and paste the link you lazy fuck.
http://www.datacenterknowledge.com/archives/2010/05/13/car-crash-triggers-amazon-power-outage/
"Amazon’s EC2 cloud computing service suffered its fourth power outage in a week on Tuesday, with some customers in its US East Region losing service for about an hour. The incident was triggered when a vehicle crashed into a utility pole near one of the company’s data centers, and a transfer switch failed to properly manage the shift from utility power to the facility’s generators."
Then not only your business is ... "in the cloud"... and by that meaning really in the physical cloud.
I'm sure glad I have a locally hosted gitlab repo so I can get some work done this morning.
Pretty good article though.
I don't think this level of stupidity and ignorance can be increased.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
It might be a bit harsh but still, if the production is critical and you expect it to work at all times you can't be surprised when shit happens if you don't have a good and tested plan. Moving everything to the cloud does not necessarily solve this issue. What if "something" happens to the cloud provider? What if someone hacks your prod system? What if you accidentally delete your data?
You still need a backup, you still need a distaster recovery plan, you still need some sort of HA solution and you still need qualified personnel. Personally I would not trust that my cloud vendor actually does this for me.
I don't really have a lot against the cloud, I have actually set up and used Office 365 for a small/medium business. But I still have local copies of that businesses data and a local Exchange server in case "something" happens.
fake submitter and slashdot moderators made me laugh today...
Well, I hope he enjoys lots more downtime now that everything is in the cloud except 0% chance of him fixing it himself. And just wait for that extended downtime when the cloud host goes out of business without warning.
The drunk driver could just as easily drove their car into Amazon's data center and hit their server that was hosting your data.
Or, their data could have been in the "cloud" already on something like Mega Upload, and then the servers seized and taken down, all data seized because government goons took them.
THEN where would you be? Most of those companies still do not have their data back, and we are not talking about pirated data, we are talking about lagitimate data that happen to be on the same server with somebody elses alleged pirated data.
So when a drunk driver takes out the electric or a backhoe chops your internet cable your company is still in the dark for those same 8 hours. It doesn't make a damn bit of difference if your servers are onsite or offsite, your employees can't work. If you move all of your data and services to the web you could have your employees work from home but if your workflow includes locally installed software then forget it. And lets not forget we have an ongoing scandal of big web companies happily handing over our data to the NSA.
With any business you need a disaster recovery plan. Basically, imagine the worst scenario that can happen (building burns to the ground) and figure out how to recover from that. Its not easy and although the cloud does make it look like you will have better uptime, the reality is it really doesn't. The only thing it can provide is lower IT costs. You still have physical employees in front of computers and a lot can go wrong.
This made me laugh.
The cobblers kid is not crucial to the cobbler making money, his hammer is. IT infrastructure is a tool just like the cobblers hammer, without it you are dead in the water. You are too busy devoting your resources to your customers to make money. You look at your internal infrastructure as an expense rather than as assets and tools which provide you the means to make money so you neglect it. What else did you expect?
And lastly, hosting your website on-site is old hat. Even with NSA spying you should not have any sensitive data on your website to begin with (unless you're stupid). So no worries there. Email hosting is a double edge sword. You are guaranteed excellent uptimes, round the clock support and no hardware to worry about. Though, on the other hand you have the NSA spying problem and potential hackers. If you are a small business then use hosted email, the benefits outweigh the other problems. If anything is super sensitive and you are scared the NSA will see it then maybe send it snail mail on CD or thumb drive. Where I work we deal with parts from all sorts of companies, many who are government/military contractors and they still email prints. The paranoid ones who have locked down workstations fax us the prints. Only companies who deal with very sensitive data need to worry.
Show me a single piece of paper the PROVES some remote server(s), or "cloud", run by someone else who likely hired under-qualified "sysadmins", is actually secured hardened enough to protect my data.
As your direct competitor I've had RIAA, MPAA, FBI notices sent to those sites because
you may have violated some law.
There is now an ongoing raid to seize all the computers and data at those sites
and they promise to return them in a year or two.
Let's get real: what they have done is a classic otherwise known as buying insurance after something bad happens. IOW: always always a stupid move, since you by definition react irrationally (based on fear etc.). A business must have reserves and planning for dealing with downtime. For most small businesses, it's cheapest to acknowledge some loss of revenue in case of certain externalities. Providing failover/protection from those externalities will be just like buying insurance, except that you're self insuring at a cost that, amortized over time, likely vastly exceeds any loss of revenue, yes, even when accounting for time value of money etc.
A successful API design takes a mixture of software design and pedagogy.
There's a big double edged sword when it comes to moving to cloud based stuff (We have a fair amount of SAAS). It's when some back ho operator gets a bit to dig happy and attenuates your line. Getting two different ISPs that can both handle the load and have two different ingress/egress routes can be awfully expensive. We lost services for 18 hours when a construction crew got a large capacity fibre line that took out both our main route and apparently, our backup line was moved by our ISP and went over the black fibre of the same group.
Also.. good luck trying to get tickets responded to in a timely manner. When you have a large problem, for instance permission issues across all 280 Office 365 accounts, and your ticket has already been open for over a week, it sucks.
Cloud has it's advantages, and it's got some giant sticking points. I wouldn't be so rushed to the cloud unless you are trying to reduce CapX or have such rapid expansion that you need to provide services faster then your internal staff can keep up.
Oh come on, the business they are in is called Open Software Integrators, they are a consulting firm. I don't know what the fuck they are doing that they can't swallow an 24hr downtime, give-or-take. It's just completely blown out of proportion, and they are overreacting instead of following their gradual move-to-cloud plan. They are very silly.
A successful API design takes a mixture of software design and pedagogy.
They only have their filthy mitts on the upstream carriers if anywhere
Let's review, the servers the company had on-side got damaged, so they expedited their plans to move to off-site servers.
Oh, and the off-site server is called a "cloud" service in this article.
What's new? What am I missing here?
I know of a major "cloud" data center having a substantial outage on New Years Day due to a drunk driver hitting a power pole. Automatic transfer switches wouldn't switch to generator, and the site was down for 16 hours.
The magnitude of the outage, time-to-recover, and customer impact were much worse than they should have been, but not that far from what I might expect at several more "premium" facilities.
When it is in the cloud, you really don't know what the physical infrastructure is...
A) The "Cloud" is a meaningless buzzword that PR sell to management.
B) Most are located in the US, subject to US laws, making them useless to anyone with Privacy Laws.
C) Fake story used to sell more "Cloud" services.
Scenario 1 - in-house data servers: The drunk guy drives thru your computer closet, or fire, or flood, or deranged person decides to make your offices a crime scene. Systems are down for who knows how long.
Scenario 2 - datacenter/cloud data servers: A backhoe cuts thru the fiber/copper between your offices & datacenter/cloud, or earthquake, or flood, or datacenter/cloud business goes bankrupt or plays host to a deranged person or maybe even gets closed down by RIAA/MPAA or maybe even IRS. Systems are down for who knows how long.
Ya pays yer money & ya takes yer chances.
Ok and their cloud service is based where? surely not up in the air, it has the same probability of disaster than having them on-site so not sure why anyone should focus on that.
i wonder how that beauty shop is doing now. guess they rebuilt the shop.
what is HA and DR? never heard of those acronyms before. HA is short for home automation and DR is short for drive?
If a single drunk driver is able to stop your production and that production is critical you are doing something wrong to begin with. While the cloud might (and probably will) offer better HA and DR it will not fix a bad design by itself.
The article also states: " I didn't want to create my own internal IT department". I' guessing Andrew Oliver is a PHB.
Translation:
Those IT geeks must have thought I was a gullible moron! The price they quoted me to protect my super important data was too high, but knew that it's all just mouse clicking and watching Youtube, so I got my nephew to set up some servers that first time.
After my catastrophic loss, I again chose to not waste money on redundant servers or hiring any of those overpaid minesweeper monkeys, but now my data lives in a far-away place that I don't know managed by people I've never met, so it must be a better solution than what I had before.
"How One Bolting Horse Sent My Company Close The Stable Door: We had no high availability or disaster recovery in place, so when a disaster happened our systems weren't available and we couldn't recover from it. That was bad, so we fixed it."
Next week's article will be "How Losing All Of Our Data Made My Company Start Making Backups", followed in September by "How Losing All Of Our Data A Second Time Made My Company Start Testing The Backups Too".
They were your servers before, you had most of the control on them, and their content. Then you had a person acting in an irresponsible, stupid, or criminal way and now that you moved to the magic cloud you think that you are safe from people acting that way, giving total control to a lot of people which any of them could act that way too?
is he going to go back to local servers/storage?
by TheSpoom (715771) Uncaring Linux user here. I have nothing to add to this but please continue. *munches popcorn*
yeah I hear you worked with one guy who s only way of interacting with mysql was phpmyadmin and when the script went over the limit he was stuck until I showed him how to telnet into the box and run the mysql query direct from the cli.
The NSA.....once in the cloud it'll make their data collection that much easier!
there are only other people's hard drives.
Let's suppose there are two types of "hoaxes". One is the type that is simply cotton candy with "almost absoutely" no evidence, such as the "Rapture Will Occur on X day" type. (Funny note - I just saw someone had posted a flyer on a windshield proclaiming the "Rapture will occur in 2011" ??! )
But let's say the other is the one based on "50% truth". So yes, Y2K was a deep multi decade legacy flaw dating from the dawn of computing. Put one way, "1975 computing had to deal with 1975's memory space concerns, so a 2 digit year made lots of sense." Then to pick a year arbitrarily, in 1995 (tied to when I consider MS's dominance of corp user space "complete" aka Windows 95), Y2K became close enough of a problem to worry about for real. So yes, massive outreaches worked across the industry, and apparently so well that when it happened I didn't see *any* significant problems of *any kind*.
But the histrionic types were shrilling that it would have been the end of civilization if not fixed, that entire service industries would collapse, etc. So in *that* sense, it was a hoax - those gross elaborations.
The interesting thing is that as a modestly observant tech news follower, I haven't seen anything pushed to that level of importance since!
My first Journal Entry ever, in 8 years! http://slashdot.org/journal/365947/aphelion-scifi-fantasy-horror-poetry-webzine
..store all your dataz in the Uncle Sam Cloud. We take care of the rest.
So, I take it from this story, that there now is a cloud that
* is redundant to location (so that drunk drivers, fires, power outages, local politics and revolutions doesn't affect you)
* is redundant to host (so that bankruptcy or stupidity of one host doesn't affect you)
* is redundant to network (so that cable outages doesn't affect you)
* is redundant to data (ie, proper backups)
* with working fallbacks/handovers (so that you have to do nothing to keep things running on the above problems)
and not just means you just handed over to someone that puts it in a virtual server in a data center somewhere?
Because otherwise, I don't see the point. (You don't even save on the IT guy. You still need that IT guy to make sure your "cloud" host does what you would do normally on your own. Or they will fail making backups and their mail server's disks will get full or...)
I agree with this. It was unlikely the apocalypse would have occurred due to Y2K, but it would be naive to think there wasn't the possibility for quite significant impact to a very technologically reliant civilization such as ours. I have read plenty of reports of Y2K bugs that DID cause minor problems (power outages, ticketing systems unable to create tickets and so on) but precisely because of the herculean effort to fix things they were the exceptions rather than the rule. The thought of a near global power outage would have been very impactful on our civilization. An inconvenience at the end of the day because people WILL tend to adapt... but many people with electric heat could have died. We are so reliant on technology that while the hand-waving about the end of the world was a bit much, it did help to bring some attention to the matter.
Get a better location or put up barriers, I'm sure it was a cost advantage over the drunk driver.
Someone accidentally or purposefully takes down your cloud server, and then there's not a damned thing you can do about it but wait until the problem is solved by the people who run those servers. Which, depending upon the situation, could be weeks, months, or even never.
Doesn't sound like a very smart idea to me. Using the cloud as an additional backup measure seems perfectly reasonable. Using it as your primary solution? Nope, pretty damned stupid.
Why not just put guardrails around the servers? Problem solved, and no cloud required.
Had they had DR in the first place, the exposure would have been less traumatic. HA, of course, doesn't buy them anything in a disaster.
TV-MA - the Beginning: "Ward, don't you think you were a little hard on the Beaver last night?"