Outages Leave Google Apps Admins In the Hotseat
snydeq writes "This week's Google outages left several Google Apps admins in the lurch — and many of them are second-guessing their advocacy for making the switch to hosted apps, InfoWorld reports. The outages, which affected both Gmail and Apps, 'could serve as a deterrent to some IT and business managers who might not be ready to ditch conventional software packages that are installed on their servers,' according to the article. 'If we began to experience a similar outage more than about two or three business hours per quarter, we'd probably make Google Apps and Gmail a backup solution to a locally hosted mail system, if we used it at all,' said one Apps admin. 'And it would likely be years before we'd try a cloud-based collaborative system again from any vendor.' Coupled with recent Apple and Amazon cloud issues, these Google outages are being viewed by some as big wins for Microsoft."
When my boss tells me he wants 0 downtime (or even five-9 downtime), I show him a quote for the 7-figure cost of creating such a system.
Apparently Google is expected to hit that level of uptime all while charging either nothing for their standard edition or $50 per user per year for the premier.
I wonder how much downtime the companies that are using Google Apps would experience if they had to pay for their own redundancy?
I'm a big tall mofo.
I agree. Openoffice is still "locally installable" and 100% free on the applications front. And any business that relies on an outside free webmail service for their corporate email needs is just asking for trouble...loss of the service from time to time is but one of the gotchas.
Cheers,
Google Apps Premier is not free - it's 50$ per year per account.
I'm using it for my private mail. I like it. But i don't expect 100% uptime - especially for just 50Ã per year per account.
we only have one or two unexpected downtimes per year
What about your planned downtime? If you're running Windows, you're rebooting to install patches on a regular basis or you're running unpatched systems. What about software installs?
In the context of the article, do you think the users of Google Apps (or any users) would be happy with, "Oh, no you don't understand. This is PLANNED downtime. This doesn't affect you or our downtime numbers."
you can have 0 unexpected downtime with a single server, if you are lucky.
You can win the lottery too, if you are lucky. How many people win the lottery though?
I'm a big tall mofo.
Do you honestly believe that you or your employees are going to build a system with higher availability than Google? In the magical fantasy world we all wish we lived in, you may have the budget, skill, manpower, and infrastructure resources to do this. In the real world it is not even remotely possible. I know how much it sucks when your system is down and there's nothing you can do but wait on some status dashboard to from Red to Green. That said, we should recognize that while being frustrated at this lack of control is normal, that doesn't mean you actually could do it better. It's easy to say "this would have never happened if we were self-hosted" while never thinking about the bullets you dodged by running hosted applications.
That means you, as a single customer, are insignificant. And that shows daily when dealing with any large service provider.
The only thing that my service provider should care about is the availability of the platform. I am completely insignificant, but the only reason my hosted app would be down is if the platform is down, and that sure as hell is significant to them. The advantage of hosted applications and cloud computing is that no one needs to ever look at or touch my app, the platform is all that matters.
Because it's a refutation of Google's business model (cloud based, for want of a better way of phrasing it) compared to Microsoft's (locally based tech).
I remain sceptical, as it it would seem that Google would have to be less reliable than local kit in order to make it worth switching back, even before you take into account extra costs for doing it locally. (How much more do you want to spend to get an extra hour per quarter in reliability?)
Nevertheless, IANASA so I don't know the data behind this decision.
All intents and purposes. Not intensive purposes.
Why yes, some raspberry vinaigrette or even a nice oil and vinegar would go wonderfully with that word salad.
There are only two IT solutions out there in the minds of too many people: Microsoft, and non-Microsoft.
To go with Microsoft is the easy, sure road. It is the standard. It is what is expected, what is known to be safe, what will always work. Any problems you encounter here are met with "well, computers always have problems don't they?"
To go with non-Microsoft is hard and uncertain. It is not expected, nor "the standard", and suspected to be extremely unsafe. The smallest problem will be countered with "you and your stupid ideas. Now go and call LocalRetailerInc for a certified Microsoft solution, and be glad I don't fire your ass over this fiasco!"
Google is not Microsoft, so according to the business logic described above, if it doesn't work the only possible alternative is to use Microsoft.
Another issue is web/network attacks. They are going up big time and are even state-sponsored. Look at what Russia is, and has been doing to Georgia.
I don't understand how anyone in this day and age can justify going with remotely-hosted applications. The ability to reach remote servers can be taken away even by morons and botnets who might not like your company.
In my opinion, remote web hosting of applications that are presumably important for a company to be able to run is just asking for trouble. I wonder how many fingers will get pointed when some critical deadline looms and nobody can run their applications to be able to meet it.
It's reckless and risky for business to expose themselves like that. As others have pointed out, OpenOffice is free and it is good. Why waste money on training people on both the Google (or other) remotely-hosted application and OpenOffice (if that is your emergency backup). Just train people on OpenOffice and now you don't need a backup plan in case the network goes down and you can't run the remote stuff.
Remote applications may have been a solution before the Internet got nasty but these days, running business-critical stuff over it when you don't need to does not make sense to me.
Maybe I'm missing the huge economic advantages that justify the unknown and growing risk, but I see network (Internet) applications as being at huge risk for outages, a security risk, a data privacy risk, etc.
Two weeks ago a transformer blew out in the building I work in. First there was no power for 3 hours, then temporary power as a large generator was hooked up, but it was not big enough to run the AC, so we did no turn on the servers. It took another day to get a large enough generator (about the size of a tractor trailer). In total, our business was shut down completely for a day and a half due.
I don't think you can even get a SLA from the power company.
Google Apps went down for 3 hours.
Shit happens.
We ran into one of these "gotcha" features in hosted Gmail that's been giving me fits and it all started with a simple mistake. I misspelled a user name. You can change the spelling in the admin module, but it doesn't change the spelling in the contacts and the misspelling still showed up when she logged in. So I tried deleting the user name and recreating the account.
Big mistake.
When you delete a user name you can't recycle it for five days, which pushed us past our roll out date. Their crip work-around is creating a mailing list with that user name. But that has its own set of problems, especially when trying to migrate a large number of users. There's no support unless you get the premium edition. So now we're stuck in the position of paying for support on a service we're not certain will work for us. I'm not inclined to throw money at something to see if it will work when what we're already paying for is working.
Unfortunately, it was one of our key sales people who already had that account name on her business cards. Rolling without her is a non-starter.
It's frustrating because I'm the one who recommended Google and I feel really let down. It's a stupid problem that shouldn't exist in the first place. Even if there's a good reason for it, there should be a giant warning banner with a flashing red neon border warning you that deleting a user results in a five day lock out. Actually, it's been more than five days and I still can't recreate the account.
This one niggling little incident is making me rethink hosted applications. So, yeah, it does sort of benefit MS. Not in our case, we're using hosted SendMail instead of Exchange, but if this type of "feature" deters other companies already using MS solutions, then yeah. Who wants to take a chance on looking bad? There will still be outages with any solution but no one gets fired for recommending MSFT. There's a certain period of time that users are looking for an excuse not to like a new service, just because it's different. If you can get past that time frame, then a small outage can be overlooked. But those first few months have to be smooth. Maybe not flawless, but close to it.
It would almost be better if the free version was a trial and corporate users could get support from day one. This is just maddening. Shape up, Google.
That's our life, the big wheel of shit. - The Fat Man, Blue Tango Salvage
The last few places I worked had periodic network outages, random print server crashes, workstation blue screens. This caused hours and hours of downtime for dozens of people over the course of a year.
When Google Apps or Gmail goes down, exceedingly rare as it is, people threaten to "abandon the cloud". I wish we had threatened to abandon the lame infrastructure that our parent company refused to update or spend money properly maintaining. For my $0, Google does one hell of a better job than the three helpless infrastructure guys could do with almost zero budget.
For places that either can't or won't put cash into a proper local infrastructure, relying on the cloud is a cost-effective option. Even with occasional downtime.
P.S. I've just started playing with Gears, and it seems to bridge the gap nicely so far.
Seriously now, WTF? Why is everyone acting like they've never had a BSOD on windows, a failed harddrive, a driver problem, or a vendor discontinue support? I use AWS, GAE and Google Apps and while there is a certain loss of control, the downtime I have experienced is far less than I would incur trying to roll my own infrastructure.
I've worked in a few companies with large IT budgets and have experienced more downtime in those environments than I have so far "in the cloud." I think the biggest problem with cloud computing, is when there is downtime, IT admins don't have anything else to do which frees up a lot of time for bitching about the downtime their blogs. Seems familiar from when I was an admin, except on the other side, it was my users bitching at me about an couple of hours of downtime a year.
Its only a big win if Microsoft don't have similer problems.
I'm personally very dubious about these online apps as anything but utilities for occasional use.
The main issue for me is that they exists primarily to benefit the hosting company (google, Microsoft or whoever). We don't need them, they need us to use them, otherwise they can't make money from us.
The current 'install on local machine' application model works perfectly well for end users, but there's less profit for them if you can buy something then use it for years without paying again.
A learning experience is one of those things that say, 'You know that thing you just did? Don't do that.' - D. Adams
Cloud apps have the same problem. When google apps or EC2 go does, it's news.
In my company Google Apps is the most reliable thing we use. Microsoft products are my biggest headache. We have clients that need their work done and I don't have any more time to waste on these crappy machines. We will be switching to Apple for all mission-critical machines in the next three weeks.
If my MS computers could have only 3 hours of downtime a quarter I would be really happy. I used to work for an IT company and they primarily used MS servers for their clients. Big mistake. MS products are a nightmare. Their clients would have been happy with 3 hours of downtime instead of days and days down dealing with MS server issues. I would only avoid cloud computing if there were serious concerns with privacy or hacking.
Hmmmm..... not long after introduction Google apps have 15 hours of unplanned downtime. We have apps that have been deemed critical and have had zero unplanned downtime since introduction (knock on wood). Our system was designed for absolute maximum 1 hour RTO and 1/2 hour RPO. Thus far, we haven't had to actually use DR plan in real life, but tests show we beat those numbers.
I'm sure Google "can" build better systems than I have, but like any other company they did a cost/benefit and decided what they have is good enough. For my company 15 hours down time isn't good enough for systems so we spent the money for a better system.
So.... yes you can at least do it better than Google "has" regardless of if they "can" do better or not. That isn't to say hosted apps aren't good enough in some cases, but to say you cannot provide better if needed is a bit silly.
"reality has a well-known liberal bias" - Steven Colbert
I love being the asshole, but let's be honest here: how many in-house systems actually deliver better uptime than Google ?
Not that many. If they did, all us sysadmins would be out of a job. Apps are not perfect. The fact that you can pay Google a few pennies to manage your email, even with some downtime, makes it several orders of magnitude cheaper than an in-house solution for most people.
Give them a break, people can survive without email for a few hours.
-Billco, Fnarg.com
my DSL was down for FIVE DAYS recently due to flooding. The brilliant bell decided to place ALL their DSL centers in the state within 2 blocks of wherever the local river was. D'oh. We got a "500 year flood" and it buried every single one of them.
If 3 hours is outrageous, what does three days classify as?
The irony? I used my gmail while they were down.
They have yet to restore my "backup dialup account", a month later. Sure, 56k isn't exactly a good backup, but they didn't even have THAT.
I work for the Department of Redundancy Department.
If you use gmail, then take a look on the Web Clip line on the top of the page. It shows entries from a set of rss and atom feeds. By default the gmail blog is one of the sources it uses, so you should at least know, that it exists. I have seen the entry about the outage show up a few times there.
There aren't any gDoc outages that I've seen. The stories so far are about gmail outages, and it's leading people to question whether gDocs will suffer the same.
That said, I don't remember the last time I've had any Google service down. It happens but not so often. My problem is that my internet service is a tad flaky, in part because wireless is my only partly decent broadband option.
And that flakiness leads me to avoid "cloud" computing. You're relying on a service that has no credible assurances of uptime, and if your internet service is down, then what? My experience with T1 service is such that I might be lucky to get same-day repairs on that internet service. I'm not fond of the VOIP idea for the same reason either, if my internet is down, phones are down too, leaving no way to get in contact with people, except for mobile phones whose signal is weak inside the building.
You need a new exchange admin. Perhaps you should look into one of the ASPs that offers exchagne hosting with SLA guarantees such as intermedia, mailstreet, usi, etc, folks who host exhcnage with 5 9's of uptime. If they manage it, and they DO indeed manage it, in spite of the crap slashdotters like to spread, then it is possible, and its a matter of someone who is not qualified acting as the admin.
I think, in this case, that it is less a matter of absolute downtime, and more a matter of people's feelings of control over that downtime. Look at cars vs. planes. Flying is safer, but people feel safer driving because they feel like they are in control of the situation.
My suspicion would be that google hits higher reliability numbers than many in-shop setups, particularly small ones; but the feeling of sitting there, twiddling your thumbs, and waiting for the remote service over which you have no control to come back up is a terrible one. It is much nicer to have to fix a local problem, which requires more effort; but makes you master of your fate(to the degree that anybody ever is).
The smaller, but ultimately more intractable, issue for remote hosted stuff is that it necessarily suffers more potential points of failure than does local stuff. If google screws up, google goes down. If somebody between my desktop and google screws up, google is down to me. If WAN goes down, but LAN stays up, local apps are still substantially useful(since a vast amount of email and document shuffling is company internal); but remote stuff is useless.
That's true, but there's another side to it; imagine a company with 500 employees. Each employee has their own workstation. Now imagine 1% of those are down constantly. That means five employees will, at any given moment, not be able to perform any work. That's an annoyance, but if a workstation is down for on average 1 hour, then it's still ok.
Now, the important thing to remember here; It's never the same five employees suffering from downtime, and the company as a whole still keeps doing what it does best; earn money. But with a centralised, hosted app, the *entire company* will be down during those three hours of downtime. Might as well give everyone a free day off.
Hosted apps aren't going to fly until this very basic problem is solved. 'Nuff said.
systemd is not an init system. It's a GNU replacement.
The part that is being misunderstood is simply this. Instead of just complaining about Google Apps... compare it to the alternatives.
How many companies rely on Microsoft Outlook with Microsoft Exchange Server? When you offer an application or suite to the whole nation or WORLD, and campaign for its use - then YES, you do need to keep a very near-0 downtime to be really successful.
Except, Microsoft Exchange (while often reliable) does have its moments. Sometimes, just from getting clogged by tons of spam, it can come to a crawl. The server can become unavailable to do network issues. Microsoft Outlook has a tendency to run slowly on some machines, or crash regularly. Expecting ANYTHING that uses computers to work 100% perfectly all of the time, although desirable, is completely unrealistic.
I don't think the people here are saying "expect downtime and just deal with it." What is really being said is, "when MS Exchange goes down... or there are internal network hiccups... or when Outlook locks up on your machine... complain loudly on the Internet instead of to your local admin... that way, the world can get a real comparison between Google Apps and the alternative."
The only reason Google Apps seems like the "bad one" here is because people go posting on blogs and news sites about it. Why? Because it's news... it's rare... it's not what people expect of Google. When Exchange server craps out, Outlook locks up, your computer gets a blue-screen-of-death, a hard drive goes bad, a router needs restarting, power goes out to the building, a UPS battery goes bad, etc, etc, etc... nobody bothers posting this on blogs or news sites because, well, it's an every-day occurrence... it's not exactly news.
Then, when you compare systems that are "always up and available 24/7, can be easily accessed from outside of the company without a complicated VPN, have admins that don't gripe if they are taking up dozens of gigs of storage, with the capability of searching through millions of emails in a fraction of a second" to Google Apps... you'll likely notice that these other systems (with you take into account the cost of the servers, routers, admin hours, electricity, software, etc) cost much much more than $50/year per user.
What's happening here is people are comparing Apples to Orangutans and are creating unrealistic expectations. If these companies really have that much cash to just waste on something they have been brainwashed into thinking is perfect, then they're next likely step in these economic times is to lay off some of their admins because, after all, why do you need admins if the systems are perfect?
I don't know about anyone else, but the fact that downtime is such a shock on Google is testament to how great the service usually is. Most companies I've worked for don't have an up time record that could even come close to Google.
www.ianhoar.com My blog about geeking out.
... but rather a big win for locally installed and controlled "personal software", as well as - HOPEFULLY - another loss for the evil forces of greed trying to indoctrinate users to the concept of a software subscription model.
Selling software as a subscription is the REAL reason why companies like Microsoft, Google, and so many others are experimenting with Web apps. It's their latest attempt to re-brand software as "content" and convince people to pay for it every month, just like they do cable TV. If they succeed, software publishers will be making far more profit than they do now, and their accountants will be boastful about how regular and predictable the cashflow is.
Just say no to Web apps and every other attempt to sell software as subscriptions.
The fact that you've got a horrible sys admin for your exchange server says a lot more about your company than it does about Microsoft and Exchange.
Fact is a lot of companies are running exchange w/ very little downtime at all.
Linux has been rock-solid from version 1. Version 3 isn't being planned yet.
The main complaint against Linux is that it requires someone who "knows what he is doing". If the same is required of Microsoft solutions, then why not just use Linux?
Interesting that people are more uncomfortable with a 3 hour downtime on gmail than they are with a 3 hour downtime on their local mail server. My guess is that it is the feeling of being out of control. If it is a local problem, there is someone to curse; if it is remote, then you don't even know when it is going to be fixed. This is a good example of how we are psychologically more adverse to unknown failures than we are to known ones.