The Sidekick Failure and Cloud Culpability
miller60 writes "There's a vigorous debate among cloud pundits about whether the apparent loss of all Sidekick users' data is a reflection on the trustworthiness of cloud computing or simply another cautionary tale about poor backup practices. InformationWeek calls the incident 'a code red cloud disaster.' But some cloud technologists insist data center failures are not cloud failures. Is this distinction meaningful? Or does the cloud movement bear the burden of fuzzy definitions in assessing its shortcomings as well as its promise?"
It's usually a decision on management's side to not use best practices, despite warnings from the tech dept.
tldr; There's nothing wrong with the technology, just the greedy bastards using it.
Belief? Hope? Preference?The Existential Vortex
If you can't trust your outsourcing partner, replace them or bring the work in-house.
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
Didn't that throw up any red flags for ANYONE?
This belongs in the BSA story. At least there it might be modded insightful or funny.
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
I thought the same thing about "Microsoft".
Okay guys, that joke's done, let's get on with our lives.
No kidding!!! What do you say at this point?
This is an unforeseen hole in the bulletproof Gandhi mechanism, so I foresee a quick "GPL V3.1" to close this.
It already exists. It is called AGPL: http://en.wikipedia.org/wiki/AGPL/
Deze sig is in 't Nederlands geschreven.
Just like people lose their stuff on personal hard drives when not backed up, they will lose cloud data when not backed up. Both kinds of computing have merits, and long term persistence of data is not automatic with either. Most people do not place THAT hard a value on backups of their cell phones. They typically sync with a PC anyway. But any business that doesn't have weekly reliable offsite backups of their fundamental assets should be sued by shareholders/customers for irresponsibility weather they use cloud or not.
Didn't that throw up any red flags for ANYONE?
I was a Sidekick user from 4/2004 until 10/2008. There had been only one 'catastrophic' failure in that time that left Sidekick users without data service for an extended period. Danger produced one of the best mobile devices, which in many ways is still better than anything out there even though the OS and devices that utilize it (the various Sidekick models that exist these days) is quite a bit outdated compared to devices like the iPhone.
I miss my Sidekick immensely. I loved true multitasking, a fully capable QWERTY keyboard, and incredible battery life. Unfortunately it didn't sync well with calendaring software, didn't keep up with music playing, and is now partially controlled by Microsoft. There have been immense trade offs with moving to the iPhone but based on my main reason for owning an iPhone (I ride the bus and enjoy the music/video player and screen size) it was the right choice for me.
That said, "cloud computing" is something which usually works (and did, in the case of the Sidekick since 2002). I don't think that this is a proven warning sign that "cloud computing" isn't as reliable as everyone believes, I just think it's proof that companies need to do a much better job of ensuring data integrity than they could have ever imagined before.
Will I stop using Flickr, Google products, and other future "cloud" devices/software because of this? No. I am smart enough, as a computer savvy end-user, to keep my own backups of my data but I do believe people need to become better educated in what can and will happen as we move to the model we have slowly done in the last 10 years.
Personally, I always interpreted cloud computing as software that's running on a number of boxes of which the number can fluctuate without being meaningful (obviously there are performance implications depending on the overall load and number of boxes, but one box going down doesn't inherently bring down the system). One nice thing is these boxes can be geographically distributed as well - so when one data center gets nuked, the others are safe. Now, I realize geographic distribution isn't a requirement but even still, the press release says the data loss is due to a "server failure." Not a data center failure, but the apparent failure of a single server.
So is this really even "the cloud"? Does that mean that Geocities was "the cloud" or that every web host out there is "the cloud" because they've got my data running on a single machine? I certainly never interpreted it that way, but I'm no expert on the matter. It seems like if this data was in "the cloud" that it could have all been retrieved off of another machine somewhere. Perhaps for some customers those other machines might not yet be completely synced with very recent updates, but that would affect a small amount of data for a subset of customers.
To my mind, this failure just goes to show that what people call clouds are merely the mainframes of yesterdecades... For the cloud to become "THE" cloud, the providers need to cooperate to replicate data across their different implementations, such that when one provider suffers an unforeseen crash of unforeseen magnitudes, the data is til there in the "real" (in this definition) cloud.
Sure, it would take no small amount of convincing to get the management drones to accept this, but I should think that a cost/benefit analysis that includes catastrophic failure would be somewhat persuasive...
"The number you have dialed is imaginary. Please rotate your phone 90 degrees and try again."
It's called Affero GPL
A single data center apparently without even a geographically distinct failover site is about as far as I can imagine from being a "cloud". Old fashioned best practices in the form of having two or more sites each capable of handling the entire load would have prevented this particular mess, let alone classic cloud approaches like that of the Google File System (GFS) which keeps at least three copies of a file's contents.
(Granted, if you're storing vital stuff in GFS or Amazon S3 you still have a logical single point of failure (e.g. a mistaken delete command) and therefore you aren't freed from the duty of doing your own backups, but that's a separate issue.)
Or we could just say that trusting Microsoft for anything is relatively unwise compared to other "higher tier" companies. Or that if you're depending on a service provider that's massively laying off staff you need to take action before something seriously ugly happens, because it likely will.
As a wise auditor once told me:
You can outsource the work, but you can not outsource the responsibility.
If your data is important to you - you must back it up, and you must test your backups.
The end.
-ted
Just because you're paying someone to store your data doesn't mean they care about that data as much as you do... That's one of the two big problems with cloud computing that can't be solved by technology. First, nobody cares about your data as much as you do. Second, nobody will protect your data (ie. control it's distribution and prevent unauthorized changes) to the level you find appropriate.
It's usually a good idea to avoid using broad generalities (like I just did), but it seems like in general it would be a bad idea to let someone else be the sole keeper of anything even remotely important or sensitive. There are exceptions, but those seem to be internal to a company (ie. the company runs it's own cloud and has all employees use it). Or military/government applications where centralized security and backup can keep user errors from becoming a real danger to the organization beyond "help I lost my email!".
I think the key here is was it only T-Mobile's data that was lost or was every customer of the "cloud" affected. If it was only T-Mobile's data than the issue is T-Mobile's backup policy, if it was "cloud"-wide than it's an issue with the "cloud" provider. In either case, I don't think you can paint the entire "cloud" concept as unstable. Cloud computing is really just a dynamic datacenter with all the usual weak links and issues present in a traditional metal datacenter.
I came to the datacenter drunk with a fake ID, don't you want to be just like me?
Leaving aside the fact that a "data center" could consist of two servers under Mabel's desk, this is not a "data center" disaster, nor is it a cloud catastrophe.
This a contract and contract management failure: the contract with the outsource was probably written without specifying that they must do the backups, AND no one established any sort of audit (formal or informal) test to ensure that there _were_ backups being taken and that the outsourcer was performing according to the contract.
Too often, the MBA doing the contract thinks "there, that's handled" once they've gotten all the signatures on the dotted line. "There, backups are handled now" he thinks, because many business folk (not ALL, I don't think it's fair to generalize that far) see these kinds of things as milestones, rather than ongoing processes to be managed.
When you cut through the "cloud", if you look into the center of things, you see that the so-called modern "cloud" computing environment is a giant computer(s), surrounded by high powered priestly geeks, doling out resources to everyone, completely centralized. The priests have some new tricks to entertain the masses with, but there's nothing fundamentally different between cloud computing and IBM's vision of computing in the 1960s.
This is my sig.
This is awfully convenient. Something that at least to my eyes looks a lot like a cloud crashes. Cloud pundits announce:
"if it loses your data - it's not a cloud".
So if Amazon's S3 ever fails horribly and loses everybody's data, then it wasn't a cloud either.
But some cloud technologists insist data center failures are not cloud failures. Is this distinction meaningful?
Do you think the customer will want to argue semantics with you after you've lose their data?
Deltron 3030 - Virus (music video)
I don't think that has anything to do with it; at least not for me. My main concern with cloud computing is trust. Do I trust someone other than myself to not fuck up and lose all my data? For critical data, the answer is no. If somebody is going to fuck up and lose all my data, it's going to be me. I don't know if all the data on a Sidekick would qualify as critical, but it would certainly be annoying as fuck to lose it all.
I'm a TMO subscriber, and I love them, so this is painful. And my sister-in-law is a longtime Sidekick user, so she's in a special agony.
But T-Mobile is in a potentially no-win situation. They obviously have to believe Danger/Microsoft that they have good processes to avoid and recover from such failures. They didn't, and now TMO is probably going to take the hit. On one hand, they should - if the service is important, take responsibility and ensure management. On the other hand, they have good assurances, so hey, how much is enough?
BlackBerry users, you should take note. Rim differs only in scale. Ahd, you hope, depth of resilience. Not that RIM hasn't had outages, though not total failure yet.
TMO may have to tell their Sidekick users to be prepared for the inevitable restore, and of course, work with Danger/Microsoft to re-establish service (even though they don't provide service, D/M does), and of course some money compensation no matter how inadequate.
And maybe offer them shiny new myTouch3Gs to give the disillusioned Sidekick users an option with a marginally better track record.
No, wait, that isn't right. I've had to wipe my G1 every update, and some apps don't have a way to save data. They just don't.
I'm glad I never got on the Sidekick train, but I have no hope that this won't some day hit me. Do you suppose the next major Sidekick update will include data backup? :)
deleting the extra space after periods so i can stay relevant, yeah.
This is a service run by Microsoft. Microsoft is a bit hostile to consumers. It would be ironic and sad if Microsoft's failure to maintain the Sidekick service gets blamed on the faceless "Cloud" and it hurts Microsoft's competitors.
Have a nice time.
Danger held your data hostage from the start and didn't provide backup. Then, when Microsoft took them over, it was clear that they were going to mess with the service and servers. No backup + Microsoft mucking with the servers = kiss your data goodbye.
But that's no more an indictment of hosted services or "cloud computing" than a Windows BSOD is an indictment of desktop computing. Microsoft screwed up, and quite predictably, too.
Just define away your problems. ROFL.
Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
Why on Earth would you trust your valuable data (and if it wasn't valuable to you, why keep it in the first place?) to someone else, someone who doesn't answer to the same people you do? I have always thought that "the cloud" is an epic fail waiting to happen. As a concept, it makes no sense. It's a scheme worthy of Professor Harold Hill himself.
You want your data safe? You want it backed up properly? Don't want to lose it? Then put it on your own hardware and take care of it yourself. Don't leave it to someone else to save your bacon when something goes wrong. Because, in the end, they don't care about you. You're just a monthly fee to them, and the agreement/contract/whatever you signed with them absolves them of all responsibility.
"My country, right or wrong; if right, to be kept right; and if wrong, to be set right." --Senator Carl Schurz (1872)
No it's cold. Besides how am I going to watch these latest episodes of Stargate and Eureka if I'm outside playing with the squirrels and birds?
"I disapprove of what you say, but I will defend to the death your right to say it." - historian Evelyn Beatrice Hall
Well, any time you're storing data in a central place, you have a greater consequence of failure. That's a downside of "cloud computing", or any web application that stores data in a database too.
The alternative approach is everyone to have a local version of their data, which will be lost by individuals all the time but not by everyone all at once.
Obviously, if you have a server that's a single point of failure for your company, and you botch a maintenance, something went very wrong. And not having a backup - it seems strange for a company that's been around the block a few times and has big resources behind it. You have to write this off as more of a specific failure and not a failure of the concept of storing data on a remote server.
I do have a good friend that works for Danger - I really don't envy the week he must be having.
-- Kate
Tip: If you want to link to specific part in youtube video, you can add #t=1m3s etc on it, ie http://www.youtube.com/watch?v=kcFUDvTFokg#t=1m40s
Also adding &hd=1 gives hq/hd version.
I know my songs, videos, and other important files are backed-up across triple drives. I don't know if the same is true if I stored them online, and this major failure of Sidekick demonstrates I'm right not to trust them.
That depends entirely on the online storage service you use. If your contract says the files are backed up across triple drives, then you've a right to expect that they are. If your contract doesn't say that, then you shouldn't expect it. Simple.
Now, I'd argue that any cloud service worthy of the name ought to have very robust mirrored storage. But since there's no legal definition of the word, you'd better read the contract.
Cloud architecture shards data
In this case it certainly did.
Why, without your clothes, you're naked, Miss Dudley!
The TOS probably made the users aware that "your data is in Danger" so they can't complain now :-)
not just stuffy history book stuff or national security, IMPHO it fully applies to "the cloud."
if Microsoft can't even build a robust cloud environment, that experiment is done.
"danger," indeed.
if this is supposed to be a new economy, how come they still want my old fashioned money?
Microsoft gutted Danger and left it on life support but all the while they lead their customers( T-Mobile and users ) to believe Danger was thriving and doing fine. Wow, doesn't that sound like Paulson in early 2007 having stated that the banking system was just fine? The difference, Paulson really was clueless while Microsoft knew darn well they'd pulled most of Dangers developers over to their project Pink.
This is what should be up in lights with flares and fireworks and not anything about how bad/good cloud computing is. But once again, there is Microsoft at the wheel and yet the press is saying "pay no attention to that man behind the curtain".
And this interesting in tying this to cloud computing sounds eerily familiar since I just read how Steve Ballmer was bashing IBM for not running their business correctly. Basically, paying too much attention to software and cloud computing and he's all amped about this right when yet another Microsoft failure proves how bad they are at this. Could be spin control so watch for more of the same if it is.
LoB
"Anyone who stands out in the middle of a road looks like roadkill to me." --Linus
Mod the parent up!
There are two sides to this (at least). If you're moving your data "to the cloud" you'd expect that "the cloud" is one hell of a lot more reliable than you are. Let's face it, they should be - the economics of scale mean it's a lot cheaper for them to host your data and lots of other's data, than it is for you alone.
But that isn't what's happened in this case, here Microsoft (!) haven't even covered the basics. This is stunning.
So does this call into question "cloud computing" or just Microsoft's "cloud computing"? This is a difficult question to answer, without being able to see for yourself your cloud partner's infrastructure and procedures you can't really be sure... But would anyone make such a foolish mistake? Microsoft have proven that the answer is "yes, if it's Microsoft", the real question is should that be just: "yes"?
I think most of us now want a more hybrid approach, "in the cloud" is nice, but I also want a "local copy".
Then you have to think about the other kind of "lose" where others gain access to data they shouldn't see...
This is an unforeseen hole in the bulletproof Gandhi mechanism, so I foresee a quick "GPL V3.1" to close this. And then all is well.
How is it a hole when people who don't redistribute code aren't required to redistribute the source that created it? If you maintain a local branch of my code and use it to process your data, more power to you. It'd be nice if you did give back your changes, but that wasn't the offer I made to you and I don't have any right to expect it of you. End-user licenses like the AGPL are dangerous hacks that'll get more bad press than they'll make up for with the minor community good they do.
Dewey, what part of this looks like authorities should be involved?
"According to some reports, the failure was due to a SAN (Storage Area Network) gone wrong at Microsoft's end. It is claimed that Microsoft does not have a working backup of some of the data that has gone missing from customers devices. The SAN upgrade is rumoured to have been outsourced to Hitachi to complete"
"Microsoft, possibly trying to compensate for lost and / or laid-off Danger employees, outsources an upgrade of its Sidekick SAN to Hitachi, which -- for reasons unknown -- fails to make a backup before starting"
"Real" cloud computing is supposed to be based on a mesh of geographically diverse, redundant servers each carrying various subsets of the data. Think RAID5 for servers, with each partition located in a different part of the world and on different networks.
Which means it is nothing more than an internet based service with five 9s of reliability and availability.
However it is an *expensive* internet based service so it needs a new moniker. But without a "Cloud Computing Consortium" with ownership of the trademark "cloud computing" to enforce correct usage, there's nothing to prevent everyone and their dog from using the term incorrectly for any "always connected" application.
The problem with all this is that its almost impossible for an end user to know for sure if someone really has a proper cloud application until something fails. If there is a total failure of a site and no one notices, you've got a working cloud. If people lose data or functionality, you don't.
I've been on slashdot so long I'm starting to get out of touch with the cool stuff if it ain't on slashdot.