Slashdot Mirror


Server Failure Destroys Sidekick Users' Backup Data

Expanding on the T-Mobile data loss mentioned in an update to an earlier story, reader stigmato writes "T-Mobile's popular Sidekick brand of devices and their users are facing a data loss crisis. According to the T-Mobile community forums, Microsoft/Danger has suffered a catastrophic server failure that has resulted in the loss of all personal data not stored on the phones. They are advising users not to turn off their phones, reset them or let the batteries die in them for fear of losing what data remains on the devices. Microsoft/Danger has stated that they cannot recover the data but are still trying. Already people are clamoring for a lawsuit. Should we continue to trust cloud computing content providers with our personal information? Perhaps they should have used ZFS or btrfs for their servers."

15 of 304 comments (clear)

  1. "they should have used ZFS or btrfs" by Manip · · Score: 5, Insightful

    This seems a rather silly point to make. I know this is Slashdot and we have to suggest Open Source alternatives but throwing out random file systems as a suggestion to fix poor management and HARDWARE issues is some place between ignorant and silly.

    Perhaps they should have had at least mirrored or stripped raid, with an off-site backup every week or so?

    1. Re:"they should have used ZFS or btrfs" by timmarhy · · Score: 4, Insightful
      retarded comments like that are the reason these zealots aren't taken seriously in the enterprise.

      i'd hazard a guess that the offsite backups were corrupted as well somehow or were silently failing.

      --
      If you mod me down, I will become more powerful than you can imagine....
    2. Re:"they should have used ZFS or btrfs" by sopssa · · Score: 5, Insightful

      Exactly, this can be a software bug too and that could possibly easily destroy or corrupt backup data too. I really doubt this service was ran without backups.

      The type of filesystem has nothing to do with this.

    3. Re:"they should have used ZFS or btrfs" by Znork · · Score: 5, Insightful

      I really doubt this service was ran without backups.

      Knowing 'enterprise' backups I'd bet there was at least a backup client installed and running. However, I'm equally sure that the backups were, at best, tested once in a disaster recovery exercise and were otherwise never verified.

      Further, responsibility would probably be shared between a storage department, a server operations department and an application management department, neatly ensuring that no single person or function is in the position to even know what data is supposed to be backed up, what limitations there are to ensure consistency (cold/hot/inc/etc), to monitor that that's actually what does happen and that it keeps happening as the application and server configuration evolves.

      Backups of dubious value do not seem to be a rarity in enterprise settings.

    4. Re:"they should have used ZFS or btrfs" by Anonymous Coward · · Score: 3, Insightful

      Repeat after me, you haven't got backups unless you've tested RESTORES.

    5. Re:"they should have used ZFS or btrfs" by petes_PoV · · Score: 5, Insightful
      It's not a backup unless you can prove it will restore. Until then it's just a waste of tape, or disk, and time

      The point about backups is not to tick the box saying "taken backup?" but to provide your business / customers / whatever with a reliable last resort for restoring almost all their data. If you don't have 100% certainty that it will work, you don't have a backup.

      --
      politicians are like babies' nappies: they should both be changed regularly and for the same reasons
  2. Re:Backups? by TheSunborn · · Score: 5, Insightful

    Or this was really a software error, and the backup servers in an other datacenter, just copied the faulty data/delete command.

    They should really be far to big to have all their data stored in a single datacenter with no offsite backup. (Or they should have an entry on thedailywtf.com)

  3. It's The Backups Stooped by tres · · Score: 4, Insightful

    This is an issue of irresponsibility. Plain and Simple. The company responsible for maintaining the data should -- at the very least -- have had some full system backup from last month. If they had some old backup somewhere at least you could chalk it up to systems failure or bad backup tape or bad admin or something.

    But the fact that there is no backup anywhere indicates brazen negligence on the part of everyone responsible for the data. Everyone who had a part in designing the system and managing the system is culpable. The most ridiculous part of this is the over-reliance on server-side data storage by the sidekick designers.

    --
    Notes From Under *nix: blas.phemo.us
    1. Re:It's The Backups Stooped by 1s44c · · Score: 4, Insightful

      But the fact that there is no backup anywhere indicates brazen negligence on the part of everyone responsible for the data. Everyone who had a part in designing the system and managing the system is culpable. The most ridiculous part of this is the over-reliance on server-side data storage by the sidekick designers.

      I will bet you there were good people -SCREAMING- to fix the backups, implement and test failover and all sorts of other good things. In my experience things like this are due to management refusing to spend money fixing problems that have not lost customers yet.

  4. Re:A server failure? by Hadlock · · Score: 3, Insightful

    Reportedly sidekicks are thin clients, other than making phone calls, everything on the phone is saved on the server side. Which is a special kind of retarded, in today's world where a blackberry performs all the same functions, and provides a local backup feature. But yeah as for the backups, all your backups are worthless if your data backup code is flawed, and nobody ever checks the backup tapes. When MS bought the service, they probably changed the location the servers were in, plugged everything back in, and kept going. I imagine a project like that would be on a short timetable, and "checking to see that the backup tapes are really being backed up to" is low on the priority list when the service is already live.

    --
    moox. for a new generation.
  5. WTF by ShooterNeo · · Score: 4, Insightful

    This is unbelievably bad. The real problem is : why aren't there incremental off site backups to another server farm? A weekly binary difference snapshot would have made this failure less catastrophic.

    Ultimately, with a complex application like this, you can't guarantee 100% that the code doesn't have a bug in it that could result in loss of user data. You can be ALMOST sure it won't, but 100% is not possible with current analysis techniques. (even a mathematical proof of correctness wouldn't protect you from a hacker)

    But a properly done set of OFFLINE backups, stored on racks of tapes or hard disks in a separate physical facility : you can be pretty sure that data isn't going anywhere.

  6. Re:See it as an opportunity by AnotherUsername · · Score: 3, Insightful

    Now is the opportunity for opensource to show what it's good for. Someone whip together a small app to extract all info from the Sidekick, put it up on sourceforge for FREE and you have tons of goodwill for OSS. Of course, the app should be Linux-only, thus forcing all Sidekick users to install Ubuntu...

    Thus eliminating any goodwill that would have been gained...

    Really, if you think that open source is a viable option for the masses, you shouldn't care which operating system a powerful application like the one you describe is on. If you really care about using open source for goodwill, releasing it simultaneously on all operating systems should be your goal. How is forcing people to use Ubuntu via software applications any different from Microsoft forcing people to use Windows via software applications?

    --
    I don't like Linux. This doesn't make me a troll.
  7. RIP Sidekick by drinkypoo · · Score: 4, Insightful

    With all the competition in the smartphone market today, this is probably an unrecoverable error. If they manage to recover the data then they will come off as heroes for having the courage to tell their customers promptly. Otherwise they just look like they are: incompetent. No great loss, though.

    --
    "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  8. You assume Danger used a MSFT platform by xswl0931 · · Score: 3, Insightful

    Looking at the timeframe that Danger was acquired by MSFT and that the Danger OS was likely based on NetBSD (http://en.wikipedia.org/wiki/Danger_Hiptop), it's more likely that Danger was still using NetBSD as their Server Software and this was merely a process issue. Blaming it on the "Microsoft Platform" without any real data is just spreading FUD.

  9. The value of data by symbolset · · Score: 4, Insightful

    Granted, this isn't cheap, but our data isn't either.

    Microsoft bought Danger for half a billion dollars. Current estimates of the value of this data are roughly... half a billion dollars, plus a little. There's little doubt that in addition to destroying the entire value of the acquisition they've created a connection between "Microsoft", "Danger" and "data loss". In their release T-Mobile isn't being shy about tying those things together. Not good. That's going to have impacts even for some completely unrelated cloud-based products like Azure.

    Somebody's about to get a really awkward performance review.

    --
    Help stamp out iliturcy.