Slashdot Mirror


Server Failure Destroys Sidekick Users' Backup Data

Expanding on the T-Mobile data loss mentioned in an update to an earlier story, reader stigmato writes "T-Mobile's popular Sidekick brand of devices and their users are facing a data loss crisis. According to the T-Mobile community forums, Microsoft/Danger has suffered a catastrophic server failure that has resulted in the loss of all personal data not stored on the phones. They are advising users not to turn off their phones, reset them or let the batteries die in them for fear of losing what data remains on the devices. Microsoft/Danger has stated that they cannot recover the data but are still trying. Already people are clamoring for a lawsuit. Should we continue to trust cloud computing content providers with our personal information? Perhaps they should have used ZFS or btrfs for their servers."

11 of 304 comments (clear)

  1. "they should have used ZFS or btrfs" by Manip · · Score: 5, Insightful

    This seems a rather silly point to make. I know this is Slashdot and we have to suggest Open Source alternatives but throwing out random file systems as a suggestion to fix poor management and HARDWARE issues is some place between ignorant and silly.

    Perhaps they should have had at least mirrored or stripped raid, with an off-site backup every week or so?

    1. Re:"they should have used ZFS or btrfs" by timmarhy · · Score: 4, Insightful
      retarded comments like that are the reason these zealots aren't taken seriously in the enterprise.

      i'd hazard a guess that the offsite backups were corrupted as well somehow or were silently failing.

      --
      If you mod me down, I will become more powerful than you can imagine....
    2. Re:"they should have used ZFS or btrfs" by sopssa · · Score: 5, Insightful

      Exactly, this can be a software bug too and that could possibly easily destroy or corrupt backup data too. I really doubt this service was ran without backups.

      The type of filesystem has nothing to do with this.

    3. Re:"they should have used ZFS or btrfs" by Znork · · Score: 5, Insightful

      I really doubt this service was ran without backups.

      Knowing 'enterprise' backups I'd bet there was at least a backup client installed and running. However, I'm equally sure that the backups were, at best, tested once in a disaster recovery exercise and were otherwise never verified.

      Further, responsibility would probably be shared between a storage department, a server operations department and an application management department, neatly ensuring that no single person or function is in the position to even know what data is supposed to be backed up, what limitations there are to ensure consistency (cold/hot/inc/etc), to monitor that that's actually what does happen and that it keeps happening as the application and server configuration evolves.

      Backups of dubious value do not seem to be a rarity in enterprise settings.

    4. Re:"they should have used ZFS or btrfs" by petes_PoV · · Score: 5, Insightful
      It's not a backup unless you can prove it will restore. Until then it's just a waste of tape, or disk, and time

      The point about backups is not to tick the box saying "taken backup?" but to provide your business / customers / whatever with a reliable last resort for restoring almost all their data. If you don't have 100% certainty that it will work, you don't have a backup.

      --
      politicians are like babies' nappies: they should both be changed regularly and for the same reasons
  2. Re:Backups? by TheSunborn · · Score: 5, Insightful

    Or this was really a software error, and the backup servers in an other datacenter, just copied the faulty data/delete command.

    They should really be far to big to have all their data stored in a single datacenter with no offsite backup. (Or they should have an entry on thedailywtf.com)

  3. It's The Backups Stooped by tres · · Score: 4, Insightful

    This is an issue of irresponsibility. Plain and Simple. The company responsible for maintaining the data should -- at the very least -- have had some full system backup from last month. If they had some old backup somewhere at least you could chalk it up to systems failure or bad backup tape or bad admin or something.

    But the fact that there is no backup anywhere indicates brazen negligence on the part of everyone responsible for the data. Everyone who had a part in designing the system and managing the system is culpable. The most ridiculous part of this is the over-reliance on server-side data storage by the sidekick designers.

    --
    Notes From Under *nix: blas.phemo.us
    1. Re:It's The Backups Stooped by 1s44c · · Score: 4, Insightful

      But the fact that there is no backup anywhere indicates brazen negligence on the part of everyone responsible for the data. Everyone who had a part in designing the system and managing the system is culpable. The most ridiculous part of this is the over-reliance on server-side data storage by the sidekick designers.

      I will bet you there were good people -SCREAMING- to fix the backups, implement and test failover and all sorts of other good things. In my experience things like this are due to management refusing to spend money fixing problems that have not lost customers yet.

  4. WTF by ShooterNeo · · Score: 4, Insightful

    This is unbelievably bad. The real problem is : why aren't there incremental off site backups to another server farm? A weekly binary difference snapshot would have made this failure less catastrophic.

    Ultimately, with a complex application like this, you can't guarantee 100% that the code doesn't have a bug in it that could result in loss of user data. You can be ALMOST sure it won't, but 100% is not possible with current analysis techniques. (even a mathematical proof of correctness wouldn't protect you from a hacker)

    But a properly done set of OFFLINE backups, stored on racks of tapes or hard disks in a separate physical facility : you can be pretty sure that data isn't going anywhere.

  5. RIP Sidekick by drinkypoo · · Score: 4, Insightful

    With all the competition in the smartphone market today, this is probably an unrecoverable error. If they manage to recover the data then they will come off as heroes for having the courage to tell their customers promptly. Otherwise they just look like they are: incompetent. No great loss, though.

    --
    "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  6. The value of data by symbolset · · Score: 4, Insightful

    Granted, this isn't cheap, but our data isn't either.

    Microsoft bought Danger for half a billion dollars. Current estimates of the value of this data are roughly... half a billion dollars, plus a little. There's little doubt that in addition to destroying the entire value of the acquisition they've created a connection between "Microsoft", "Danger" and "data loss". In their release T-Mobile isn't being shy about tying those things together. Not good. That's going to have impacts even for some completely unrelated cloud-based products like Azure.

    Somebody's about to get a really awkward performance review.

    --
    Help stamp out iliturcy.