Slashdot Mirror


Why Power Failures Can Always Lead To Data Loss

bigsmoke writes "So, all your servers run on RAID. You back up religiously. You're even sure that your backups are recoverable. But do you also need a UPS? According to Halfgaar (on Slashdot before to promote better Linux backup practices), yes, usually you do. He argues that despite technological advancements such as file system journaling, power failures can still cause data loss in most setups."

19 of 456 comments (clear)

  1. Well no shit, Sherlock by Skyshadow · · Score: 5, Insightful

    Power losses can cause data loss? Gee, you mean that my system that relies on electricity for everything it does can be adversely effected by power outages even if I take precautions? That's some good admin work there, Lou -- if only there was some sort of law that covered the tendency of things that can go wrong to go wrong...

    Next week: Fires can make things warm, floods can make things wet.

    --
    Every year during my review, I just pray the words "slashdot.org" aren't mentioned.
    1. Re:Well no shit, Sherlock by Anonymous Coward · · Score: 5, Funny

      I don't know about you, but my servers run on the power of cotton candy and happy thoughts.

    2. Re:Well no shit, Sherlock by Skyshadow · · Score: 5, Funny

      I don't know about you, but my servers run on the power of cotton candy and happy thoughts.

      As a former sysadmin, I would think that any machine reliant on 'happy thoughts' would be the most crash-prone system in the history of computing.

      --
      Every year during my review, I just pray the words "slashdot.org" aren't mentioned.
    3. Re:Well no shit, Sherlock by Anonymous Coward · · Score: 5, Informative

      Ok, people who don't just read the executive summary knew this all along, but perhaps it's necessary that someone spells it out for the rest: Journaling and RAID do not prevent data loss in case of a power outage (and many more circumstances). If you know why, just skip the article. If you're wondering how you can lose data if you write everything to two disks and your filesystem guarantees its own consistency, then perhaps this is the wake up call that you need.

    4. Re:Well no shit, Sherlock by Timothy+Brownawell · · Score: 5, Funny
      No, it really does have some interesting observations, with some very scary implications:

      One of the first things that will happen, is that the memory DIMMs will no longer be refreshed properly (DRAM needs to be refreshed constantly otherwise it will loose it's data) and very rapidly, the memory will contain only garbage. The hard drives and DMA controller however, will run a bit longer; so if data is being written to disk, the DMA controller will keep reading data from memory, but it has no idea that this data is corrupted.

      However, we've recently seen that RAM holds state well enough to preserve crypto keys thru a power cycle. This has very scary implications: the RAM knows what's happening, and behaves differently (loses data immediately on power-off or remembers it for several seconds) in order to cause the most difficulty for the owner of the machine.

      Not only are computer components intelligent and self-aware, they're also out to get us!

    5. Re:Well no shit, Sherlock by Anonymous Coward · · Score: 5, Funny

      I can offer you a Happy Thought UPS. It's a box of puppies. Be careful though, it only has 500 puppy Amps of capacity.

    6. Re:Well no shit, Sherlock by Anonymous Coward · · Score: 5, Funny

      Your mom loves you and pays for the electricity. That doesn't mean that your servers run on love.

    7. Re:Well no shit, Sherlock by supersat · · Score: 5, Informative

      Are you sure your disks are in write-through mode? Have you checked? Brad Fitzpatrick (of LiveJournal, memcache, OpenID, etc. fame) discovered that many disks lie about being in write-through mode, and wrote a utility to check it.

  2. Illiteracy by carou · · Score: 5, Funny

    From TFA:

    (DRAM needs to be refreshed constantly otherwise it will loose it's data)

    Fly, little data! Be free!

  3. can always lead to data loss? by internerdj · · Score: 5, Funny

    Definitely maybe?

  4. Well of course you need UPSs, but by pembo13 · · Score: 5, Informative

    APC is the only UPS maker on the market that has at least spent some small effort so that their UPSs can be properly integrated with a Linux machine. I made the mistake of purchasing an Ultra UPS as it was cheaper than the APC.

    --
    "Thanks for all the money you paid to us. We've used it to buy off ISO among other things" -Microsoft
  5. Don't for get to test people, TEST! by sco_robinso · · Score: 5, Insightful

    In my company, everything is behind UPSs. Our SAN is even behind 2 separate UPSs. We thought everything was configured properly, but you'd be surprised what comes to roost when you test everything.

    We recently had a test night where all we did was test the UPS system and shutdown procedures, and there was a couple gotchas. Interestingly, by default the APC powerchute app we were using defaulted to shutting down the UPS completely after the [first] server went down - not good. This was buried fairly deeply in the configuration.

    Equally important to any protection measure, be it RAID, Power Protection, whatever - is testing!

  6. Re:What this really points out... by Macman408 · · Score: 5, Interesting

    This is old hat in embedded systems.

    Yes, but embedded systems usually have lower power requirements, or at the very least, a smaller range of power requirements. You can't add 3 PCIe cards, a few extra drives, and a few more GB of RAM to most embedded systems.

    I worked on the design of an embedded system a few years ago that had a holdup spec - I think it was supposed to survive for 50 ms with no power. So a 50 ms power interruption would result in continued operation, while an outage longer than that was allowed to reset the board. However, the power draw on the board was around 200 Watts; being able to supply that much power for that long in a fairly compact form factor was a huge hurdle. It also caused airflow problems, because the giant capacitors would prevent air from getting to other components on the board, like the CPU. In the next version of the spec, I believe the holdup requirement was eliminated - apparently we weren't the only ones having trouble meeting that requirement.

  7. Re:That's what I always say sometimes by alta · · Score: 5, Interesting

    Rule #1.

    NEVER plug a laser printer into a UPS. The power that the fuser draws is WAY too much.

    Look at some of the cheap office units, they show little pictures on them, notice the printer icon is on the surge side, NOT battery/surge side.

    If the power goes out, you should NOT be trying to print.

    http://articles.techrepublic.com.com/5100-10878_11-6085460.html See #6

    http://arstechnica.com/guides/other/ups.ars/3

    http://www.jetcafe.org/npc/doc/ups-faq.html#0405 see 04.05

    Would you put a space heater on a UPS? Shredder? Vacuum? Table Saw? If you put a laser printer on it, you may as well.

    --
    Do not meddle in the affairs of sysadmins, for they are subtle, and quick to anger.
  8. Our Tandem by PIPBoy3000 · · Score: 5, Interesting

    This reminds me of my favorite power loss story. The facility was doing a generator test, where we were supposed to switch over from city power to the generator. Unfortunately it didn't happen smoothly and the UPS kicked in. Sadly it turned out that so many servers had been added since the original design, the UPS was really only good for fifteen minutes or so. The final problem was that our operator didn't notice the issue quickly enough and so the next thing everyone in IT knew is that our main data center just lost power.

    We spent most of the day getting our servers back up from various states of disrepair (confirming the article, power loss is superbad). It turns out that our main medical software ran on a Tandem. Though the drives and such lost power, the CPU had a backup of D-batteries and survived the power loss just fine. Needless to say, we stopped making fun of their seemingly primitive emergency backup power.

  9. Re:no, that's not the scary thing by JustOK · · Score: 5, Funny

    its not worth loosing you're cool about grammer misteaks and etc.

    --
    rewriting history since 2109
  10. Voltage Spikes by natoochtoniket · · Score: 5, Informative

    The typical small UPS system has some amount of surge protection built-in. But it's typically only good for at most a couple thousand joules. But then, if you get a spike that is big enough to blow a varister, you also get to buy a new ups.

    A better solution is to put a "whole house" surge protector on the circuit-breaker panel. It protects everything, with a much higher number of joules. Five or six pounds of varisters can absorb a lot more shock than one ounce of varisters. They cost about $100, and can be found at most big hardware stores or electrical supply houses. That doesn't eliminate the need for a ups. It does protect the ups, along with the other equipment, from most voltage spikes.

    Last year, lightning hit the power pole 20 feet from my house. We know where it hit because the pole caught fire. My next-door neighbors on both sides lost every single piece of electrical equipment -- not just computers, TV's, and stereos, but also fridge, microwave, water heater, and range. All of it was damaged beyond repair. We barely noticed the hit, except for the bright flash of light, and had no damage at all.

  11. He forgot UPS-triggered shutdown by SleptThroughClass · · Score: 5, Insightful
    The author did not mention having the system set up to have the UPS trigger an automatic shutdown.

    If you're not at the machine, or don't know how to shutdown without a CRT, the disk can get messed up when the UPS runs out of power. Unless you only have a desktop machine with no network applications writing to disk (no BitTorrent); then you might be OK if you just walk away from your keyboard and let the system become quiescent before it loses power.

  12. Ah, that's easy by jcochran · · Score: 5, Funny

    All you need to do is have the grid power feed some high wattage light bulbs. And near the light bulbs is some solar cells. The output from the solar cells is used to charge batteries which feed an inverter that actually powers the computer. Of course there is some power loss in the conversion process, and you need to have some (ok, a lot), of the input power to the system commited towards running a cooling unit to keep things at a reasonable temperature. But the resulting device provides clean power with no possibility of any surges getting thru to the protected equipment.

    Of course, if you go to this level of trouble for your power source, then I'd also suggest opto-isolating all signal lines to and from the server. And enclose the server in a well grounded faraday cage. And it wouldn't be a bad idea to have a dedicated comm link to a duplicate server located else where. Preferably on a different tectonic plate.