Slashdot Mirror


How Power Failures Corrupt Flash SSD Data

An anonymous reader writes "Flash SSDs are non-volatile, right? So how could power failures screw with your data? Several ways, according to a ZDNet post that summarizes a paper (PDF) presented at last month's FAST 13 conference. Researchers from Ohio State and HP Labs researchers tested 15 SSDs using an automated power fault injection testbed and found that 13 lost data. 'Bit corruption hit 3 devices; 3 had shorn writes; 8 had serializability errors; one device lost 1/3 of its data; and 1 SSD bricked. The low-end hard drive had some unserializable writes, while the high-end drive had no power fault failures. The 2 SSDs that had no failures? Both were MLC 2012 model years with a mid-range ($1.17/GB) price.'"

204 comments

  1. build in some power storage by X0563511 · · Score: 5, Insightful

    Seriously... slap in some basic power circuitry and some caps - enough that the drive can finish the cycle it is on and do whatever it needs to do to power off safely.

    --
    For large sets, this will be our guide even unto death, for the LORD will work for each type of data it is applied to...
    1. Re:build in some power storage by Anonymous Coward · · Score: 2, Insightful

      I'll quote the great CliffyB: Vote with your dollars!

      What? It's valid thinking, not at all 9:th grade.

    2. Re:build in some power storage by v1 · · Score: 5, Insightful

      space is at an extreme premium in those drives. There's a reason they feel so heavy/dense. Given the quilting layout of the chips, adding a single cap would prevent several memory chips from fitting. So you may as well then fill that remaining space with more caps. But you will reduce capacity, and that's what sells SSDs.

      There's already a substantial amount of circuitry in them, far from "basic". It's essentially a CPU. I'd be interested to see some numbers as to average power drain during idle, read, and write.

      The ones that did the best during the power blips probably did have caps and a bit more in their power system to handle it though. It certainly does surprise me that the mid-range, not the high-end, were the best performers in this test.

      --
      I work for the Department of Redundancy Department.
    3. Re:build in some power storage by Guspaz · · Score: 2

      Most enterprise SSDs do have small supercapacitors or capacitor arrays onboard for exactly this reason. Some of the higher-end consumer drives do too. But most consumer drives don't.

      The answer? Get a UPS.

    4. Re:build in some power storage by WillgasM · · Score: 1

      You would think. The only SSD I'm running is on my computer at home and my house is sufficiently UPS'd. It's always cool when the power goes out at my apartments but all my electronics keep going. I just wish there was a battery on that Time Warner box outside my door.

    5. Re:build in some power storage by sjames · · Score: 1

      I bet no one ever thought of that!!

      Based on the paper, I guess they didn't

    6. Re:build in some power storage by Mad+Merlin · · Score: 2

      space is at an extreme premium in those drives. There's a reason they feel so heavy/dense.

      I don't know what SSDs you've been using, but I've never picked up an SSD (OCZ Vertex 2/3, Intel X25-M/320/330/335/510/520) that didn't feel light and sound nearly hollow.

    7. Re:build in some power storage by sjames · · Score: 1

      The answer? Get a UPS.

      Because those never fail.

    8. Re:build in some power storage by Beardo+the+Bearded · · Score: 1

      That was my first thought as well, throw in one supercap and you'll solve this problem.

      --

      ---
      ECHELON is a government program to find words like bomb, jihad, plutonium, assassinate, and anarchy.
    9. Re:build in some power storage by Anonymous Coward · · Score: 0

      They don't feel heavy at all, and the drives are much smaller than than the 3.5" bays they go into, leaving lots of space for a capacitor.

    10. Re:build in some power storage by Anonymous Coward · · Score: 0

      Yes, lets just not use anything that fails once and a while, even if it is even less than the thing it is protecting.

    11. Re:build in some power storage by Anonymous Coward · · Score: 0

      "once and a while"?

      It is "once IN a while".

    12. Re:build in some power storage by NatasRevol · · Score: 1

      They thought of it. They just didn't want to pay for it.

      --
      There are two types of people in the world: Those who crave closure
    13. Re:build in some power storage by hawguy · · Score: 2

      I bet no one ever thought of that!!

      Based on the paper, I guess they didn't

      Some SSDs already have capacitors that do just this, so yes, they did think of it. Did you really think that SSD manufacturers aren't aware of this issue?

      But when a few dollars can sway a purchase decision, and it's hard to convince consumers through a few sentences on the side of an SSD box that power protection circuitry is important to have, it's hard to justify putting it in. And since most SSD's are probably sold as OEM equipment where a few pennies can make the difference between getting the sale or not, then it's even harder to justify.

      It's not something I'd be willing to pay extra for - my computer hasn't lost power in years (thanks to a UPS that automatically shuts down my computer), but my computer writes to disk so rarely that there's probably a 100 to 1 chance that it will be in the middle of a write if I just walk up and pull the plug. If I do lose data, there's always backups to fall back on.

    14. Re:build in some power storage by dgatwood · · Score: 1

      The answer? Get a UPS.

      You're assuming a desktop-sized drive in a desktop computer, yet nearly all computers sold today are portables, and laptop users are more likely to buy bus-powered external drives than mains-powered drives.

      So the five most likely causes of power failure in a consumer hard drives (and presumably, in the future, SSDs), ordered from most likely to least likely, are probably:

      • Somebody yanking a USB cable before the device is fully unmounted.
      • The laptop's battery dying earlier than expected.
      • Somebody yanking a FireWire cable before the device is fully unmounted.
      • Somebody yanking an eSATAp cable before the device is fully unmounted.
      • An electrical power disruption caused by the hinge pinching the inverter cable.

      An unexpected mains power failure with a non-battery-backed device falls somewhere around #87. A UPS won't help with any of the above.

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

    15. Re:build in some power storage by TheRealMindChild · · Score: 1

      But when a few dollars can sway a purchase decision, and it's hard to convince consumers through a few sentences on the side of an SSD box that power protection circuitry is important to have, it's hard to justify putting it in

      This isn't buying a car. $3 or even $20 isn't going to be detrimental to the purchase oppritunity when the consumer can TELL it is of quality above the competitors. Blaming the consumer in this case sounds like you are on the other side

      --

      "When life gives you lemons, don't make lemonade. Make life take the lemons back!" -- Cave Johnson
    16. Re:build in some power storage by Anonymous Coward · · Score: 0

      High-end SSD drives do have ultracaps to deal with power interruption.
      High-end as in enterprise class drives that each cost more than you spent on your whole gaming rig.

      The reason why power interruption causes data loss is simple. It comes down to how flash storage works. While I'm sure most of you assume that writing to flash is a simple bit-by-bit operation that happens instantly, nothing could be further from the truth.

      Flash memory is accessed in blocks and only blocks. Even if you need to write to a single bit, the entire block that that bit resides in needs to be re-written. This means before you can write, the entire block has to be read and stored temporary ram. If power is interrupted during a write operation then there is a very good chance the entire block will be lost because the contents of the flash controller's ram will be lost.

      And it gets worse, because flash writes don't always work. Yep. You heard that correctly. Here's how a typical flash write operation works:
      1. Read block and calculate changes
      2. Erase block (Oh yeah, you can't write to blocks with data. You can only write to flash cells that have been zeroed)
      3. Attempt write
      4. Check for write success by reading block and making sure it matches what you attempted to write.
      5. If step 4 fails, go back to step 2 (Number of iterations required depends on the type and quality of the flash memory)

      The above explains why flash writes are so much slower than flash reads, but it also means that there is a greater chance of data loss during power failure because the writes take a long time.

    17. Re:build in some power storage by sjames · · Score: 1

      It's not like they need a great deal of hold up time. Done well, they need only hold power long enough to successfully write a commit bit or decide not to.

    18. Re:build in some power storage by PRMan · · Score: 1

      I just wish there was a battery on that Time Warner box outside my door.

      Strange. My DirecTV DVRs just keep on working...

      --
      Peter predicted that you would "deliberately forget" creation 2000 years ago...
    19. Re:build in some power storage by AmiMoJo · · Score: 1

      Most SSDs are 2.5" so there would be plenty of room for a large capacitor or small battery. You really don't need a lot of energy to finish flushing a small RAM buffer.

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    20. Re:build in some power storage by sjames · · Score: 1

      I didn't say don't use a UPS, I said they DO fail sometimes so don't pretend it can't happen.

    21. Re:build in some power storage by AmiMoJo · · Score: 1

      Or maybe attach a capacitor or battery to the power connector (with diodes so you don't try to power the entire PC).

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    22. Re:build in some power storage by Mashiki · · Score: 3, Informative

      I don't know what SSDs you've been using, but I've never picked up an SSD (OCZ Vertex 2/3, Intel X25-M/320/330/335/510/520) that didn't feel light and sound nearly hollow.

      Consumer drives are usually lightweight, they don't need the extra cooling. Enterprise drives depending on who they're made by and what they're for can have heatspreaders or heatsinks within, or attached to each chip adding to the weight.

      --
      Om, nomnomnom...
    23. Re:build in some power storage by TechyImmigrant · · Score: 2

      It wold be great if they told you about the feature so you could make an informed purchasing decision.

      --
      I should use this sig to advertise my book ISBN-13 : 978-1501515132.
    24. Re:build in some power storage by TechyImmigrant · · Score: 1

      >space is at an extreme premium in those drives.
      So put them in a desktop drive form. The first thing I do with SSDs is put them in one of those adaptors to make them fit in a normal drive tray.

      --
      I should use this sig to advertise my book ISBN-13 : 978-1501515132.
    25. Re:build in some power storage by hairyfeet · · Score: 1

      Uhhh...we solved this problem ages ago with UPS. If you care about your data put the machine on a UPS. I've had my business customers on UPS systems for years, showed them how to test the batteries and swap 'em when they get worn out, no problems. I just had to swap the PSU and HDD out of my netbox at the shop because a transformer blew on my block and managed to give the old gal enough of a shock even through a surge protector that it cooked the PSU and the HDD, but since its just a netbox I don't care enough about it to waste money on a UPS. I just slapped in some parts from my parts box, slapped the disc image and was back up in less than an hour, no biggie.

      At the end of the day if the unit has data you care about? UPS, just like the one I have at home is running on. If its something you don't give a shit about, like my 8 year old netbox at the shop? Just use a surge protector and be ready with a disc image if anything craps out. Really folks this ain't rocket science, a little common sense goes a long way.

      --
      ACs don't waste your time replying, your posts are never seen by me.
    26. Re:build in some power storage by K.+S.+Kyosuke · · Score: 1

      Seriously... slap in some basic power circuitry and some caps

      A small, stupid, retro NiMH battery might work even better.

      --
      Ezekiel 23:20
    27. Re:build in some power storage by yurtinus · · Score: 2

      Exactly, this is buying consumer computer equipment. Put a label on the side with a bullet point touting your unexpected power fault protection and I can pretty much guarantee it will have no impact on your product sales. You know what will? The extra $2 price that puts you below the other guy on the "lowest price first" product sorting.

      --
      +1 Disagree
    28. Re:build in some power storage by Anonymous Coward · · Score: 0

      Seriously... slap in some basic power circuitry and some caps - enough that the drive can finish the cycle it is on and do whatever it needs to do to power off safely.

      That would cost money, add to the cost: most folks don't care about the details of data integrity enough to know that this is a good idea. They'd just see the higher price and purchase the next model over which is $1 cheaper.

    29. Re:build in some power storage by WillgasM · · Score: 1

      I'm mostly talking about the Internet. Netflix only buffers a minute or two.

    30. Re:build in some power storage by hawguy · · Score: 2

      But when a few dollars can sway a purchase decision, and it's hard to convince consumers through a few sentences on the side of an SSD box that power protection circuitry is important to have, it's hard to justify putting it in

      This isn't buying a car. $3 or even $20 isn't going to be detrimental to the purchase oppritunity when the consumer can TELL it is of quality above the competitors. Blaming the consumer in this case sounds like you are on the other side

      How can the consumer TELL if its quality is above the competitors? The presence of capacitors doesn't mean that it's a better drive than a drive without capacitors. It just means that you have more protection from one rare set of circumstances -- potentially with less reliability overall, since big electrolytic capacitors are known to fail, especially cheap ones.

      I suspect that most SSD's are bought as OEM drives buried inside laptops and desktops where the end user may not ever know what brand and/or model the drive is, so how will a higher cost for a feature that may offer no real benefit for mother users help sell more drives?

      Don't believe me? Here's proof: Manufacturers aren't promoting it as a feature in big letters on the side of the box. If they thought they could add $5 of circuitry and sell the drive for $10 more, they would.

      If you're reading Slashdot, then you're not a typical consumer, and maybe you really are enough of an SSD expert to compare features to know what makes one SSD better than another, but for the other 99% of consumers, they will either buy an SSD with their next computer, or they'll buy the one at Best Buy that has the lowest price and the highest transfer rate since that's a number he can understand. How would you even quantify "Power protection capacitors" to know if it's worth $5, $50 or $100 to you? If it's really important to you, you can always buy an enterprise class SLC drive that includes the capacitors

      Blaming the consumer in this case sounds like you are on the other side

      Is this one of those George Bush "If you're not with us, you're against us" false dichotomys? Believe it or not, it's possible for people to have different opinions without being enemies.

    31. Re:build in some power storage by TechyImmigrant · · Score: 0

      My employee discount beats any $2 price difference.

      --
      I should use this sig to advertise my book ISBN-13 : 978-1501515132.
    32. Re:build in some power storage by TechyImmigrant · · Score: 3, Funny

      >yet nearly all computers sold today are portables

      What I really want is a potable computer, so I can drink it if I get thirsty.

      --
      I should use this sig to advertise my book ISBN-13 : 978-1501515132.
    33. Re:build in some power storage by TechyImmigrant · · Score: 2

      >Flash memory is accessed in blocks and only blocks. Even if you need to write to a single bit, the entire block that that bit resides in needs to be re-written. This means before you can write, the entire block has to be read and stored temporary ram. If power is interrupted during a write operation then there is a very good chance the entire block will be lost because the contents of the flash controller's ram will be lost.

      You are wrong.

      Flash it written word by word. The size of the word depends on the chip.
      Flash is *erased* a block at a time.

      That is what makes flash more efficient than EEPROM, the block erase plane.

      --
      I should use this sig to advertise my book ISBN-13 : 978-1501515132.
    34. Re:build in some power storage by Darinbob · · Score: 1

      You can design the firmware to avoid corruptions as well, it doesn't need a hardware solution. The manufacturers just have to be aware that power failures will occur and take that into account during the design. Extra capacitors won't fix the problem of shortcuts in the design.

    35. Re:build in some power storage by TechyImmigrant · · Score: 1

      You can write individual words in a flash chip.
      It takes longer to write than read because you have to force a bunch of electrons through an insulator.

      If you want to write over existing data, you have to erase the block it is in, because you can only erase whole blocks, but there is nothing to stop you incrementally writing to unused parts of a block.

      --
      I should use this sig to advertise my book ISBN-13 : 978-1501515132.
    36. Re:build in some power storage by hawguy · · Score: 1

      My employee discount beats any $2 price difference.

      What kind of employee discount do you have that can take a $120 drive and a $122 drive and make the prices equivalent?

    37. Re:build in some power storage by Bill_the_Engineer · · Score: 1

      Component manufacturers target OEMs for the bulk of their sales. They will build to the price point that may win them a sale. The $3 to $20 amount may not make a difference to a consumer purchasing one from NewEgg, but to someone who purchases in blocks of 1000 it may.

      --
      These comments are my own and do not necessarily reflect the views or opinions of my employer or colleagues...
    38. Re:build in some power storage by the+eric+conspiracy · · Score: 1

      I've had lots more failures due to UPSs going tits up than through data loss on SSDs.

    39. Re:build in some power storage by Anonymous Coward · · Score: 1

      Most consumers don't know what an SSD drive is and those who have one in there systems have one because they were upsold on a system by a clueless salesperson. Its pure coincidence they have one.

    40. Re:build in some power storage by edmudama · · Score: 2

      Most of the enterprise grade SSDs on the market that are outfitted with power-loss protection circuitry fit these capacitors within the 2.5" form factor.

      --
      More data, damnit!
    41. Re:build in some power storage by froggymana · · Score: 2

      Probably the five finger discount.

      --
      "To prevent this day from getting any worse, I'll just read ERROR as GOOD THING" 1GJU8xLuDKDxEs4KLf8fAGyptoDsqvEsBT
    42. Re:build in some power storage by Jeremi · · Score: 1

      >What I really want is a potable computer, so I can drink it if I get thirsty.

      What I really want is a pottable computer, so it can monitor my geraniums.

      --


      I don't care if it's 90,000 hectares. That lake was not my doing.
    43. Re:build in some power storage by gweihir · · Score: 1

      High-end SSDs have supercaps for that. Low-end SSD customers are to cheap to pay a few USD/EUR more for the added protection.

      --
      Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
    44. Re:build in some power storage by Anonymous Coward · · Score: 0

      >What I really want is a potable computer, so I can drink it if I get thirsty.

      What I really want is a pottable computer, so it can monitor my geraniums.

      What I really want is a pourable computer, so I present my bits and megahertz in decorative vases.

      What I really want is a parable computer, so I can troll internet atheists automatically.

    45. Re:build in some power storage by thegarbz · · Score: 2

      but I've never picked up an SSD (OCZ Vertex 2/3, Intel X25-M/320/330/335/510/520) that didn't feel light and sound nearly hollow.

      Rip it open and have a look. There's not much weight at all to a piece of fibreglass and some plastic resin encasing some silicon. Circuit boards and components are really quite light when they don't require cooling or even large bits of metal for simple thermal mass.

      You'll find that even though it's light and looks hollow it'll be packed quite full. Now combine that with the problems associated with creating some form of energy storage. Storage can come in some electrical form, i.e. battery which would be great but then you need either a maintenance regime, combined with some form of monitoring and perhaps even some charging circuit.

      Your other options is capacitance, but in order to get enough useful capacitance you need large densities which invariably comes by rolling together two thin aluminium plates. All useful (for this application) capacitors will therefore be cylindrical and will immediately consume massive amounts of space (by massive I'm talking cubic mm). Again have a look and see how little actually fits in such a small form factor.

    46. Re:build in some power storage by adolf · · Score: 1

      it's hard to convince consumers through a few sentences on the side of an SSD box that power protection circuitry is important to have

      I disagree.

      To pick an example: Gigabyte advertises on the box that they use high-quality Japanese capacitors for their motherboards. And since every. single. motherboard failure I've seen in a decade has been due to bad caps, these words mean a lot to me.

      "Built-in power backup to help keep your data safe!" sounds like a good enough slogan to lure me in.

      But what do I know? I'm just a consumer who wants to find products that seem likely to last, even if they're a bit more expensive.

    47. Re:build in some power storage by rossjudson · · Score: 1

      This is what high-end, enterprise-class PCIe flash cards already do:

      http://www.fusionio.com/

    48. Re:build in some power storage by ultrasawblade · · Score: 1

      I thought some if not most flash chips had to be written in "pages", i.e. 2kb pages per 128kb eraseblock.

    49. Re:build in some power storage by TechyImmigrant · · Score: 1

      It's usually around 50%-60% of the market price.

      --
      I should use this sig to advertise my book ISBN-13 : 978-1501515132.
    50. Re:build in some power storage by TechyImmigrant · · Score: 1

      Some serial programmed ones do. But none of the parallel flash chips I've designed into circuits work that way. An SSD's SATA interface only works in blocks though, so even though the underlying chips might be word writable, the interface doesn't let you do that.

      The low level transistor structure of flash chips are always bit by bit programmable and block eraseable, so it's always the interface that dictates the size of the write unit.

      --
      I should use this sig to advertise my book ISBN-13 : 978-1501515132.
    51. Re:build in some power storage by Anonymous Coward · · Score: 0

      A $3 capacitor in bulk is an AWESOME capacitor...still your point stands, but it's probably more like 25c in qty of 1000000+ for even a pretty nice supercap

    52. Re:build in some power storage by Guspaz · · Score: 1

      There are very very few "desktop-sized" SSDs. Virtually every SSD found in a desktop is a 2.5" notebook drive, often mounted with a 2.5" to 3.5" adapter plate. The UPS will protect any bus-powered device by keeping the bus itself powered.

      In terms of external SSDs, those are rare enough that they're not really a scenario to worry about (they're mostly just gimmicks for now). If and when they do eventually appear, the solution is simply for manufacturers to include a small capacitor; if you know it's going to be an external SSD, you simply have to take precautions. A lot of consumer SSDs that don't ship with capacitors or supercapacitors onboard are still designed to use them (and they're omitted for cost reasons).

      Your scenario is, in current usage, extremely rare. Even the laptop-running-out-of-power scenario is virtually impossible, since most laptops will force themselves to hibernate before letting their battery actually completely die.

    53. Re:build in some power storage by Anonymous Coward · · Score: 0

      Yes such things can be done, but they have performance costs. In order to increase write IOPs, drives use write caching. This means that the drive reports a sector as written, even if it is only sitting in RAM cache. Any enterprise grade drive (worth its salt) will have super caps that allow you to flush the cache to flash.

      A consumer drive is likely not to have those caps, opting instead to use the space for more flash which lowers the $/MB. They don't want to disable write caching, because as it turns out, writing to flash is very slow compared to reads. The slowness of flash writes is hidden by having many flash chips writing in parallel, but that only works when you hide the flash behind a write cache.

      To fix this without caps, firmware would have to disable write caching, which would dramatically slow the write performance.

    54. Re:build in some power storage by FirephoxRising · · Score: 1

      I was about to say the same thing, I'm sure they could find space in 2.5" for a cap and I damn well would pay more for a drive that is more reliable, I'm looking at you OCZ (13/15 vertex drives have failed here, I'm going to replace them with Intels).

    55. Re:build in some power storage by roman_mir · · Score: 0

      not the high-end, were the best performers in this test.

      - for whatever reason I couldn't find the actual brands and models of the SSDs that were tested, but at least from POV of server hardware it makes sense not to add extra batteries on the components themselves, since the servers will have their own power backup, while laptops are most often used while both plugged in and on battery power.

      The moral of the story: get a UPS for your desktop.

    56. Re:build in some power storage by dgatwood · · Score: 1

      There are very very few "desktop-sized" SSDs. Virtually every SSD found in a desktop is a 2.5" notebook drive, often mounted with a 2.5" to 3.5" adapter plate. The UPS will protect any bus-powered device by keeping the bus itself powered.

      My point was that most people don't use their laptops as a glorified desktop; a UPS won't do any good if the device is sitting on your lap in the car. And a UPS won't do any real good for the internal drive in a laptop or a bus-powered external drive attached to a laptop anyway, given that the laptop already is a glorified UPS. It's like, "Yo, dawg, I heard you liked UPSes, so I hooked a UPS to your UPS so you could be backed up while you're backed up." Or something.

      Your scenario is, in current usage, extremely rare. Even the laptop-running-out-of-power scenario is virtually impossible, since most laptops will force themselves to hibernate before letting their battery actually completely die.

      Yes, in theory. But you know the difference between theory and practice. In practice, as batteries age, the computer's ability to determine the battery's remaining capacity diminishes. And at some point, you'll be sitting there typing, the voltage will suddenly sag a little too low, and the computer will shut itself off unceremoniously. I've had a lot of laptops over the past 15 years, and every single one has eventually gotten to the point where it does that fairly reproducibly. If 100% is extremely rare, I'd hate to think what you consider common. :-)

      --

      Check out my sci-fi/humor trilogy at PatriotsBooks.

    57. Re:build in some power storage by Dunbal · · Score: 2

      The problem with voting has always been that the idiots get to vote too. So while you might "vote with your dollars" to select the most reliable drive, they will vote for the one with the cute name, or the shiny case, or the "free gift", or the special price, etc.

      --
      Seven puppies were harmed during the making of this post.
    58. Re:build in some power storage by Dunbal · · Score: 1

      space is at an extreme premium in those drives.

      Not true. I mean yes if they want to maintain the 2.5" form factor, but there's plenty of room if they go 3.5". Of course that would mean two models, one for laptops and one for desktops - because how often does a laptop unexpectedly power down? But space is absolutely not the issue on a desktop. They're designed for 3.5". I'd rather have a bigger drive that never fails than a tiny space saving one that I can't rely on.

      --
      Seven puppies were harmed during the making of this post.
    59. Re:build in some power storage by Anonymous Coward · · Score: 0

      Who pretended it can't happen?

    60. Re:build in some power storage by EETech1 · · Score: 1

      Size is still going to be an issue. 1 farad gives you 1 volt at 1 amp for 1 second. Consider you need 5 volts to run the SSD reliably, that gives you 200mA for 1 second while dropping steadily (R/C time constant) towards 0 volts. This 1 farad of capacitance will be roughly the size of a stack of 15 - 20 quarters (or nickels if you want to pay 2X) and would take up lots of space in an SSD

      I played around with some supercaps on an embedded flash based datalogger I designed, and without lots of CPU cycle consuming software monitoring and trigger circuitry I had much less corruption with a diode isolated normal (tiny) cap powering the flash chip, and nothing besides power conditioning on the CPU. That way if the flash had a buffer write in progress it would complete, but the CPU would go dead nearly instantly during brownout and not continue to try and operate until the power supply got so low (1000 mSec or so) that things started to have problems with logic levels and such. I had enough Capacitance to communicate on the bus, and debug status in the background for nearly 2 seconds after poweroff and under normal conditions I was by far the last device left communicating on the CAN bus I was monitoring, but if I lost power, and the bus continued to operate and I tried to ride it out and still record data to the bitter end hoping I would get my power back before I went dead, I saw failed writes and corrupted data. I had to use hardware triggers on the power supply to get an early signal so the CPU to decide when the power supply was stable enough to continue, or if I needed to issue a write command immediately to flush the buffer and ensure the data was safe and in the meantime buffer in CPU RAM and wait to see if things improved by the time I got confirmation that the page was written so I could decide if I could sneak another 20 mSec sample into flash. It was a lot of additional hardware and software (and highly targeted testing) to get any additional improvement in data reliability over version 1.0 that just died when the power did. I doubt your SSD manufacturer cares that much about such edge cases as we did (I even stored our supply voltage and hardware trigger status with every sample) when validating drive by wire systems.

      One thing I thought about while reading this paper was that they should have stopped writing to the SSD when power was cut to it, because it's kind of unfair to cut the power on your storage device that's fed from the same power as the CPU, but to have some magic ability for that CPU to keep writing after it normally would be dead. It's worst case, but might not be real world.

      Cheers

    61. Re:build in some power storage by PlusFiveTroll · · Score: 1

      Then they would cut the drive out of the notebook/largepad market.

    62. Re:build in some power storage by PlusFiveTroll · · Score: 1

      What I want is a pot computer, so when I let the magic smoke out, it really is magic smoke.

    63. Re:build in some power storage by PlusFiveTroll · · Score: 1

      Should be pretty easy to make a inline adapter over the SATA power connecter with some supercaps in it for desktop use. What I don't know is the ramifications of always (or at least till the cap runs out) having power on to the drive itself. A more complicated device could turn off after a few minutes of no power on the SATA input.

    64. Re:build in some power storage by Reziac · · Score: 1

      So would it be possible to build a unit that sits between drive and mainboard, serves as power capacitor, and can, at need, send a "flush everything to disk and halt gracefully" signal to the drive??

      Seems to me the bulk of such a unit could hang off to the side, with just a thin piece sitting between the connectors.

      --
      ~REZ~ #43301. Who'd fake being me anyway?
    65. Re:build in some power storage by sjames · · Score: 1

      Guspaz by suggesting that a UPS was a total solution to the corruption problem.

    66. Re:build in some power storage by Anonymous Coward · · Score: 0

      Lots of devices already linger a bit when you pull the cord - it's just not visible to you.

      Generally anything that doesn't have a wall-wart, but has the 120/AC handling circuitry internally. The capacitors used for filtering the rectified power take a few seconds to discharge.

    67. Re:build in some power storage by TechyImmigrant · · Score: 1

      >Then they would cut the drive out of the notebook/largepad market.

      But add to the desktop/server/reliable computer market. If I were asked to pay $50 more for an SSD in a desktop box with a powerfail protection backup battery and better power conditioning due to the extra space for filter components and I would open my wallet.

      --
      I should use this sig to advertise my book ISBN-13 : 978-1501515132.
    68. Re:build in some power storage by thegarbz · · Score: 1

      Yes indeed, this was my first thought as well when reading this discussion. The only problem is one of standard form factors. I imagine consumers would be pretty pissed if their 2.5" SSD is actually 2.5" + that extra bit that is needed to prevent you losing all that data SSD. This isn't an issue in a PC, however laptops and netbooks, or even desktop machines with mSATA connectors on the motherboard would suddenly have space issues.

    69. Re:build in some power storage by Anonymous Coward · · Score: 0

      A MOSFET would make more sense since you wouldn't have .7v of drop across the diode.

    70. Re:build in some power storage by Common+Joe · · Score: 1

      But what do I know? I'm just a consumer who wants to find products that seem likely to last, even if they're a bit more expensive.

      I'd pay extra money to know that a product has higher quality and reliability. The problem is that I've been taken (and some of my friends a lot more so) by buying the expensive stuff only to find out that it is the same as the cheap stuff. Example: Sony used to mean quality. Not anymore. For a long time, they rode on their own success until people finally figured out they could get the same quality and pay less money. Because of this (and other things I've seen and heard over the years), I only look at price. Should a vendor actually make an effort and let me know why their stuff is extra few bucks, I'll pay extra attention to that and they can instantly form brand loyalty with me.

    71. Re:build in some power storage by Reziac · · Score: 1

      That's why I was thinking it needs to be a unit external to the drive (for those that lack this feature in the first place, tho per TFA that didn't necessarily save them) with the thinnest of connectors between drive and mainboard, kinda like the old SCSI and floppy adapters, only thinner.

      I'm not a fan of this craze to stuff everything in the smallest possible space, even if it does save the manufacturer a half cent worth of plastic. :/

      --
      ~REZ~ #43301. Who'd fake being me anyway?
    72. Re:build in some power storage by ergean · · Score: 1

      Changing caps on old motherboards is my new hobby... am so good at it that I can change 1 or 10 in under 5 minutes. It takes me longer to get the motherboard out and clean it. And it takes an insane amount of time to test it. But I have a lot of dead time in my business.

    73. Re:build in some power storage by edt12345 · · Score: 0

      I think the problem with unscheduled power loss is that the SSD may be performing operations than can take multiple seconds to complete. Since the sector remapping table is likely the critical piece of information that must be written to the flash before powering off, and that even in a read only or idle device, there may be background sector remapping occurring, it is easy to imagine how a power fail during this operation could be fatal to any file system.

      The question is how to mitigate this type of failure. Energy storage is one possible solution, but might not be enough by itself. Instead, I wonder if high end SSDs minimize the amount of erasing that needs to be done to write critical data, like the sector remap table. Perhaps by some kind of a internal journal that tracks changes to critical data structures might be used.

      Data blocks, including logs, being written over the external (Sata, etc) interface should be more tolerant of power loss as long as they are written to the medium (flash) in order. It is my impression that "good" file systems will tolerate power loss with no unrecoverable file system corruption, although actual written data blocks may be lost.

      The internal NAND flash may or may not be word writable, but that's not really the issue, IMHO. Avoidance of long term blocking operations like erases is the key. Combine this with the extremely minimal time the SSD drive circuitry will operate after the power rail falls through its minimum specified value in normal operation until the supply voltage becomes too low to write the flash and one can see the challenge, even with some moderate energy storage.

      It would be interesting to understand the details of high end SSD's internal data management.

    74. Re:build in some power storage by Bengie · · Score: 1

      Do you think one could just hook a decently strong cap in-line with your power cords that run to the SSDs, use diodes so the power only goes one way, and have those caps augment the caps in-side of the SSDs?

    75. Re:build in some power storage by X0563511 · · Score: 1

      You'd need the firmware to be able to know that power failure was inevitable, as well - so it can flush out what it has and put itself in a safe state. This can be done by detecting the voltage coming in outside of the cap going low.

      Else, you'd have to power it for as long as it takes for the thing to decide to flush it's write cache (and perform it as well)

      --
      For large sets, this will be our guide even unto death, for the LORD will work for each type of data it is applied to...
    76. Re:build in some power storage by Bengie · · Score: 1

      If the SSD watched the SATA/etc connection to see if it lost signal, it could assume it needs to get into a safe mode when the connection is lost to the host-controller. Then you could just use dumb in-line temp power to give the drive a little extra time.

  2. Before you ask. by eddy · · Score: 5, Informative

    The paper doesn't disclose the brands.

    --
    Belief is the currency of delusion.
    1. Re:Before you ask. by war4peace · · Score: 1

      Of course it doesn't. Naming/Shaming is not allowed.
      I was sarcastic, of course. They don't do it, though, because it'd probably put them in a crossfire of lawsuits coming from powerful companies. Nobody wants that. They will lose simply by being bullied financially. It's all about who brings more lawyers to the table, not who's right or wrong.

      --
      ...gis sdrawkcab (usually not responding to ACs; don't bother posting as AC)
    2. Re:Before you ask. by Mad+Merlin · · Score: 1

      Which is unfortunate. That was the main reason I opened the PDF.

    3. Re:Before you ask. by PRMan · · Score: 1

      Somebody should tell that to Consumer Reports...

      --
      Peter predicted that you would "deliberately forget" creation 2000 years ago...
    4. Re:Before you ask. by TechyImmigrant · · Score: 1

      MLC == Intel. But they were the good ones.

      --
      I should use this sig to advertise my book ISBN-13 : 978-1501515132.
    5. Re:Before you ask. by greg1104 · · Score: 1

      I created a Reliable Writes page for PostgreSQL that talks about this and gives some known good and bad examples. Intel's 320 and 710 drives are the only two SATA SSDs still on the market that have survived the tests for clean shutdown I've advocated everyone run. They are units with a supercapacitor to enable power failure cleanup. If a drive doesn't have a battery for that sort of purpose, you will lose data at shutdown one day. And, no, a UPS is no cure, because all it takes to ruin a system on one is someone tripping over a cord at the data center to destroy the whole thing.

    6. Re:Before you ask. by DragonTHC · · Score: 1

      Which makes it completely useless for 90% of us who just wasted our 3 minutes.

      --
      They're using their grammar skills there.
    7. Re:Before you ask. by The_Revelation · · Score: 1

      You can almost be sure the bricked SSD was from Kingston. You'd be lucky to get an entire day's worth of light-usage out of them before bricking. I believe Kingston are aware of this, because its about the only memory product they don't seem to have any kind of warranty on, even if returned on the same day they're purchased.

    8. Re:Before you ask. by jmichaelg · · Score: 1

      Which makes it difficult for anyone to duplicate the study to see if the results persist.

      The author says the power was killed at randomly selected times so perhaps the drives that survived power loss happened to be hit with power loss when they weren't vulnerable to data loss. Without someone replicating the study, it's hard to know if this guy's results mean anything.

  3. Remember kids by Anonymous Coward · · Score: 0

    Always RAID and have battery backup, it saves lives.

    1. Re:Remember kids by Anonymous Coward · · Score: 0

      Kids can't even afford SSDs, let alone that other stuff, you insensitive clod!

    2. Re:Remember kids by the_B0fh · · Score: 1

      raid just means you have TWO (or more) bricked SSDs..

    3. Re:Remember kids by Bengie · · Score: 1

      A RAID of SSDs that all fail at the same time because of power failure. Sounds great.

  4. Anyone ever hear of a battery-backed cache? by Midnight_Falcon · · Score: 1

    Last time I checked, standard platter-based disks had the same issue -- a problem that is solved in server/enterprise environments by placing a write-cache battery in the RAID controller.

    In a desktop environment I suppose one could embed a write cache battery into the SSDs to abate the issue, but in a laptop environment it'd be unlikely you'd even encounter it since you'd have to be writing data while running out of battery, in which case, you might well deserve it :)

    1. Re:Anyone ever hear of a battery-backed cache? by LunaticTippy · · Score: 1

      A capacitor could hold enough power to finish a write cycle on SSD no problem. It wouldn't even have to be very large.

      --
      Man, you really need that seminar!
    2. Re:Anyone ever hear of a battery-backed cache? by FoolishBluntman · · Score: 1

      In the paper presented, they also include both a consumer and enterprise class disk drive.
      The consumer class drive had the same problems as the cheap SSD.

    3. Re:Anyone ever hear of a battery-backed cache? by noelhenson · · Score: 1

      Actually, you might be surprised at how large it would be. F=It/V. 10mS write time for a sector, 2S for a file. Voltage tolerances being about +/-5%. 3.3V operationg voltage. 5V source voltage. Say 80mA for operating current.

      1.7V drop over 10mS should require about 470uF. To write a file (a 2-second file) would be something like: 80mA*2S/1.7V or about 94,000 uF.

      I'm not trying to be a killjoy here, but trying to store data during bad-voltage and power-down situations is a nontrivial problem.

    4. Re:Anyone ever hear of a battery-backed cache? by noelhenson · · Score: 1

      *F=Farads. I guess I should have written C=It/V.

    5. Re:Anyone ever hear of a battery-backed cache? by Midnight_Falcon · · Score: 1

      True, but, then when the cap dries out and eventually bursts open it'd probably be a major cause of drive failure and lack of longevity.

    6. Re:Anyone ever hear of a battery-backed cache? by nabsltd · · Score: 1

      Actually, you might be surprised at how large it would be. F=It/V. 10mS write time for a sector, 2S for a file.

      You don't need enough power to finish the OS-level task...you only need enough to write out the data in the drive's RAM cache. Since that is 256MB or less on most current SSD drives (512MB is found on some drives greater than 500GB), it's not as much as you estimate.

      Then, too, when you use the correct timings (10ms for a sector is about 200 times too long), you see that even the slowest SSD takes only about 5ms to write out 1MB (with an average around 3ms), that's around 0.75 seconds to flush the whole cache, resulting in about 1/3 the power you estimate. And, that's assuming that the whole cache needs to be flushed, as it's possible that only a few blocks need to be written.

      You can also estimate the time in another way, in that these drives can sustain 300-500MB/sec write rates. That means that copying from the drive's RAM cache to flash must be at least that fast, which gives between 0.5 and 1.7 seconds to flush. This time is similar to my estimate above, so I suspect that 0.75 seconds isn't far off from correct.

    7. Re:Anyone ever hear of a battery-backed cache? by wisnoskij · · Score: 1

      OR the battery fails, is taken out, or falls out.

      --
      Troll is not a replacement for I disagree.
    8. Re:Anyone ever hear of a battery-backed cache? by drinkypoo · · Score: 1

      Use a solid cap, and/or socket the cap at the edge of the drive someplace.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    9. Re:Anyone ever hear of a battery-backed cache? by Anonymous Coward · · Score: 0

      What the fuck are you on, faggot? The issue is with the cache in the god damnned SSD not getting flushed due to a power outage. How the fuck is the battery backed write cache on your fucking RAID controller going to help that situation any?! Does it magically keep the drive powered? No.

    10. Re:Anyone ever hear of a battery-backed cache? by Anonymous Coward · · Score: 0

      The RAID controller battery protects whatever caching the controller is doing. Hard drives manage this problem by using the voltage generated by the disk spinning down to finish outstanding writes. Since SSDs don't have this to rely on, enterprise SSD vendors use capacitor backup. This is all the more critical because things like shingling on hard drives and managing wear on SSDs requires there to be a lot more data being shuffled in and out of memory.

    11. Re:Anyone ever hear of a battery-backed cache? by LunaticTippy · · Score: 1

      Capacitors should outlast the equipment they are in. There was a problem with poor quality Taiwanese capacitors that exploded after a few months/years, but that isn't an inherent problem with the technology.

      I've had to replace batteries many times, they fail more quickly in general than capacitors.

      --
      Man, you really need that seminar!
  5. Power corrupts... by preflex · · Score: 5, Funny

    ... Power failure corrupts absolutely.

    1. Re:Power corrupts... by PRMan · · Score: 0

      MOD parent up! That's hilarious!

      --
      Peter predicted that you would "deliberately forget" creation 2000 years ago...
  6. UPS by rossdee · · Score: 1

    Why should a power failure corrupt anything? The UPS will shut the computer off if there is a prolonged outage.

    1. Re:UPS by Anonymous Coward · · Score: 0

      United Parcel Service provides on demand house calls?

  7. Unsurprising by Anonymous Coward · · Score: 3, Insightful

    These devices have an elaborate internal database for the management of block remapping. For this to survive power failures it needs to use transactional updates. Getting this right is hard - it takes years for file systems and databases to become robust. I'd guess that many devices don't even attempt to do it and the ones that do probably have obscure failure modes. A UPS is essential.

  8. Finally somebody said it! by Dishwasha · · Score: 5, Informative

    I had some original Vertex drives from OCZ that kept absolutely corrupting when my laptop got accidentally unplugged and I powered on the machine. I had to RMA them over and over and over again. I finally figured out that my battery was getting old and, although everything was functional even on battery power and it would boot, the initial large draw of power on boot must have created a voltage drop (i.e. brownout) which the SSDs weren't designed to compensate for. Within an hour of boot (even back on plugged power) they would choke, freeze the OS, and be rendered unusable from then on out.

    Several SSD manufacturers are probably not engineering well for fluctuating power. Rather than fixing the problem with better engineering, OCZ simply changed their warranty policy to void the warranty if the customer is not providing proper power which, correct me if I'm wrong, I don't think rotating disk hard drive manufacturers have had that in their warranty clauses.

    1. Re:Finally somebody said it! by Anonymous Coward · · Score: 1

      Properly designed digital circuitry has a power monitor circuit that prevents the system from running if the power is below a spec'd level. Flash circuitry definitely needed this kind of protection when it first came out, because it used 12V power for programming, and 5V power otherwise, and it was very common for the two power rails to ramp up (or down) at different rates during power on and off. It was very easy to trigger a spurious write cycle because of noise on the lines.

    2. Re:Finally somebody said it! by citylivin · · Score: 1, Insightful

      Well thats probably becuase you were using OCZ crap. I have never had a quality product from that company.

      However that said, I have noticed the same thing with the crucial m4s I have. In one particular laptop, it keeps bricking drives becuase the battery doesnt hold much of a charge any more. Luckily, i can "unbrick" them by plugging in the power (but not data) for 20 minutes, then plugging in the data connection, then rebooting the machine. Has worked more than once.

      and crucial has put out a bunch of firmwares trying to deal with this. Last time it happened was a few months ago. I have aprox 15 other drives deployed and it only happens to one or two of them, seems to always be in laptops or after some sort of power surge. Crucial will always RMA the drive as well the one or two times i did not get it going.

      And before anyone says "why thats why I dont use SSDs, too new and unstable!" I say that I would not give up my SSD for all the scsi 15ks in the world. SSDs are the single greatest speed increase in computer performance in the last 15 years. Make backups, as you should anyways, and dont be afraid of ssds. When you fly close to the sun, you are going to get burned. Still I would rather FLY so high and roll the dice on reliability (which is still stellar in most circumstances).

      Rotational hard drives are such a pain now as an OS drive, and they still die eventually. I recommend SSDs to everyone now, with the caveat above that you always need good backups.

      --
      As a potential lottery winner, I totally support tax cuts for the wealthy
    3. Re:Finally somebody said it! by DigiShaman · · Score: 1

      The OCZ Vertex 2 series were prone to spontainiously bricking. The original OEM Samsungs that Dell used would exhibit all sorts of strange disk I/O issues that led to strange Windows hard locking. You would think it would be causes by CPU, RAM, Video, or chipset. No, it was the Samsungs.

      When it comes to reliability, I'll stick with Intel offerings. Not the fastest things on the block, but good enough in its class. Personally, I gambled on a OCZ Vertex 4 because Newegg slashed some insane 30% off MSRP via ShellShocker special going on at the time. A week later, OCZ released a new firmware update and quickly did the Vertex 4s become popular again. So far, I have one in my MacBook and another in my minitower. No problems yet *knock on wood*.

      --
      Life is not for the lazy.
    4. Re:Finally somebody said it! by ckthorp · · Score: 1

      I had some fun with trying to mount some Crucial M4 drives in USB external enclosures. They kept getting unmounted and the SMART block remap count kept running up, and up, and up. One of the drives outright failed and the other was at 55% spare sectors remaining when I figured out the issue. When there was a write, the current consumption from programming the FLASH chip would cause a voltage sag and the write would fail but it wasn't usually enough of a drop to make the drive reset. Once I bought the "Y"-style USB cords (the kind with an extra power plug) and then modded that to run on a wall-wart, everything was fine. (Just for the record, this was a hack to add some faster storage to an aging server that only had SCA-hotswap bays).

    5. Re:Finally somebody said it! by bAdministrator · · Score: 1

      Putting a system in/out of stand-by/hibernation is probably another risk factor with SSD drives.

      Some points of failure:

      SATA SSD Drive - SATA Cable - SATA Controller - Resource Manager - "Operating System" - SATA Driver

      In my personal experience, the manufacturer wanted to blame the SATA controller without providing me with any proof.

      In summary, SSDs are fast, but they may die fast too. SSDs are most likely not any more or less reliable than mechanical drives, except for purely demonstrational purposes, such as dropping a computer on the floor (unless that caused a drop in power to the drive from other components).

  9. We encountered something like this by AliasMarlowe · · Score: 5, Interesting

    We encountered extensive and progresssive file corruption on SSDs in an industrial device. It used the FAT file system, and after every loss of power, it ran its equivalent of chkdsk/f at the next boot. If power was lost again while this command was running, then it was guaranteed that the file system would become corrupt (despite the fact that we were writing nothing to the SSD; it held only files which were opened for reading). The window of opportunity was described as "very short", and the possibility of corruption was "very small" according to the vendor. In our experience in the field, and in our internal testing, the window of opportunity exceeded 20 seconds, and the possibility of corruption was "utter certainty".

    The vendor fixed the problem in a very easy way. They changed the file system from FAT to a commercial journaling FS. In our subsequent tests, we never found any file corruption, even on iterated power loss at random intervals after power on.

    --
    Those who can make you believe absurdities can make you commit atrocities. - Voltaire
    1. Re:We encountered something like this by TheRealMindChild · · Score: 4, Insightful

      First, running an SSD on an "industrial device"

      Second, using FAT

      Third, "commercial journaling FS". What does that even mean?

      If you are industrial, where is your UPS?

      --

      "When life gives you lemons, don't make lemonade. Make life take the lemons back!" -- Cave Johnson
    2. Re:We encountered something like this by certsoft · · Score: 5, Informative

      We use USB flash drives for a data logger. Most of the time the data is being buffered in the ARM based Linux board's RAM to save power. Once we get a complete file's worth (4MB at the present) we power up, validate, write the file, and power down. Supercaps have been a lifesaver. There's even enough capacity to do the write cycle if the flash was powered down when a power fail is detected. That allows to not lose whatever what was already in the RAM buffer.

    3. Re:We encountered something like this by yurtinus · · Score: 5, Insightful

      Likely as part of an embedded system - monitoring or control software. Systems where you just flip the power switch on when you need them and off when you're done, so an UPS wouldn't apply.

      I'm not saying their implementation was right, just saying that you can't imply from his post that it was wrong :P

      --
      +1 Disagree
    4. Re:We encountered something like this by thejynxed · · Score: 3, Informative

      If it was a drive being used to read schematics for CNC for instance, there isn't a manufacturer out there that currently offers a machine-tied UPS for the CNC machine. If the CNC machine loses power, then so does the drive, and vice versa, since it's all on the same circuit (usually you'll find the power stuff hidden in a cabinet along a nearby wall, and that stuff takes power directly from the mains).

      --
      @Mindless Drivel: 100% of Twitter posts ever Tweeted.
    5. Re:We encountered something like this by Darinbob · · Score: 3, Interesting

      I hate a lot of USB drives and CompactFlash. They're all designed as dumb commodity devices for the undiscriminating user, and trying to get any solid spec sheets out of the manufacturers is impossible if you're not also a giant corporation. Instead their data sheets are just marketing literature (you rarely get anything more technical than "8x speed"). Almost all are designed to work with Windows with no concern to work with embedded systems or production automation, etc. So you end up buying a wide variety to test with and see which ones are barely adequate to work with your system.

    6. Re:We encountered something like this by Anonymous Coward · · Score: 1

      How do you tell if a drive has supercaps? I have a unique application and we need a super small SSD drive although ran into a problem with SSD drives being too unreliable. I concluded it was probably due to the occasional loss of power. We were actually hoping to use MicroSDHC cards until this issue was realized after significant testing. This was with the highest rated drives/the most expensive. We are now looking into msata ssd drives. Any thoughts on this?

    7. Re:We encountered something like this by certsoft · · Score: 2

      Fortunately the client has facilities to test various drives over a wide temperature range (down to -40, not sure how hot they test) while running. And yes, a lot of them are crap.

    8. Re:We encountered something like this by thejynxed · · Score: 5, Interesting

      Not just a lot of them, most of them, to the point that my former contract rolled their own due to flaky controllers, etc put out by the SSD manufacturers. Yes, they found it cheaper and more efficient to make their own SSD drives, and to incinerate the ones that failed in a blast furnace than rely on the crap the manufacturers are currently foisting on the market.

      --
      @Mindless Drivel: 100% of Twitter posts ever Tweeted.
    9. Re:We encountered something like this by hot+soldering+iron · · Score: 5, Interesting

      You might check into adding supercaps into the power supply, across the DC output lines.
      For a less DY method, you could try this: http://www.beam-tech.com/093001/prd_pgs/internal_ups.htm#
      It's an internally mounted, UPS. There are also some PC power supplies that have the UPS built-in, but expect to pay a premium for those.
      If your application allows it, you might want to just mount your SSD into a laptop. It already has internal battery power, and there isn't any exotic hardware you have to pay through the nose for.

      --
      When you want something built, come see me. If you want correct grammar and spelling, get a F*ing liberal arts student.
    10. Re:We encountered something like this by adolf · · Score: 2

      Do laptops ever monitor health of the battery if external power is never removed? I'm aware that laptops can tell when the battery is eventually trashed in nornal use (Dells, in particular, seem to be pretty bitchy about it with continuously-blinking lights, and report their findings to the OS if it bothers to ask).

      But being plugged in forever is not "normal use" for a laptop.

      I like your idea (and no, I'm not the AC you're replying to), but I have this vision of a small laptop that has been running with external power for years and years. And for all of those years and years, it's been reporting (via ACPI or whatever hooks) that the battery is in fine, working order.

      Suddenly the power dips for a moment, and the machine crashes with neither warning nor expectation because the li-ion/li-po cells are simply very old and nothing bothered to check (let alone report) if they still work beforehand.

    11. Re:We encountered something like this by Anonymous Coward · · Score: 1

      Not to mention that, depending on what the CNC machine was actually doing, you definitely wouldn't want a hard drive on it. Anything that generates vibration can kill a hard drive pretty quickly. There's an apocryphal story about a CS department that had to keep changing hard disks that were failing. It turns out one of the construction workers who were doing work in that server room was using a half-tall rack as a convenient sawhorse to cut wood with. With a power saw, I might add. Hence why all their failed disks had massive gashes in the recording media.

    12. Re:We encountered something like this by cheater512 · · Score: 1

      It works better than you think. Under Linux you can dump lots of nice info on a battery.
      They all have fairly detailed stats on how much capacity they have, what they are rated for, voltage and even how much capacity is being used at the moment.

    13. Re:We encountered something like this by ultrasawblade · · Score: 2

      You calling ext3/ext4 shitty? I can put the journal on a separate device for performance enhancement, can NTFS do that?

      In all serious though NTFS is well engineered.

    14. Re:We encountered something like this by adolf · · Score: 2

      Right, sure: All of this battery information can certainly be gleaned under any operating system, given appropriate software.

      But the question is (restated): If the machine never runs on battery, does the machine know the health status of that battery? Does it really have any idea what those figures really are? Can it possibly know, without ever having run on (or otherwise discharged) the battery what the operational status of that battery really is?

      The implication is that if it cannot, then it's really not inherently more reliable than a much simpler machine with no battery at all.

      Please remember that the context here is that of a reliable machine that generally has external power and exists in a fixed location, but which may (as any other thing also may) lose that external power at some point.

      That a laptop in normal use that spends some of its time running plugged in, some of it just charging, some of it sitting in a bag doing nothing, and some of it running only from battery -- and report statistics based on that normal treatment -- only indicates that a laptop battery works predictably in normal use.

      This isn't normal use, though. And I, myself, have never tried this particular abnormal use of getting a new laptop, plugging it in, and leaving it that way for a Long, Long Time.

      Hence, the question.

    15. Re:We encountered something like this by ultranova · · Score: 1

      The implication is that if it cannot, then it's really not inherently more reliable than a much simpler machine with no battery at all.

      What does "inherent" reliability mean? All machines currently in use require regular maintenance. Checking the status of the battery seems like something that should be part of said maintenance routine.

      Also, at least my UPS does self-test the battery periodically, even if the mains power is never interrupted.

      --

      Forget magic. Any technology distinguishable from divine power is insufficiently advanced.

    16. Re: We encountered something like this by Anonymous Coward · · Score: 2, Informative

      He is talking about the file system specification (its on disk structure) not about the specific code implementation in windows.

    17. Re:We encountered something like this by Anonymous Coward · · Score: 2, Insightful

      Obvious troll is obviously doing just that, i think his use of the term "silly faggots" when referring to linux users is the clue that tipped me off to this fact.

    18. Re:We encountered something like this by X0563511 · · Score: 1

      I'd like to know how you'd expect to be able to pull any such information from any battery at all.

      Unless you test it by actually discharging it, the best you can do is get a VA reading from it.

      --
      For large sets, this will be our guide even unto death, for the LORD will work for each type of data it is applied to...
    19. Re:We encountered something like this by adolf · · Score: 1

      So would I.

      That's my question.

    20. Re:We encountered something like this by adolf · · Score: 1

      "Inherent"

      Adjective

              Existing in something as a permanent, essential, or characteristic attribute: "inherent dangers".

      The question is: Does a laptop test its own battery when plugged in forever?

      If not, how is it tested?

      (Pulling the plug and waiting for the box to turn off != reliable, especially if the particular evil you're trying to fend off is that of SSDs sometimes going batshit crazy when their power unexpectedly goes away..)

      My UPS also has its own battery test routine (as do various items of not-PC-related gear that I maintain and that have their own battery banks), because being plugged in forever is the definition of "normal" for such a device. But it is not normal for a laptop.

      To use the word again, my UPS provides (AFAICT) inherently more reliable power than a laptop with a battery of indeterminable health.

    21. Re:We encountered something like this by AliasMarlowe · · Score: 1

      First, running an SSD on an "industrial device"
      Second, using FAT
      Third, "commercial journaling FS". What does that even mean?
      If you are industrial, where is your UPS?

      First, there was no choice, and there will be none even in future versions of that vendor's device as a component of the one we make. A spinning disk solution would be far too power intensive, and much too bulky as well.
      Second, only FAT was supported by the vendor's firmware at that time (they made improvements later).
      Third, here's one. And it's not ext3 and has nothing to do with NTFS.
      Fourth, you're joking (for one thing, a 30-minute UPS would not fit in the space available, even if the rest of the device were omitted). Or you're just another stupid troll. Or you utterly fail to understand that industrial devices must survive unexpected power-outages intact, even if they have a UPS with huge batteries or an always-on MG-set. Remember, even a power cable can fail suddenly, or the MG-set can get an abrupt fuel line blockage, and so forth.

      --
      Those who can make you believe absurdities can make you commit atrocities. - Voltaire
    22. Re:We encountered something like this by RockDoctor · · Score: 1

      If you are industrial, where is your UPS?

      Quite possibly, banned. If you are in a working environment where flammable or explosive atmospheres are present as a normal part of normal operations, or as a part of "foreseeable emergency" operations, then the normal situation is to have a gas-proof, purged (from a known-clean air supply under positive pressure) power control box which monitors several flammable gas monitors in the area under consideration. If the gas monitors detect non-trivial amounts of flammable gas in the purged zone, or in the supply of "clean" air to the purged zone, then the system shuts down the power. Immediately. Within one cycle of AC. No "if", no "but", no "maybe", no "can I finish this write". No polite prompt that "The system is going down for power down now" followed by a shrinking white dot. Power is off NOW.

      Multiple people have died for getting that wrong. Hundreds of millions of dollars of equipment have been burned or sunk. Lawsuits have occurred. Which is why the sparkies take it seriously.

      I no longer work in such areas routinely, but I spent 20 years using and maintaining such equipment, including those gas detectors. Shit, they can be pernickety, particularly the gas detectors, and particularly when they get dosed with seawater spray that comes through when the waves are hammering 20m up the side of the vessel.

      One of the companies that I work with routinely issues all of it's field staff with specialised laptops suitable to this environment. The staff must use one of these machines to log into the company network (this is enforced by spyware root-kitted in on the installation image and locked behind password-protected hardware (the "unit" has routers that refuse network service to computers that aren't running the spyware). The wifi cards are removed (people at this company have died through violation of radio silence rules when handling explosives ; so that possibility is eliminated ; it's easier than going back to court to explain why someone else has died) and to maintain the immediate effectiveness of the power shutdown systems, the batteries are removed and the battery terminals are hot-glued over so that they can never be re-fitted. (I guess they haven't worked out a procedure to get around the CMOS backup batteries. But being soldered into a PCB, they're probably unlikely to spark themselves.)

      "No UPS" is a perfectly acceptable, expectable industrial situation. I've worked under such circumstances for nearly 30 years.

      The legislative framework in which my work environment developed was derived from electrical equipment in coal mines, and the bodies that certify equipment designs were founded in the 1930s. Flammable atmospheres and electrical equipment are not a new combination, and lots of solutions have been tried. The rarity of such explosions is a testimony to the industry having stopped (mostly) arguing about the costs of this approach. Occasionally someone tries to cut corners - through ignorance or cost-cutting. I'd bet ... TWO ... pints of beer that the 29 bodies in the Pike River mine died due to short-cutting something in this regulatory regime. Though I now see that the final report has been released and prosecutions have occurred, so I suppose I've got some reading to do (my suppositions are reasonable. See item one of this list from the report on the homicides : "It is not possible to be definitive, but potential ignition sources include arcing in the mine electrical system, a diesel engine overheating, contraband taken into the mine, electric motors in the non-restricted part of the mine and frictional sparking caused by work activities."). Not a problem I have to work with in detail any more - my job is (partly) to avoid generating those flammable atmospheres in the first case (which clearly wasn't done

      --
      Birds are not dinosaur descendants;birds are dinosaurs, for all useful meanings of "birds", "are" and "dinosaurs"
    23. Re:We encountered something like this by Anonymous Coward · · Score: 0

      First is answered below, second = running Windows, third = NTFS. At least that's how I read it.

  10. Already done by rgbrenner · · Score: 1

    enterprise-class SSDs have capacitors designed to last long enough for the SSD to finish any writes if the power fails.

    Capacitors cost money though.. so this is one of the things that gets stripped out of consumer-level drives to reduce the price.

  11. You ever look inside one? by Sycraft-fu · · Score: 1

    There is all kinds of extra space in a 2.5" SSD. They have a lil' CPU, some flash chips, and that's it more or less. They are quite small. In smaller form factors, then ya space can become an issue but there's plenty in a 2.5" unit.

  12. not naming names = data "pulled out of my ass" by citizenr · · Score: 2, Insightful

    Useless paper/test.

    --
    Who logs in to gdm? Not I, said the duck.
    1. Re:not naming names = data "pulled out of my ass" by Anonymous Coward · · Score: 0

      No, no ... in fact it is very useful.

      It says that SSD are not ready for the big show. oppose to the expected, that non moving parts was the benefit.

      They just quit the latency time. If you can live with that, go ahead; if not, stay with regular rotating hard disks.

    2. Re:not naming names = data "pulled out of my ass" by Theovon · · Score: 1

      If they do that, they won't get any more free SSDs to test, and that'll impact their ability to write papers criticizing SSDs. What would you prefer? A paper biased towards SSDs too small/cheap to be useful to you, or one that doesn't name names? Anonymity is VERY important in this kind of research.

    3. Re:not naming names = data "pulled out of my ass" by edmudama · · Score: 1

      SSDs are already in the big show, and have been demonstrated reliable in those applications. The key is choose your vendors carefully, ask how they were qualified, etc.

      --
      More data, damnit!
    4. Re:not naming names = data "pulled out of my ass" by Anonymous Coward · · Score: 0

      I'd prefer them apply for a research grant or raise the money in some other fashion so that they're not beholden to the manufacturers. I don't know how many drives they used (don't tell me 15; if that's all they used, that's crumby research), but even 60 of the lowest capacity variants they could get wouldn't cost a great deal.

    5. Re:not naming names = data "pulled out of my ass" by citizenr · · Score: 1

      yes, they used 15, only few of those were of the same brand and model.

      --
      Who logs in to gdm? Not I, said the duck.
    6. Re:not naming names = data "pulled out of my ass" by citizenr · · Score: 2

      If they do that, they won't get any more free SSDs to test, and that'll impact their ability to write papers criticizing SSDs. What would you prefer?

      I would prefer research to be done by someone who is not manufacturers bitch.
      You dont need a ton of money to test commodity hardware, the trick is to SELL stuff after the test, not take home and pretend it wasnt a bribe.

      --
      Who logs in to gdm? Not I, said the duck.
    7. Re:not naming names = data "pulled out of my ass" by Anonymous Coward · · Score: 0

      the trick is to SELL stuff after the test,

      Not everyone manages to sell bricked SSD drives for significant sums of money.

      You could try to RMA the bricked ones of course. Naming names might help the RMA process go smoother, esp if you publish reviews on the RMA process too ;).

  13. up/down/up/brown/fried by h8sg8s · · Score: 2, Insightful

    What some of folks don't realize is its the seesaw nature of many power events that's primarily behind both data corruption and SSD failure. It's a rare rack system that has its own power conditioning and UPS these days (HP NonStop comes to mind) and without it you're subject to whatever the event provides in the way of under/over voltage, spikes, drops, etc. Many times these happen in timeframes too fast for power switching equipment to react and in some cases its that stuff that gets fried first.

    --
    Organization? You must be joking..
  14. Interesting failure mode for Crucial SSDs by ckthorp · · Score: 1

    There is a protection mechanism that I know exists in Crucial SSDs which makes the drive appear dead after some unclean shutdowns of the drive while it performs a firmware-level integrity check of the drive. It may exist in other brands as well. Sometimes it takes 2 runs of 30-60 minutes to get the drive to re-enumerate via SATA. I'd be curious to know if the "dead" drive was affected by this bug.

    1. Re:Interesting failure mode for Crucial SSDs by drinkypoo · · Score: 1

      There is a protection mechanism that I know exists in Crucial SSDs which makes the drive appear dead after some unclean shutdowns of the drive while it performs a firmware-level integrity check of the drive.

      I don't know if they're violating a spec or not and it's probably a life's work to find out, but that seems very rude to me. They really ought to identify as busy or something, so that they don't just scare the piss out of you. If you almost-brick an Xperia phone by scragging the bootloader so bad you can't even reflash it, whatever handles the comms is still working and lurking in the background and it will enumerate via USB with the service interface. That way you know whether you should even bother. Would still prefer a more segregated bootloader, though.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    2. Re:Interesting failure mode for Crucial SSDs by Voyager529 · · Score: 2

      You got this too? I just ordered a Crucial M4 on sale a few weeks ago. the day after I installed and cloned it, I had the same situation where it wouldn't start. I called Crucial, expecting to need an RMA. Luckily I got an informed gentleman on the phone who told me to leave it at the failed POST screen for 20 minutes, reboot, and give it another 20 minutes, and reboot again. It worked. Supposedly it's not so much a 'bug' as an 'obscure feature'. ...I'm keeping my spinning rust drive around just in case.

    3. Re:Interesting failure mode for Crucial SSDs by ckthorp · · Score: 1

      Overall, it is a good thing. The data isn't organized linearly for wear leveling purposes, so a power outage can leave the metadata in an inconsistent state. Also, make sure you have the latest firmware on the drive. They had a fun one earlier that caused a drive lockup hourly after the power on counter hit about 35k hrs (or some such). I've got about 2 dz M4 drives in service, so I've seen a lot of the bugs.

    4. Re:Interesting failure mode for Crucial SSDs by ckthorp · · Score: 1

      I agree. The first time one of our engineer's laptops HD's did this, it was rather uncomfortable to say the least. I think a good compromise solution would be to have it enumerate with a "useful" drive textual model identifier like "M4 ERROR CHECKING, LEAVE ON 30 MIN" or some such. I'm sure it violates some standard, too, but it would at least give the user some indication of what is happening.

  15. yeah, looked at the pdf.... by Anonymous Coward · · Score: 0

    "We use synchronized I/O (O SYNC), which means each write operation does not return until its data is flushed to the device."

    Sure about that? Most of the devices I've seen will report "command complete" while data is either in DRAM or in flight even with write cache disabled. There's only a few that don't do that, and they aren't the cheap ones. You may get lucky on a major player stuffing some decent code in a consumer grade SSD for the sake of fewer firmware versions in manufacturing, but it's usually not the case.

    Any device with a "super cap" over 2 years old is suspect. They degrade. All of them are using ceramic arrays now, and only guarantee data in flight if you're really pestering them on a design review.

    Also the "brick" may not be a brick. When these drives have to rebuild translation tables, it can take a while. I've seen 60+ minutes on a 400G device. Leave the power on and wait a couple of hours. Reboot. You might get your drive back, maybe even most of your data. I wouldn't count on the last write, but you may get that too if their raid works.

  16. UPS does nothing for the common fault case. by stoploss · · Score: 3, Informative

    Most enterprise SSDs do have small supercapacitors or capacitor arrays onboard for exactly this reason. Some of the higher-end consumer drives do too. But most consumer drives don't.

    The answer? Get a UPS.

    A UPS is no panacea: I experience grid failure very rarely.

    However, relatively speaking I experience many more kernel lockups that require an ACPI-initiated poweroff by holding down the power button until the machine abruptly powers off. What do you do when a reboot/poweroff command causes your Linux/BSD machine to hang? I/O handle leaks in the Samba SMB client (ie. *not* the smbd daemon) and the Samba Winbind code are notorious for this. The only times I have ever had to "yank power" from a production Linux database machine were due to SMB share mount zombies or Winbind that the kernel couldn't kill even during an issued reboot command.

    I have several OCZ Vertex 4 SSDs, and this concerns me—especially due to the fact that the paper/presentation does not disclose the test results. I guess I will just have to hope that my device models aren't affected and/or that waiting a minute or two during a hung poweroff/reboot means the kernel has stopped attempting to write to the devices and everything has flushed.

    PS. If you compare the vague results in the summary with the paper you will find that only two of the fifteen drives passed the tests, yet four of the devices were cited to have power protection capacitors.

    1. Re:UPS does nothing for the common fault case. by Anonymous Coward · · Score: 0

      I don't understand how if they claim that it takes up to 20 sec for the final write to finalize that a computer that simply shutsdown in 10 sec won't have the same problem.

    2. Re:UPS does nothing for the common fault case. by Anonymous Coward · · Score: 0

      Suggest you use MS Windows so it won't lock up all the time.

    3. Re:UPS does nothing for the common fault case. by stoploss · · Score: 2

      I don't understand how if they claim that it takes up to 20 sec for the final write to finalize that a computer that simply shutsdown in 10 sec won't have the same problem.

      Drives support a blocking "sync" command that is only supposed to return when the drive has flushed all pending writes and has reached quiescence. If there is nothing pending to flush then the command will return immediately. If not, it may take the cited 20 seconds to return. Normal reboot/poweroff procedure in the OS waits for this condition, and this has been around forever (the HDD equivalent is to flush write cache and park the heads). That's why a 10 second shutdown can be safe even with the putative 20 second worst-case writeout window—if there are pending writes then there is nothing to do.

      Yanking the power prevents this sanity check from happening.

    4. Re:UPS does nothing for the common fault case. by adolf · · Score: 1

      Instead of a supercap I'd rather there be a couple of replaceable Lithium coin cells inside of an SSD, to just finish finish writes after the power unexpectedly dips for some reason. They're cheap commodities, they seem to have predictable failure rates, and I don't remember the last time I changed one in any computer (though it used to be a fairly frequent repair).

      By using them only once in a blue moon and occasionally monitoring the voltage and setting a SMART error if they're getting worn out, I'd estimate that they'd last at least as long as SSDs seem to . . .

      But in the meantime, why not just use a reset button? It kicks over the whole machine without depowering accessories like drives, which would satisfy your needs and the concerns posed by TFA.

      Yes, the physical existence of such a button is sadly lacking these days, though third-party motherboards and and the boxes built with them still have the appropriate connector. The motherboard in the computer I write this on even has small reset and power switches right on the board.

      Perhaps this SSD issue will invoke a resurgence in reset buttons, or at least a change in power-button behavior.

      There's no good reason why it can't function as a power button, an ACPI sleep button, and a reset button, with firmware changes and appropriate blinken-lights: Push once for sleep/graceful shutdown (or do-nothing), hold until power light goes out/flashes/whatever for reset, hold even longer for instant power off.

      Or, a software approach: "omg! someone touched the button! quick, tell the disks/SSDs to flush their buffers just in case!"

      Or, a user approach (which may or may not work for your scenario). Google "Magic SysRq Key": You can sync and unmount disks and reboot/kill zombies without ever pulling the power out from underneath the hardware, as long as the kernel is still somewhat listening.

      Or, some combination of any of these. . . .

    5. Re:UPS does nothing for the common fault case. by Vairon · · Score: 1

      Assuming you have sysrq keys enabled, you can hit alt-sysrq-s, wait for the sync to complete, alt-sysrq-u, alt-sysrq-b. This performs a filesystem sync then remounts all filesystems read-only then boots the system. Also if you have a stuck mount point you can always use a lazy umount (umount -l) to remove it from filesystem hierarchy so you don't need to reboot in the first place.

    6. Re:UPS does nothing for the common fault case. by ultrasawblade · · Score: 1

      Look into kexec. I think it can be used to reboot your system even if your current kernel is hung.

    7. Re:UPS does nothing for the common fault case. by FirephoxRising · · Score: 1

      Anyone with access to the article? What are the results, which ones didn't lose data, which one bricked?!

    8. Re:UPS does nothing for the common fault case. by PlusFiveTroll · · Score: 1

      I'm assuming that hitting reset will cause the board to send the drives a reset and initialize command pretty quickly.

      The SysRq key sounds like the best option (though if you don't have PS2, usb locks up with the board way more often).

    9. Re:UPS does nothing for the common fault case. by adolf · · Score: 1

      I'm assuming that sending the drives reset and initialize commands will allow them to continue to work properly, as opposed to unexpectedly yanking the power from them.

  17. UPS a solution for this by Anonymous Coward · · Score: 0

    This reenforces my personal policy - if it is not a self-powered laptop (usually running on mains), all of my other computers are powered through UPS (Uninterruptable Power Supplies), e.g. Belkin, APS, et. al. I learned this the hard way when a brief power interruption scrambled several conventional hard drives on an old Mac Quicksilver.

  18. Buy a SSD with a battery or capacitor by thue · · Score: 2

    This is old news; see fx Wikipedia's coverage. Only buy SSDs with a battery or capacitor, or whatever is the in DRAM cache of the SSD will be lost on power failure.

  19. UPS irony by stoploss · · Score: 1

    Uhhh...we solved this problem ages ago with UPS. If you care about your data put the machine on a UPS. I've had my business customers on UPS systems for years, showed them how to test the batteries and swap 'em when they get worn out, no problems.

    That may help, but it isn't sufficient. I had one client on an APC SmartUPS that caused more power failures than it prevented. Why? Ambient thermal shutdown of the SmartUPS resulted in it abruptly powering off repeatedly even while the grid was up. So, if they did not have a UPS installed they would not have had any of those power outages, and, for bonus irony, grid failures were quite rare and never occurred while I was there.

    This may seem like it goes without saying, but the installation context matters.

    1. Re:UPS irony by drinkypoo · · Score: 1

      After the third used APC UPS that still didn't work properly after battery replacement, I gave up. None of them could handle vaguely anywhere near the load they were supposed to. I don't know whose UPSes to buy, but I wouldn't buy anything from APC any more. It's unfortunate, because they used to follow a simple formula (fat traces, quality components, sturdy enclosures, priced accordingly) and they were a good value proposition.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    2. Re:UPS irony by stoploss · · Score: 1

      The ironies caused by attempting to prevent faults through increased complexity are multifarious.

      I had one client whose installation was at a datacenter with the "standard" triple redundancy of power supply: grid, UPS, and generator. Furthermore, all racks had an "A" and "B" power distribution network. One day they were attempting to bring the "A" power distribution system back into code after inspection (WTF?), so they took "A" offline to make changes. No planned effects, because all the units in the rack had dual power supplies... but then a worker dropped a wrench and "B" went down too after the breaker tripped—for the whole datacenter. Total power loss... and yet the grid was still up.

      Lovely.

      There's no escape from the fact that increased complexity increases the risk of catastrophic failure. To wit: if the 2013 Super Bowl didn't have the multi-grid relay installed, they wouldn't have had any outage. The power failure prevention mechanism caused the outage. Simpler is usually better, unless people want to pay massively to eliminate all their single points of failure in order to escape the bathtub curve of ironically-diminished reliability caused by the increased complexity of a naively-implemented "redundant system". Even in "properly engineered, no-single-point-of-failure" systems there are often hidden failure vectors, so I normally advocate simplicity and "warm standby" approaches.

    3. Re:UPS irony by drinkypoo · · Score: 1

      That multi-grid relay was supposed to be a warm standby approach, but it was too clever for its own good, which brings us back to your opening sentence. Manual switches would have been preferable. A brief power outage while a maintenance guy scrambles for the relay (hopefully there's someone stationed near the control...) is acceptable when the utility goes down. Having the system screw up and imagine itself an emergency is just lame.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    4. Re:UPS irony by stoploss · · Score: 1

      That multi-grid relay was supposed to be a warm standby approach, but it was too clever for its own good

      Haha, absolutely. Once I had a client who "upgraded" their gigabit ethernet topology to provide multipath IO from their production servers to their SAN. All these switches had dual power supplies connected to dual distribution power systems and the servers had dual power supplies, dual NICs on separate VLANS, etc. Fair enough; I enabled MPIO for the ISCSI SAN access on the production servers.

      What *actually* happened? A month later we experienced massive data loss from production server process output because the mulitpath IO to the SAN triggered a firmware bug in the SAN and made it abruptly drop offline completely (and also corrupt recently-written data FTW). Comparatively speaking, how often was it likely that we would lose network access from the server to the SAN (which was in the same rack) in a single-path ethernet configuration? What, failure of the solid-state ethernet switch? Ethernet cable failure in unmoving cables? Ha.

      Later on in a separate incident, we lost access to the SAN because someone was doing maintenance to the switch plug configuration during business hours and redundantly unplugged the redundantly-powered switches.

    5. Re:UPS irony by David_Hart · · Score: 1

      After the third used APC UPS that still didn't work properly after battery replacement, I gave up. None of them could handle vaguely anywhere near the load they were supposed to. I don't know whose UPSes to buy, but I wouldn't buy anything from APC any more. It's unfortunate, because they used to follow a simple formula (fat traces, quality components, sturdy enclosures, priced accordingly) and they were a good value proposition.

      I have tried many other UPS devices and APC are the only ones I trust. Its not clear if you are buying used UPS devices or are the original owner and are just replacing the battery. If you are buying used units (i.e. off of ebay), you never know if they have been hit with a surge, etc. Beyond that, they contain a battery. Batteries degrade over time, usually between 3 to 5 years, same as your car battery. This means that while the UPS will have the rated capacity with a new battery, it will become less over time. My strategy is to load a UPS at no more than 70% of capacity. If you are going over this, then you are asking for problems

    6. Re:UPS irony by stoploss · · Score: 1

      You make good points, but the particular SmartUPS I cited was acquired brand new.

      I can even understand the thermal shutdown (despite the irony it caused); however, why didn't it use the built-in alarm sound when it was approaching thermal shutdown? These units scream incessantly if they don't like their battery anymore... even worse than a smoke detector does. Why couldn't it have done similarly if it was approaching a thermal cutout? "Happy, happy, happy... DEAD!"

      The fact remains that if the client hadn't bothered with the effort and cost of installing a UPS then they would have suffered *zero* power outages over the course of the 18+ months I was there.

    7. Re:UPS irony by hairyfeet · · Score: 1

      Sadly I have to agree APC was once a good company, now they lowball the shit out of everything and they suck. Now as far as power? Double it. I know, sounds like a bullshit way to raise rates but I have found they ALL vastly overestimate what the unit will take so if you need 300w? Buy a 600w. Need a Kilowatt? Buy 2. Yes it sucks but you have to play the game and that I've found is a good hard fast rule that works for pretty much any company in the UPS business. Now if its a home user you can often get by with "power and a half" like say buying a 450w when they need a 300w but when it comes to business better safe than sorry, and all my happy customers tell me that I made the right call.

      The bitch is now I'm seeing the same thing from PSUs, a unit rated at 300w will be lucky if it isn't a smoking wreck if you pull more than 180w so now I'm starting to do the "power and a half for home, double business" rule for PSUs as well and so far it seems to work. that is what happens when companies can get buy with selling substandard because everybody is doing it, you end up having to buy more than you need because their ratings are all lies.

      --
      ACs don't waste your time replying, your posts are never seen by me.
    8. Re:UPS irony by FirephoxRising · · Score: 1

      I have a MGE Pulsar extreme 1000c that's been awesome! Wasn't impressed with the APC before it....

    9. Re:UPS irony by drinkypoo · · Score: 1

      The first thing I do when I get a used UPS is replace the battery. Well, after a function test. I have substitute batteries around that I can test with.

      Unless they've actually been literally hit by lightning, which in California is surpassing rare, they have no excuse for having fried. I used to buy used APC UPSes and replace the battery, and have them work for years. Now, I don't even consider buying them new. They shifted to the cheap consumer shit model. Thing is, I keep hearing about their big heavy UPSes failing in pathetic and unacceptable ways that tell me that this lack of attention to detail is probably now found across their entire range.

      Today, if I had to add battery back-up to a computer system, I would do it myself. Use a PicoPSU, put the battery in the case. Inverter to power the display. All I need are batteries and a suitable charger. I wonder if I have room for an Optima in there, then I could just use a car charger.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  20. My Personal Policy by wisnoskij · · Score: 2, Insightful

    This is why I don't use prototype tech that is really not ready to be used in the real world. And if you do, expect loads of bugs and bricking.

    But either way, thanks for funding the development of something I am excited to try out in 2-4+ years when it will be a mature usable technology.

    --
    Troll is not a replacement for I disagree.
    1. Re:My Personal Policy by edmudama · · Score: 1

      SSDs are way past prototype technology at this point. The products from high quality vendors are both fast and robust.

      --
      More data, damnit!
    2. Re:My Personal Policy by Anonymous Coward · · Score: 0

      As opposed to tech that is guaranteed to fail, i.e., hard drives?

      Just use backups.

  21. SSDs made by HD manufacturers by Burz · · Score: 1

    I think you make a good point about warranty clauses, and it would be hard to imagine HD manufacturers singling out their SSDs with an inferior warranty in this respect.

    Considering the paper cited by TFA won't spill the beans on which models were tested, it may be a safer bet to purchase SSDs from traditional HD makers (at least I hope that is the case with my Samsung).

  22. Server Application by Anonymous Coward · · Score: 0

    So for servers we should use a mixed SSD/hard drive RAID for data integrity?

  23. Power factor? by stoploss · · Score: 1

    Need a Kilowatt? Buy 2. Yes it sucks but you have to play the game and that I've found is a good hard fast rule that works for pretty much any company in the UPS business.

    Yes, that sucks, and I am no defender of the UPS industry.

    However, remember that UPS's are listed VA rather than watts. Power factor is an important issue, and that's why a UPS/small gasoline generator sucks for spinning up an A/C unit motor, or in the case of technology, a highly capacitive load. If (and only if) your load is purely resistive then VA==watts, otherwise, watts < VA consumption. All computer systems will have a power factor of <1. Simply put, if you drive an incandescent light bulb then VA is the same as wattage, but if there is any capacitance (computer hardware) or inductance (motors), then VA is higher than wattage because the power factor is less than 1.

    VA is the only fair metric for a UPS manufacturer to use, because it's trivial to come up with a load that has very high VA consumption while having close to zero wattage (ie. very low power factor)... however, that high VA load still drains the UPS battery that is running an inverter to produce an AC waveform.

    Watts pulled from the wall (ie. with no power factor applied) is most certainly not the same as the VA that UPS capacity is listed in. It makes sense that it is harder to spin up a motor or drive the capacitive load on a computer than it is to drive a simple resistor like an incandescent light bulb.

    Now, if you're alleging that the manufacturer-cited UPS VA capacity is being treated similarly to the laptop manufacturers' "estimated runtime" inflation then you may be right. I just wanted to ensure you were comparing apples to apples.

    1. Re:Power factor? by ub3r+n3u7r4l1st · · Score: 1

      In other words that's why you should NEVER plug in a laser printer behind a UPS. The initial current surge when the laser printer wakes up can easily overload the UPS and shut itself off, cutting off the power to the other devices on the UPS completely.

    2. Re:Power factor? by hairyfeet · · Score: 1

      Well I already have my customers plug the extra crap like printers on a seperate surge protector because I had heard some of those, especially the high end printers and scanners some of my customers use, can really pull on startup so I can tell ya that ain't it. ALL that is plugged into the UPS are the PCs and the monitors (we have to put the monitors because these ain't your cheapo LCDs, we are talking the high end graphics monitors) and I have found the "double" rule is pretty damned mandatory.

      But like I said now I'm seeing that with PSUs, and of course as you pointed out their laptop times are complete horseshit as well. Like I said this is what happens when ALL of the companies lowball, they can get away with bullshit like this because they are all doing the same bullshit. Just look at how all the SSD manufacturers put out these insane MTBF numbers when we know that the more cells and smaller processes equal worse MTBF times, its all marketing bullshit which mans you gotta play the game. To steal a line from Tron Legacy "I stand for the users" and what matters is them, so you have to know what works and what don't or you could cost somebody some serious dough and I've found the double rule WORKS and that is what matters.

      --
      ACs don't waste your time replying, your posts are never seen by me.
    3. Re:Power factor? by stoploss · · Score: 1

      That all makes sense. However, as a practical example, let's say you take a computer that has a power supply that's rated for 600 W and has a power factor of 0.5: that computer will require 1200 VA to drive the load. Furthermore, a computer monitor is practically the definition of a low power factor load, as well as the motors on the printers you wisely exclude from the UPS circuit.

      BTW, adding power factor correction is what the 80 Plus power supply certification program is about.

      If you want to compare your wattage, true VA load, and power factor (watts / power_factor == VA) then I suggest grabbing a Kill A Watt meter. Amazon has them for less than $20.

      So, doubling the wattage in VA is certainly a reasonable thing to do and would be precisely correct if the power factor for everything nets to be 0.5. Also, when your clients are upgrading their computers have them get power supplies with active power factor correction. It will decrease the load on their UPS as well as decreasing the day-to-day load on their HVAC (less waste heat).

    4. Re:Power factor? by hairyfeet · · Score: 1

      All the business units are already 80 plus as it saves money as you pointed out and frankly the 80 plus PSUs tend to last longer and when somebody is paying me a grand to build a system? they want it to last.

      But what i was getting at is it didn't USED to be this way, before i could use "one and a half" so that the customers had headroom for growth and call it a day but the past few years as Drinkypoo found out the numbers all the UPS guys use are bullshit so you have to buy double just to get what you really want.

      And I doubt a kill-a-watt would tell me much as frankly most of these systems are in no way power piggies. Most are using 95w AMD CPUs with midrange graphics in the 40w range, nothing that is really gonna pull a heavy load. Its just the years of racing to the bottom and lowballing on parts has taken its toll on the UPS business. Like I said I'm seeing the same thing with PSUs, whereas before you could buy one that was say 100w above what you needed to give you room for growth and now you need power and a half or even double just to keep from melting the PSU. It didn't use to be this way, used to be a PSU was a PSU and as long as you gave yourself some headroom to grow all was golden, not the case anymore.

      --
      ACs don't waste your time replying, your posts are never seen by me.
    5. Re:Power factor? by PlusFiveTroll · · Score: 1

      Working for a lot of offices there is another common item that gets plugged in to UPSs and kills them with a double whammy.

      Little space heaters.

      People find the first plug under their desk and stick a 1500W heater that kicks on and off, generally right beside the UPS heating it up like an oven!

  24. Why does the word "ECC" by rs79 · · Score: 1

    Not occur anywhere on this page?

    --
    Need Mercedes parts ?
  25. Less than? Don't think so. by SuperKendall · · Score: 1

    Yes, lets just not use anything that fails once and a while, even if it is even less than the thing it is protecting.

    Back when I was still using desktops I had more power outages from a consumer UPS going bad (say, twice a year or a bit more) than I did from real power failures.

    Furthermore, you could tell a normal grid power failure was probable in some cases, say a severe lightning storm. Then you could prepare a bit, at least making sure things were saved. A UPS failure strikes at any moment at all without care.

    AND a power failure from the power company is likely to be short. A UPS failure is GUARANTEED to last until you physically unplug the UPS from the path to power, so say if you are not at home that system is offline until you get back.

    --
    "There is more worth loving than we have strength to love." - Brian Jay Stanley
  26. Since I did RTFA by rabtech · · Score: 2

    Power loss protection (super capacitors) was stated on four of the drives (the four least expensive to boot). Only three performed flawlessly in the unserialized writes test. Those aren't great odds. In fact only two drives passed all tests with no errors, and it wasn't necessarily the SLC "enterprise" drives, though those two also passed the serialized writes test.

    In case you aren't aware, unserialized writes invalidate *every* assumption, including write ahead, journaling, even your fancy BTRFS/ZFS. His example is a database where the transaction log write was sync'd before the data page write, then after a power failure the data page is persisted but the log write is gone.

    You can recover from many of the other errors or at least detect them but unserialized writes can silently corrupt data or even ruin the entire filesystem.

    Obviously the metadata/dead failures are the exception... Those render the whole SSD useless.

    --
    Natural != (nontoxic || beneficial)
  27. Re:Less than? Don't think so. by Guspaz · · Score: 1

    I'd suggest, then, that you've been buying the wrong UPSes, because while they do occasionally fail, they don't fail at anywhere near the rate that you've experienced (or rather the quality brands don't). I've owned several units for quite a few years, and none have ever failed while powering equipment in a manner that caused a power loss. An expired battery did cause a failed self-test, but that didn't cause any outage, merely the indication that the battery should be replaced. Another unit (the one time I bought a noname brand) did fail catastrophically, but not while actively powering equipment.

  28. Re:Less than? Don't think so. by SuperKendall · · Score: 1

    After buying several different brands I gave up. Most were around $100, it's not like I was buying the cheapest UPS systems I could (at least not after the first one failed).

    I will say that I bought a UPS a year ago for just my router, and so far that has not failed. So I think it was really more a matter of UPS systems really not being able to handle anywhere close to rated load, as some people have discussed in other comments... Again that was with a desktop system and a monitor, it should have been within tolerance but obviously it wasn't as the UPS systems kept dying.

    --
    "There is more worth loving than we have strength to love." - Brian Jay Stanley
  29. Re:Less than? Don't think so. by PlusFiveTroll · · Score: 1

    The problem was any $100 UPS isn't a good UPS. Most decent UPS's are going to be closer to $300, which is pretty expensive for many users (though not business users who could easily lose $300 worth of work in the blink of the lights). More expensive battery backups monitor your batteries and perform tests on them and give you very good performance metrics. Closer to the $500 range an you'll get in to the 'online' UPS range that conditions all power to the device. Oh, and make sure the electrical system you are plugging in to has a proper, low resistance ground.

    What to take from this.
    1. A $100 UPS is not a good UPS, doesn't matter who makes it.
    2. Like any component, look at the cost of downtime and restoration caused by UPS failure and price the unit you get accordingly.

  30. Science?, of the 'but the 3rd mouse got away' var. by tkjtkj · · Score: 1

    Reading the original PDF, I noted the following: "Our testing framework detects unserializable writes with HDD#1, too. This indicates that some low-end hard drives may also ignore the flush requests sent from the kernel. On the other hand, HDD#2 incurred no failures due to power faults. This suggests that this high-end hard drive is more reliable in terms of power fault protection." Now, for ref, the above involved testing TWO spin-type HDD's .. HDD#2 was the 2nd drive of the pair, and it reportedly did not fail. While I was doing bio research I recall this joke: "33% of the test subjects improved with the treatment, 33% had no improvement, and the 3rd mouse got away" WHY oh why even *mention* such a meaningless 'result' in a paper that otherwise seems to adhere to proper scientific methods?? Oh , and yes, of course, how come it's ONLY the researchers who learned which drive brands did well?? Was this gov-funded? This is important, for if we were to learn that the SSD's that did well are no longer available on the market, it would save us a LOT of time, not even considering that any of the 'surviving' SSD's might be known to have other serious flaws in design .. (i.e.: in performance ) !!

    --
    "There are 11 kinds of people: those who know binary, those who don't, and those who could not care less!"
  31. Re:Less than? Don't think so. by X0563511 · · Score: 1

    I've been using UPS for years, and I've not had one fail in such a way once. The closest to failure that I have is that one only lasts for half a minute or so (the lead plates in the battery are trashed).

    In return, since the ones I buy have voltage regulators, I've never had to replace a dead power supply since I've been using them. Not. Once.

    --
    For large sets, this will be our guide even unto death, for the LORD will work for each type of data it is applied to...
  32. SSD "power holdup card" ? by gmarsh · · Score: 1

    I'm envisioning a little PCB that goes between a power cable and a SSD, and has some power management parts and a holdover capacitor. If power fails it would provide power to the SSD for a few seconds, hopefully long enough for it to flush its data to NAND. Could also do overvoltage protection etc. to prevent a bad power supply from frying the SSD as an extra feature. Should only cost ~$10-20 or so to make in quantities of 10 or so, and be a pretty quick design to bang out.

    It won't fit in anything other than a desktop PC. And I wouldn't be surprised if some SSDs would still drop dead with the card, because they'd have some dumb quirk like the controller hanging up if the SATA interface drops dead...

  33. Anecdotal Evidence by holiggan · · Score: 1

    My ex-boss had to deal with this problem. Short version: power issues are potentially worse to SSD than to hard disks.
    I got an SSD one and half year ago, for my home desktop rig, and "teased" my then-boss into getting one for his work laptop.
    My SSD is up and running nicelly (with stable current, very rare power outs), always shutdown, no hibernates or something like that.
    My ex-boss had to RMA 3 or 4 diferent SSDs, because he uses hibernate on his laptop and a couple of times after resuming, the SSD simply "reverted" to a previous "disc state". For example, after installing Windows 7, and the software and data, the SSD would "reset" back to the point right after installing Windows. Also, one of the times he completely formatted the SSD and after a reboot, it went back to the time it had Windows and everything else! Really odd and freakish, and usually an hibernate or even a normal shutdown was done before the SSD broke / bricked / froze...
    The laptop was not very recent (was probably 2 or 3 years old by the time he got the SSD), so some SATA driver issue combined with different power requirements or improvement over those years might explain such an unlucky streak...
    My SSD is still running nice and good, my ex-boss meanwhile replaced his laptop and SSD.

    --
    "A sysadmin is a cross between a detective, a police officer, a gardener, a doctor and a fireman"