Flash Destroyer Tests Limit of Solid State Storage
An anonymous reader writes "We all know that flash and other types of solid state storage can only endure a limited number of write cycles. The open source Flash Destroyer prototype explores that limit by writing and verifying a solid state storage chip until it dies. The total write-verify cycle count is shown on a display — watch a live video feed and guess when the first chip will die. This project was inspired by the inevitable comments about flash longevity on every Slashdot SSD story. Design files and source are available at Google Code."
It'll be nice to get some third-party data on exactly how long these things last on average.
a live stream linked on slashdot.. ouch..
Flash! Aa-aaahhh!!
For the guy a couple days back who asked what kind of project can he do that would be useful to the world, here is a great example. Try something like this.
Qxe4
Wait, which flash are we talking about here?
I was expecting something cool, like storing a picture, displaying it, and then constantly XORing each pixel with some random number twice, repeatedly, and watching the image decay over time. Although it would appear that it'd need quite a lot of time.
This project was inspired by the inevitable comments about flash longevity on every Slashdot SSD story.
Take that every 'dotter that says bitching on this website doesn't get anything done!
/removestonguefromcheek
Motorcycles, Robots, Space Gossip and More!
article says: We used a Microchip 24AA01-I/P 128byte I2C EEPROM (IC2), rated for 1million write cycles.
Um, SSDs don't use anything like this part as their storage.
I bet the server's IP address is untraceable.
"The cost of freedom is eternal vigilance." -Thomas Jefferson
The display only goes to 9,999,999! I think that won't be enuf... should be 100M or 1G.
Depends - if the chips are using some sort of error correction, they may well just fail. I have USB-based Flash die all the time and it DIES, as in not even presenting a usable device to the OS despite being "detected". The theory is that they fail nicely but the chances are that any non-premium flash will just die a death. Why bother making the device fail gracefully if it's failed anyway?
Literally - I've never seen a flash device in such a "read-only" mode, even for a single bit, but I can't even begin to count the number of flash-chips in certain devices (everything from routers to USB sticks) that just die for no reason and never recover.
Now, to see how much explosives it takes to MAKE it fail!
This is my favorite part! :-)
Any technology distinguishable from magic is insufficiently advanced. - Geek's corollary to Clarke's law
Oh bleh... AC box checked accidentally. The parent is me.
I support the Slashcott and will not be reading or commenting from 2/10/14 to 2/17/14. Beta is steaming pile of dog shit
Most modern flash memories have their controllers check which blocks are dying or dead and re-route write and read requests to good blocks. So while your flash may seem to be working perfectly well, various blocks inside it may be dying and its storage size may be progressively decreasing.
So I hope they are rewriting the entire flash in their test. Otherwise it is not representative.
Actually, I rescind my post, as I realize I was confusing EEPROM with NOR/NAND. Your point is actually quite valid.
I support the Slashcott and will not be reading or commenting from 2/10/14 to 2/17/14. Beta is steaming pile of dog shit
More importantly, the test pattern does not resemble normal SSD usage. Complete writes are very unusual for SSD and a cycle is not completed nearly as quickly as a cycle on this EEPROM (400 cycles per minute). When an SSD is written to in normal usage, a wear leveling algorithm distributes the data and avoids writing to the same physical blocks again and again. The German computer magazine C't has run continuous write tests with USB sticks and never managed to destroy even a single visible block on a stick that way. The first test (4 years ago) wrote the same block more than 16 million times before they gave up. The second test (2 years ago) wrote the full capacity over and over again. The 2GB stick did not show any signs of wear after more than 23TB written to it.
Okay, I'll bite. Let me introduce you to this thing called "functional equivalence". You do realize that even though they are all "nonvolatile storage," there is a difference between EEPROM and Flash, and that there are many different kinds of low- and high-density Flash and they all have different proprietary silicon designs with different characteristics?
Microchip EEPROMs are specifically designed for low-density, high-reliability applications, and are totally different at the transistor level from high-density MLC Flash used in solid state disks.
+1 Informative :)
They could add an extra digit to the front of the display showing how many times the other numbers have reached their maximum! Brilliant, 10x the capacity for only one digit more!
They're testing an EEPROM: while the underlining physics of storing data in an EEPROM and Flash RAM are the same - floating gate transistors - EEPROMs use best-of-breed implementations, single-bit addressable floating gate, while the Flash RAM found in SSDs is the cheapest, lest enduring MLC NAND. MLC NAND are the cheapest per bit, and have a write cycle endurance of two to three orders of magnitude lower than EEPROMs.
SSDs do not contain EEPROMs. They don't even contain SLC (NOR or NAND). In fact, SSDs don't even contain NOR MLCs. Only the cheapest will do, for SSDs.
"The agriculture ministry is not in charge of Gundam" - Japanese ministry official.
It may be that the controller on the device just doesn't know what to do when something goes pear-shaped. To be sure, you should be accessing the raw NAND chip itself.
True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
(Being a software guy rather than a Flash memory guy, I wouldn't want to guess whether over-erased cells would be at logic 1, logic 0 or a mix of the two.)
Well I'm not an expert on flash, but I know a little about how they work. In NOR flash the data line is pulled up to one, so that's the default state for any bit. There's a transistor connected to ground, and if the floating gate has a charge in it and the transistor is on, then it pulls the data line down to 0. "Erasing" a NOR flash sets all the bits to 1, and programming it sets certain bits to 0.
The most common failure mode as I understand it is that electrons get trapped in the floating gate even after erase cycles such that it's very close to or over Vt for the transistor, so that bit would be stuck in the "programmed" state of logic 0.
NAND memory is the opposite, the erased state is 0 and the programmed state is 1, so a permanently charged floating gate should result in a stuck-at-1 fault.
Which, relating to the OP's question, means either way the memory wouldn't be good for much of anything. Your NAND SSD is going to fail during an erase-program (aka "write") cycle, and except in the extremely unlikely case that the pattern you were writing did not involve changing any previously stored 1s to 0s on stuck bits, then the result is going to be wrong. You could read it, but you'd be reading the wrong data.
The enemies of Democracy are
Graceful as in data not related to your recent failed writes are still readable so they can be backed up and migrated to a new drive. Not sure why that concept is so difficult. I consider something dead as "completely unreadable, ALL your data has been destroyed - have a nice day."
No longer reliable but still semi recoverable isn't quite "dead."
Maybe I'm just using a stricter interpretation of the word dead than you are?
Let's use a marker on a white board analogy. If I was storing all my data on a suitably large white board using a marker and I completely exhausted my marker's supply of ink, I'd be pissed if this resulted in a blank whiteboard, wouldn't you? On that same note, if I wiped a small section of my whiteboard with the intent of writing something new in that area and only then realized that my marker was no longer suitably supplied with ink and my write failed, I would find the blank void in that section alone acceptable.
Does that clarify things?
The AC actually posited a worse case scenario, in that the whole disk was filled, and only one "spot" was repeatedly changed.
There are two ways to handle this:
...will the Flash Destroyer hold up under this load?
http://www.bynarystudio.com
This is what I got from a 2GB Kingston Flash Key. After that there were errors in almost all overwrites. However the real kicker is that while the key read back wrong data, there never ever was any error reported. Since doing that beginning of 2009, I do not rust USB Flash anymore.
Set-up: Linux, 1MB random data replicated to fill the chip, then read back to compare. Repeat with new random data. I had one isolated faulty read-back around 3500 cycles and then from arounf 3700 cycles 90% (and pretty soon 100%) faulty read-backs. Language was Python, no errors for the device on STDERR, or the systemlogs. And I looked carefully.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Your description is a bit backwards, at least for the NOR flash I work on. When the floating gate has charge (electrons), it turns the transistor off. The negative charge on the FG cancels out the positive voltage on the control gate. The bit is read via a current sense -- no current is a zero, lots of current is a one.
The main failure mechanism (that I know of) is oxide damage due to high energy electrons. Program and erase (technically, Fowler-Nordheim tunneling) take high voltages, which gives electrons enough energy to scatter into the oxide and get trapped. This repels other electrons. So what happens is that it takes longer and longer to program and erase until eventually you exceed the set limit, at which point it shows up as a fail. The bit will be in an indeterminate state. It may read correctly but won't have enough margin to guarantee data retention.
Visit the
I am working on flash write/erase cycling right now in my day job and I can tell you that this is not a very good test. Temperature affects cycling endurance (and this is reflected in the spec), so if your SSD is 20-30C higher than room temp it's going to make a difference. Fowler-Nordheim tunneling (which NAND flash uses for program and erase) is hardest at cold temperatures, so the first operation after powerup might be the worst case in a PC. (Yes, I know they're not using an SSD here, but they are doing their cycling at room temp.)
Another thing to keep in mind is that continuous cycling is not realistic. The wear-out mechanism here is charge trap-up, where electrons get stuck in the floating gate oxide and repel other electrons, slowing down program and erase. Over time, thermal energy lets the electrons detrap. So irregular usage in a hot PC should actually be nicer environment for endurance.
A final factor is process variation, which can only be covered by using a large sample size (>100) and/or using units from separate lots with known characteristics, none of which an end user will likely have access to. Even that doesn't tell you anything about the defect rate.
There are really two types of tests that people are talking about here. The first is a spec compliance test, which uses the extreme conditions I mentioned above to guarantee that all units will have the spec endurance under all spec conditions. This should be done by the manufacturer. The second is a real world usage test, which will only give realistic results if done under actual use conditions. The number you get from the article's test probably won't tell you much.
[Disclaimer: I work on embedded NOR flash, not NAND, but the bits are the same and the article's talking about EEPROM so I figure I can butt in.]
Visit the