Flash Destroyer Tests Limit of Solid State Storage

← Back to Stories (view on slashdot.org)

Flash Destroyer Tests Limit of Solid State Storage

Posted by timothy on Thursday May 27, 2010 @08:02AM from the step-right-up-place-your-bets dept.

An anonymous reader writes "We all know that flash and other types of solid state storage can only endure a limited number of write cycles. The open source Flash Destroyer prototype explores that limit by writing and verifying a solid state storage chip until it dies. The total write-verify cycle count is shown on a display — watch a live video feed and guess when the first chip will die. This project was inspired by the inevitable comments about flash longevity on every Slashdot SSD story. Design files and source are available at Google Code."

77 of 229 comments (clear)

Interesting! by exasperation · 2010-05-27 08:04 · Score: 3, Interesting

It'll be nice to get some third-party data on exactly how long these things last on average.
1. Re:Interesting! by mantis2009 · 2010-05-27 08:16 · Score: 4, Informative
  
  Just checked out the video feed. The chip already lasted longer than 1 million writes, which is the number of writes the chip is supposed to last over its lifetime. As of this writing, the chip has survived more than 1,600,000 write cycles and counting.
  
  Still, since this test isn't on an actual, shipping solid state drive (SSD) product, the results will be discounted by a lot of critics.
2. Re:Interesting! by Smallpond · 2010-05-27 08:30 · Score: 4, Insightful
  
  Mechanical disks have lots of great failure modes. You can do seek tests until the arm breaks or voice coil fails, you can do write/read tests until you get enough bad sectors that they can't recover the data any more, or you can do start-stop of the drive motor until it dies. Another good one is to stop the motor for a while, then see if it starts up or has stiction (sic), but that test takes a long time. If the drive is not held rigidly enough, vibration will kill it, and it it isn't cooled properly, heat will kill it. Did I miss any?
3. Re:Interesting! by jellomizer · 2010-05-27 08:36 · Score: 4, Interesting
  
  I would like to see a comparison with a mechanical drive doing the same thing in parallel.
  While the Solid Sate has a theoretical Limited number of writes vs. the mechanical drive, it would be interesting to see what real world has to offer.
  
  --
  If something is so important that you feel the need to post it on the internet... It probably isn't that important.
4. Re:Interesting! by msauve · 2010-05-27 08:38 · Score: 3, Insightful
  
  since this test isn't on an actual, shipping solid state drive (SSD) product, the results will be discounted by a lot of critics.
  Assuming that the flash is of equivalent technology (e.g. SLC NAND, cell size, etc) to that used for SSD, then this would present a best case test, since it is exercising all cells equally.
  
  An SSD tries to do wear leveling (distribute writes evenly), but that can't done perfectly, as is done in this test.
  
  --
  "National Security is the chief cause of national insecurity." - Celine's First Law
5. Re:Interesting! by Dancindan84 · 2010-05-27 08:40 · Score: 4, Insightful
  
  And honestly it's a pretty valid argument. This is definitely going to be informative, but I'm just as interested in how a particular SSD handles the flash blocks failing as when they fail. A SSD with flash that averages 1,000,000 writes before blocks start to fail but does it gracefully with little/no data loss could be better than one that averages 2,000,000 but goes out in a blaze of glory as soon as the first block fails.
  
  --
  "Always forgive your enemies; nothing annoys them so much." - Oscar Wilde
6. Re:Interesting! by Kindgott · 2010-05-27 08:42 · Score: 5, Informative
  
  Yeah, the title seems misleading, since they're writing and verifying data on an EEPROM, which is not used in solid state drives last time I checked.
  
  --
  If there's anything more important than my ego around here, I want it caught and shot immediately.
7. Re:Interesting! by Pharmboy · 2010-05-27 08:46 · Score: 5, Insightful
  
  Or connect the drive inside any computer running a Prescott P4 with 100% CPU utilization.
  
  --
  Tequila: It's not just for breakfast anymore!
8. Re:Interesting! by D+Ninja · 2010-05-27 08:51 · Score: 5, Funny
  
  If you have any important data on that drive, urine trouble...
9. Re:Interesting! by InsaneProcessor · 2010-05-27 08:52 · Score: 4, Informative
  
  I find this "not very interesting" RTFA. This is not a flash destroyer. It is an EEPROM destroyer. NOT THE SAME THING AND NOT USEFUL!
  
  --
  
  Athiesm is a religion like not collecting stamps is a hobby.
10. Re:Interesting! by TeknoHog · 2010-05-27 09:13 · Score: 4, Interesting
  
  I'm just curious, why use sic in your own posts? Wouldn't you just correct whatever you are sic-ing?
  IMHO, this kind of use of [sic] is perfectly valid. It means "this is not a typo, it's really how it is spelled" (literally "thus"). In this case it refers to an unusual word that may look like a misspelling of a more common word. However, it can also refer to a genuine misspelling, when you are referring to what somebody else wrote.
  
  --
  Escher was the first MC and Giger invented the HR department.
11. Re:Interesting! by Chris+Burke · 2010-05-27 09:20 · Score: 4, Funny
  
  A SSD with flash that averages 1,000,000 writes before blocks start to fail but does it gracefully with little/no data loss could be better than one that averages 2,000,000 but goes out in a blaze of glory as soon as the first block fails.
  That depends on how you define "better", and for my personal definition, it depends on exactly how glorious a blaze it is. :)
  
  --
  
  The enemies of Democracy are
12. Re:Interesting! by Jah-Wren+Ryel · 2010-05-27 09:26 · Score: 3, Interesting
  
  And honestly it's a pretty valid argument. This is definitely going to be informative, but I'm just as interested in how a particular SSD handles the flash blocks failing as when they fail. A SSD with flash that averages 1,000,000 writes before blocks start to fail but does it gracefully with little/no data loss could be better than one that averages 2,000,000 but goes out in a blaze of glory as soon as the first block fails.
  Flash fails on write - if the write succeeds, you will be able to read it baring catastrophic events like ESD exposure.
  
  --
  When information is power, privacy is freedom.
13. Re:Interesting! by lauragrupp · 2010-05-27 09:33 · Score: 5, Informative
  
  Here is work from the academic community exploring error rates, latencies and some other factors. It compares 11 NAND flash chips (both SLC and MLC) from 5 manufacturers: http://nvsl.ucsd.edu/ftest.html
14. Re:Interesting! by Rockoon · 2010-05-27 09:46 · Score: 2, Interesting
  
  Mechanical disks have lots of great failure modes.
  My favorites are the ones that make loud sounds during the failure event. When a piece of the head breaks off, for example.. that thing bounces around in there like crazy when the drive is spinning around thousands of times per minute.
  
  --
  "His name was James Damore."
15. Re:Interesting! by Loki_1929 · 2010-05-27 09:50 · Score: 3, Funny
  
  He said an oven, not a nuclear fusion core.
  
  --
  -- "Government is the great fiction through which everybody endeavors to live at the expense of everybody else."
16. Re:Interesting! by BikeHelmet · 2010-05-27 09:58 · Score: 3, Insightful
  
  I just hang around on the NCIX forums, and every day or two there's a person complaining about having to RMA their SSD because programs started crashing, and then finally they couldn't even boot it.
  I saw lots of people replying in threads, saying theirs were still working fine. I started asking everyone how long they had owned theirs. Most with working SSDs were in the 8-15 months range, and most with serious problems were in the 12-24 months range.
  I've noticed that SSD warranties from a lot of manufacturers have dropped from the original 5 years down to ~2. That's quite a drop. There must be a reason.
  I suspect a heavy disk user like myself would burn through one well before the warranty is up.
  Note: My sample is pretty small compared to the amount sold, but I do wonder how many die without the owners being vocal about it.
  I'm wondering if close to two years ago manufacturers flipped to cheaper NAND to get the prices down? Now prices are going back up, so maybe manufacturers realized their mistake? Even since January, SSD prices have gone up 20-30% on average. $89.99 SSDs are now $120+
  http://www.newegg.com/Product/Product.aspx?item=N82E16820167025&Local=y
17. Re:Interesting! by gyrogeerloose · 2010-05-27 09:59 · Score: 2, Funny
  
  When I first read the title of the summary, I thought to myself "Shit, yet another one about Apple versus Adobe..."
  
  --
  This ain't rocket surgery.
18. Re:Interesting! by gyrogeerloose · 2010-05-27 10:03 · Score: 3, Funny
  
  That depends on how you define "better", and for my personal definition, it depends on exactly how glorious a blaze it is. :)
  Really. Don't all of us Slashdotters love a good explosion? Sure, we mostly prefer them to be scheduled explosions but, still, an explosion is an explosion.
  
  --
  This ain't rocket surgery.
19. Re:Interesting! by dave420 · 2010-05-27 10:05 · Score: 2, Informative
  
  The chip in question is completely different in tolerances, performance, and life-span of the chips used in SSDs. That's the problem.
20. Re:Interesting! by Simetrical · 2010-05-27 10:19 · Score: 3, Informative
  
  So, I'm violating my usual rule of not responding to ACs, only because you're such an idiot (which conveniently explains why you are posting AC).
  
  "perfect" in that they will distribute the writes 100% evenly across all available spare sectors
  See, that's the thing. Once a sector is written to, it won't be touched again, unless the data changes. You end up with some subset of sectors which are frequently modified, while others never are. That is NOT an even distribution of writes across all sectors, nor is it "perfect" in any sense of the word. So, fill up 75% of your SSD with files which don't change, then beat up on the remaining sectors 4 times as much as truly evenly distributed writes would cause. It's not clear what you "MLC" comment was about, since I specifically mentioned that as an example of flash technology.
  So keep track of how many times each erase block has been written, and if some blocks get erased too often relative to the rest, move data from the least-erased blocks onto the most-erased blocks. You do a few extra writes this way, but a negligible number if you set the thresholds high enough. And then you'll get fully leveled writes. I'm sure the clever folks at places like Intel have figured out strategies like this (although for the cheap stuff, who knows).
  
  --
  MediaWiki developer, Total War Center sysadmin
21. Re:Interesting! by srvivn21 · 2010-05-27 10:45 · Score: 2, Informative
  
  Not the original AC, but I thought I would try to clear up a disconnect instead of downmodding...
  
  So, I'm violating my usual rule of not responding to ACs, only because you're such an idiot (which conveniently explains why you are posting AC).
  -1 Flamebait. As I'll show, the rest of your rant has insufficient content to balance this.
  
  "perfect" in that they will distribute the writes 100% evenly across all available spare sectors
  Emphasis mine.
  
  See, that's the thing. Once a sector is written to, it won't be touched again, unless the data changes. You end up with some subset of sectors
  
  The spare ones, as the AC pointed out.
  
  which are frequently modified, while others never are. That is NOT an even distribution of writes across all sectors,
  Not a claim made by the AC.
  
  nor is it "perfect" in any sense of the word.
  Strictly your opinion.
  
  So, fill up 75% of your SSD with files which don't change, then beat up on the remaining sectors 4 times as much as truly evenly distributed writes would cause.
  The AC actually posited a worse case scenario, in that the whole disk was filled, and only one "spot" was repeatedly changed.
  
  It's not clear what you "MLC" comment was about, since I specifically mentioned that as an example of flash technology.
  Sorry mate, your original comment made mention of SLC, not MLC. While it's not clear what the AC was harping about (as you didn't make a claim regarding the type of flash used by retail SSD's) calling the AC names without comprehending what was actually written is not conclusive to a rational discussion. I can only hope I'm not feeding a troll.
22. Re:Interesting! by networkBoy · 2010-05-27 11:07 · Score: 4, Informative
  
  And in fact, the more advanced wear leveling algorithms do this already. There are spare blocks specifically such that the data can be moved, then the old block that was not used can be freed.
  
  --
  whois gawk date unzip strip find touch finger mount join nice man top fsck grep eject more yes exit umount sleep dump
23. Re:Interesting! by networkBoy · 2010-05-27 11:11 · Score: 4, Informative
  
  In fact, they are read back. At the flash component level.
  The flash cell is a charged gate. when programmed the uC in the flash device compares the charge state with a reference voltage. Not enough? Add more charge. Still not enough? Cell is bad, mark it (block level, so you lose xx bits for one bad one) and move on.
  This is fairly high level and not exactly how it works, but close enough.
  
  --
  whois gawk date unzip strip find touch finger mount join nice man top fsck grep eject more yes exit umount sleep dump
24. Re:Interesting! by mysidia · 2010-05-27 11:24 · Score: 2, Insightful
  
  He said an oven not a nuclear fusion core.
  The response was a Prescott P4 at 100% CPU, not an SCC / 48-CORE Intel Many-Cores(TM) chip with all cores at continuous 100% utilization
25. Re:Interesting! by blahplusplus · 2010-05-27 11:30 · Score: 3, Interesting
  
  One should not forget companies might have "chip lotteries", i.e. use chips that are less robust and cheaper to manufacture without majority of consumers knowing the difference.
  They do this in the LCD monitor industry where they have "panel lotteries" that use cheaper parts and are not what is advertised due to consumer ignorance. See Article on Anand here about panel lotteries:
  http://forums.anandtech.com/showthread.php?t=39226
26. Re:Interesting! by mysidia · 2010-05-27 11:31 · Score: 2, Interesting
  
  See, that's the thing. Once a sector is written to, it won't be touched again, unless the data changes. You end up with some subset of sectors which are frequently modified, while others never are. That is NOT an even distribution of writes across all sectors, nor is it "perfect" in any sense of the word.
  If you have a copy on write filesystem such as ZFS, changes might be written to new blocks.
  It seems that in the future storage agents (or the SSDs themselves) might eventually evolve to relocate certain regions physically, in order to put rarely written data, or data that is 'stable' due to the existence of snapshots, onto the most worn sectors.
  Once wear on the commonly written sectors exceeds certain ranges...
27. Re:Interesting! by Anonymous Coward · 2010-05-27 11:39 · Score: 5, Informative
  
  Actually, I believe *you* are incorrect. Different AC here, but I had to respond because your response doesn't match what I understand to be the case as an engineer working with vendors selecting NAND flash for use in consumer devices. I'll be interested to see if I'm incorrect or if this even gets read as an AC post.
  Specifically, it doesn't matter to the flash device if the host has written a sector and never touched it again, that sector *will* be moved when it's been read enough times that the ECC indicates it's likely to become unreadable soon. This is called read disturbance and it can happen surprisingly frequently with MLC cells in small process sizes (i.e. at sufficient density to make multi-GB modules). It also happens on SLC devices but to a lesser extent because they can cope with more voltage decay per bit and still be able to read the bit correctly. This is done as a function of even the simplest block-access controllers because otherwise you wouldn't be able to read your own data back more than a few hundred times. In fact, if you wish to get technical about it, it also has a massive dependency upon the temperature the module is at when the data was originally written since this directly impacts the amount of electrons which can be stored.
  In addition to moving data to counter read disturbance, most controllers (even the very simple ones in SD Cards & eMMC devices) will move sectors (actually not filesystem sectors, but individual blocks although the distinction isn't important here) around in order to optimise wear across the entire device even if the content hasn't changed. If you think about it, this has to happen at some level even without wear levelling since the sector is massively smaller than the superblock size for most of the densities we have available today - it's not unusual to see a device with an erase block size of 256KB, which is normally way larger than a sector.
  I don't know much about SSD controllers, they're far too expensive for our devices, but they can't possibly work the way you think they do - not if they use the same raw NAND that is used for other block storage abstractions.
28. Re:Interesting! by Anonymous Coward · 2010-05-27 11:40 · Score: 3, Informative
  
  Informative? How about wrong!
  128seconds *1M operations = 128,000,000 seconds
  Seconds in a day = 86400
  128M/86400 = 1481.48 days
  Or roughly 4 years.
  For some reason, you divided 128M by the number of minutes in a day (1440) to arrive at your ludicrous 243years.
  Hence you are out by a factor of 60.
29. Re:Interesting! by Bob_Geldof · 2010-05-27 14:24 · Score: 2
  
  I'm personally convinced it's just another round of Memory Company Collusion, like the whole rambus thing.
  Honestly the price of ALL memory has gone up between 20 and 100 percent in the past year (go look at ddr2/ddr3 prices, they're the same or HIGHER than they were last year. 4 gigs for 50 bucks a year ago, 2 gigs for 45 bucks now. There was an overlap period on newegg where UNREGISTERED ECC DDR3 @ 1333 was CHEAPER than Non-ECC consumer sticks by 10-20 bucks for the same capacity. Obviously that has since changed, but the point is memory is suspiciously going up in price while most other consumer hardware is still on the way down.)
  Have you taken into account the difference in inflation of your local currency and that of the currency used where the RAM in question is made? Assuming a Taiwanese manufacturer selling in the US, my back of the envelope calculations put a $50 quantity of RAM last year at most $53 today due to exchange rates. Add on another 2% for inflation (http://forecasts.org/inflation.htm) we get $54.
  Then again that does not take into account the dip in supply as a result of manufacturers holding off production while the western economies tanked over the last year or two. It comes down to how accurately did the manufacturers predict the drop in demand. It is possible demand exceeds supply enough to increase prices by about $5-$15 at the moment.
  I think it is plausible that we are only seeing market forces at play here. Someone should look into it a bit more though, to be sure.
  
  --
  887321 = 337*2633
30. Re:Interesting! by Bing+Tsher+E · 2010-05-27 14:45 · Score: 4, Funny
  
  That brings to mind an old favorite of mine: the Light Emitting EPROM. The power pins on EPROM chips are in opposite corners. Plug in the EPROM chip backwards and you've hooked the power up backwards. Result: A light emitting EPROM, though one with a very limited service life.
31. Re:Interesting! by izomiac · 2010-05-27 16:24 · Score: 5, Informative
  
  Cause wear leveling only picks another sector to write to from among the unused sectors. Simplified, if your drive is 80% full, you write to the same sectors five times as often.
  
  Especially because once blocks start failing, other blocks start failing too, at an accellerating rate, and they rapidly reach a state of being completely unusable.
  That's a contradiction. If the wear-leveling algorithm was ineffective then you'd have a relatively constant rate of block failure. A good wear-leveling algorithm ensures you won't get a significant number of block failures until almost every block has been worn out. Then you get a bunch. So the behavior described is failing exactly as intended, and indicates the wear-leveling algorithm worked almost perfectly.
  
  But you're right in that a wear algorithm that only uses free space would be terrible. That's one reason no device uses one like that. The primary reason though, is because the SSD has no idea which blocks are empty and which are free, unless it is told via the TRIM command (later generation SSDs with newer OSes). The filesystem knows, but an SSD is filesystem agnostic. Moving data is the cause behind the performance drop-off when the drive runs out of unused/un-TRIM'd blocks.
  
  Personally, I have the cheapest, buggiest SSD in common knowledge (the one that can get bogged down to 4 IOPS), and it has worked beautifully for me. Just checking a diagnostic tool, in the past two years I've power cycled it 5,666 times (which probably explains why I kill HDDs so quickly), the average block has been erased 7,333 times, and no block has been erased more than 7,442 times. I've got zero ECC failures. Honestly, I'm a little surprised I've written 234 TB of data to my poor 32 GB drive, but my usage is a bit heavy (~10 complete Gentoo compiles with countless updating, ~5 DISM'd Windows 7 installs, ~5 DISM'd Vista installs, ~30 Haiku installs, ~20 SVNs of 10 GB projects, and a good amount of downloading).
  
  But, in my experience, the wear leveling algorithm is only ~3% away from being "perfect".
live stream by Anonymous Coward · 2010-05-27 08:07 · Score: 5, Funny

a live stream linked on slashdot.. ouch..
1. Re:live stream by biryokumaru · 2010-05-27 08:19 · Score: 2, Insightful
  
  They should have a bit torrent-like system for streams. Like, you just connect to the swarm and request a fairly recent image. Everyone keeps the past minute or so cached to send to new people in the swarm. Maybe a tiered system so that the people who have been connected longest are closest to the original stream.
  Let's say I connect to Joe and Mary, who're connected to the original server. They send me frames two or three frames behind the server. Jack connects, and he's getting a bit lagged images too, right with me. Now Sally connects and she's behind me and Jack, so Me, Jack, Joe and Mary all send her images. It's like a pyramid scheme for streaming video.
  Now Joe leaves. I've been around longer than Jack, so I move up in the tiers. I see a single frameskip, but now I'm connected directly to the source stream.
  The real purpose here is to relieve some of the pressure from the initial server. Maybe they've got 100/100, and I connect with my 20/5. Well, 5 isn't much compared to 100, but I'm pulling less than 1. Let's call it 1. Now the available bandwidth for streaming is 105 and I'm only using 1. With all them other folks connected up, the server might be only holding half the load. And higher bandwidths could get tiering priority, like, if I have 100/100, well, I get directly connected to the server pretty quick so I can redistribute the stream faster.
  Oh, that's right, video comes in streams, not images... well, okay, it's got some problems. But it seems like a good solution to a very, very common problem. Make things easier on Hulu and Youtube (cause we all know they need the help, right?) and such too. Maybe drastically reduce the barrier for entry into that field, at least.
  Just a thought.
  
  --
  When you're afraid to download music illegally in your own home, then the terrorists have won!
2. Re:live stream by kipin · 2010-05-27 08:30 · Score: 4, Informative
  
  http://torrentstream.org/
  
  Works pretty well actually.
  
  --
  If I can not smoke in heaven, then I shall not go. -- Mark Twain
3. Re:live stream by TooMuchToDo · 2010-05-27 08:31 · Score: 4, Informative
  
  You've just described what multicast was designed to solve.
  https://www.cisco.com/en/US/products/ps6552/products_ios_technology_home.html
4. Re:live stream by game+kid · 2010-05-27 08:40 · Score: 2, Interesting
  
  Doesn't multicast help any? Given a bunch of people who want to view the same exact stream, the server should be sending the same packets and letting the viewers' players deal with sync, starting at a key frame (and not in the middle of some crumbly diff frames), et cetera. With that, the server could just concentrate on the list of viewers' IPs, send packets far less often, and the /. arson fails.
  Live streams, to me, seem easier than webpages because the viewer always wants the current frames of a live video but may want any portion of any other pages.
  
  --
  You can hold down the "B" button for continuous firing.
5. Re:live stream by RollingThunder · 2010-05-27 11:02 · Score: 2, Interesting
  
  And which works great for IPTV solutions. The end points subscribe to a channel by setting their IP, and then the upstream router decides if it needs to do the same, heading further back until it hits another router that's got the channel already subscribed.
  Similar for when you leave the channel. Once the router decides it's not got any clients for a given channel, it'll unsubscribe from it and those will bubble back.
  Very elegant, imo.
6. Re:live stream by adolf · 2010-05-27 14:18 · Score: 2, Interesting
  
  It is very nice. And it was around for a long, long time before people started using it for everyday television (IPTV). We used to call it the Mbone.
  
  --
  Kid-proof tablet..
Subject here by Anonymous Coward · 2010-05-27 08:10 · Score: 4, Funny

Flash! Aa-aaahhh!!
1. Re:Subject here by Chris+Burke · 2010-05-27 08:29 · Score: 4, Funny
  
  Now do that a million more times and we'll see if you wear out. Don't forget to include the live video feed.
  
  --
  
  The enemies of Democracy are
2. Re:Subject here by Rockoon · 2010-05-27 08:55 · Score: 2, Informative
  
  King of the impossible!
  
  --
  "His name was James Damore."
for the guy by phantomfive · 2010-05-27 08:10 · Score: 2, Insightful

For the guy a couple days back who asked what kind of project can he do that would be useful to the world, here is a great example. Try something like this.

--
Qxe4
1. Re:for the guy by houstonbofh · 2010-05-27 08:19 · Score: 3, Funny
  
  The fact that you said this shows you spend way to much time on slashdot. The fact that I recognized it, and was one of the first posters in the thread you refer to says the same about me. I wonder if I can find a life for sale on craigslist?
  
  Link to thread in question...
  http://ask.slashdot.org/story/10/05/23/1547202/Scientific-RampD-At-Home
Die Flash, DIE! by Anonymous Coward · 2010-05-27 08:11 · Score: 5, Funny

Wait, which flash are we talking about here?
1. Re:Die Flash, DIE! by hoggoth · 2010-05-27 08:36 · Score: 2, Funny
  
  > Wait, which flash are we talking about here?
  We're talking about flash photography, of course.
  
  --
  - For the complete works of Shakespeare: cat /dev/random (may take some time)
2. Re:Die Flash, DIE! by bennomatic · 2010-05-27 12:08 · Score: 3, Funny
  
  No no, it's German for "The Flash, THE!"
  
  --
  The CB App. What's your 20?
dull by Threni · 2010-05-27 08:14 · Score: 5, Funny

I was expecting something cool, like storing a picture, displaying it, and then constantly XORing each pixel with some random number twice, repeatedly, and watching the image decay over time. Although it would appear that it'd need quite a lot of time.
Ha! by BJ_Covert_Action · 2010-05-27 08:14 · Score: 2, Funny

This project was inspired by the inevitable comments about flash longevity on every Slashdot SSD story.
Take that every 'dotter that says bitching on this website doesn't get anything done!

/removestonguefromcheek

--
Motorcycles, Robots, Space Gossip and More!
SSD's? no. by hypethetica · 2010-05-27 08:16 · Score: 5, Informative

article says: We used a Microchip 24AA01-I/P 128byte I2C EEPROM (IC2), rated for 1million write cycles.
Um, SSDs don't use anything like this part as their storage.
The more of you that watch, the faster it dies by bluestar · 2010-05-27 08:16 · Score: 2, Funny

I bet the server's IP address is untraceable.

--
"The cost of freedom is eternal vigilance." -Thomas Jefferson
1. Re:The more of you that watch, the faster it dies by Monkeedude1212 · 2010-05-27 08:19 · Score: 2, Informative
  
  Looking back on it, that was a pretty bad movie.
The display only goes to 9,999,999! by Rick+Richardson · 2010-05-27 08:21 · Score: 2, Insightful

The display only goes to 9,999,999! I think that won't be enuf... should be 100M or 1G.
1. Re:The display only goes to 9,999,999! by Silly+Man · 2010-05-27 08:46 · Score: 3, Funny
  
  But it will be *over nine thousand!!!*
Re:Die? by ledow · 2010-05-27 08:23 · Score: 2, Informative

Depends - if the chips are using some sort of error correction, they may well just fail. I have USB-based Flash die all the time and it DIES, as in not even presenting a usable device to the OS despite being "detected". The theory is that they fail nicely but the chances are that any non-premium flash will just die a death. Why bother making the device fail gracefully if it's failed anyway?
Literally - I've never seen a flash device in such a "read-only" mode, even for a single bit, but I can't even begin to count the number of flash-chips in certain devices (everything from routers to USB sticks) that just die for no reason and never recover.
Myth Busters by PSaltyDS · 2010-05-27 08:23 · Score: 4, Funny

Now, to see how much explosives it takes to MAKE it fail!
This is my favorite part! :-)

--
Any technology distinguishable from magic is insufficiently advanced. - Geek's corollary to Clarke's law
Re:SSD's? no. by ElectricTurtle · 2010-05-27 08:28 · Score: 2, Informative

Oh bleh... AC box checked accidentally. The parent is me.

--
I support the Slashcott and will not be reading or commenting from 2/10/14 to 2/17/14. Beta is steaming pile of dog shit
But how much data does it write? by Edmund+Blackadder · 2010-05-27 08:32 · Score: 3, Insightful

Most modern flash memories have their controllers check which blocks are dying or dead and re-route write and read requests to good blocks. So while your flash may seem to be working perfectly well, various blocks inside it may be dying and its storage size may be progressively decreasing.
So I hope they are rewriting the entire flash in their test. Otherwise it is not representative.
1. Re:But how much data does it write? by fnj · 2010-05-27 08:48 · Score: 2, Insightful
  
  Nonsense, it's completely representative of normal use. That's exactly the point. Until data loss occurs, or there are no more free blocks to use, the flash memory is objectively perfectly good.
2. Re:But how much data does it write? by blind+biker · 2010-05-27 08:57 · Score: 3, Informative
  
  They're testing an EEPROM: it is bit addressable and it does not contain any wear leveling algorithm.
  
  --
  "The agriculture ministry is not in charge of Gundam" - Japanese ministry official.
Re:SSD's? no. by ElectricTurtle · 2010-05-27 08:32 · Score: 2, Informative

Actually, I rescind my post, as I realize I was confusing EEPROM with NOR/NAND. Your point is actually quite valid.

--
I support the Slashcott and will not be reading or commenting from 2/10/14 to 2/17/14. Beta is steaming pile of dog shit
Re:SSD's? no. by Anonymous Coward · 2010-05-27 08:34 · Score: 5, Informative

More importantly, the test pattern does not resemble normal SSD usage. Complete writes are very unusual for SSD and a cycle is not completed nearly as quickly as a cycle on this EEPROM (400 cycles per minute). When an SSD is written to in normal usage, a wear leveling algorithm distributes the data and avoids writing to the same physical blocks again and again. The German computer magazine C't has run continuous write tests with USB sticks and never managed to destroy even a single visible block on a stick that way. The first test (4 years ago) wrote the same block more than 16 million times before they gave up. The second test (2 years ago) wrote the full capacity over and over again. The 2GB stick did not show any signs of wear after more than 23TB written to it.
Re:SSD's? no. by robot256 · 2010-05-27 08:39 · Score: 4, Informative

Okay, I'll bite. Let me introduce you to this thing called "functional equivalence". You do realize that even though they are all "nonvolatile storage," there is a difference between EEPROM and Flash, and that there are many different kinds of low- and high-density Flash and they all have different proprietary silicon designs with different characteristics?
Microchip EEPROMs are specifically designed for low-density, high-reliability applications, and are totally different at the transistor level from high-density MLC Flash used in solid state disks.
Re:SSD's? no. by Anonymous Coward · 2010-05-27 08:45 · Score: 2, Funny

+1 Informative :)
I know by billlava · 2010-05-27 08:53 · Score: 5, Funny

They could add an extra digit to the front of the display showing how many times the other numbers have reached their maximum! Brilliant, 10x the capacity for only one digit more!
Apples and hippos by blind+biker · 2010-05-27 08:53 · Score: 5, Informative

They're testing an EEPROM: while the underlining physics of storing data in an EEPROM and Flash RAM are the same - floating gate transistors - EEPROMs use best-of-breed implementations, single-bit addressable floating gate, while the Flash RAM found in SSDs is the cheapest, lest enduring MLC NAND. MLC NAND are the cheapest per bit, and have a write cycle endurance of two to three orders of magnitude lower than EEPROMs.
SSDs do not contain EEPROMs. They don't even contain SLC (NOR or NAND). In fact, SSDs don't even contain NOR MLCs. Only the cheapest will do, for SSDs.

--
"The agriculture ministry is not in charge of Gundam" - Japanese ministry official.
1. Re:Apples and hippos by Microlith · 2010-05-27 09:15 · Score: 3, Informative
  
  They don't even contain SLC (NOR or NAND).
  Some, usually the more expensive models, will use SLC NAND. No SSD uses NOR for data storage due to a total lack of density on that technology. They may for storing firmware/FPGA data, however.
2. Re:Apples and hippos by curunir · 2010-05-27 13:06 · Score: 2, Informative
  
  Intel's Extreme line, for one. The X25-E goes up to 64GB. It's a 2.5" form factor, but it's a SATA drive and you can use a 3.5" bay with mounting rails to put it in a desktop.
  GP is right about being expensive...expect to pay over $600 for the 64GB model.
  
  --
  "Don't blame me, I voted for Kodos!"
3. Re:Apples and hippos by somenickname · 2010-05-27 13:45 · Score: 3, Interesting
  
  All of these ones: http://www.newegg.com/Product/ProductList.aspx?Submit=ENE&N=2010150636%201749646482&name=SLC
Re:Die? by fbjon · 2010-05-27 08:54 · Score: 2, Insightful

It may be that the controller on the device just doesn't know what to do when something goes pear-shaped. To be sure, you should be accessing the raw NAND chip itself.

--
True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
Re:Die? by Chris+Burke · 2010-05-27 09:05 · Score: 3, Informative

(Being a software guy rather than a Flash memory guy, I wouldn't want to guess whether over-erased cells would be at logic 1, logic 0 or a mix of the two.)
Well I'm not an expert on flash, but I know a little about how they work. In NOR flash the data line is pulled up to one, so that's the default state for any bit. There's a transistor connected to ground, and if the floating gate has a charge in it and the transistor is on, then it pulls the data line down to 0. "Erasing" a NOR flash sets all the bits to 1, and programming it sets certain bits to 0.
The most common failure mode as I understand it is that electrons get trapped in the floating gate even after erase cycles such that it's very close to or over Vt for the transistor, so that bit would be stuck in the "programmed" state of logic 0.
NAND memory is the opposite, the erased state is 0 and the programmed state is 1, so a permanently charged floating gate should result in a stuck-at-1 fault.
Which, relating to the OP's question, means either way the memory wouldn't be good for much of anything. Your NAND SSD is going to fail during an erase-program (aka "write") cycle, and except in the extremely unlikely case that the pattern you were writing did not involve changing any previously stored 1s to 0s on stuck bits, then the result is going to be wrong. You could read it, but you'd be reading the wrong data.

--

The enemies of Democracy are
Re:Huh? by Denis+Lemire · 2010-05-27 09:17 · Score: 4, Insightful

Graceful as in data not related to your recent failed writes are still readable so they can be backed up and migrated to a new drive. Not sure why that concept is so difficult. I consider something dead as "completely unreadable, ALL your data has been destroyed - have a nice day."
No longer reliable but still semi recoverable isn't quite "dead."
Maybe I'm just using a stricter interpretation of the word dead than you are?
Let's use a marker on a white board analogy. If I was storing all my data on a suitably large white board using a marker and I completely exhausted my marker's supply of ink, I'd be pissed if this resulted in a blank whiteboard, wouldn't you? On that same note, if I wiped a small section of my whiteboard with the intent of writing something new in that area and only then realized that my marker was no longer suitably supplied with ink and my write failed, I would find the blank void in that section alone acceptable.
Does that clarify things?
It's not the worst case by tepples · 2010-05-27 11:06 · Score: 2, Insightful
The AC actually posited a worse case scenario, in that the whole disk was filled, and only one "spot" was repeatedly changed.
There are two ways to handle this:
- Reserve 5% to 7% of sectors to replace worn-out sectors. This conveniently happens to match the difference between 64 GB and 64 GiB. Some newer SSDs have "250 GB", which leaves over 9% of a 256 GiB module free for the controller to spread writes.
- Move this unchanging data from less worn sectors to more worn sectors to free up the less worn sectors for more rapidly changing data.
I wonder... by bynary · 2010-05-27 11:30 · Score: 3, Funny

...will the Flash Destroyer hold up under this load?

--
http://www.bynarystudio.com
3700 overwrite Cycles by gweihir · 2010-05-27 13:11 · Score: 3, Informative

This is what I got from a 2GB Kingston Flash Key. After that there were errors in almost all overwrites. However the real kicker is that while the key read back wrong data, there never ever was any error reported. Since doing that beginning of 2009, I do not rust USB Flash anymore.
Set-up: Linux, 1MB random data replicated to fill the chip, then read back to compare. Repeat with new random data. I had one isolated faulty read-back around 3500 cycles and then from arounf 3700 cycles 90% (and pretty soon 100%) faulty read-backs. Language was Python, no errors for the device on STDERR, or the systemlogs. And I looked carefully.

--
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Re:Die? by AdamHaun · 2010-05-27 16:07 · Score: 2, Informative

Your description is a bit backwards, at least for the NOR flash I work on. When the floating gate has charge (electrons), it turns the transistor off. The negative charge on the FG cancels out the positive voltage on the control gate. The bit is read via a current sense -- no current is a zero, lots of current is a one.
The main failure mechanism (that I know of) is oxide damage due to high energy electrons. Program and erase (technically, Fowler-Nordheim tunneling) take high voltages, which gives electrons enough energy to scatter into the oxide and get trapped. This repels other electrons. So what happens is that it takes longer and longer to program and erase until eventually you exceed the set limit, at which point it shows up as a fail. The bit will be in an indeterminate state. It may read correctly but won't have enough margin to guarantee data retention.

--
Visit the
This is a bad test by AdamHaun · 2010-05-27 16:49 · Score: 4, Informative

I am working on flash write/erase cycling right now in my day job and I can tell you that this is not a very good test. Temperature affects cycling endurance (and this is reflected in the spec), so if your SSD is 20-30C higher than room temp it's going to make a difference. Fowler-Nordheim tunneling (which NAND flash uses for program and erase) is hardest at cold temperatures, so the first operation after powerup might be the worst case in a PC. (Yes, I know they're not using an SSD here, but they are doing their cycling at room temp.)
Another thing to keep in mind is that continuous cycling is not realistic. The wear-out mechanism here is charge trap-up, where electrons get stuck in the floating gate oxide and repel other electrons, slowing down program and erase. Over time, thermal energy lets the electrons detrap. So irregular usage in a hot PC should actually be nicer environment for endurance.
A final factor is process variation, which can only be covered by using a large sample size (>100) and/or using units from separate lots with known characteristics, none of which an end user will likely have access to. Even that doesn't tell you anything about the defect rate.
There are really two types of tests that people are talking about here. The first is a spec compliance test, which uses the extreme conditions I mentioned above to guarantee that all units will have the spec endurance under all spec conditions. This should be done by the manufacturer. The second is a real world usage test, which will only give realistic results if done under actual use conditions. The number you get from the article's test probably won't tell you much.
[Disclaimer: I work on embedded NOR flash, not NAND, but the bits are the same and the article's talking about EEPROM so I figure I can butt in.]

--
Visit the