Long-Term Performance Analysis of Intel SSDs

← Back to Stories (view on slashdot.org)

Long-Term Performance Analysis of Intel SSDs

Posted by Soulskill on Friday February 13, 2009 @03:12PM from the apparently-some-things-get-worse-over-time dept.

Vigile writes "When the Intel X25-M series of solid state drives hit the market last year, there was little debate that they were easily the best performing MLC (multi-level cell) offerings to date. The one area in which they blew away the competition was with write speeds — initial reviews showed consistent 80MB/s results. However, a new article over at PC Perspective that looks at Intel X25-M performance over a period of time shows that write speeds are dramatically reduced from everyday usage patterns. Average write speeds are shown to drop to half (40MB/s) or less in the worst cases, though the author does describe ways that users can recover some of the original drive speed using standard HDD testing tools." Reader MojoKid contributes related SSD news that researchers from the University of Tokyo have developed a new power supply system which will significantly reduce power consumption for NAND Flash memory.

23 of 95 comments (clear)

Min score:

Reason:

Sort:

Why? by IamGarageGuy+2 · 2009-02-13 15:20 · Score: 3, Interesting

I didn't see anything that answered the question of why this would happen. I may be slow but shouldn't it either fail or work? Is storage being lost and therefore getting less with more time used to find a good area? Please don't mod me as troll for not knowing (maybe flamebait for being stupid, I guess)

--
Stay tuned for new sig...
1. Re:Why? by SpazmodeusG · 2009-02-13 15:41 · Score: 5, Informative
  
  The Intel SSD (and all SSDs) are made up of big addressable blocks. On the Intel SSD these are 64KiB in length. When you read or write to the drive the internal controller actually reads or writes an entire 64KiB block.
  A simple change to 1 byte means a read of the entire 64KiB block that byte is in, a change of the data and then a write of 64KiB.
  If the filesystem isn't flash-aware you can suffer a theoretical performance hit of being 65536 times slower because of this.
  
  So what you really need is a filesystem that stores files in 64KiB blocks and groups reads and writes to the same blocks together as 1 operation.
2. Re:Why? by Rockoon · 2009-02-13 15:58 · Score: 5, Informative
  
  It happens because flash drives write on very large (512KB) "blocks", but they still pretend like they have average sized (4KB) "sectors".
  
  Essentialy what the intel write-combining technology is doing is combining multiple small (4KB) writes into a single block, and letting the old block become fragmented (having a bunch of 4KB holes in it.)
  
  The scenario in the nutshell:
  
  You have a 1MB file and a program which modifies a single 4KB chunk of it. Intels technology marks the original 4KB chunk within its original "block" as erased, and then allocates a new block (using the wear leveling algorithm) to hold the new version of the 4KB chunk and additionally combines it with any other small writer operations that may have recently occured or will recently occur. Up to 128 such 4KB writes can be combined into a single block write.
  
  After this is done many hundreds of thousands of times, however, the drive begins to be in a state where nearly every "block" is only partialy used. The write combiner itself is stuck with whatever the wear leveling algorithm handed it, which is now a partialy used block instead of a fully virgin block. It can no longer combine 128 small 4KB writes together, but maybe only has space to combine 10 of them, or in the worst case scenario.. 1 of them.
  
  --
  "His name was James Damore."
3. Re:Why? by MadnessASAP · 2009-02-13 16:02 · Score: 2, Informative
  
  I thought it could flip a bit too 1 at will without having to rewrite the whole block but if you want to write a 0 it needs to read the whole block, wipe it out and rewrite it with the same data but with the bit flipped to 0. But I wouldn't really know, I'll use SSDs when they cost about the same as hard drives.
  
  --
  I may agree with what you say, but I will defend to the death your right to face the consequences of saying it.
4. Re:Why? by Dr.+Ion · 2009-02-13 17:08 · Score: 5, Informative
  
  Um... no.
  When cells age, they take longer to erase. This happens over 5,000, 10,000 cycles or longer. It's not dramatic, and eventually the cells fail in a way more severe than can be corrected by the ECC.
  Because there is a (software) process to bring full speed back to the drive, we can safely conclude that none of the slowdown is related to cell aging or other cell-level issues. It's more of an organization and fragmentation issue.
5. Re:Why? by tlhIngan · 2009-02-13 17:08 · Score: 5, Informative
  
  The Intel SSD (and all SSDs) are made up of big addressable blocks. On the Intel SSD these are 64KiB in length. When you read or write to the drive the internal controller actually reads or writes an entire 64KiB block.
  A simple change to 1 byte means a read of the entire 64KiB block that byte is in, a change of the data and then a write of 64KiB.
  If the filesystem isn't flash-aware you can suffer a theoretical performance hit of being 65536 times slower because of this.
  So what you really need is a filesystem that stores files in 64KiB blocks and groups reads and writes to the same blocks together as 1 operation.
  Actually, NAND flash comes in 2 block sizes - small block (16kiB/block, 512bytes/page, 32 pages/block), and large block (128kiB/block, 2048bytes/pages, 64 pages/block).
  Also, in NAND flash, a "write" operation can turn a "1" bit to a "0" bit. An "erase" operation turns a "0" bit into a "1" bit. Writes can work at the bit level, erases at the block level. (Though, large block NAND can NOT be partial-page programmed, so you must write 2048 bytes at once, but you can read all 2048 bytes, flip one bit, then write it all back). This characteristic is used by the flash management routines in order to manage the flash block. Marking pages as "discard" or "ready for erase" is done by flipping a 1 bit to 0 since that's easy. You can write a block partially, so you don't have to incur a huge 128kiB write always.
  Given this, it's a block device, so you can't write 1 byte anyhow - you must write the sector size, which is emulated as 512 bytes. What normally happens is that the SSD will mark a page as "dirty" to indicate it's not to be used, and remap that page's contents onto a new page elsewhere, thus only performing a 2048 byte write (plus 64 out of band bytes).
  Now, what happens when all the blocks are used? The flash routines have to erase a block, but before erasing a block, it has to make sure all the pages within it are "dirty". If there are non-dirty pages, they're copied to another block, and when all non-dirty pages are copied, that block is erased. If your access pattern is such that all the blocks have non-dirty pages, it takes a little while to actually move all the data around to get blocks that can be erased. Do enough random I/O, and this can happen quite easily.
6. Re:Why? by Dr.+Ion · 2009-02-13 17:40 · Score: 2, Informative
  
  Older flash devices allowed multiple writes to one page, but new ones do not.
  The higher-density MLC devices do not allow you to read a page, flip a bit to 0 and overwrite it. They require that pages be written just one, and in order.
  This is causing no end of frustration for the Microsoft mobile filesystems, which frequently overwrote pages to flag them.
7. Re:Why? by AllynM · 2009-02-14 01:16 · Score: 2, Informative
  
  There was no difference in how long it took to fragment. If we wrote a nasty enough mix of smaller file sizes to the drive, performance would drop right at the point where all flash was written to at least once (i.e. just over the 80GB mark).
  After running HDDErase on the drive, it went the same *exact* 80 MB/sec write speed each and every time. Additionally, running successive software secure erasures (writing 0's across all 80GB) showed 0 drop in speed even after 10 passes.
  In testing several different SSD brands / types, I have yet to see a slowdown that would suggest block erasures take longer over time. I suspect the block erase timing is based on flash that is at or near its end of life.
  Al Malventano
  PCPer Editor
  
  --
  this sig was brought to you by the letter /.
TL:DR by Nursie · 2009-02-13 15:33 · Score: 3, Insightful

That article is a multi-page annoyance, the grammar is bad and we already have flash-aware filesystems like jffs2.
1. Re:TL:DR by bcrowell · 2009-02-13 16:11 · Score: 3, Informative
  
  That article is a multi-page annoyance, the grammar is bad and we already have flash-aware filesystems like jffs2.
  
  As far as I can tell from some quick googling and checking on Wikipedia, jffs2 isn't much of a competitor at this point, e.g., it's apparently not really usable on flash chips bigger than 512 Mb. Maybe UBIFS or LogFS? None of them seem to be really mature.
  
  --
  Find free books.
2. Re:TL:DR by renoX · 2009-02-13 19:52 · Score: 2, Insightful
  
  Flash-aware filesystem currently only works on embeded setup where there is direct access to the Flash.
  Given the need for compatibility, SSD will always have a controller showing the SSD as a disk, but I agree that it'd be nice if they would add additionnal lower level access in the case the computer is able to use Flash-aware filesystem.
3. Re:TL:DR by TheRaven64 · 2009-02-14 02:02 · Score: 4, Interesting
  
  Not sure about the Linux world, but LFS on NetBSD counts as mature. It's been sitting in the BSD tree since 4.4BSD (1990, a year before the first Linux release) and is well supported by NetBSD, although the other BSDs dropped it from their trees in the intervening decades because it didn't provide major benefits on rotating mechanical disks. With flash becoming cheap, suddenly it's seeing a lot more interest...
  
  --
  I am TheRaven on Soylent News
I am waiting for these SSDs by bogaboga · 2009-02-13 15:45 · Score: 3, Informative

I am patiently waiting for these SSDs and plan to test them on a MythTV distro box. I will get a fully compatible Linux SSD notebook onto which a MythTV distro will be installed.
Then with 3 TV cards, I will see how these SSDs measure up on reading/writing/transcoding etc. My intention is to work the SSD for about a week. Watch this space for results.
I do not think that Intel will deliver the "golden" SSD. I think Samsung's SSD effort will bear results faster. Those videos say a lot.
Re:Damn by Viper+Daimao · 2009-02-13 15:54 · Score: 4, Insightful

Exactly. BSG and Joss Whedon's new show are on.

--
"In the game of life, someone always has to lose. To me, if life were fair, that someone would always be Oklahoma." -DKR
SLC vs MLC by w0mprat · 2009-02-13 16:25 · Score: 3, Interesting

Isn't the problem partly MLC? SLC has consistently better small random write performance. Many cheap SSDs use MLC for obvious reasons, it fairs well in benchmarking -MLC has relatively high read performance- but write performance hurts real bad in real world usage. You may get noticeable micro-lag anytime the OS writes to storage. Application loading may be snappy for example, but the whole system slows down while writes are done. It's good to see the truth coming out amongst all the benchmarketing

It's early days for SSDs. I'll be sticking with my power guzzling magnetic frisbe stacks for a while yet.

--
After logging in slashdot still does not take you back to the page you were on. It's been that way for 20 years.
1. Re:SLC vs MLC by Anonymous Coward · 2009-02-13 17:33 · Score: 2, Informative
  
  Once someone releases an SSD that solves ALL of those sticky points, and ideally delivers enough random-access throughput to saturate the 300MB/s SATA line (or whatever bus is mainstream by then), that's when I'll jump on board.
  Well, like myself, you will be waiting for a non-flash based SSD then.
  Inevitably, something like PRAM will displace Flash, and it can't happen soon enough. Until then, I would much rather see some of that fab capacity reclaimed for DRAM production.
2. Re:SLC vs MLC by Kneo24 · 2009-02-14 00:13 · Score: 2, Interesting
  
  I've been using an Intel SSD as a boot drive and I think it's worth every penny so far. I have a few programs and games on the boot drive and they all load up considerably faster than the alternatives. I don't care about write speeds. Their size alone means they're not really meant for storage yet, so using it as such is a bit retarded. If you're doing a lot of write operations to your SSD, you should probably think about moving that file(s) to a different storage device.
3. Re:SLC vs MLC by Kjella · 2009-02-14 01:19 · Score: 4, Insightful
  
  To me, MLC has a conceptual problem of going against the fine tradition of binary computing, which is all about data integrity. Why don't we go back to analog computers for even higher densities, while we're at it.
  Says someone that obviously has never seen the raw output of a HDD read head, or the optical laser in a DVD reader. The real world was always very ugly and analog, there's a helluva lot going on to give you a 0 or 1 answer.
  
  --
  Live today, because you never know what tomorrow brings
4. Re:SLC vs MLC by cnettel · 2009-02-14 01:30 · Score: 4, Informative
  
  Do you detest Gigabit Ethernet and disk-based drives as well? Pure binary protocols for signals on media tend to be inefficient. The technology is still digital. Whether data integrity is a priority or not is really a matter of proper error correction, to rely on avoiding on single-bit errors is a flawed strategy.
Closer, but.. no. by Dr.+Ion · 2009-02-13 17:11 · Score: 4, Informative

NAND blocks are *erased* in large blocks, probably 128KB or larger in this case.
However, the read and write operations occur at a *page* level, not block. NAND pages today are typically 2K or 4KB in size.
So you can read and write in smaller units than 128KB.
However, to erase any byte of the NAND, you have to relocate the preserved data and erase a whole block.
Because these drives operate on huge aggregate arrays of NAND, their block structure may be much larger, or they may have very complicated and smart algorithms to re-map write new data while waiting to perform erases much later.
Re:File system? by Dr.+Ion · 2009-02-13 17:20 · Score: 2, Insightful

One of the biggest challenges of the coming years will be finding and developing filesystems (logical data stores) that take advantage of the strengths of flash memory while deminishing the weaknesses of it.
Our approach today is mapping large banks of Flash to look like a hard drive, and then using a filesystem that is optimized to reduce seek activity. (Cyl/Hds/Tracks-per-Sector..)
EXT3 on SSD, FAT on huge SD cards, it's just shoe-horning our old filesystems onto new media. It makes about as much sense as using a hard drive to store a single TAR image only.
Once we make the huge step of designing high-performance filesystems that are exclusively *for* flash media, then we can take advantage of some of the huge benefits that are distinctly flash.
Key things like journalling should be designed with the flash organization in mind: pages and blocks vs "sectors". That kind of thing.
Re:There's got to be some writable space here... by Dr.+Ion · 2009-02-13 17:35 · Score: 4, Insightful

That's so oversimplified as to be completely wrong.
The number of write/erase cycles on NAND is significantly less than a hard drive. Typical devices are rated for 10,000 cycles. Bleeding-edge MLC parts can be as low as 5,000 or 7,000 erase cycles.
But.. a well-designed device will perform accurate wear-levelling across all the available blocks, so it doesn't matter what kind of access the user performs -- the whole device will wear evenly.
There are indeed reserve blocks to mitigate premature death of some parts.
But, the most important part is the ECC mechanism. The parts don't just wear out and die, they get an increasing bit error rate. By overdesigning the ECC logic, you can squeeze longer life out of the parts.
It does not play guess and check.. well-recognized error correction algorithms like Reed-Solomon or BCH are used with really high detect/correct rates.
Once you have accurate wear levelling, excellent ECC, and some manner of failure prediction, then it doesn't make so much sense to keep all your flash "in reserve" ready to swap out other parts wholesale. You might as well involve all the parts in the mix, so you get longer wear throughout.
Still better than the alternative by Anonymous Coward · 2009-02-13 17:35 · Score: 2, Informative

Looking at the big picture, I'd rather have a slow SSD than keep dealing with the data losses of (criminally unreliable) HDs.