Garbage Collection Algorithms Coming For SSDs
MojoKid writes "A common concern with the current crop of Solid State Drives is the performance penalty associated with block-rewriting. Flash memory is comprised
of cells that usually contain 4KB pages that are arranged in blocks of 512KB. When a cell is unused, data can be written to it relatively quickly. But if a cell already contains some data, even if it fills only a single page in the block, the entire block must be re-written. This means that whatever data is already present in the block must be read, then it must be combined or replaced, and the entire block is then re-written. This process takes much longer than simply writing data straight to an empty block. This isn't a concern on fresh, new SSDs, but over time, as files are written, moved, deleted, or replaced, many blocks are a left holding what is essentially orphaned or garbage data, and their long-term performance degrades because of it. To mitigate this problem, virtually all SSD manufacturers have incorporated, or soon will incorporate, garbage collection schemes into their SSD firmware which actively seek out and remove the garbage data. OCZ, in combination with Indilinx, is poised to release new firmware for their entire line-up of Vertex Series SSDs that performs active garbage collection while the drives are idle, in order to restore performance to like-new condition, even on a severely 'dirtied' drive."
So what does this do when forensics are being done on one of these drives? Is the firmware just doing a better job of marking a dirty block available or do the dirty blocks have to be zeroed at some point. Even if the blocks are just marked will they output zeros if 'dd'ed by an OS?
Wouldn't the drive benefit from a real understanding of the filesystem for this sort of thing? If it knew a sector was unallocated on a filesystem level, it would know that sectors were empty/unneeded, even if they had been written to nicely. Or should computers now have a way of tagging a sector as "empty" on the drive?
Either way, it looks like an OS interaction would be very helpful here.
Or are modern systems already doing this, and I'm just behind the times?
Not to be cynical, but these new algorithms, if implemented poorly, have the potential to run down the limited number of write cycles on the cells. Not that this could be strategically manipulated in any way...
There are 1.1... kinds of people.
The Garbage collector restores performance of the drive. Nothing comes free, so a question - at what cost?
In theory there is no difference between theory and practice. In practice there is. - Yogi Berra
why? its low level but it doesn't affect the above filesystem.
on the list of reasons why it SHOULD be done by the OS not the firmware are:
*OS has a better clue about idleness
*OS can create idleness by holding unimportant writes for a while (ext4 style) and using this time to do GC
*OS can decide to save power by not doing this while on batterypower
on the list AGAINST i only have:
*jtownatpunk.net thinks it should be platform independent and thinks this can't be achieved without doing it in firmware
put out the essence of the driver in public-domain and code a version for windows/mac if required, that way all oses will use the same logic even if they have completely different drivers.
IranAir Flight 655 never forget!
I see OCZ already released some sort of garbage collection tool, but it only works on Windows. Kind of annoying since I bought their "Mac Edition" drive for my MacBook. Hopefully they'll put this in a firmware update, too, and hopefully I won't have to boot DOS on my Mac to update the firmware with a utility that blows over my partition table this time. That was a lot of fun going from version 1.10 to 1.30 firmware.
Anything measures in rewrites over hours or larger time spans is not (or shouldn't be) that much of a problem for modern flash. Someone calculated that you'd have to be reflashing a particular device every 15 minutes for 5 years to reach the flash's rewrite limit. That was several years ago. (It may have been 5 minutes as opposed to 15, but I'll give the less reliable number. This number appears to be from 2000 or 2001, as the device was the Agenda VR3 dating from about then.)
Assuming it's as good as the flash from that example, rewriting every hour results in 20 years. I don't know about you, but I don't have many hard drives from 20 years ago.
Now, if it's rewriting all the time, that could go down drastically, and quality might be different, but every 20 days shouldn't be a problem unless you've got really really crappy flash, by the standards of 9 years ago.
Your right that the OS file system can do it more efficiently than the onboard SSD controller. Problem is, how many years do you want to wait until we get usable SSD's? Every OS would need their filesystem updated to speak to SSD drives or we present the same IDE/SATA/SCSI/SAS command interface to the OS like we have for decades and handle all the new specialized items internally. Which gets the product out there as quick as possible? When you've spent years R&Ding a product most companies would rather get it out there as quick as possible to recoup their costs and then fix bugs/errors as they go along rather than wait for some thirdparty they have no control over to do it for them ... someday ...
While your solution is the *better* one, the current method is more practical and yes that matters. We can only hope that eventually SSD's create an alternate interface that allows the OS to bypass the SCSI/whatever interface and speak in raw SSD lingo to the device.
The Goal: A long simple life filled with many complex toys.