Intel DC S3700 SSD Features New Proprietary Controller
crookedvulture writes "For the first time in more than four years, Intel is rolling out a new SSD controller. The chip is featured in the DC S3700 solid-state drive, an enterprise-oriented offering that's 40% cheaper than the previous generation. The S3700 has 6Gbps SATA connectivity, end-to-end data protection, LBA tag validation, 256-bit AES encryption, and ECC throughout. It also includes onboard capacitors to prevent against data loss due to power failure; if problems with those capacitors are detected by the drive's self-check mechanism, it can disable the write cache. Intel's own high-endurance MLC NAND can be found in the drive, which is rated for 10 full disk writes per day for five years. Prices start at $235 for the 100GB model, and capacities are available up to 800GB. In addition to 2.5" models, there are also a couple of 1.8" ones for blade servers. The DC S3700 is sampling now, with mass production scheduled for the first quarter of 2013."
The article makes me a bit suspicious:
"Intel's own high-endurance MLC NAND can be found in the drive, which is rated for 10 full disk writes per day for five years."
sounds pretty bad actually, if I understand it right.
Per cell this means: 365*10*5 = roughly 20.000 write cycles per cell? Sure wear leveling algorithms are there, but 20.000 cycles is not exceptional, or am I wrong?
Don't misunderstand this post. I think Intel's SSDs are good.
yes you are wrong, 20k cycles are very good for mlc.can't bring up any citations now -- too lazy, but the latest mlc nand cells ( 20nm ) are down to like 5k or less
The article says this:
The controller has a 6Gbps Serial ATA interface, and a gig of DRAM rides shotgun.This DRAM cache never stores user data but is instead used for context and indirection tables.
That detail is important in light of the DC S3700's power-loss protection, which uses multiple onboard capacitors to ensure that in-flight data is safely written to the flash in the event of a power failure.
What are context and indirection tables?
This is about right. MLC flash normally is rated for between 1k and 10k cycles. Newer flash is generally less as transistor sizes are shrunk to fit in more gbytes in the same die area.
A home PC will only write a couple of gigs a day under typical workloads, which turns out to about 5 full writes a year for even the small sizes. That would last you 4000 years assuming ideal wear leveling...
Basically, what they're saying is this will be absolutely fine for everything except outgoing mail servers and a few other specialist things.
The capacitor backup and write cache make wear leveling much much easier, since all frequently written to cells can be cached in ram, and only written once on shutdown, and the capacitor backup means even an unclean shutdown will save your data.
It's like the Colonel's 11 secret herbs and spices. If you trust the brand, it's a plus - if you think that KFC is including human remains as a "spice", it's a negative. In this case, it lets you know that there is something unique about this new Intel SSD that no other brand has. Whether that is good or bad depends purely on your feeling about Intel's level of competence in designing SSD controllers.
W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
What is "proprietary" supposed to tell me about hardware?
There is just so much wrong with calling things "proprietary" and thinking it'll make the reader perceive the product as superior.
TFS does a terrible job of, um, summarizing the situation; but it does actually make sense in context:
Intel's initial entry into SSDs(X-25) was based on an in-house controller, which(with the exception of the unpleasant 8MB firmware bug) was generally quite well regarded. Then it stagnated. They did a few tepid bumps and firmware updates; but no successor controller appeared. With SSDs actually able to saturate a 3GB/s SATA bus, the fact that Intel had nothing on the table for 6GB/s SATA began to become an issue.
More recently, Intel began shipping 3rd party controllers (most recently Sandforce, possibly some Marvel at some point) on everything but their enterprise gear.
Now, after the thick end of four years, they've brought out their first new SSD controller architecture. Whether it does, in fact, turn out to be better is not known; but it is news after such a long hiatus.
What "proprietary" means to me here is "untested and likely to be very buggy". I've helped people cope with losing terabytes of lost data eaten by Intel's early X-25 models, when they first played this game. The “BAD_CTX 13x Error” AKA 8MB size bug sucked; so did their flat out deception about the drive's write cache in order to cook benchmark results.
At least they're honest about which drives do and don't care about cache integrity now, and firmware reliability of the models that do that right (the 320 and 710 series) seem pretty solid now. But since getting firmware right for a complicated SSD takes a lot of field testing, that they've switched to this new proprietary controller means the odds of data loss due to firmware bugs on this model are going to jump right back up again. Firmware seems to be the least reliable part of a typical SSD, so brand new firmware surely equals very high risk, even if the hardware is executed perfectly. Doesn't matter how well the flash cells work if you hit something like the "oh, the drive reports it's 8MB now" sort of bug--and that problem haunted multiple generations of drives in Intel's past firmware before they exorcised it. Now it seems they want to start over again. Didn't like that movie the first time, would not watch again.
At the beginning of its release cycle, the odds of firmware bugs eating all your data is massively higher on this drive than the models that re-used existing controllers/firmware and have been out a while. The new controller means they've basically started over again with a firmware rewrite. PC hardware and software has so many possible configurations to test, it's impossible to get that right without beta testing the hardware in the field to see what problems the sucker early adopters get nailed by. The only way I'd feel comfortable relying on one of these during its first year of life, while that's getting ironed out, is to have even more backups than the current generation of hardware needs to be considered safe enough.
The small amount of RAM on Intel's SSDs are not used to cache writes in a significant quantity. The idea that you'll only have to write the most popular cells once per shutdown is a dream. The main benefit of having a bit of reliable capacitor backup is that the drive can be less aggressive about forcing an erase of a large cell just to write a fraction of it out, therefore improving the write amplification situation on the drive. You can even see limiting small writes as a factor in the claimed longevity of the drives if you dig into their spec sheets enough. I did an article comparing the 320 vs 710 series lifetimes, approaching from the perspective of one of those specialist things you allude to--database server operation. One of the things that I noticed there is that the longer lifetime of the 710 came with the restriction that you couldn't do nearly as many small random writes per second (write IOPS) and still hit the claimed lifespan target. If the cache was larger and really effective at postponing writes, that trade-off wouldn't exist.
...unless the disk is nearly full, in which case it'll be writing the same cells over and over again.
(unless the supply a utility which moves data from least-used cells to most-used...)
That happens even if the disk is nowhere near full, and performing wear leveling is a major part of what the SSD controller does. If you're on a system that doesn't support TRIM, a nearly-full disk could end up with write amplification problems, though.
there are NOT faster and cheaper drives in the market with these features, there's nothing out there right now, to my knowledge, with is capacitor-backed cache and end-to-end integrity.
for the price intel is asking, its quite reasonable as well.
Except the "dirty little secret" of the industry is its NOT the cells dying that gets you, the controller dying is what bites you in the ass. if it was just the cells since when a cell fails it just ends up read only that wouldn't be so bad, but when the controller fails you flip the switch and...nothing. Not even the BIOS/UEFI detects the thing, its just gone.
That is why even though this article is a year old I'd urge those thinking of diving into SSD to read it, especially the comments where you see guy after guy getting bit in the ass by dead controllers. brand make a difference, OCZ being worst and Intel best, but ALL have this problem to a degree, and when it happens to you? Well lets just hope you have a VERY recent backup.
This is why I tell my customers there are some places SSDs make sense but NOT all. If its mobile, not mission critical, and you religiously stick to a backup schedule? No problem there, if its just an OS drive with the data on HDD? No problem there, just make sure you have recent disc images so you can just clone onto the replacement, but in anything mission critical, or for those that won't stick to a rigid backup schedule? then SSD is NOT the way to go, it'll bite them on the ass and leave them in a bad way.
They really need to come up with a second controller, one that will simply take over in the case of failure and leave the drive in a read only state. this would at least insure that when the main controller does fail you can get the data off, and its those failure rates that are keeping a lot of people (myself included) from switching.
ACs don't waste your time replying, your posts are never seen by me.
Except the "dirty little secret" of the industry is its NOT the cells dying that gets you, the controller dying is what bites you in the ass. if it was just the cells since when a cell fails it just ends up read only that wouldn't be so bad, but when the controller fails you flip the switch and...nothing. Not even the BIOS/UEFI detects the thing, its just gone.
You forget that in a file system you typically write to more than one cell to store some data, what happens when some writes succeed and others fail? Major file system corruption and fast. I've managed to wear out one of the original OCZ Vertex drives - don't know how, I wrote maybe 5 TB to it and ideally it should take 1200 TB @ 10k writes/cell but SMART data was pretty clear. I had a broken file system and each run of fsck made everything worse, I had to stop trying to fix it, mount the thing read-only and salvage what I could. Even that failure mode is not graceful.
Live today, because you never know what tomorrow brings
(unless the supply a utility which moves data from least-used cells to most-used...)
All SSDs do wear levelling, otherwise they'd die after a couple of days. That happens beneath the LBA address layer - i.e. LBA's are mapped to physical addresses and the mapping changes each time an LBA is written.
So you don't need to do wear levelling at the file system level. In fact the only thing you need to do there is to have a TRIM command which tells the SSD that a range of LBAs no longer contain useful data. That means the SSD can mark them as obsolete which gives the wear levelling a bit more elbow room.
echo -e 'global _start\n _start:\n mov eax, 2\n int 80h\n jmp _start' > a.asm; nasm a.asm -f elf; ld a.o -o a;
You've always got a free erase unit, because at least one is reserved for wear levelling. It's easy to invent an algorithm that moves that free unit around the the disk by garbage collecting from a full unit to an empty one.
There are papers on this sort of thing. Look at the patents M Systems filed for example, or the documentation on TrueFFS. I've worked with embedded systems that used that and one of the first things we did after we got a socket driver working was to hammer a full disk and check that the wear levelling really did what it was supposed to.
And, sure enough if you log unit erases overnight they are evenly spread.
echo -e 'global _start\n _start:\n mov eax, 2\n int 80h\n jmp _start' > a.asm; nasm a.asm -f elf; ld a.o -o a;
Not quite correct either.
It's not the controller hardware dying, it's the controller firmware crashing and burning.
A few days ago, my Crucial C300, a drive I've been running like mad for 2 years, finally critically failed to read back a sector. And instead of returning an disk error, the entire drive froze. After waiting 15 minutes to see if it'd come back, it didn't. Rebooting, then rereading resulted in the same drive crash. Overwriting the sector with dd made it force a remap and allowed me to fully image my drive.
What does this tell us?
1) A 2nd controller doesn't help. It'll just do the same thing.
2) In the normal block failure mode, it'll return a disk error and we can overwrite it.
3) There exists bugs in the firmware where the block tracking metadata gets into a state where the controller can't handle it anymore. My guess is that maybe it ran out of memory trying to clean itself up or something. Whatever the case, if you hit something like this, there needs to be a way to escape without losing the entire drive. Perhaps a debug mode or memory-optimized read-only mode toggled by a jumper or something.
4) I should have noted the rare occasional stutter in the past month as a sign that things were not great.
Anyhow, I backed everything up, issued an ATA secure erase to hope the drive cleans its metadata too, and then loaded everything back on from the disk image. Works perfectly.
(relavent equipment: OSX 10.6, no TRIM enabled, ~2.5 year old drive, primary build/dev environment, all firmware updates have been loaded)