Intel Confirms Data Corruption Bug, Halts New SSDs
CWmike writes "Intel has confirmed that its new consumer-class X25-M and X18-M solid state-disk drives (SSDs) suffer from data corruption issues and said it has pulled back shipments to resellers. The X25-M (2.5-inch) and X18-M (1.8-inch) SSDs are based on a joint venture with Micron and used that company's 34-nanometer lithography technology. That process allows for a denser, higher capacity product that brings with it a lower price tag than Intel's previous offerings, which were based on 50-nanometer lithography technology. Intel says the data corruption problem occurs only if a user sets up a BIOS password on the 34-nanometer SSD, then disables or changes the password and reboots the computer. When that happens, the SSD becomes inoperable and the data on it is irretrievable. This is not the first time Intel's X25-M and X18-M SSDs have suffered from firmware bugs. The company's first generation of drives suffered from fragmentation issues resulting in performance degradation over time. Intel issued a firmware upgrade as a fix."
Maybe they should have used HW/SW co-verification (like Seagate in that study - an example of how a storage company tests their firmware).
For you software developers out there who enjoy free debuggers, you should know that we, hardware designers, also have our own debuggers. Except they are a little bit more expensive (think $500,000+) and can be quite bulky. But they are the only way to really test firmware before taping-out a chip.
"The company's first generation of drives suffered from fragmentation issues resulting in performance degradation over time."
The performance degradation in the Intel X-25 is not because of a "firmware bug". All SSD's will suffer performance degradation whether or not their writing/wear leveling algorithms have been updated via firmware.
Future? You must be new to computers. I updated the firmware in my very first 80's printer to give it more features. Had to pop out the old chips and put in the new ones. I upgraded the firmware in modems from several different manufacturers (some more than once) to add features and fix bugs. I've updated the firmware (BIOS) on most of my motherboards. I've updated the firmware on optical drives. I've updated the firmware on a scanner. I've updated the firmware on SCSI controllers. I've updated the firmware on hard drives. I've updated the firmware on switches and routers. Hell, I've updated the firmware on keyboards.
This is hardly a new phenomenon.
They probably meant a hard disk password. Depending on implementation, this means either disk supported full disk encryption, or a simple firmware interlock that prevents reading through the controller without the password (but could be bypassed with forensic tools that read the disk surface directly).
$_ = "wftedskaebjgdpjgidbsmnjgcdwatb"; tr/a-z/oh, turtleneck Phrase Jar!/; print
Seriously, I'd say this is in the By Design bucket. For the security conscious - set a BIOS password. If the (feds/aliens/wife/others) remove the password, all access to the data is gone.
Brilliant! Secure!
Mind you, not being able to change my password once every other day might hinder my current security model.
Not really. Making an educated guess from the article, it appears that this is implemented as a simple controller lockout, not actual encryption. So swapping the flash memory into another controller (common computer forensics technique) would bypass it. Most people paranoid enough to want a disk password want real encryption, so using Intel's half-measure of a password is likely a very uncommon scenario. The tests are probably very simple; glossing over this case would be an understandable, if not desirable, oversight.
$_ = "wftedskaebjgdpjgidbsmnjgcdwatb"; tr/a-z/oh, turtleneck Phrase Jar!/; print
Although this bug should have been caught faster it seems that it is possible to update the firmware without any data loss (fortunately I have put it in a laptop, power outages are no problem). I've looked at the Intel site and the flash utility seems to be simply bootable from CD - if this is the last bug I'll be a very happy punter indeed.
My 80 GB G2 SSD replaced a not too fast laptop drive. I'm now trying Linux, but I'll try Vista as well just for fun - I'll just write my 80 GB to an external drive using Gparted. These drives come highly recommended even if they would slow down to 50% of performance (which, it seems, they don't). I unzipped Eclipse to it and JavaDoc and I could see that the archiver that unzipped the .zip has some performance issues reading the index. It took longer than the unzipping and gunzipping and untarring (the Eclipse gunzipping/untarring took less than 2 seconds - yikes). The only thing faster is the tmpfs in RAM which I used to compile the OpenJDK in on my "workstation". Starting Eclipse takes now less time on my laptop than on my workstation even though it got twice as few cycles.
Is this a cost issue, or a thoroughness issue?
No, we dont catch every possible scenerio here, either, but we do try very, very hard. Knowing one of the coders in Intel's RAID drivers groups, he goes crazy with stuff. And he just writes Linux drivers. I do not envy him - this past year, every bug he's had to fix has been caused by someone else's code. Someone not writing Intel drivers. And he gets slammed every time for bad testing, as if he can test all the rest of the kernel team's stiff, NTM every fly-by-night Chinese hardware outfit. They're killing him.
I can't even say 'ext4', he just goes insane. Though he chuckles when I whisper 'ReiserFS', and opens another beer.
I'm glad I'm not in that line of work.
deleting the extra space after periods so i can stay relevant, yeah.
Yes, they do.
C doesn't have voltage or current leaks.
"How to recover lost/corrupted files from an SSD?"
+1 IDisagreeSoHeMustBeATrollOrAnAstroturferOrAShill
Why bother though? If someone breaks in, you'll have to fix or replace your front door, even though the motion-detecting laser robots zapped him. If you just leave your front door unlocked instead, intruders can just walk in, and the laser-wielding robots can zap him, and then automatically dispose of the body for you too. This way, the intruder won't cause any damage.
Because suddenly your code becomes time-based, eg it matters WHEN x=0 becomes x=1, and what's in between.
Believe me, this kicks you in the balls really hard. I still remember the frustration on my Altera course, where in simulation everything worked fine, but once flashed onto a FPGA everything went to shit.
Dell has released updated firmware for my laptops BIOS 17 times.
To keep out the innocent neighbor kids or the maid who comes on the wrong day. You only want to dispose of bodies that deserve it.
You'll sleep better that way.
This issue is a bit more complicated than you think.
The maid I can understand, but if your neighbor's kids are anything like mine, they're not innocent.
Aircraft (F-16 among others) flight control firmware has been updated by reprogramming UVPROMs for many years.
"This post is an artistic work of fiction and falsehood. Only a fool would take anything posted here as fact."
This really seems like a very unlikely event to happen to trigger the problem on these drives for most users since from my experience personally and professionally I have yet to see anyone actually know about BIOS passwords, much less about setting a password on the drive using the ATA secure drive password feature. I am surprised that this was even caught by anyone unless it was a complete fluke or there actually are people or companies using this type of a feature for security. (I don't doubt it but haven't seen it.)
I personally own the first generation Intel X25-M 80GB MLC SSD and I have written about it extensively here on this forum. I heard rumors that the new TRIM feature support will only made available to this second generation release of these drives but I'm unsure if that is really true. I'm on the fence right now whether I should sell my G1 drive and upgrade to the G2 because of this feature and also for a little more performance because I am so happy with the performance of this drive and also the current 8820 firmware that solved the fragmentation and slowdown issues.
If you are one of those folks who is still sitting around not knowing what to do when all of this Solid State Disk news is coming out all over then you are missing the biggest paradigm shift to computing performance since the transfer from floppy disks to hard drives.
With the upcoming re-release of this newly affordable drive around 2009-08-28 from Intel X25-M G2 80GB MLC SSD at ~$230 USD from Newegg or ZipZoomFly you should definitely dig down deep and save a little money to buy one of these drives and experience the biggest performance and responsiveness improvement to your computer that you could imagine.
If you need a primer on the SSD revolution check out my previous post regarding the articles to read.
Required Reading for Solid State Drives (Score 1)
Welcome to 2 weeks ago:
http://www.pcper.com/comments.php?nid=7544
Allyn Malventano
Storage Editor, PC Perspective
this sig was brought to you by the letter
So? It's just a set of different paradigms. It's just like using a different programming language. 99.9% of the time if your code works during functional verification testing (which doesn't simulate the physics of hardware) it will work fine in timing/hardware verification and then also in real hardware (so long as you don't violate any timing constraints, which your synthesis tool will tell you about). That's one of the reasons why RTL synthesis tools like Cadence are so insanely expensive, because they do allow you to go from function verification which verifies the syntax and semantics of your code to hardware verification which allows you to ensure your design will work as expected in actual hardware. If you're getting "kicked in the balls really hard" then it's probably because you need to brush up on your VHDL/Verilog, just like if you're getting segfaults when writing C you're doing something wrong. It doesn't mean that the process is any less deterministic.
Functional simulations will only catch #1.
If you are getting segfaults in C you usually ASSUME that the processor you are running on is acting in a deterministic manner and ASSUME the problem is your code.
The DIFFERENCE is that SOMETIMES the underlying hardware is not acting deterministically because it is a PHYSICAL system that has physical flaws or imperfections. Like leakage currents that are JUST a tiny bit too much, or depend on the state of the neighboring circuit or the temperature.
In other words, I've written C code that had "segfaults" and it wasn't the fault of the C code, it was memory issues that resulted in problems. And I've written C code that suffered from a buggy compiler, too. I've also written code that "misread" about 1% of the characters typed in at the terminal, and it wasn't the code that was at fault, it was the UART.
I don't know anything about the source of Intel's problem, but I will say that they can send me ALL of the "defective" SSDs and I'll give them a home where I promise never to set a password on the disk or change it after I do.
What makes Intel a hard disk vendor anyway? Yes, it is still a disk
It's solid state mass storage, where "solid state" = "chips". A disk is a spinning thingy which is completely different. Since Intel designs and make chips (see: "solid state" = "chips"), it is a perfect choice for them to make solid state mass storage devices out of chips.
Have I mentioned the relationship between "solid state" and "chips" and how "solid state" != "spinning thingy"?
I remember updating the HARDWARE of my modem: Changing the swamping resistors to reduce the Q of the filters and broaden the passbands so the Rx side would work at 300 as well as the original 110 baud. B-)
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
Now the problem came in that case when you wanted to change/delete the password. It would use a second subroutine to do.
That last step was the killer, seems that someone had declared a global variable and a local variable with the same name. End result one overwrote the others data, and one never knew exactly what the box hashed, nor you could figure out what to key in to the screen to unlock the door. (so to speak.)
I'm sorry, I'm to tired to be witty at the moment so this message will have to do.
I'm not even going to put a foot in the flamefest over whether solid state mass storage is cost effective or even reliable - I only ask you don't call some chips that just sit there a spinning disk.
More than 1/4 of Intel's revenue comes from miscellaney chips and motherboards that are not microprocessors. That's a big enough chunk it shouldn't be dismissed as not a core business.
That this bug made it through means someone should be looking for employment and indicates a problem with management and internal processes, not that they shouldn't make the product in the first place.
Ask anyone who bought a JMicron-based SSD about insufficient testing. How any company thought that controller was worthy for their SSDs is beyond me.
Before I replaced mine with a Samsung SSD, my [censored] was regularly giving me studders and pauses that lasted for 20-40 seconds at a time. It just flat-out halted everything on the computer for half a minute for no apparent reason, even while reading, not just writing. Apparently, this was predominant behavior for the controller that dominated the SSD arena until the X-25 started blowing people away.
I think I understand now why Seagate, WD, and the other HD manufacturers are taking so long to get SSDs on the market. Since their market depends almost exclusively on storage, they can't afford to screw up their first SSDs. At least, I hope that's the reason. Even they have to understand that the hard drive market isn't going to last forever.
The FDIV bug wasn't fixed in firmware. There was a microcode update that worked around the problem, but it made division painfully slow. Intel's 'fix' was to recall all of the affected chips and provide replacements. It cost the company a lot of money and the story became the introduction to Andy Grove's biography.
I am TheRaven on Soylent News