Faulty Chips Might Just be 'Good Enough'
Ritalin16 writes "According to a Wired.com article, 'Consumer electronics could be a whole lot cheaper if chip manufacturers stopped throwing out all their defective chips, according to a researcher at the University of Southern California. Chip manufacturing is currently very wasteful. Between 20 percent and 50 percent of a manufacturer's total production is tossed or recycled because the chips contain minor imperfections. Defects in just one of the millions of tiny gates on a processor can doom the entire chip. But USC professor Melvin Breuer believes the imperfections are often too small for humans to even notice, especially when the chips are to be used in video and sound applications.' But just in case you do end up with a dead chip, here is a guide to making a CPU keychain."
I'd rather my chip works as advertised.
It may seem that there's a basic linear line between over-the-top quality control and cost and more economical quality control and cost, however one has to think that if it turns out that these chips are more likely to have defects in them and in fact do in the future, how long will costs remain low? The chip will still be useless and will have to be replaced, added to that the cost of making the returns from the customer/store and then the possible customer dissatisfaction with the company's quality which could result in a lost sale in the future. Will it actually be cheaper in the long term?
Then why not have analogue processors instead of digital processors. Seriously - they're much faster than digital switches.
The only reason for moving to digital switches was accuracy - the cost of the first digital bitflipper processors was far more expensive than valve technology was in 1950s and 1960s. And that really was the only reason for changing to digital processors.
"It's not your information. It's information about you" - John Ford, Vice President, Equifax
LCD manufacturers routinely put defective screens on the market, on the premise that a dead pixel here or there "won't be noticed". Too bad, because consumers do notice and do tend to return the product equipped with the dodgy screen, only to be told that it's "normal".
In short: computers suck...
"A door is what a dog is perpetually on the wrong side of" - Ogden Nash
The CPU vendors are already doing a 'sort and grade' operation, when they label processors. Have been for years. When the yield from the fab is lower-grade, the dies get packaged and labelled as lower-speed parts.
Then the Overclockers come in and ramp the speed back up, and claim 'the faster chips are a ripoff' and complain that 'Windows is always crashing.'
Many of the chips fail inspection prior to going into the package, and then some more fail functional test after that. Probably more than half the price of a chip is the factory itself and the R&D work which is amortized over so many zillions of parts, and much of the rest is all the handling, packaging, shipping, and middlemen. I'd guess less than 10% is per-part materials and labor.
Therefor throwing away a $2 chip during production doesn't cost $2. It's only worth $2 by the time the customer pays for it.
Sure you could sell the defects at some discount, but it's only worth the trouble for some high volume part like RAM where defects are easily useable, and definitely NOT a part where the impact of some particular defect in the end user's application could be really hard to characterize (like a CPU).
...They'd already be doing it.
Please remember that this is the same industry that came up with the 80486SX when they were having lousy yields on 80486DX chips. If these processors had any utility, trust me, they'd find a way to make money off 'em.
Basically, the problem is this. With mechanical and analogue devices, most of the time you know that if you change the inputs a small amount, the outputs will change a small amount.
But digital devices are chaotic. Change one bit in the input, and the output is likely to be radically different. One bit in the wrong place on a Windows system can make the difference between Counterstrike and a BSOD.
You can use substandard devices for some applications; dodgy RAM, for example, can be used to store audio on, and it would work just as well for video framebuffers. But you could never put anything programmatic on it; that has to be perfect.
(IIRC, they do recycle faulty wafers. One of the ways is to scrape the doped layer off and turn them into solar cells. I don't know if they can use them again for ICs, though.)
If you go and buy a handful of 5% resistors, you will find ~0 that are within 2% of their value - if you buy 2%, none w/in 1%, etc...
Manufacturers are VERY aware they can charge a larger premium for better parts
If you look at what the "big ticket" items are in the US economy, electronics and medicine are up at the top of the list.
And the reason for this is, as you get closer to perfection, it takes more and more of an economic cost, in terms of money or resources or time or effort. For a computer or a medicine to go from 90 percent to 99 percent utility means a ten fold increase in price.
Thats why the constant quest to have "perfect" electronics and medicine is driving up the prices of these things to the point where normal people can't afford them. If we could accept that we didn't always need new, perfect, shiny medicines and electronics, it would put them in a sane price range.
Hopefully I didn't put any [] around my words.
Faulty Chips can be used to generate "true" random numbers.
but anything like a graphic chip is going to be too complex to handle.
Depends...
Graphics chips these days have multiple pipelines, and are shipped in variants with different numbers of pipelines. If you can build a board that lets you use (say) any two pipelines out of a 4-pipeline chip, then you can use more of the defective chips. Similarly, if you're making MP3 chips, and their FM radio or LCD subsystems fail, you sell them to APple to put in the iPod Shuffle...
The thing is, defective chips are already sorted into bins like this. Processors are binned by clock speed... buy a low-speed CPU and it could well have come from the same run as its higher-speed cousin. Memory has mechanisms to allow for a certain number of bad cells. It wouldn't surprise me at all if some 2-pipeline GPUs are 4-pipeline versions that failed the 3rd or 4th pipeline.
I don't know how much headroom is left.
For most applications, the specific resistance isn't all that pickey. 5% is often good enough. Also, it's often not even the absolute value that's important, but the relitive value that's important. You have a device with 3 channels each with a 1k resistor. It doens't matter that the resistors are 1k, it matters that they are all the same value, and somewhere around 1k, etc.
However that's not true of the digital world. It is important that my processor gets the right answer to a calculation everytime, all the time. It is important that the data stored in RAM is always accurate. If any of these fail, well it can fuck things up and you can't predict what. Maybe it's the least significant bit of a sample in an audio file and I never know. Maybe it's a bit in the address of a jump in a driver interrupt and it brings the whole system crashing down.
So while I'm not really worried if all the resistors in my powersupply are precisely to spec because who cares if it produces 11.5v instead of 12v? I am VERY concerned that my CPU might give me anything ever but a completely accurate and predictable result.
Also, it can make a difference in the analogue domain too. The military is pickey for a reason. If a TV fails, no big deal. If an F16 fails, that's a big deal. However on a more mundane level you'll find milspec parts in use. I built a headphone amp using all 1% (or better) milspec resistors. Why? Well, they sound better. The design (metal film instead of carbon) has better audio characteristics, their resistance changes less with temperature, and the closer matched they are, the closer the output of the channels of the amp are.
There have been moments in DRAM history when devices were made that were configured in some way during final test to work around bad spots. IBM did it for a while in the 1980s, I think. But with 90+% yields, it's not worth the added switching you need on chip to allow that. You could, in theory, use heavy ECC to tolerate a substantial defect rate. That's how CD-ROMs work, after all. But it's not necessary yet.
For a while, there was a market for DRAM with bad spots for use in telephone answering machines.
This is an idea that resurfaces periodically in the semiconductor history, but historically, the yields have always come up to acceptable levels.
What is this guy a professor of? As others have noted, this isnt very likely to work in practice. It's not even good enough for an answering machine if it compresses the audio. Any good compression method is likely to be tripped up by even one bad bit. After all the goal of compression is to make every bit count! In the case of CPU's, it doesnt seem likely that a random stuck bit is going to be innocuous. The quoted example of a LSB stuck on an adder is very contrived-- The arithmetic adder is probably less than 1% of a CPU's real estate. And again, even a LSB error is going to be unacceptable if any compressed or encrypted data goes through the adder, which is extremely likely these days. And let's not forget programs like compilers and linkers, which use the adder to calculate things like addresses. Off by a bit isnt going to cut it for avery large range of applications. And this guy got $1M to research this hare-brain idea? Sheesh.