Laptops With Certain NVidia Chips Failing
Eukariote writes "An estimated 18 million laptops with NVidia G84 and G86 graphics chips sold in the past one and a half years are experiencing high failure rates. Various laptop models from multiple manufacturers (Apple, Dell, HP, Lenovo, and others) are affected. NVidia blames it on bad chip packaging causing thermal failure. BIOS updates that turn the laptop fan on more frequently or permanently have been released by Dell and HP. The cynical interpretation is that this is likely to only delay the problem until the warranty has expired."
Having to have my laptop fan all of the time to account for a bad chip is an unacceptable fix. It's loud, it takes more electricity to run, and it shortens the life of the fan, and possibly the whole computer as a result.
I don't respond to AC's.
"The power of accurate observation is frequently called cynicism by those who don't have it." - George Bernard Shaw
All Nvidia G84 and G86s are bad
The short story is that all the G84 and G86 parts are bad. Period. No exceptions. All of them, mobile and desktop, use the exact same ASIC, so expect them to go south in inordinate numbers as well. There are caveats however, and we will detail those in a bit.
Both of these ASICs have a rather terminal problem with unnamed substrate or bumping material, and it is heat related. If you ask Nvidia officially, you will get no reason why this happened, and no list of parts affected, we tried. Unofficially, they will blame everyone under the sun, and trash their suppliers in very colourful language.
When the process engineers pinged by the INQ picked themselves off the floor from laughing, they politely said that there is about zero chance that NV would change the assembly process or material set for a batch, much less an EOL part.
For dessert, there's this article to finish :)
i think that the better quality control of apple makes my computer immune to the problem, the genius bar can surely fix this problem and replace the computer for a new one, try this with dell.
As detailed in this thread, the GF8400 has serious performance problems under Vista Aero when running recent driver versions. I wonder if this is related? - i.e. Recent driver updates have down-clocked the GPU leading to bad performance. Dell have however recently acknowledge the problem and is working on a fix.
Here are the Dell models which have BIOS updates, from TFA:
Inspiron 1420
Latitude D630
Latitude D630c
Dell Precision M2300
Vostro Notebook 1310
Vostro Notebook 1400
Vostro Notebook 1510
Vostro Notebook 1710
XPS M1330
XPS M1530
A link? Shit I own one. Dell XPS m1330; I've had the motherboard replaced twice already for video failure, and I got the thing in September of 07. Yes, that's right, replaced twice in less than a year.
The flaw is every bit as bad as everyone makes it out to be.
Sadly, it's not the laptops that are the problem. The problem apparently exists in all G84 and G86 chips, including those on desktop models.
This was reported by the inquirer (and here, i think) a few weeks ago, but apparently the news hasn't been getting around..
http://www.xkcd.com/354/
waiting to form.
Charlie gets it right. Let's see, 18 million notebook machines. Freight each way, plus cost of labor to fix them and the materials needed. Less than $10 a machine! Great, that math stuff. Yup, a $150-200 million charge oughta do it at around $10 a machine!
Hello? This is the SEC? Hey, I have a question about an 8K I saw for NVidia. It goes like this.....
---- Teach Peace. It's Cheaper Than War.
Just continue to use iTunes without the display, you pussy. Sheesh -- typical Mac user.
The article about trolling is the next one down. Easy mistake to make.
Your hair look like poop, Bob! - Wanker.
Does this have anything to do with the Xbox 360's Red Ring of Death? And do these problems, in turn, have something to do with RoHS certification, due to lead-free solders being less durable?
Nvidia has been said to have had a hand in the design of some parts of the 360, and the problem sounds like it is identical.
That said, on my own laptop (a Dell Inspiron 6000i) sees at least 8 hours a day of actual use, and is generally powered on at least 20 hours per day. The default fan control keeps the fan spinning all the time at smoothly varied speeds, with a heavy tendency to keep it spinning at high speed for long periods of time following heavy loads. This is very annoying to me.
Instead, I run i8kfangui, which lets me control (based on the temperature of the CPU, GPU, RAM, or hard drive) the fan's speed. It keeps dust accumulation and noise down, and works pretty well. The tradeoff is that it (by my choice) keeps the CPU in a constant and dramatic swing between 52 and 43 degrees Celcius:
The fan is simply off below 43C, then turns at low speed once the CPU reaches 52C. If it gets to 68C (which almost never happens, and is quite hot for a CPU) it spins at high speed. I find this behavior to be very preferable.
But the point is that it is generally a slow climb to 52C, and a fast fall to 43C, over and over in an abusive thermal-stress scenario. This cycle repeats a dozen or so times per hour, 8-20 hours per day, and has done so for three years. It works fine,
The motherboard is not RoHS compliant, and so presumably was built with lead-based solder. However it seems that most new machines are built with lead-free solders, all of which seem to have various problems.
Are there any metallurgists in the house who might care to speculate on the relationship between lead-free solders and systemic failure of laptops due to heat cycling?
Kid-proof tablet..
Are any desktop chips affected, or only laptop chips?
But it is Nvidia's fault because they signed off on these cooling units.
That is like saying it isn't your car maker's fault if they put breaks in your car designed for a lawnmower and instead it is obviously the people who are making these lawnmower breaks fault for not making sure they can break a much heavier car...
From what I'm reading the issue isn't with fans not performing as expected. The issue is that at the performance rate Nvidia had them at they simply didn't do the job needed and resulting in the GPU overheating and destroying its self.
It is entirely, 100% Nvidia's fault. If you put in substandard parts you get a substandard result.
it is not logic board, it is motherboard. And yes, that is PC term. And yes, you got one.
The HP DV2000 DV6000 and DV9000 series laptops are all affected. The BIOS updates just make the fan spin more often, thats it. HP has extended the MFG warranties to 2 years from the date of purchase. At GeekSquad/Best Buy HP has been offering a LOT of replacements for these laptops authorized through HP, but the laptops have to be DOA and sent to service which takes about a week to two weeks. I've sent off atleast 15 HP laptops in the past 6 months for replacement/repair. I give HP some credit for atleast trying to fix the problem and/or replace the whole laptops themselves. I don't know what other MFG's are doing..
Agreed. Most reference coolers (and even a lot of 3rd party ones) aren't worth the cheap plastic used to make them. When I pulled the ref cooler off my 8800GT last year I was shocked to find that the fan didn't even sit completely atop the core, and that there was a LOT of excess thermal paste and stupidly thick thermal pads. It's little suprise the card was heatsoaking to 90C after a few hours of Bioshock and crashing itself! I can only cringe in horror when I imagine something like that stuffed into a freaking laptop. Fortunatly I had already planned on replacing the stock cooler (just a big heatpipe/heatsink with a 120mm fan ziptied to it) and lo and behold my card now has trouble hitting low 40's even after hours of flogging.
\ Long story short, all manufacturers should be held accountable for the idiotic shortcuts they take when it comes to cooling their electronics. Its kind of an important aspect of electronics, no? Why not spend a buck or two more on something that actually does the job? Till then the first thing I do with any graphics card (or CPU for that matter) is still going to be to chuck the stock cooler into my parts bin, and then look for something bigger or better.
My DELL XPS M1710 has a 7950GTX and never had any issues. The DELL BIOS does have some issues with heat management so I run l8kfan to keep heat at acceptable levels.
On top of that, did you know most new DELL laptops (confirmed on XPS and VOSTRO) wont read S.M.A.R.T? I think heat killed my original hard drive but the BIOS wouldn't report the drive was going bad. They should fire whoever made the decision that removing this feature was an improvement.
The best test environment is production. - Me
chrome://browser/content/browser.xul
Well, unless your replaced logic board fails again, I don't think Apple would take it back for replacement, since it basically works. Unfortunately, the affected GPUs are basically the entire nVidia 8x00 line (except for desktop 8300, and all the 8800's). Very few laptops actually use the 8800M GPU (think gaming laptops), so any other replacement, even a new laptop with an nVidia chipset will likely have the problematic GPU. The other alternative is to find a laptop with an AMD/ATi or Intel GPU.
Sorry, I was distracted by the picture of the BREASTS on TFA page
No sig for you!!
Its price is the lowest since 1990 ($4.2 today); Just fired its CEO; Very favorable reviews for upcoming ATI4xxx GPU; Troubles for NV; What do ya thinking?
Why is it all Nvidia's fault, seems to me it should be a shared responsibilty.
I work for a company big into mobile IC design (like NVIDIA). And I can say that it is very likely NVIDIA's fault because they (as do we), as the design company, specify every last detail of process, circuit, and package, when it comes to IC fabrication. Additionally, the company which produced these chips--TSMC--is the oldest, largest, and possibly most reliable dedicated fab company in existence. If there is a heat dissipation problem, it almost certainly stems from engineering oversight or management's corner-cutting on NVIDIA's part.
>> Standing on head makes smile of frown, but rest of face also upside down.
Unfortunately people think there's a difference with a macbook logic board (intel *coughs*) and an intel motherboard. Though a fan of OS X, Apple needs to give up on putting their apple logo stickers over the original 3rd party vendors hardware. It's a fucking PC/laptop with EFI.
Exactly. An overclocked PC can run for days on end completely stable in a room temp ~75 degrees. But if you put that desktop in an oven and get the air temp up around 150, something is gonna burn up. It really should be the OEMs responsibility for saying "Hey, your card gives out more heat than our laptop design can dissipate. We can't deploy these."
The USAF had a reliability program that ran from the mid-1960s to the mid-1980s which did quite a bit to make electronics more reliable in the field. About 1% of the USAF's "black boxes" were marked with stickers that said something like "USAF Reliability Program Unit - If unit breaks, replace entire unit and send broken unit to ... for analysis".
When broken units came into the analysis shop, a considerable effort was made to find out exactly which component had failed and how it had failed. This went way beyond normal repair. When a bad part was located, the part was opened up and examined with an electron microscope or X-rayed, as appropriate, to see exactly what had gone wrong.
The USAF would frequently publish pictures from this program in Aviation Week. You'd see pictures of bad lead joints inside an IC package, too-long internal leads that had failed under high G loads, and bad on-chip etching. Manufacturers of bad parts were named. Inspectors were sent to plants to figure out what had gone wrong with the manufacturing process. The problem got fixed or the supplier stopped getting military contracts.
This worked well when the military bought most electronic components. By the 1980s, consumer electronics were using electronics at least as sophisticated as the military, and the military had to start using "commercial, off the shelf" components. Today, the USAF has trouble getting any special attention from parts suppliers.
Auto manufacturers still do things like this. Because they have to pay for recalls, they need to find out why things break and fix the production process, even if it's at a supplier.
As long as it's working fine at the moment, there's not much you can do. If it fails repeatedly while under warranty (especially with the same problem), you're likely to be able to talk your way into a replacement computer.
Apple does have a decent history of creating repair extension programs when there's a known and particularly nasty design defect, especially when another company owns up to it being their fault. I imagine especially in those cases, they get the third party to pay for some or all of the repair costs. However, if you're really worried about it, you might consider getting AppleCare just to give yourself the three years of warranty. Of course, AppleCare, like any extended warranty, is a large profit center - but many people (myself included) decide it's better to pay a few hundred dollars upfront than to risk an expensive repair (or an even more expensive replacement) later. On the other hand, the flat-rate repair service (which I assume they still offer) isn't that much more expensive than AppleCare, so as long as you don't need more than one repair in the 2nd or 3rd year, you might be better off just risking it.
But just to be clear - I'd personally expect that many of the computers might get covered by a repair extension, which often last to 3 years beyond the date of purchase. But that doesn't help at all if your computer exhibits a symptom other than what is expected for the particular failure covered; if your CPU fails, for example, you'd be on your own without the warranty.
There is a problem with the chips, there is no doubt about that. However take anything Charlie says about it with a huge truckload of salt. There was a bit of bad blood between Nvidia and Charlie years ago (something like 4 or 5 now), and ever since they've refused to talk to anyone from the Inquirer and Charlie specifically.
It seems these days that all Charlie does is write long article bashing Nvidia. That is unless he's writing an article that's so over the top that his editor has to pull it (yes, believe it or not, there actually is an editor in charge of all those pieces).
Go read dell or HP forums and EE times. Read The Inq only if you want some amusement to see how amazingly slanted of a story can be produced.
Sew, ewe think your sew grate at spelling? Well, I ewes a spell chequer sew I no every word in this comment is spelled rite.
They're not actually shipping the affected product anymore, so presumably if you get a newly enough manufactured replacement part, you won't have the problem on the new piece of equipment.
Personally, I've never used my display on my MacBookPro. The UI on OSX is so wonderful, that I do not even have to look at it. I practically imagine what I want to open, and it opens it for me! This coupled with the nice sounds, let me know when I've opened the right application. If worst comes to worst, I can just use the option key combos to start my music, to start web-browsing etc.
I've never used it, so to be honest, I don't see why anyone would want such a feature, let a lone need it.