Slashdot Mirror


Inside Nvidia's Testing Facilities

An anonymous reader writes "FiringSquad has up a behind the scenes look at NVIDIA's Santa Clara HQ. In addition to the usual shots of the server farm, they spend several pages talking about the Silicon Failure Analysis Lab which is the secret to NVIDIA's success as a fabless semiconductor company. They also have shots of NVIDIA's thermal analysis lab where they run the GPUs at 40 deg C and 0 deg C, and the Performance analysis labs."

67 comments

  1. I just got an empty page reading... by Rah'Dick · · Score: 2, Funny

    "Nothing for you to see here. Please move along." What?

    1. Re:I just got an empty page reading... by f0dder · · Score: 1

      Reads like a new employees training video.. without the music.

    2. Re:I just got an empty page reading... by hasbeard · · Score: 3, Informative

      I think that just means no one has posted yet. When you see that it's your opportunity to be the first to say something.

    3. Re:I just got an empty page reading... by JonathanR · · Score: 1

      Meh? I never read those employee training videos.

    4. Re:I just got an empty page reading... by 45mm · · Score: 1

      An entertaining twist on the usual "frist p0st!" post ... good show!

    5. Re:I just got an empty page reading... by CCFreak2K · · Score: 1

      It actually means that the story is published (or almost published), but only subscribers can see it. Sometimes the first poster uses it for humour (in this case, they might make a quip about nVidia video cards).

      --
      "Beware of he who would deny you access to information, for in his heart he dreams himself your master."
  2. Excellent Article by Anonymous Coward · · Score: 3, Interesting

    An excellent Article! Finally a change from the mundane 'IT Cable Puller Assembles Software System to blah blah blah' Great to know that people are interested in what real engineers are doing. If course I do like the props given to the NVIDIA IT folks that keep everything humming nicely.

    1. Re:Excellent Article by rvw14 · · Score: 2, Funny

      Good article, I especially loved the ATI adds that showed up on every page.

  3. I have an idea.. by Anonymous Coward · · Score: 0, Flamebait

    you can take a .22 revolver, put it to your head, pull the trigger, and not go at all..

    1. Re:I have an idea.. by digital_rich · · Score: 1

      This is not a Republican hate forum I don't think he was directing his hatred at Republicans.
    2. Re:I have an idea.. by Anonymous Coward · · Score: 0

      It is quite obvious he was. Don't you understand what a subset is?

  4. is it just me by Anonymous Coward · · Score: 0

    or is this new comment system completely useless ?

    1. Re:is it just me by Anonymous Coward · · Score: 0

      No, just the commenters.

  5. Where are the disco sofa's and pinball machines? by heroine · · Score: 4, Insightful

    All this renewed interest in corporations has us wanting our dot com parties back. They didn't mention the on-site oil changes. Interesting that the most valuable part of these companies is the lowest paying part: the QA lab. And the QA lab is still powered by 100Mbit ethernet.

    Then of course many of U thought runaway housing inflation would force these companies to think about moving elsewhere like, say, Pleasanton. Wrongo. Even with 4x more expensive rents than 2000, Silicon valley is still the king of corporate headquarters.

  6. Tighten up the graphics testing facility by graviplana · · Score: 3, Funny

    NVIDIA Tech: Johnson, you've been playing that game for hours, how's it going? NVIDIA Tech 2: We just finished level three and need to tighten up the graphics a little bit. NVIDIA Tech: Great! http://youtube.com/watch?v=j9COTOUH4qU&mode=related&search=

    --
    "Time is nothing; timing is everything."
  7. Did anyone else... by Icarus1919 · · Score: 1

    Read that as Santa Clause HQ? Man, maybe I'm catching the Christmas spirit or something. They're already selling the crap in stores.

    1. Re:Did anyone else... by JonathanR · · Score: 3, Funny

      No. You are special.

    2. Re:Did anyone else... by Anonymous Coward · · Score: 0

      Unlike you, Johnathon, who a putrid unremarkable wretch. Go fuck yourself.

    3. Re:Did anyone else... by JonathanR · · Score: 1

      So unremarkable, that you had to leave a remark, eh?

  8. why use Intel Clovertowns when they have there own by Joe+The+Dragon · · Score: 4, Interesting

    why use Intel Clovertowns when they have there own real good chipsets for AMD servers / work station systems?

  9. missing on tour by Anonymous Coward · · Score: 0

    They missed the janitor's closet with the monkey typing on a keyboard where the linux drivers are programed!

  10. Feline body temperature?? by jwiegley · · Score: 4, Funny

    I'm puzzled as to what is so "extreme" about 40C? My cat's temperature runs just slightly less than that and it purrs along quite nicely (literally).

    --
    I will never live for sake of another man, nor ask another man to live for mine.
    1. Re:Feline body temperature?? by jjeffries · · Score: 4, Funny

      Well then rest assured that if you wanted to implant a GPU in your cat, the Nvidia would handle your pussy's heat (other problems notwithstanding.)

    2. Re:Feline body temperature?? by skoaldipper · · Score: 1

      Hell, my 8800GTS idles at 58C or ~140F. No problemo. That 40C testing lab must be where staff employees hang meat for the cafeteria.

      In my book, if it don't melt, more fps CAN be felt.

      --
      I hope, when they die, cartoon characters have to answer for their sins.
    3. Re:Feline body temperature?? by mikael · · Score: 1

      According to the "Gnome sensors applet" on my laptop, idle temperature for the CPU/GPU are 61C. Running any type of GPU applications can push the temperature up to the high 80's . Above 91C, the system shuts down.

      --
      Vintage computer adverts: http://www.vintageadbrowser.com/computers-and-software-ads
    4. Re:Feline body temperature?? by aliquis · · Score: 1

      Mine had an epileptic seizure 2 or 3 days ago :(

      Must have been a bad circuit or something, anyone know how o underclock or raise the voltage (ok, many people will have ideas or that) of a cat?

    5. Re:Feline body temperature?? by Yetihehe · · Score: 1

      I have gf6800. Once when I was gaming fan died. When I exited from game, I only had warning that my card can be overheating. Indeed in control panel there was 120*C. And shutdown was set to 138.

      --
      Extreme Programming - Redundant Array of Inexpensive Developers
  11. 40 deg C? by JimboFBX · · Score: 2, Informative

    40 deg C? So what is that, 104 degrees farenheit? Thats not very taxing at all. Doesnt my laptop pull in 80 deg C?

    1. Re:40 deg C? by quanticle · · Score: 2, Informative

      Doesnt my laptop pull in 80 deg C?

      Given that most processors shutdown to prevent thermal damage at around that temperature, I'd think not. The shtudown threshold of a P4 (one of the hotter running chips of late) was around 78C, I'd think that 80C is a bit high.

      That said, I do think that 40C is a pretty low bar to pass. Given that my P4 idles at around 48-50C, I'm surprised that they consider 40C to be an "average" test environment.

      --
      We all know what to do, but we don't know how to get re-elected once we have done it
    2. Re:40 deg C? by gt_Peter · · Score: 1

      They probably mean -40C, not +40C. Industrial temperature range is typically -40C to 100C. Commercial is 0C to 85C give or take a bit. I work in labs like this so it was interesting to check out their test setups. Nice to see they look as cluttered and disorganized as we do.

    3. Re:40 deg C? by Osty · · Score: 1

      40 deg C? So what is that, 104 degrees farenheit? Thats not very taxing at all. Doesnt my laptop pull in 80 deg C?

      That's ambient temperature. My laptop runs around 30C when ambient temperature is around 18-22C and 40C when ambient temperature is 32-35C, average load (it pushes 50C in 18-22C ambient at full load). I can reasonably assume it would run around 50C under average load if ambient was 40C, and 70C or higher under full load. Depending on the chip and laptop, that may be acceptable or it may be way out of range.

    4. Re:40 deg C? by mbessey · · Score: 4, Informative

      40 degrees C is a sort-of standard for "elevated ambient" testing of electronics. The point of testing at higher temperatures is mostly to ensure that heat transfer out of the chips is sufficient at that temperature to keep them from overheating. The chips themselves will likely be running at much greater temperatures internally, but as long as the heat sinks are efficient enough, the chips shouldn't overheat.

      For consumer electronics, I guess the assumption is that if it's 40 degrees in your room, you're going to go find somewhere cooler to be, rather than sitting there with your PC blowing hot air on you.

      In other industries, the standards are different. Many products designed for use in an automobile are tested at 50-60 degrees, which is closer to the interior temperature of a car in full sun in a temperate climate.

    5. Re:40 deg C? by Jarjarthejedi · · Score: 2, Interesting

      "For consumer electronics, I guess the assumption is that if it's 40 degrees in your room, you're going to go find somewhere cooler to be, rather than sitting there with your PC blowing hot air on you."

      I'm sure that's a good assumption in many situations, but I've sat outside on my computer during the day a few (read: every friday since school started back up) times this year when the temp was over 110 F. I was out there when it was 117 F running along just fine for almost 20 minutes before my class opened up.

      It's not a bad assumption, in general the amount of time a computer's going to be running in >104 F is very small, but it's not exactly impossible.

      --
      There are two kinds of fool One says 'This is old therefore good' Another says 'This is new therefore better'- Dean Ing
    6. Re:40 deg C? by WhoBeDaPlaya · · Score: 1

      That's _ambient_. The GPU itself is going to be some deltaT above that.
      Since cooling solutions have an effective degC/W ratio, let's say deltaT = 20C. So testing at 27C ambient = 47C GPU, 40C ambient = 60C GPU.

    7. Re:40 deg C? by Kelz · · Score: 1

      Theres a difference between thermal analysis and environmental stress testing.

    8. Re:40 deg C? by AdamHaun · · Score: 1

      Automotive products that don't go in the cabin are tested at much higher temperatures than that. I work on microncontrollers for antilock brakes, and we test at 125C and -40C.

      --
      Visit the
    9. Re:40 deg C? by Anonymous Coward · · Score: 0

      I was an environmental test tech at a car company in 2003 and we did all tests at -40 C and 80 C. As a side note at 80 C with all devices activated a 2002 Hyundai junction box would burst into flames. It looked cool;)

    10. Re:40 deg C? by JimboFBX · · Score: 1

      My Inspiron 8000 would pull over 80 deg C often, and would commonly be above 70 deg C using a fan-hack I downloaded for it. It could be the sensor is simply inaccurate and the fans and such were all adjusted to compensate when they shipped this out, or maybe it was closer to the CPU and GPU than a thermal sensor typically is.

  12. Re:why use Intel Clovertowns when they have there by georgewilliamherbert · · Score: 3, Informative

    The article greatly oversimplified the compute HW setup. Nvidia has a many-thousand-node computational grid with servers across a wide variety of size tiers for different job types (mostly chip design/validation). Stuff is tested pretty extensively prior to mass purchase, and what's running a given size tier depends a lot on combinations of demand scheduling and HW vendor model rollout scheduling, both in CPUs and the boxes they sit in.

  13. Re:why use Intel Clovertowns when they have there by ahdemus · · Score: 1

    It is "their"and not "there".

  14. Driver testings? by antdude · · Score: 1

    I noticed their later drivers are seriously having problems in Windows. Linux seems fine to me, but Windows drivers' quality are getting worse and worse. I remember 8x.xx versions were pretty stable and had very few issues with them. NVIDIA needs to get its act together on their drivers. Good hardware, but bad software (Windows) these days.

    --
    Ant(Dude) @ Quality Foraged Links (AQFL.net) & The Ant Farm (antfarm.ma.cx / antfarm.home.dhs.org).
    1. Re:Driver testings? by cibyr · · Score: 2, Interesting

      What's worse, there doesn't seem to be any mechanism to report driver bugs to nVidia. I suppose you just have to hope they notice it and fix it in the next release.

      --
      It's not exactly rocket surgery.
    2. Re:Driver testings? by antdude · · Score: 1

      There are official forums, but it seems like NVIDIA doesn't care. It is REALLY frustrating. I think I will be going back to ATI/AMD for my next video card.

      --
      Ant(Dude) @ Quality Foraged Links (AQFL.net) & The Ant Farm (antfarm.ma.cx / antfarm.home.dhs.org).
    3. Re:Driver testings? by Anonymous Coward · · Score: 0

      nvidia drivers are horrible. Whats worse is that when you get a new one they tell you to click through the warning screen that says something to the effect that "Microsoft has not certified this driver yet!". (certification involves running MS's big enourmous driver test suite on it) They rush these things to market usually not testing them adequately themselves and certainly not taking the time to get through MS's test suite either. As a result their end users see blue screens of death all the time. This is true of a lot of driver makers, but especially of nvidia. I would guess that its because there is more pressure to get these things to market quickly and they are more complex than most other kinds of drivers. Maybe the linux drivers are better because there isnt as much pressure to get them out the door quickly.

      I remember seeing a statistic from a presentation by one of MS's lead/head/director/whatever engineers a couple of years ago that said something to the effect that nvidia's drivers are responsible for over 80% of windows blue screens. I just spent the past 15 minutes trying to find it online somewhere so I could cite it, I guess noone has ever said something to that effect publicly though.

    4. Re:Driver testings? by jorghis · · Score: 1

      If I wasnt at the same presentation parent was at, I was at one just like it. I have listened to at least a half a dozen people who work on the windows kernel say this exact same thing too.

      I saw a picture of a lame motivational poster MS had up in one of their development buildings once that said something to the effect of "do your part to make windows more reliable..." (not quite as cheesy but you get the idea) underneath those words some MS engineer had scrawled the words "..kill an nvidia programmer".

      At the very least there are some people at MS who want to blame nvidia for a lot of blue screens, although it certainly sounds like they deserve it.

    5. Re:Driver testings? by aliquis · · Score: 1

      Yeah, because ATI are well known for their quality drivers! Bla bla reboot gpu fucked up error on radeon 9xxx.

    6. Re:Driver testings? by makomk · · Score: 1

      The last two releases of their Linux driver (100.14.11 and 100.14.19) haven't worked reliably for me; I kept getting system crashes and display corruption. Unfortunately, previous releases are incompatible with Xorg 1.4 and it'd be a pain to downgrade. (Since I don't really need 3D under Linux and I've got a 7300, I'm using Nouveau - it's more stable and the 2D acceleration is much better than the old nv driver.)

    7. Re:Driver testings? by antdude · · Score: 1

      The newer driver versions seems better than in the past to me, even on old Radeon 9x00 AIW cards.

      --
      Ant(Dude) @ Quality Foraged Links (AQFL.net) & The Ant Farm (antfarm.ma.cx / antfarm.home.dhs.org).
  15. Re:why use Intel Clovertowns when they have there by blahplusplus · · Score: 1

    His memory misfired, I hate anal retentive people. Just get a grip and read between the lines. Some peoples memories are not as good and there is no edit function on slashdot.

  16. real-time simulation?! by Anonymous Coward · · Score: 0

    I noticed a part of the article that states that their compute cluster can simulate their hardware in real-time. I wonder if it's a cycle-accurate simulation, and how they write the simulator! It's not that easy to write a fast simulator, even if you can throw massive hardware at the problem.

    1. Re:real-time simulation?! by Bill+of+Death · · Score: 2, Insightful

      No, they cannot simulate the hardware designs anywhere near real-time. Using hardware emulation, they can run real software on real PCs, but still not near full speed.

  17. Futureless Nvidia? by Anonymous Coward · · Score: 0

    I have trouble seeing a future for Nvidia and AMD in the GPU business. I'm not sure there's going to be any market left for the GPU within five years. Intel and IBM are serious about their ray tracing ambitions. Larrabee and the proposed 2009/2010 Cell processor are going to be very capable ray tracing parts. IBM's interactive ray tracer is pretty impressive on today's Cell processor, even only using 6 SPUs on the PS3. I'm sure Nvidia will continue to exist as a company as they've already diversified into other components, but I think the future of graphics accelerator parts belongs to Intel and IBM.

  18. Nvidia by returnofjdub · · Score: 1

    Fabless? Or Fabulous?

  19. Who cares? by HeroreV · · Score: 0, Troll

    ATI is releasing specs, and Nvidia isn't, so why should I care about Nvidia? I'm building a new computer soon, and it will definitely have an ATI graphics card (unless Nvidia also promises soon to release specs).

  20. Re:Where are the disco sofa's and pinball machines by spxero · · Score: 1

    Uh... houses are cheaper in Pleasanton than those near Silicon Valley?

  21. Sorry John, I just... by Anonymous Coward · · Score: 0

    spilled coffee on the keyboard! Will that hurt much?
    http://www.firingsquad.com/media/article_image.asp/2253/20

  22. Unisys? by Anonymous Coward · · Score: 0

    I thought Nvidia used a large cluster of Sunfire servers for their chip simulations.

    Anyway, remember kids: NVIDIA is proprietary and undocumented hardware. Buy ATI/AMD instead!

    Glass

  23. Re:why use Intel Clovertowns when they have there by Wavicle · · Score: 1

    why use Intel Clovertowns when they have there own real good chipsets for AMD servers / work station systems?

    1) Quad core
    2) Fast quad core
    3) They didn't build their cluster just last month

    --
    Education is a better safeguard of liberty than a standing army.
    Edward Everett (1794 - 1865)
  24. To test subjects... by CCFreak2K · · Score: 1

    "At the Enrichment Center, we believe that a highly-motivated test subject and carry out rather complex tasks while enduring the most intense pain, so in case you don't make it through the testing...goodbye!"

    --
    "Beware of he who would deny you access to information, for in his heart he dreams himself your master."
  25. I must have missed it! by jshackney · · Score: 1

    There was an article buried in all those ads? My ADD addled brain couldn't get past the second page.

  26. Mod is wrong: Real News Please by Anonymous Coward · · Score: 0

    It's not a troll to say an article isn't news worthy.
    That this article polled a measly 61 comments which says it should have never been posted.
    Where does Slashdot get its mods+editors from these days?

  27. Re:why use Intel Clovertowns when they have there by Anonymous Coward · · Score: 0

    Some peoples memories

    "people's".

  28. vista bluescreen by nester · · Score: 1

    All that QA and they still can't get my 6800+ to run under vista w/o immediately blue screening.

  29. That's true by mbessey · · Score: 1

    I may have overstated things a bit when I said that verifying heat transfer was the primary purpose of testing at elevated temperatures. It's an important thing to verify though. Depending on the level of analysis performed beforehand, you'll have greater or lesser confidence that the product won't overheat, but nothing beats actually sticking a thermometer in/on the device and checking.

    Of course in addition to that, you've got all sorts of other issues to look for - differential expansion causing components to separate, solder cracking, adhesives flowing when they shouldn't, verifying that the battery safety circuits work correctly, etc, etc.