Slashdot Mirror


Too Much Gold Delays World's Fastest Supercomputer

Nerval's Lobster writes "The fastest supercomputer in the world, Oak Ridge National Laboratory's 'Titan,' has been delayed because an excess of gold on its motherboard connectors has prevented it from working properly. Titan was originally turned on last October and climbed to the top of the Top500 list of the fastest supercomputers shortly thereafter. Problems with Titan were first discovered in February, when the supercomputer just missed its stability requirement. At that time, the problems with the connectors were isolated as the culprit, and ORNL decided to take some of Titan's 200 cabinets offline and ship their motherboards back to the manufacturer, Cray, for repairs. The connectors affected the ability of the GPUs in the system to talk to the main processors. Oak Ridge Today's John Huotari noted the problem was due to too much gold mixed in with the solder."

35 of 111 comments (clear)

  1. I'll fix it by Anonymous Coward · · Score: 4, Funny

    Just give it to me and I'll get rid off the excess gold

    1. Re:I'll fix it by sarysa · · Score: 2

      When you do fix it, be sure to replace it with bitcoin.

      --
      Charisma is the measure of someone's ability to lie with a straight face.
  2. Which is another way of saying not enough lead. by Anonymous Coward · · Score: 5, Interesting

    ROHS strikes again

    1. Re:Which is another way of saying not enough lead. by bobbied · · Score: 4, Insightful

      Mod Parent up!

      If I ever get my hands on the guy who had this crazy idea of taking lead out of solder... Huge mistake, even with the environmental issues... /P?

      --
      "File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
    2. Re:Which is another way of saying not enough lead. by Sponge+Bath · · Score: 2, Informative

      You are blaming a manufacturing defect on environmental regulation? It is possible to make RoHS products that work correctly. Companies do it everyday.

    3. Re:Which is another way of saying not enough lead. by Anonymous Coward · · Score: 2, Insightful

      but almost everything, medical, industrial, military, aviation, aerospace etc. basically everything that _just have to work_ is exempt
      there's a reason for that

    4. Re:Which is another way of saying not enough lead. by asserted · · Score: 4, Insightful

      ...except that the report linked from the article examines the problem of gold embrittlement of the tin-lead (63% Sn - 37% Pb) alloy. go figure.

    5. Re:Which is another way of saying not enough lead. by Anonymous Coward · · Score: 5, Informative

      The lead-free solder has cost billions in failures.

      http://en.wikipedia.org/wiki/Whisker_(metallurgy)
      http://nepp.nasa.gov/WHISKER/

      NASA lost satellites because of lead-free solder (despite them requesting leaded solder). The funny thing is, leaded solder completely prevents whisker formation.

      Now, you may not care about whiskers if you just throw away your electronics every year or two, but if you want longevity, these things will kill you. So for lead-free solder preventing pollution? We are producing much more garbage now thanks to whisker-caused short circuit failures.

    6. Re:Which is another way of saying not enough lead. by __aaltlg1547 · · Score: 5, Informative

      Yep. Gots to pay attention. Thick gold on the connector to connector contacts is best, but don't plate it onto the solderable end of the connector, or on the pads on in the through holes. Actually, a tiny amount is good because it prevents corrosion before you have the part soldered on, but it has to completely diffuse into the solder to avoid making a non-conductive boundary layer. If there's too much to diffuse, you're screwed. You'd think the engineers at Cray would know this.

    7. Re:Which is another way of saying not enough lead. by __aaltlg1547 · · Score: 4, Insightful

      The lead-free solder has cost billions in failures.

      http://en.wikipedia.org/wiki/Whisker_(metallurgy) http://nepp.nasa.gov/WHISKER/

      NASA lost satellites because of lead-free solder (despite them requesting leaded solder). The funny thing is, leaded solder completely prevents whisker formation.

      Now, you may not care about whiskers if you just throw away your electronics every year or two, but if you want longevity, these things will kill you. So for lead-free solder preventing pollution? We are producing much more garbage now thanks to whisker-caused short circuit failures.

      I agree with everything except the part where that has something to do with gold contamination in solder joints.

    8. Re:Which is another way of saying not enough lead. by tlhIngan · · Score: 4, Informative

      If I ever get my hands on the guy who had this crazy idea of taking lead out of solder... Huge mistake, even with the environmental issues... /P?

      The problem in solder is not the lead. It's the tin.

      Tin by itself forms whiskers spontaneously. Some of the worst culprits in this isn't the solder, it's the hardware - the tin in hardware used to mount PCBs etc seem to whisker the most and cause problems. And plenty of research have shown what combination of tin ("bright" tin is the worst - and it was only until recently did manufacturers stop using it) led to the worst problems.

      Leaded solder suffers from whiskering as well. Anytime you use tin, you'll have whiskers. Its just a matter of time - use the wrong tin and it'll whisker quickly. Use the right tin and it'll whisker slowly. And it's not the result of electrochemistry, electromigration, or anything. It's just tin atoms wishing to migrate to relieve stress in the crystalline structure. They diffuse through the structure - the atoms aren't pulled locally, but from the entire bulk.

      We knew this when the first solders were created for electronics. At the time, they experimented and found lead worked "well enough".They never went to find out if there's any other substitute. Massive amounts of R&D is going on in materials science to find alternatives.

    9. Re:Which is another way of saying not enough lead. by semi-extrinsic · · Score: 2

      Agreed. My previous camera (Canon SX20) had to be repaired twice, and then replaced (with an SX40), since it came from one of the first production lines which used leadless solder. The solder was bad, so the voltage converter would stop delivering the correct voltage after about 10 000 shots.

      --
      for i in `facebook friends "=bday" 2>/dev/null | cut -d " " -f 3-`; do facebook wallpost $i "Happy birthday!"; done
    10. Re:Which is another way of saying not enough lead. by servognome · · Score: 3, Informative

      Tin whiskers aren't the only problem. Tin-Lead solder bulk properties can allow it to relieve stress better than lead-free counterparts. This aids reliability when it comes to preventing fatigue related crack propagation.
      There are a number of process factors that also impact the reliability of a solder joint, including heating and cooling rates, flux chemistry, and the plating of the connected parts. These can effect microstructure, intermetallic formation, and void formation. Like you say, for Tin-Lead this has been studied in depth for decades, the focus on lead-free has only been going on for about 15.

      --
      D6 63 0D 70 89 81 BB 8E 7B 7C 5F 5D 54 EA AB 73
  3. gold on the grills by HPHatecraft · · Score: 2

    pimp you super computer?

  4. Wish by Sparticus789 · · Score: 3, Insightful

    I wish I had this problem in my life... too much gold!

    Did anyone else go right here?

    --
    sudo make me a sandwich
    1. Re:Wish by Anonymous Coward · · Score: 5, Insightful

      No you don't.

      Midas

  5. Too much gold by mydn · · Score: 3, Funny

    They realized that it had too much gold when they noticed its name was showing as "Titan, of the Shattered Sun"...

  6. Poor supercomputer by Brentyl · · Score: 5, Funny

    Too much gold never slowed down Mr. T. I pity the fool.

  7. Connector problems ? by deroby · · Score: 3, Funny

    ... I assumed everybody knew by now you should always go with Monster Cables ! ...or maybe they didn't run them in properly ?

    --
    If there is one thing to be learned on slashdot, it has to be sarcasm.
    1. Re:Connector problems ? by Eunuchswear · · Score: 2

      I'm kind of amazed they couldn't spring for a full ATLAS system, I mean, TITAN, what losers.

      --
      Watch this Heartland Institute video
  8. Re:can someone please explain by mbkennel · · Score: 4, Informative

    I'm not a chemist either but fortunately somebody who is working on it is.

    Munger also reported the problems with the connector pins, which Oak Ridge Today‘s John Huotari noted was due to too much gold mixed in with the solder. Gold is used for connectors because it does not oxidize quickly, and because of its high electrical conductivity; however, when mixed with solder that contains tin, the gold and tin can combine, making the combination brittle (PDF) under certain conditions. Cray is reportedly replacing the connectors to alleviate the problem.

  9. Re:can someone please explain by Nuke+Bloodaxe · · Score: 3, Informative

    Quoting from the article "Gold is used for connectors because it does not oxidize quickly, and because of its high electrical conductivity; however, when mixed with solder that contains tin, the gold and tin can combine, making the combination brittle under certain conditions."

  10. Re:can someone please explain by tibit · · Score: 4, Informative

    I can only guess, but perhaps the coating on the terminals has to maintain certain mechanical properties over time. A wrongly formulated alloy, or a wrong thickness of plating will give you a connector that, perhaps, degrades in presence of heat and vibration. Or perhaps it plastically deforms on the contact area, thus lowering the contact pressure and eventually leading to loss of reliable connection. When you have small contact area, the contact pressure is sufficient to provide essentially a gas-tight connection. As the contact area grows, the pressure drops and eventually you expose your contact area to the atmosphere. At that point things usually go wrong.

    Pure gold is soft and by itself it has about the worst properties imaginable for any sort of a connector surface. It literally rubs off, it's so soft. Its low resistance is irrelevant, since the gold layer is very thin. Gold's bulk conductance plays little role in overall resistance of a mated contact pair. You could replace gold with a metal that has 10x lower conductance, usually with little or no measurable change in contact resistance -- that is, if you can find something that can match gold in other properties (wetting of underlying surfaces, resistance to oxygen, etc.).

    Gold is also useless as plating for high current terminals. I have designed plenty of connectors where some pins were for small signals and were gold plated, and others were for power and were silver plated. Gold plated power contacts simply lose the gold and then you have all the problems of an unplated contact pair that's exposed to the atmosphere since the gold erodes away leaving craters. It's no fun.

    When you get relays with gold-plated contacts, there are often two sets of ratings. One is for low-current use, where the gold is guaranteed to stay on the contacts. Another rating is for sufficiently high current use where the gold is vaporized away and you're left with some other coating material that works well in this application. You can't swap such relays around without realizing what's going on, since contact pairs that were exposed to high currents will perform horribly in small signal, small current applications.

    I also can't quite understand why people still buy gold jewelry -- all it took for me was a gold wedding band. I switched to tungsten carbide after a decade and I'm not looking back. The standard 18K alloy is a joke.

    --
    A successful API design takes a mixture of software design and pedagogy.
  11. Re:can someone please explain by ebno-10db · · Score: 5, Interesting

    What's strange is how the gold got mixed into the solder. Long gone are the days of cheap gold when they would plate every metallic surface on a connector. Now they selectively plate the mating surfaces. Certainly they don't plate the part you solder. Gold contamination of solder is a well known phenomenon, but I haven't heard of it in decades, literally. The only other thing I can figure is that sometimes they flash plate some gold on the PC board to reduce solder whiskers or something. But that's a well known process. What the hell happened here?

  12. cash for gold will offer $10 a board by Joe_Dragon · · Score: 3, Funny

    cash for gold will offer $10 a board

  13. Obligatory Firesign by bmo · · Score: 3, Funny

    CONQUISTADOR: Welcome to New Spain! This is your new Father - Father Corona.
    FATHER CORONA: Pax vneuti nicutm! down on your knees, now! D'ye recognize what I'm holidn' over your head, lads?
    INDIAN: It's a Cross. The Symbol of the Quartering of the Universe into Active and Passive Principles.
    FATHER CORONA: God have mercy on their heathen souls!
    CONQUISTADOR: What the Father means is - what is the Cross made of? Gold! Have you got any?

  14. Re:Solder huh? by asserted · · Score: 2

    No.

  15. Re:can someone please explain by Immerman · · Score: 2

    Alloys work in mysterious ways. Some alloys are "simple" having properties somewhere between the pure metals. Others have properties significantly different than their components. It can be a very non-linear process, in some cases even a fraction of a percent of "contamination" can drastically alter the properties of a metal, or you may have a "sweet spot" where the properties get better and better until you add just a little to much and change things completely. In this case someone said it was a matter of excess gold making the solder brittle.

    --
    --- Most topics have many sides worth arguing, allow me to take one opposite you.
  16. Re:can someone please explain by TapeCutter · · Score: 3, Informative

    Sorry but either your comprehension sucks or your just trolling. They are not "using tin instead of lead", it's an alloy composed of 63% Sn + 37% Pb, ie: good old fashioned tin-lead solder, the abnormally high trace levels of gold is the cause of the problem.

    --
    And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
  17. Re:Solder huh? by TapeCutter · · Score: 2

    There was lead in the solder, 37% to be exact.

    --
    And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
  18. Re:Solder huh? by TapeCutter · · Score: 2

    Oddly enough the solder they are using is 63%tin and 37% lead, the tin is not the "problem" since the problem doesn't occur without the extra trcaes of gold.

    --
    And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
  19. Re:can someone please explain by bill_mcgonigle · · Score: 2

    What the hell happened here?

    Maybe the Boomer who knew the process inside and out recently retired. I've seen this happen in a few tech/manufacturing industries lately. There's a chance he was hired back as a consultant to fix this mess.

    --
    My God, it's Full of Source!
    OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
  20. Off-topic but just FYI by JamesRing · · Score: 2

    Did you know that people in North America pronounce "solder" as "sodder"? I had no idea until I moved to the US and I still find it hilarious!

  21. Re:can someone please explain by servognome · · Score: 2

    As you add more gold you are changing the properties of the alloy you are forming. A small layer of gold to prevent oxidation doesn't really cause much impact, your solder (in this case eutectic Sn-Pb) still melts and solidifies basically the same way. The more gold you add the more complex the system becomes. Rather than a nice eutectic material that goes from liquid to solid directly, you get different phases that solidify at different temperatures. This results in the formation of brittle intermetallics that fail prematurely.

    --
    D6 63 0D 70 89 81 BB 8E 7B 7C 5F 5D 54 EA AB 73
  22. Exemptions are being phased out by olau · · Score: 2

    Quote from Wikipedia:

    Legislation published in July, 2011 removes these exemptions.

    Apart from a few exemptions, RoHS2 covers all types of Electrical and Electronic Equipment (EEE) including some medical devices and monitoring and control equipment which have been exempt in the past. Previous exemptions to product from categories 8 and 9 will be gradually phased out,[16] with:[17]

    - Cat. 8: Medical Devices - 3 years after publication
    - Cat. 8: In-vitro-Diagnostics - 5 years after publication
    - Cat. 9: Control and monitoring instruments - 3 years after publication
    - Cat. 9: Industrial control and monitoring instruments - 6 years after publication

    The reason stated on Wikipedia for exempting these things in the first place was being cautious until enough experience had been collected, considering that they only constituted only a small part of the electronics garbage pile anyway.