Slashdot Mirror


Do Car Safety Problems Come From Outer Space?

Hugh Pickens writes "As electronic devices are made to perform more and more functions on smaller circuit chips, the systems become more sensitive and vulnerable to corruption from single event upsets. This is especially true of Toyota, which has led the auto industry in its widespread inclusion of electronic controls in the manufacture of their various car models. 'These circuit families store not just data, but their basic function electrically,' says Lloyd W. Massengill, director of engineering at the Vanderbilt Institute for Space and Defense Electronics at Vanderbilt University. 'In the unfortunate event of a particle flipping just the right bit, a circuit configured to carry out a benign action may be reprogrammed to carry out some unintended action.' Denise Chow writes in Live Science that some scientists are pointing to cosmic ray radiation as a plausible mechanism behind the sudden, unexplained acceleration reported to have occurred with the late model Toyotas." "As the design of automobile systems continues to evolve from mechanical to electronic controls, relying more and more on various circuitry and chips, these electronic components may be vulnerable to being confounded by high-energy radiation writes Chow. Federal regulators were prompted to look into the possible role that cosmic rays played in Toyota's product recall fiasco after an anonymous tipster suggested the design of Toyota's microprocessors, software and memory chips could make them more vulnerable (PDF) to interference from radiation compared with other automakers. 'What's not known is what direction Toyota and other automakers are taking in terms of finding and correcting these issues,' says senior researcher Ewart Blackmore."

22 of 437 comments (clear)

  1. Why they tell you to turn off your phone... by LostCluster · · Score: 5, Informative

    Interference from radiation doesn't just come from outer space, it comes from cell phones, TV/radio stations, microwaves.... you see where this is going. I once worked in an office where there was a cell phone relay antenna too close to a PC, and we were constantly reinstalling the OS until I told them to move things around in the area.

    Thing is, when Windows gets a corrupted OS... it BSODs and we move on. Single-bit errors shouldn't send the car out of control... there should be some checksum that shouldn't add up. When a fault is detected, it should go to a backup program about safely shutting down the car.

    1. Re:Why they tell you to turn off your phone... by JoshuaZ · · Score: 5, Funny

      That's almost exactly what I was going to say. You've managed to make an accurate first post that actually includes a suggestion for dealing with the problems in question. Are you sure you meant to post this comment on Slashdot?

    2. Re:Why they tell you to turn off your phone... by pushing-robot · · Score: 4, Informative

      http://en.wikipedia.org/wiki/Non-ionizing_radiation

      Granted, an unshielded circuit can be vulnerable to any EM field, but gamma rays affect electronics in a completely different way than microwaves do.

      --
      How can I believe you when you tell me what I don't want to hear?
    3. Re:Why they tell you to turn off your phone... by MachDelta · · Score: 4, Funny

      But what if my car is already red?

    4. Re:Why they tell you to turn off your phone... by Anonymous Coward · · Score: 4, Informative

      If red cars are an indication of the problem, it's more widespread than engineers used to believe. On a more serious note: Fault tolerant design is the answer. Have three systems calculate the result (ideally using three different algorithms) and let them vote on the correct result. Don't assume that a set state persists, recalculate frequently and set the state even if it should be already set. Feed the control and the sensor data into a watchdog circuit (in triplicate...) to detect mismatches. Etc.

    5. Re:Why they tell you to turn off your phone... by Jane+Q.+Public · · Score: 5, Insightful

      In order for it to interfere with a digital circuit, it first has to be radiation of the "ionizing" category, and then it has to get through whatever shielding the electronics are in. (I presume they are in some kind of can; no shielding at all would be plain stupid.)

      Cell phone radiation hardly qualifies. Nor, for that matter, do most terrestrial sources of radiation.

      "Cosmic rays", unlike most terrestrial-source radiation, are capable of penetrating shielding and disrupting electronics.

      However... striking just the right bit(s) to cause acceleration, in a large collection of cars, is so incredibly unlikely as to be in the "I don't f*ing think so" category.

    6. Re:Why they tell you to turn off your phone... by SeekerDarksteel · · Score: 5, Informative

      This is one of the most common methods of error tolerance, actually, N-modular redundancy (typically either dual-modular or triple-modular). It's used in airliners and space shuttles, as well as a number of other critical applications. IBM actually sells servers (the system z series) which automatically runs two copies of everything and compares instruction results, so that failing processors can be detected and avoided.

      The proposal by the GP poster is actually much more difficult that it would seem at first glance. About the only place "checksum" style error detection is used is in memories/registers. The reason is that if I do a floating point addition, for example, the only way I know whether the addition gave me the right answer is to do the addition again and check.

      --
      The laws of probability forbid it!
    7. Re:Why they tell you to turn off your phone... by dwreid · · Score: 5, Interesting

      At the risk of sounding like a geezer, I remember back in the late 70's when this was a problem in early designs of mini-computers. Then we used to see single bits get flipped and crash computers from a variety of sources including cosmic radiation and alpha particles that came from the spontaneous decay of elements in the ceramic chip housings. More recently, when I purchased my 2005 Cadillac CTS it experienced a variety of problems similar to this when I would drive through a toll station that was equipped with RFID ID systems. Behaviours including sudden acceleration, engine stalling, indicator lights on the instrument panel going "crazy", On-Star calling for help when nothing was wrong, causing the driver's seat to suddenly drive forward to the steering wheel (making it really hard to steer), etc. At the time the only solution was to pull over, shut off the car, remove the key, open the door, wait for everything to shut down and then restart. After many frustrating weeks of "we can't duplicate the problem" it was discovered that the car had faulty shielding on one of the cables that makes up the in-car network. Once fixed the "gremlins" went away. The real crime here is that, because the problem can't be replicated on demand, Toyota is blaming the behaviour on attention seeking owners. This bizare response was recently repeated on the floor of Congress by one of Toyota's congressional tools. (I mean duly elected government representative.)

    8. Re:Why they tell you to turn off your phone... by rcamans · · Score: 5, Informative

      I worked on ECMs at GM (Delco Electronics) for 10 years at the start of their use (1980 to 1990). So if a cosmic ray came along and flipped a bit, it would have to be a specific bit. If it was a msb type bit in the accelerator position, then yes, acceleration. except that the bit would unflip right away because of pedal position update. Or if it was some engine feedback msb, again, yes, temporary acceleration, but again, only for a short time. Updates happen constantly.
      About EMI/EMC/RFI - the modules have been shielded and protected since day one against that. The engine is a very high disturbance environment in may ways. Sparks, for instance. The ECMs have been in almost all American cars since before 1980, because of the 1975 car air pollution reduction act Congress passed. The only way cars could meet the pollution restrictions was through ECMs. So If we have ECMs since nearly forever, and only just now one manufacturer has a bit flip problem? I don't think so. And these modules do not use the latest super-small feature processor technology. They use older temperature-resistant tech, Much larger features, far more radiation-resistant.
      No, the most likely problem is either a software routine with a bug, no error handler, or similar issue, or a mechanical,problem (less likely).

      --
      wake up and hold your nose
    9. Re:Why they tell you to turn off your phone... by WaywardGeek · · Score: 4, Insightful

      Radiation that can upset bits in an electronic circuit don't come from your cell phone, TV/radio stations or microwave oven. You may get enough EMI to interfere with your radio, but flipping individual bits in a chip pretty much requires an ion - basically a nucleus or neutron stripped of it's electrons flying through your chip. These come from two main sources. First, there's the Sun. Even with the magnetic shielding of the Earth, many fly through us all the time. Most common are single protons, but we occasionally are struck with gold nuclei, or even heavier. Older larger geometry chips were immune to single-event-upsets (SEUs) due to protons, but heavier elements could cause trouble. Newer, more advanced electronics are even sensitive to individual protons and neutrons. The other common source for radiation is neutrons from decays in lead used in electronic packaging. Ever hear of RohS compliance? Basically, a bunch of electronics companies around the world suddenly decided to "go green" and save us from lead poisoning by removing lead from their packaging. Ever wonder why? Do you really think they suddenly cared if they were killing our babies with lead poisoning? Uh... I'm afraid not. They removed the lead because of neutron radiation from lead decay.

      I'm guessing that studying radiation effects isn't very popular in Japan, possibly because we nuked them twice. However, they should get a clue and start learning about how to deal with rogue ions and neutrons.

      --
      Celebrate failure, and then learn from it - Nolan Bushnell
    10. Re:Why they tell you to turn off your phone... by rickb928 · · Score: 4, Insightful

      I don't hear much about comsumer electronics being fritzed by cosmic rays, or microwave ovens, etc, though I suppose this might explain the random failurs. But comsmic radiation? That's a new one.

      But RHoS being forced by lead decay? I dunno, but tin whiskers is negating any advantage that offers.

      Give me good old eutectic 63/37 any day. It just works. Not a lot of kids usae circuit boards as pacifiers, ya know?

      --
      deleting the extra space after periods so i can stay relevant, yeah.
    11. Re:Why they tell you to turn off your phone... by Kral_Blbec · · Score: 4, Informative

      I'm a bit skeptical of your claims about lead decay in electronics. While some isotopes of lead are radioactive, those are products of uranium decay, which as any good geek knows, goes through alpha and beta decay until it ends as a stable particle of lead-206. In that pathway there is lead-214 and lead-210 that have half-lives of half an hour and 22 years respectively. However, unless they are putting uranium in your electronics, the only lead present is going to be from mined ores that have had plenty of time to decompose into a stable form.

      The best chart of lead isotopes I found is here http://education.jlab.org/itselemental/iso082.html. I'm not sure why, but it lists a half life for lead-204 even though I thought it was supposed to be stable. Most half lives are a few minutes or hours.

  2. Is there realy a problem? by LostCluster · · Score: 5, Insightful

    Since the biggest Toyota runaway story has turned out to be a problem exists between seat and pedals situation... is this all hype with no science behind it?

  3. No. by stonecypher · · Score: 4, Insightful

    There's a reason that our entire modern world doesn't come crashing to a halt around us every 30 seconds. If every CPU was vulnerable to bit flips from random radiation, every part of your house would be on fire and arcing electricity. Times Square would look like the bridge of the 60s enterprise under attack.

    This is just some douchebag professor trying to ride the tragedies to fame. There's a reason it's always hitting the same system in the car. It's because the system is defective. There's a reason the professor has nothing but speculation to back himself up.

    This is the worst kind of charlatanry from someone who should know better. I hope his hosting school takes this very, very seriously.

    --
    StoneCypher is Full of BS
    1. Re:No. by TheGeniusIsOut · · Score: 4, Insightful

      I can't even begin to calculate the probability of a single bit flip due to impact from a cosmic ray causing unintended acceleration in multiple vehicles. Possible? Certainly, nearly anything is. Plausible? Maybe in a very broad sense of the world. Likely? Not very.

      --
      Ignorance is Bliss -- And the Opposite is True -- Genius is Madness
    2. Re:No. by SeekerDarksteel · · Score: 4, Informative

      There's a reason that our entire modern world doesn't come crashing to a halt around us every 30 seconds. If every CPU was vulnerable to bit flips from random radiation, every part of your house would be on fire and arcing electricity. Times Square would look like the bridge of the 60s enterprise under attack.

      Actually, every CPU _IS_ vulnerable to bit-flips from radiation. That part of it is not speculation. It does occur in commodity processors, and with probabilities large enough that we have ECC ram, and ECC and/or parity in caches. Some servers actually come with built in hardware fault tolerance methods, because when you run hundreds of servers non-stop for years, the probability that a particle strike screws up a register on chip is non-negligible. Now, still, the probability isn't _huge_. Definitely not high enough to be causing these specific problems, especially when the failure is always in the same manner. _That_ part of it is pretty much bullshit.

      --
      The laws of probability forbid it!
  4. Sun UltraSPARC-II's anyone? by nbvb · · Score: 4, Insightful

    Sounds a whole lot like the e-cache parity errors in the Sun UltraSPARC-II processors.

    If you were never affected by that, consider yourself a lucky person.

    particle-caused bitflips are very much real.

  5. Prove It, Implement Fix, Pay Out Families by eldavojohn · · Score: 4, Insightful
    If this is true, recreate the phenomenon in a lab. Test your hypothesis by exposing the circuitry in question to similar radiation in a lab. While you can't test thousands of sets of circuitry, being able to recreate it by increasing the amount of radiation and testing or automating the testing and dosage cycle and letting it run until the malfunction is noted or another failure occurs.

    It's not out of the question, IBM noted in the 90s:

    Extensive background radiation studies by IBM in the 1990s suggest that computers typically experience about one cosmic-ray-induced error per 256 megabytes of RAM per month. If so, a superstorm, with its unprecedented radiation fluxes, could cause widespread computer failures.

    You have to fix this though. As a large manufacturer you have to accept this risk just like your competitors do. Airlines accept this risk and triple check their data because people's lives are at risk. As a car manufacturer, you are in the exact same position.

    I hope the fix they already rolled out as a recall includes triple checking data or -- if the article is correct -- we won't see a drop in these horrible accidents. I hope for drivers and public safety that it does. It's led to death and possibly wrongful incarceration. Restitution is in order. Take testing motor vehicles seriously.

    --
    My work here is dung.
  6. Space Rays, My Ass by WrongSizeGlass · · Score: 4, Funny

    Whether you subscribe to Occam's razor, or just plain old common sense, rays from outer space are not Toyota's problem (though they may be the author's problem).

    This type of thing is just plain bat shit crazy. There is a problem somewhere in Toyota's system somewhere. Either a software bug or bad chips or something real and tangible ... but rays from outer space? Please.

    If someone here on /. had posted that in the last Toyota story they would have gotten a +5 Funny.

  7. Frontline Auto Engineer's Perspective by jim_k_3038 · · Score: 5, Informative

    While working for Motorola, I worked on electronic throttle control (ETC). We spent a ton of time working to make the system "fail safe". I think we all had in the back of our minds that it was only a mater of time before we would have to testify as to our engineering decisions.

    My little part of ETC involved adding a sub processor which watch-dogged the main micro. The little micro asked a series of questions of the main micro. Both processors would need to agree on all the inputs and output of the system. The little micro would also ask question regarding real time OS (RTOS) of the main micro. The main micro would need to have tasks executing in the right order to satisfy the small micro. Lastly, the small micro would ask the main micro to perform math operations to verify accuracy. Oh, and the main micro was continuously checksumming it's memory too.

    Both micros had a direct hardware disable path to the H-bridge which was delivering power to the throttle plate. The throttle plate was spring loaded, so, with power cut, the throttle plate would snap to an idle position.

    Next came the electro / magnetic compatibility testing (EMC). We spent months inside huge chambers testing both radiation and susceptibility. One of the tests for susceptibility involved using a zap gun to spark a 20kV spark on each pin of our ECU. Not satisfied with that, our customer opened one of our modules and used a sparking spark plug to slowly zap our board to failure. Bottom line, that throttle plate better never stick one way, or the other.

    In the end, it always amazed me that the whole thing would work at all. Seemed to me that the system was always seconds away from going into some kind of fail safe mode.

    No, a stray bit flip is not going to facilitate a run away car. Least not on my system!

  8. McMurdo by Unxmaal · · Score: 5, Interesting

    When I was working for NASA, on the NISN network, we'd get these weird router crashes for the old Cisco router located at (or very near) the South Pole in Antarctica. It was always a memory problem, and I'd always have to call someone to get them to powercycle the router. It irritated me to keep bothering those guys, so I opened a case with Cisco TAC.

    The TAC guy sent a terse response, saying that particular crash was a "transient memory error" due to "alpha radiation or sun spots." That really pissed me off -- Cisco TAC just gave me a standard BOFH response! I escalated, and swung the NASA club around some, and finally got a senior engineer on the phone. "You said this router's at the South Pole, right? So that means it's at very high altitude, with very little ozone shielding, right?" "Umm, yeah." "Well there you go. There's a lot more radiation at that altitude than at sea level. Our stuff's only rated for sea level. See if they can .. I dunno, put a lead blanket over it or something."

    I relayed the info to my contact at McMurdo, and he laughed and said he'd figure something out.

    On a hunch, I checked the other two "high-altitude" routers we had, and sure enough, they both had a statistically higher failure rate for "transient memory errors".

    --
    http://unxmaal.com
  9. IBM System/360 anecdote by Anonymous Coward · · Score: 4, Interesting

    My dad was an IBM CE (Customer Engineer) specialist on one of the models in the IBM System/360 mainframe range. He used to like telling the story about how he and another engineer were out on a customer's site trying to determine an intermittent fault. They would bring the machine up and sure enough there would be this glitch at precise intervals. They just couldn't figure out what was causing it. That was, until the other CE took a look out the window.
    After a bit he said 'Tell me when it happens'. OK... '...now' my dad said. Then he said 'I'll tell you when the next one happens' and a few seconds later said '...now'. Which is exactly when it did glitch.
    It turned out that the customer's DP center was situated close to an airport. The CE could see the radar dish revolve at the end of the runway. When it pointed straight at him was when the glitch occurred. Needless to say the computer room received some RF shielding.