Slashdot Mirror


Do Car Safety Problems Come From Outer Space?

Hugh Pickens writes "As electronic devices are made to perform more and more functions on smaller circuit chips, the systems become more sensitive and vulnerable to corruption from single event upsets. This is especially true of Toyota, which has led the auto industry in its widespread inclusion of electronic controls in the manufacture of their various car models. 'These circuit families store not just data, but their basic function electrically,' says Lloyd W. Massengill, director of engineering at the Vanderbilt Institute for Space and Defense Electronics at Vanderbilt University. 'In the unfortunate event of a particle flipping just the right bit, a circuit configured to carry out a benign action may be reprogrammed to carry out some unintended action.' Denise Chow writes in Live Science that some scientists are pointing to cosmic ray radiation as a plausible mechanism behind the sudden, unexplained acceleration reported to have occurred with the late model Toyotas." "As the design of automobile systems continues to evolve from mechanical to electronic controls, relying more and more on various circuitry and chips, these electronic components may be vulnerable to being confounded by high-energy radiation writes Chow. Federal regulators were prompted to look into the possible role that cosmic rays played in Toyota's product recall fiasco after an anonymous tipster suggested the design of Toyota's microprocessors, software and memory chips could make them more vulnerable (PDF) to interference from radiation compared with other automakers. 'What's not known is what direction Toyota and other automakers are taking in terms of finding and correcting these issues,' says senior researcher Ewart Blackmore."

61 of 437 comments (clear)

  1. Why they tell you to turn off your phone... by LostCluster · · Score: 5, Informative

    Interference from radiation doesn't just come from outer space, it comes from cell phones, TV/radio stations, microwaves.... you see where this is going. I once worked in an office where there was a cell phone relay antenna too close to a PC, and we were constantly reinstalling the OS until I told them to move things around in the area.

    Thing is, when Windows gets a corrupted OS... it BSODs and we move on. Single-bit errors shouldn't send the car out of control... there should be some checksum that shouldn't add up. When a fault is detected, it should go to a backup program about safely shutting down the car.

    1. Re:Why they tell you to turn off your phone... by JoshuaZ · · Score: 5, Funny

      That's almost exactly what I was going to say. You've managed to make an accurate first post that actually includes a suggestion for dealing with the problems in question. Are you sure you meant to post this comment on Slashdot?

    2. Re:Why they tell you to turn off your phone... by pushing-robot · · Score: 4, Informative

      http://en.wikipedia.org/wiki/Non-ionizing_radiation

      Granted, an unshielded circuit can be vulnerable to any EM field, but gamma rays affect electronics in a completely different way than microwaves do.

      --
      How can I believe you when you tell me what I don't want to hear?
    3. Re:Why they tell you to turn off your phone... by pitchpipe · · Score: 3, Interesting

      there should be some checksum that shouldn't add up. When a fault is detected, it should go to a backup program about safely shutting down the car.

      Or how about a computer redundancy system where a group of computers that are all capable of controlling the car watch the behavior of the computer that is actually controlling the car. Through a voting system they could decide to hand the control of the car over to a another computer in the event that the controlling computer doesn't act in a way that was deemed safe. This way the car could continue to operate normally while signaling that there is a problem that needs to be addressed.

      --
      Look where all this talking got us, baby.
    4. Re:Why they tell you to turn off your phone... by Cryacin · · Score: 2, Insightful

      I think it's just trying to blame the little green men on a problem that has more terrestial origins.

      --
      Science advances one funeral at a time- Max Planck
    5. Re:Why they tell you to turn off your phone... by Anonymous Coward · · Score: 3, Informative

      Nope, the exact opposite. Gamma rays are short wavelength and high energy.

    6. Re:Why they tell you to turn off your phone... by MachDelta · · Score: 4, Funny

      But what if my car is already red?

    7. Re:Why they tell you to turn off your phone... by Anonymous Coward · · Score: 4, Informative

      If red cars are an indication of the problem, it's more widespread than engineers used to believe. On a more serious note: Fault tolerant design is the answer. Have three systems calculate the result (ideally using three different algorithms) and let them vote on the correct result. Don't assume that a set state persists, recalculate frequently and set the state even if it should be already set. Feed the control and the sensor data into a watchdog circuit (in triplicate...) to detect mismatches. Etc.

    8. Re:Why they tell you to turn off your phone... by hipp5 · · Score: 3, Informative

      Gamma rays have a higher frequency,

      Corrected. And thus they have a shorter wavelength.

    9. Re:Why they tell you to turn off your phone... by Jane+Q.+Public · · Score: 5, Insightful

      In order for it to interfere with a digital circuit, it first has to be radiation of the "ionizing" category, and then it has to get through whatever shielding the electronics are in. (I presume they are in some kind of can; no shielding at all would be plain stupid.)

      Cell phone radiation hardly qualifies. Nor, for that matter, do most terrestrial sources of radiation.

      "Cosmic rays", unlike most terrestrial-source radiation, are capable of penetrating shielding and disrupting electronics.

      However... striking just the right bit(s) to cause acceleration, in a large collection of cars, is so incredibly unlikely as to be in the "I don't f*ing think so" category.

    10. Re:Why they tell you to turn off your phone... by SeekerDarksteel · · Score: 5, Informative

      This is one of the most common methods of error tolerance, actually, N-modular redundancy (typically either dual-modular or triple-modular). It's used in airliners and space shuttles, as well as a number of other critical applications. IBM actually sells servers (the system z series) which automatically runs two copies of everything and compares instruction results, so that failing processors can be detected and avoided.

      The proposal by the GP poster is actually much more difficult that it would seem at first glance. About the only place "checksum" style error detection is used is in memories/registers. The reason is that if I do a floating point addition, for example, the only way I know whether the addition gave me the right answer is to do the addition again and check.

      --
      The laws of probability forbid it!
    11. Re:Why they tell you to turn off your phone... by MichaelSmith · · Score: 2, Interesting

      More to the point they generate secondary showers of ionizing radiation when they transverse metallic shields so we should be careful not to make the problem worse by creating showers of particles with a greater cross section.

    12. Re:Why they tell you to turn off your phone... by evanbd · · Score: 3, Informative

      You can build circuits that detect faults while operating. They're more complex than their normal counterparts, but the transistor count is less than 2x. On-line error detection is a common name.

      Of course, such circuits get really expensive if you don't have a large market for them. But cars represent a fairly large market, so if it was the best approach they could probably use them. Of course, that assumes there's any market or regulatory pressure to use any sort of error detection at all.

    13. Re:Why they tell you to turn off your phone... by dwreid · · Score: 5, Interesting

      At the risk of sounding like a geezer, I remember back in the late 70's when this was a problem in early designs of mini-computers. Then we used to see single bits get flipped and crash computers from a variety of sources including cosmic radiation and alpha particles that came from the spontaneous decay of elements in the ceramic chip housings. More recently, when I purchased my 2005 Cadillac CTS it experienced a variety of problems similar to this when I would drive through a toll station that was equipped with RFID ID systems. Behaviours including sudden acceleration, engine stalling, indicator lights on the instrument panel going "crazy", On-Star calling for help when nothing was wrong, causing the driver's seat to suddenly drive forward to the steering wheel (making it really hard to steer), etc. At the time the only solution was to pull over, shut off the car, remove the key, open the door, wait for everything to shut down and then restart. After many frustrating weeks of "we can't duplicate the problem" it was discovered that the car had faulty shielding on one of the cables that makes up the in-car network. Once fixed the "gremlins" went away. The real crime here is that, because the problem can't be replicated on demand, Toyota is blaming the behaviour on attention seeking owners. This bizare response was recently repeated on the floor of Congress by one of Toyota's congressional tools. (I mean duly elected government representative.)

    14. Re:Why they tell you to turn off your phone... by lgw · · Score: 2, Insightful

      I think that Rolls Royce offers a pure drive-by-wire system in one model, including braking. Of course, many airplanes are completely fly-by-wire. It's just a matter of cost.

      Nonw of which will prevent you from stepping on the wrong pedal. Maybe Toyota has a bug somewhere, maybe not, but remember the "Audi unintended acceleration" problem? 100% driver error. The "Toyota unintended acceleration" problem? The most likely explanation remains driver error (I'd have no doubts at all, expect I believe the Woz when he says he found something). Toyota's mistake early on was to try deny they had a bug, on the pathetic basis that the didn't have a bug, as no one ever believes they are stepping on the wrong pedal. They should have rushed out a firmware "fix" that instead recorded legal proof of the driver error.

      --
      Socialism: a lie told by totalitarians and believed by fools.
    15. Re:Why they tell you to turn off your phone... by rcamans · · Score: 5, Informative

      I worked on ECMs at GM (Delco Electronics) for 10 years at the start of their use (1980 to 1990). So if a cosmic ray came along and flipped a bit, it would have to be a specific bit. If it was a msb type bit in the accelerator position, then yes, acceleration. except that the bit would unflip right away because of pedal position update. Or if it was some engine feedback msb, again, yes, temporary acceleration, but again, only for a short time. Updates happen constantly.
      About EMI/EMC/RFI - the modules have been shielded and protected since day one against that. The engine is a very high disturbance environment in may ways. Sparks, for instance. The ECMs have been in almost all American cars since before 1980, because of the 1975 car air pollution reduction act Congress passed. The only way cars could meet the pollution restrictions was through ECMs. So If we have ECMs since nearly forever, and only just now one manufacturer has a bit flip problem? I don't think so. And these modules do not use the latest super-small feature processor technology. They use older temperature-resistant tech, Much larger features, far more radiation-resistant.
      No, the most likely problem is either a software routine with a bug, no error handler, or similar issue, or a mechanical,problem (less likely).

      --
      wake up and hold your nose
    16. Re:Why they tell you to turn off your phone... by WaywardGeek · · Score: 4, Insightful

      Radiation that can upset bits in an electronic circuit don't come from your cell phone, TV/radio stations or microwave oven. You may get enough EMI to interfere with your radio, but flipping individual bits in a chip pretty much requires an ion - basically a nucleus or neutron stripped of it's electrons flying through your chip. These come from two main sources. First, there's the Sun. Even with the magnetic shielding of the Earth, many fly through us all the time. Most common are single protons, but we occasionally are struck with gold nuclei, or even heavier. Older larger geometry chips were immune to single-event-upsets (SEUs) due to protons, but heavier elements could cause trouble. Newer, more advanced electronics are even sensitive to individual protons and neutrons. The other common source for radiation is neutrons from decays in lead used in electronic packaging. Ever hear of RohS compliance? Basically, a bunch of electronics companies around the world suddenly decided to "go green" and save us from lead poisoning by removing lead from their packaging. Ever wonder why? Do you really think they suddenly cared if they were killing our babies with lead poisoning? Uh... I'm afraid not. They removed the lead because of neutron radiation from lead decay.

      I'm guessing that studying radiation effects isn't very popular in Japan, possibly because we nuked them twice. However, they should get a clue and start learning about how to deal with rogue ions and neutrons.

      --
      Celebrate failure, and then learn from it - Nolan Bushnell
    17. Re:Why they tell you to turn off your phone... by rickb928 · · Score: 4, Insightful

      I don't hear much about comsumer electronics being fritzed by cosmic rays, or microwave ovens, etc, though I suppose this might explain the random failurs. But comsmic radiation? That's a new one.

      But RHoS being forced by lead decay? I dunno, but tin whiskers is negating any advantage that offers.

      Give me good old eutectic 63/37 any day. It just works. Not a lot of kids usae circuit boards as pacifiers, ya know?

      --
      deleting the extra space after periods so i can stay relevant, yeah.
    18. Re:Why they tell you to turn off your phone... by JWSmythe · · Score: 3, Informative

          Why post AC? You obviously work for NASA. :)

          Redundancy in a car isn't essential for the computer, as long as it fails in a safe mode. In the case of a single bit being flipped in the data stream, that would be a transient error. In a throttle system, it would be so short lived, you'd never know it ever happened. How many times per second do you think the computer reads its inputs and adjusts things? (hint: it's more than 1).

          Heck, you don't even (usually) notice misfires, and those happen all the time, even on perfectly tuned vehicles. It takes a whole series of misfires, or a constant fault to be noticeable. On a V8 engine, you can even lose a cylinder and not notice. I had someone once bring a car to me because it "doesn't accelerate well". It turned out three spark plug wires weren't on. And no, I didn't work on it before that, someone else messed up. It actually idled pretty well. The three cylinders weren't sequential, so it managed fine. That's even been included as a feature on some cars. For example, an 8 cyl car would disable 2 or 4 cylinders to get better fuel economy, and run on all 8 if full power was requested. It's sometimes referred to as a variable displacement engine. Versions have shown up in GM, Chrysler, Mercedes, and Honda vehicles over the years.

      --
      Serious? Seriousness is well above my pay grade.
    19. Re:Why they tell you to turn off your phone... by mc6809e · · Score: 2, Interesting

      In order for it to interfere with a digital circuit, it first has to be radiation of the "ionizing" category

      Neutron radiation isn't considered ionizing, yet interactions between the neutrons and the silicon in a typical chip will create charged particles that cause current surges. These current surges can interfere with the correct operation of a circuit and that includes individual transistors, not just bits in memory.

    20. Re:Why they tell you to turn off your phone... by tibit · · Score: 2, Insightful

      If the ECU is so susceptible to single-bit errors, I'd like to see it getting stuck in IDLE, getting stuck running rich/lean, etc.

      I'm pretty sure that if we *do* learn of what the problem was, it will be something rather embarassing, and will have nothing to do with SEUs, seized bushings, etc.

      Toyota's technical problem right now is lack of post-mortem diagnostics built into the ECU. Things that are "out of the ordinary" should be logged, ideally with as much of ECUs state logged as possible. That's their only *technical* problem. Everything else is hearsay at this point, from the technical standpoint. Engineering can't work with what amounts to gossip.

      Stories of people driving their cars with WOT to the dealerships with *nothing* constructive coming out of it indicate that there's gross lack of competence everywhere in their corporate structure. There's no communication. If a tech gets a "weird example" like that in the dealership, he should be able to get to the engineer who is on the ECU support team. Anything less should get responsible people jailed. Mr Toyoda has lost touch. It's not about incremental improvements. It's simply about corporate inertia and unnecessary shielding of people who should be working towards a common goal. If a tech at a Toyota dealer somewhere in the U.S. thinks he has something really weird going on, he shouldn't be treated like public enemy #1. He should be treated like a source of valuable feedback, potentially averting an ongoing disaster. There's no reason for said tech not to be able to get to the engineering.

      No, I don't work for Toyota or their dealers. But I've heard enough corporate idiocy to be able to recognize its symptoms. The blind running around exhibited by Toyota's engineering right now is a *classic* "all red flags" symptom. The first step at the solution isn't technical. It's corporate wetware.

      --
      A successful API design takes a mixture of software design and pedagogy.
    21. Re:Why they tell you to turn off your phone... by JWSmythe · · Score: 3, Interesting

          I remember a news story from several years ago that even made the evening news. Someone had a Saturn car that they realized they couldn't afford and tried to return. The dealer wouldn't just take it back for a full refund, since it was now a used car.

          Over the next few months, the driver had several "emergencies" with it, each time having it towed back to the dealership, where they couldn't find a problem. One in particular that was video taped by the police, the car was circling in a parking lot and the driver called 911. The insisted the car wouldn't stop. They told her to step on the brakes, use the emergency brake, throw it in neutral, shut it off, etc, etc, etc... She circled for something like 30 minutes. Finally they got her to open the drivers window, and an officer got in the middle of where it was circling. He ran for the side of the car, grabbed the wheel, and then turned off the key. The car (amazingly enough) came to a stop.

          Of course, she claimed it wouldn't stop for her. There was all kinds of talk about lemon laws, and how Saturn vehicles weren't safe. She made a whole bunch of noise, and the dealership traded her car for another one. The problems persisted for her. Obviously Saturns were amazingly dangerous vehicles. Someone from the dealership (I think the owner) actually started driving her original car to work every day, to find out what the problem really was. He never had a problem.

          Eventually, she was charged, I believe with reckless endangerment. Pretty much, she was driving dangerously, and endangered the officers who tried to help her.

          I won't say that the mystery Toyota is driver error or a mechanical problem, but where the cases that have been in the news have massive parallels in other vehicles too, where drivers just did the wrong things.

          A older lady in a Buick several years ago was pulling into the parking lot where I worked. I happened to be in the front of the store, and heard her tires squeal. She smashed into a parked car. That broke the parking pawl and sent the parked car across the parking lot into two other parked cars. One of those cars belonged to one of my coworkers, who wasn't exactly very happy that his car was totaled. I ran out to see if she was ok (once the cars stopped moving). She said "What happened?" I told her what she did. She was very insistent that she hit the brakes. I told her she spun the tires before hitting the first car. She said the other car must have done it. The driver of the other car was in the store at the time. At least everyone with wrecked cars had a good sense of humor about it, and no one was hurt. The funniest part was, her car was fine. There was absolutely no damage. It wasn't even scratched. The other three car were severely damaged though. Her insurance gave my coworker full book value on his car, even though it was a rusted piece of junk that barely ran. They were fully aware of it, they were just avoiding potential legal problems.

      --
      Serious? Seriousness is well above my pay grade.
    22. Re:Why they tell you to turn off your phone... by TheLink · · Score: 2, Interesting

      > Radiation that can upset bits in an electronic circuit don't come from your cell phone, TV/radio stations or microwave oven
      > You may get enough EMI to interfere with your radio, but flipping individual bits in a chip pretty much requires an ion

      You don't need to flip individual bits in a chip to cause problems with car electronics. I suspect if something flipped dozens or thousands it would still cause problems. So you shouldn't get so fixated on individual bit flips.

      From the perspective of car safety, the people that are saying "outer space" seem like they're clutching at straws.

      As for the removal of lead. It actually made the tin-whisker problem bigger and thus made stuff less reliable.

      I strongly doubt the removal of lead was anything to do with making stuff more reliable by avoiding lead decay, if you can provide a decent citation for that, that'll be interesting.

      --
    23. Re:Why they tell you to turn off your phone... by Kral_Blbec · · Score: 4, Informative

      I'm a bit skeptical of your claims about lead decay in electronics. While some isotopes of lead are radioactive, those are products of uranium decay, which as any good geek knows, goes through alpha and beta decay until it ends as a stable particle of lead-206. In that pathway there is lead-214 and lead-210 that have half-lives of half an hour and 22 years respectively. However, unless they are putting uranium in your electronics, the only lead present is going to be from mined ores that have had plenty of time to decompose into a stable form.

      The best chart of lead isotopes I found is here http://education.jlab.org/itselemental/iso082.html. I'm not sure why, but it lists a half life for lead-204 even though I thought it was supposed to be stable. Most half lives are a few minutes or hours.

    24. Re:Why they tell you to turn off your phone... by JWSmythe · · Score: 2, Funny

          Ummm, that wasn't safe mode. I did it, and my car turned into an Autobot. How the hell do I make it into a car again? I have to drive to work in the morning. It might seem cool, but having a giant robot walking down the highway is bound to freak out at least a few people. DHS may have something to say about my walking car with giant guns too.

      --
      Serious? Seriousness is well above my pay grade.
    25. Re:Why they tell you to turn off your phone... by AussieNeil · · Score: 2, Interesting

      This was indeed a real problem in the late 70's, particularly for DRAM chips and only ceased to be a problem when manufacturers tightened up on the allowable level of impurities in materials near the memory chips, such as the encapsulating plastics and the chip coatings used within ceramic ICs. Many elements have naturally occurring isotopes that are radioactive and DRAM errors are dependent on the concentration of these within materials surrounding the memory chip and the radioactive decay method. Back then of course we had atmospheric atomic testing and straw packing material was a good way to capture atmospheric fallout (and a good way to get fogged photographic film too). When you consider the effect of Moore's Law on the size of the capacitor used within the DRAM over the last 30 years (the bit flip is caused by the radioactive decay particle discharging this capacitor) and the fact we can't make perfectly pure materials at an economic cost, it is surprising that this problem is not more obvious now. I suspect software bugs are more likely to be the cause however.

    26. Re:Why they tell you to turn off your phone... by putaro · · Score: 2, Interesting

      The effect of random bit flips on software is going to be hard to define. Modern hardware probably has all of the code running in RAM, not ROM as it would have been back in the 80's. A bit flip in a register could cause very odd things to happen. Perhaps someone coded a loop like:

      for (i=0; i!=10; i++)
          do_something();

      Flip a bit in the register and that loop will not terminate until the register overflows.

      I don't think you can code so that random bit flips will not be a problem. The hardware needs to be robust enough to catch them and either fix them or at least throw an error so that things can be reloaded.

      I haven't looked at the communications protocols in use between the various modules but it wouldn't surprise me if there were a lot of possibilities for errors in there as well. Software engineers will put a lot of reliance on "checksums" and swear up and down that there is no possibility for things to go wrong, but in the end it turns out the checksums used are not very robust. TCP/IP checksums, for example, are almost worthless but most TCP/IP communications takes place over links with robust checksums so they're not tested very much. I implemented very simple links (TCP/IP over a VME bus - don't ask it was a whacky idea) and found that single bit errors in the hardware could get through a single layer of the checksums quite easily (that is, it would pass the IP checksums but the TCP checksums would catch things).

    27. Re:Why they tell you to turn off your phone... by gtall · · Score: 2, Informative

      Boeing's 737 production since 1967: 6,285 aircraft
      Toyota's production in 2007 alone: 8,880,000 vehicles

    28. Re:Why they tell you to turn off your phone... by Agripa · · Score: 2, Interesting

      When you consider the effect of Moore's Law on the size of the capacitor used within the DRAM over the last 30 years (the bit flip is caused by the radioactive decay particle discharging this capacitor) and the fact we can't make perfectly pure materials at an economic cost, it is surprising that this problem is not more obvious now. I suspect software bugs are more likely to be the cause however.

      The last few process generations of DRAM have not become more susceptible to radiation induced soft errors as originally predicted but instead have leveled off or even gotten a little better. CPU static RAM based cache has an order of magnitude higher susceptibility for a number of different reasons but there, ECC (or parity for instruction cache since bad instructions can just be reloaded) has been routine for quite a while. Larger memory sizes make systems as a whole more susceptible though and the cosmic ray induced soft error rate is measurable on modern PCs with altitude making a difference of at least 2 orders of magnitude. Sea level has about 1/10th the rate of Denver which has about 1/10th the rate of a cruising passenger jet airplane.

      For DRAM, I suspect what is going on is that the smaller charge storage volume means that any given ionization event is spread over more cells while each cell's higher charge density makes it less susceptible.

      I have had full ECC support on my last three home workstations (P3 1GByte, P4 2GByte, and now a Phenom 2 8GByte since Intel was not an option) but have not recorded enough events to draw a meaningful conclusion.

    29. Re:Why they tell you to turn off your phone... by GooberToo · · Score: 2, Informative

      I don't hear much about comsumer electronics being fritzed by cosmic rays,

      Chances are you'll be hearing about this more and more over the next several decades or so. Scientists have discovered a large spot over the Atlantic (IIRC) where high levels of cosmic radiation are actually making it to the ocean's surface. Further investigation indicates this is because their Earth's magnetosphere is beginning to significantly weaken. Furthermore, its expected that not only will the the level of radiation exposure continue to drastically rise at this particular location, but that radiation exposure globally will drastically rise.

      It turns out, it appears this is related to the shifting of Earth's magnetic poles. As the poles continue to migrate away from their axial positions, the earth's magnetosphere begins to dramatically weaken. Not to surprising, the protection extended to both artificial satellites and Earth's occupants will be significantly and negatively affected.

      Accordingly, expect far more electronics failures from cosmic radiation over the next several decades and beyond. And over the next thousand years, the levels of radiation may pose a significant risk to all life on Earth - or at least those on the surface. This of course, also suggests we will have a pole reversal sometime within the next thousand years.

      Obviously far more research is required.

    30. Re:Why they tell you to turn off your phone... by Gordonjcp · · Score: 2, Insightful

      Is there a reason why cars aren't doing the same thing?

      Because there's no way that these problems are cause by "cosmic rays". If it *was* a problem, then we'd be hearing about all kinds of random electrical problems in all kinds of vehicles. Cars have had computer-controlled fuel injection and ignition for over twenty years now. Granted, the 68000-based engine management unit in my 1990 Citroen XM has a smaller transistor density than the extremely compact and powerful processors in modern systems, but if cosmic rays were flipping bits then the problem would not be confined to one manufacturer or one model.

    31. Re:Why they tell you to turn off your phone... by tlhIngan · · Score: 2, Interesting

      I don't hear much about comsumer electronics being fritzed by cosmic rays, or microwave ovens, etc, though I suppose this might explain the random failurs. But comsmic radiation? That's a new one.

      It's quite common actually, and many documented studies have proven it does occur. You don't hear much because well, the effects are minimal in most cases. A flipped bit in RAM does nothing if it's just unused memory, for example. Or maybe it flips the bit in an unused register (that's getting reloaded with new data). Or alters the result of an unused computation unit. Heck, there were old RAM chips made with somewhat radioactive encapsulation - the computers they were in crashed more frequently than normal.

      Other times, it may show up as a graphical glitch in a game - a fiddly pixel that goes away on next refresh, or other unnoticed operation. If it damages a critical data structure, well, an application just crashes. If it gets really lucky and gets a crucial kernel data structure, then the computer crashes/panics/BSODs.

      The amount of data damaged is on the order of a bit. Depending on the whole system, that bit could be nothing (i.e., unused), unnoticable (a flicker in a pixel in the framebuffer), or crucial (application/OS crashes).

  2. Is there realy a problem? by LostCluster · · Score: 5, Insightful

    Since the biggest Toyota runaway story has turned out to be a problem exists between seat and pedals situation... is this all hype with no science behind it?

    1. Re:Is there realy a problem? by Anonymous Coward · · Score: 2, Insightful

      >Executables can have hashes like MD5 and SHA checked before being allowed to execute, etc.

      That's a ONE TIME check when you load the program. Sure it can check if the data in the FLASH has start to corrupt or someone has tempered the firmware. However, It doesn't check the memory once the coding is running which is 99+% of the time the code is doing. Cosmic ray can be hitting your car ANYTIME and not just when it is parked.

      ECC checks the memory bits during access and you can have periodic scrubbing to check for any changes. It has a higher chance of finding issues that are transient nature.

    2. Re:Is there realy a problem? by ShakaUVM · · Score: 2, Interesting

      >>Confirmed cases of runaway acceleration are virtually non-existent.

      And how do you confirm it? Ask the person?

      My '84 Cutlass Supreme went out of control accelerating when I was driving on the campus loop (back in '97 or so), but how could you confirm this? It did happen, but how can you verify it? (I've posted the story on Slashdot before, if you really dig back into my history, long before the runaway Toyota thing entered our national consciousness.)

      And to the snarky people posting on this - it's terrifying as fuck for your car to accelerate arbitrarily fast (especially when you run a stop and have to dodge pedestrians), and no, the brakes didn't work. Long story short, I had to kill the gas and use non-power assist brakes to come to a stop, fortunately without killing anyone.

    3. Re:Is there realy a problem? by MadShark · · Score: 2, Interesting

      The problem is that many microcontrollers used in automotive systems don't have support for ECC or any other hardware based error checking mechanism. A lot of these systems only use the memory on the microcontroller chip. If there is external RAM on the unit, ECC memory isn't always used since it is more expensive. Flash is typically checksumed/CRCed/MD5 checked, but you don't typically see flash cells get flipped in the field. I've seen one unit get flash corrupted(out of many millions of possible units) in the last 11 years.

      It will be interesting to see if they get to the root cause of the problem. If it is an electromagnetic interference problem, it will be very difficult.

    4. Re:Is there realy a problem? by Jah-Wren+Ryel · · Score: 2, Informative

      Since the biggest Toyota runaway story has turned out to be a problem exists between seat and pedals situation...

      The article you linked to does not even begin to support that conclusion. Basically its a bunch of innuendo, like he [i]might[/i] have been late on payments on the car (since proven false) or that he should have shifted it to neutral (not an intuitive action for someone who has never driven a manual transmission - and certainly a last resort that does not negate the existence of a problem to begin with). Even information released after that article was published has been far from damning - basically toyota has said "we couldn't reproduce the problem" - as if "works for me" means there are no software bugs.

      The undisputed facts are that the brakes were severely worn (although Toyota claims that the wear is not consistent with emergency braking - huh?) and that the car's black-box showed that the guy hit the brakes over 200 times during the time of the incident and that a cop witnessed the guy practically standing on the brakes.

      Unless there is more that's come out recently, all facts released so far point to a failure with the car, not the nut behind the wheel.

      --
      When information is power, privacy is freedom.
    5. Re:Is there realy a problem? by AK+Marc · · Score: 2, Interesting

      And how do you confirm it?

      You replicate it and see if it happens again, or look for physical causes that might come to that result. Loose floormats have been confirmed to cause it. rusty/sticky throttle cables have been confirmed to cause it. Bad cruise control units have been confirmed to cause it (mostly because of physical errors, not all are electronic).

      But "the car accelerated, I applied the brake and only the brake once the acceleration started and pushed it as hard as I could and the vehicle continued to accelerate out of control" cases have, as far as I know, *never* been replicated. The brakes are somewhere around ten times more powerful than the engine. If you slam the brake pedal to the floor with all your might, you will stop all cars, unless your brakes failed before you tried to use them. So, every case of "I pressed the brakes as hard as I could with my foot off the throttle" defaults to someone that didn't have their foot on the brake and off their throttle.

      And to the snarky people posting on this - it's terrifying as fuck for your car to accelerate arbitrarily fast (especially when you run a stop and have to dodge pedestrians), and no, the brakes didn't work. Long story short, I had to kill the gas and use non-power assist brakes to come to a stop, fortunately without killing anyone.

      Another reason why manuals are better. You just put in the clutch, and the car stops accelerating. And turning off the car or putting it in neutral is so easy one wonders about the competency of the California trooper who was out of control for over a minute.

      But for brakes to not stop a car means the brakes are so bad that their failure should have been evident before the incident. Would you say the car you were in when this happened was in excellent mechanical shape without any problems braking or accelerating ever before that incident? I had a Cutlass Ciera of about that age that accelerated out of control once. It was the cruise control that got stuck in the "accelerate" position. The brakes worked. But the car is so crappy that if I'd used the brakes to hold the constant speed for 10+ seconds before trying to stop as fast as possible, they would have faded to the point they would be useless. So when people make reports, it's also interesting to me how long people are holding the brakes at low pressure before going to high pressure. Because, especially in crappy American cars, like Oldsmobiles, the brakes fade fast. They have more than enough power to stop you from 100+ mph under full acceleration, but can't do so after riding them for a mile.

  3. How about safe languages? by Anonymous Coward · · Score: 3, Funny

    I bet they still use C for these kinds of things, how about something safer, such as Eiffel?

    1. Re:How about safe languages? by istartedi · · Score: 2, Insightful

      If a cosmic ray flips a bit in the (insert safe language here) array boundary checker, then what?

      --
      For all intensive purposes, "whom" is no longer a word. That begs the question, "who cares"?
  4. No. by stonecypher · · Score: 4, Insightful

    There's a reason that our entire modern world doesn't come crashing to a halt around us every 30 seconds. If every CPU was vulnerable to bit flips from random radiation, every part of your house would be on fire and arcing electricity. Times Square would look like the bridge of the 60s enterprise under attack.

    This is just some douchebag professor trying to ride the tragedies to fame. There's a reason it's always hitting the same system in the car. It's because the system is defective. There's a reason the professor has nothing but speculation to back himself up.

    This is the worst kind of charlatanry from someone who should know better. I hope his hosting school takes this very, very seriously.

    --
    StoneCypher is Full of BS
    1. Re:No. by TheGeniusIsOut · · Score: 4, Insightful

      I can't even begin to calculate the probability of a single bit flip due to impact from a cosmic ray causing unintended acceleration in multiple vehicles. Possible? Certainly, nearly anything is. Plausible? Maybe in a very broad sense of the world. Likely? Not very.

      --
      Ignorance is Bliss -- And the Opposite is True -- Genius is Madness
    2. Re:No. by SeekerDarksteel · · Score: 4, Informative

      There's a reason that our entire modern world doesn't come crashing to a halt around us every 30 seconds. If every CPU was vulnerable to bit flips from random radiation, every part of your house would be on fire and arcing electricity. Times Square would look like the bridge of the 60s enterprise under attack.

      Actually, every CPU _IS_ vulnerable to bit-flips from radiation. That part of it is not speculation. It does occur in commodity processors, and with probabilities large enough that we have ECC ram, and ECC and/or parity in caches. Some servers actually come with built in hardware fault tolerance methods, because when you run hundreds of servers non-stop for years, the probability that a particle strike screws up a register on chip is non-negligible. Now, still, the probability isn't _huge_. Definitely not high enough to be causing these specific problems, especially when the failure is always in the same manner. _That_ part of it is pretty much bullshit.

      --
      The laws of probability forbid it!
    3. Re:No. by adolf · · Score: 2, Interesting

      You misunderstand my argument. That's OK -- it happens to me all the time.

      Allow me to rephrase: What are the chances of the RAM being marginally-bad in such a way as to allow unintended acceleration, while not producing any other symptoms?

      The chances of it being bad to begin with are slim (after all, all RAM is tested, often by more than one party). But this won't be just any RAM -- this will be, in today's terms, glacially slow RAM which has been tweaked to perfection over the past decade (or more), because the stuff that a Prius does just doesn't require anything lightning fast. (See, also: US space program.)

      I'll go ahead and answer the question: The chances of bad RAM causing unintended and irrevocable acceleration and no other badness are about the same as bad RAM causing your PC to boot up and say "Hello, world!" instead of loading an OS. Could it happen? Why, sure! (In other news: A thousand monkeys and a thousand typewriters will, eventually, produce the complete works of Mark Twain as long as you replace the parts when they wear out.)

      Will it happen? Ummm.......

      Will it happen more than once? Uh. Erm. *ahem*

  5. Sun UltraSPARC-II's anyone? by nbvb · · Score: 4, Insightful

    Sounds a whole lot like the e-cache parity errors in the Sun UltraSPARC-II processors.

    If you were never affected by that, consider yourself a lucky person.

    particle-caused bitflips are very much real.

    1. Re:Sun UltraSPARC-II's anyone? by Anonymous Coward · · Score: 2, Informative

      I work with someone who used to do tech support for Sun - those flips were due to a manufacturing error - tech support were just told to tells customers it was due to 'Sun Spots'.....

    2. Re:Sun UltraSPARC-II's anyone? by Anonymous Coward · · Score: 2, Interesting

      Actually, it was due to a design error, as the cache wasn't ECC protected and occasional bit-flips weren't detected.
      http://www.sparcproductdirectory.com/artic-2001-dec-1.html

    3. Re:Sun UltraSPARC-II's anyone? by dr2chase · · Score: 2, Insightful

      Right, but then more of them would appear at higher altitudes.

    4. Re:Sun UltraSPARC-II's anyone? by asaul · · Score: 3, Interesting

      I wouldn't say error, it was designed with parity protection only, so was incapable of correcting single bit errors, only detecting them. Hence, the reason for the crashes (i.e it detected a bit flip). If two bits were flipped you would never know.

      I worked in the Sun front line call support during this time, and explaining this over and over to customers was somewhat painful. Never mind the years of mocking that still come from telling customers "it was a cosmic ray". Sun put massive effort into tracking, diagnosing and fixing this issue though. Some customers got versions of CPUs with "mirrored" SRAMs. Sad to say, I remember customers still getting errors with those.....

      The US-III chips came out with end to end ECC protection, but the problems remained. In the end it turned out to be a host of socket mounting, pin contact and design specification issues that caused the errors, mostly solved by the time the 1200MHz CPUs were out. I wouldn't be surprised if it was something similar with the US-II.

      As for Toyota, if they dont have end to end ECC they only have themselves to blame.

      --
      "If everybody is thinking alike, somebody isn't thinking" - Gen. George S. Patton
  6. Prove It, Implement Fix, Pay Out Families by eldavojohn · · Score: 4, Insightful
    If this is true, recreate the phenomenon in a lab. Test your hypothesis by exposing the circuitry in question to similar radiation in a lab. While you can't test thousands of sets of circuitry, being able to recreate it by increasing the amount of radiation and testing or automating the testing and dosage cycle and letting it run until the malfunction is noted or another failure occurs.

    It's not out of the question, IBM noted in the 90s:

    Extensive background radiation studies by IBM in the 1990s suggest that computers typically experience about one cosmic-ray-induced error per 256 megabytes of RAM per month. If so, a superstorm, with its unprecedented radiation fluxes, could cause widespread computer failures.

    You have to fix this though. As a large manufacturer you have to accept this risk just like your competitors do. Airlines accept this risk and triple check their data because people's lives are at risk. As a car manufacturer, you are in the exact same position.

    I hope the fix they already rolled out as a recall includes triple checking data or -- if the article is correct -- we won't see a drop in these horrible accidents. I hope for drivers and public safety that it does. It's led to death and possibly wrongful incarceration. Restitution is in order. Take testing motor vehicles seriously.

    --
    My work here is dung.
  7. Space Rays, My Ass by WrongSizeGlass · · Score: 4, Funny

    Whether you subscribe to Occam's razor, or just plain old common sense, rays from outer space are not Toyota's problem (though they may be the author's problem).

    This type of thing is just plain bat shit crazy. There is a problem somewhere in Toyota's system somewhere. Either a software bug or bad chips or something real and tangible ... but rays from outer space? Please.

    If someone here on /. had posted that in the last Toyota story they would have gotten a +5 Funny.

  8. Re:Everyone Loves Space Ray by WrongSizeGlass · · Score: 3, Funny

    Tonight on CBS, a very special episode of Everyone Loves Space Ray:

    Space Ray: Hey, Deborah, did you hear what happened to my car?
    Deborah: Don't worry about it, Space Ray, you didn't cause it this time (simulated audience laughter)

    With a special guest appearance by Ace Frehley as "Just Another Confused Alien". Coming up right after "The Ghosts of Gilligan's Island"

  9. Problem IS from outer space... by AliasMarlowe · · Score: 2, Funny

    Since the biggest Toyota runaway story has turned out to be a problem exists between seat and pedals situation..

    Ignorant alien between seat and pedals. Toyotas were designed for humans to drive. 'nuff said.

    --
    Those who can make you believe absurdities can make you commit atrocities. - Voltaire
  10. What if the cosmic rays... by neiras · · Score: 3, Funny

    Single-bit errors shouldn't send the car out of control... there should be some checksum that shouldn't add up.

    What if the cosmic rays corrupted the checksum routine?

    The mind boggles!

  11. Cosmic Connection? by Anonymous Coward · · Score: 2, Insightful

    So, in the case of Toyota, these cosmic rays are very clever. They targeted cars in the US and not cars in Japan or other countries. How did the rays target selective areas of the planet? Did they choose highly litigious geographical areas?

    I predict government grants will be spawned to finance new careers (and even a new federal agency) in Terrorist Cosmic Ray Detection and Analysis (TCRDA) to protect the US from these rogue rays.

  12. Frontline Auto Engineer's Perspective by jim_k_3038 · · Score: 5, Informative

    While working for Motorola, I worked on electronic throttle control (ETC). We spent a ton of time working to make the system "fail safe". I think we all had in the back of our minds that it was only a mater of time before we would have to testify as to our engineering decisions.

    My little part of ETC involved adding a sub processor which watch-dogged the main micro. The little micro asked a series of questions of the main micro. Both processors would need to agree on all the inputs and output of the system. The little micro would also ask question regarding real time OS (RTOS) of the main micro. The main micro would need to have tasks executing in the right order to satisfy the small micro. Lastly, the small micro would ask the main micro to perform math operations to verify accuracy. Oh, and the main micro was continuously checksumming it's memory too.

    Both micros had a direct hardware disable path to the H-bridge which was delivering power to the throttle plate. The throttle plate was spring loaded, so, with power cut, the throttle plate would snap to an idle position.

    Next came the electro / magnetic compatibility testing (EMC). We spent months inside huge chambers testing both radiation and susceptibility. One of the tests for susceptibility involved using a zap gun to spark a 20kV spark on each pin of our ECU. Not satisfied with that, our customer opened one of our modules and used a sparking spark plug to slowly zap our board to failure. Bottom line, that throttle plate better never stick one way, or the other.

    In the end, it always amazed me that the whole thing would work at all. Seemed to me that the system was always seconds away from going into some kind of fail safe mode.

    No, a stray bit flip is not going to facilitate a run away car. Least not on my system!

  13. McMurdo by Unxmaal · · Score: 5, Interesting

    When I was working for NASA, on the NISN network, we'd get these weird router crashes for the old Cisco router located at (or very near) the South Pole in Antarctica. It was always a memory problem, and I'd always have to call someone to get them to powercycle the router. It irritated me to keep bothering those guys, so I opened a case with Cisco TAC.

    The TAC guy sent a terse response, saying that particular crash was a "transient memory error" due to "alpha radiation or sun spots." That really pissed me off -- Cisco TAC just gave me a standard BOFH response! I escalated, and swung the NASA club around some, and finally got a senior engineer on the phone. "You said this router's at the South Pole, right? So that means it's at very high altitude, with very little ozone shielding, right?" "Umm, yeah." "Well there you go. There's a lot more radiation at that altitude than at sea level. Our stuff's only rated for sea level. See if they can .. I dunno, put a lead blanket over it or something."

    I relayed the info to my contact at McMurdo, and he laughed and said he'd figure something out.

    On a hunch, I checked the other two "high-altitude" routers we had, and sure enough, they both had a statistically higher failure rate for "transient memory errors".

    --
    http://unxmaal.com
    1. Re:McMurdo by Shimbo · · Score: 2, Insightful

      "You said this router's at the South Pole, right? So that means it's at very high altitude, with very little ozone shielding, right?" "Umm, yeah." "Well there you go. There's a lot more radiation at that altitude than at sea level.

      His explanation sounds a bit off; a few molecules of ozone may be good for stopping UV but I doubt it makes a lot of difference to cosmic rays.

      Just being at the South Pole is a much greater risk factor than mere altitude though, because it's where the magnetosphere funnels all the crap.

  14. IBM System/360 anecdote by Anonymous Coward · · Score: 4, Interesting

    My dad was an IBM CE (Customer Engineer) specialist on one of the models in the IBM System/360 mainframe range. He used to like telling the story about how he and another engineer were out on a customer's site trying to determine an intermittent fault. They would bring the machine up and sure enough there would be this glitch at precise intervals. They just couldn't figure out what was causing it. That was, until the other CE took a look out the window.
    After a bit he said 'Tell me when it happens'. OK... '...now' my dad said. Then he said 'I'll tell you when the next one happens' and a few seconds later said '...now'. Which is exactly when it did glitch.
    It turned out that the customer's DP center was situated close to an airport. The CE could see the radar dish revolve at the end of the runway. When it pointed straight at him was when the glitch occurred. Needless to say the computer room received some RF shielding.

  15. Weird by AmonTheMetalhead · · Score: 3, Interesting

    Having heard all these stories really makes me wonder, i live in Belgium where cars with manual gear boxes are the common norm, and i've had my car accelerate like nuts once (pedal got stuck because of the floormat) i shifted to neutral, turned of the engine & used my momentum to get to the side of the road where i could dislodge the mat.

    Are manual gearboxes that rare in the States?

  16. Reverse car analogy? by mangu · · Score: 2, Interesting

    It seems like the only people who don't trust Toyota anymore are people who drive non-Toyota vehicles. It reminds me of the Linux users who say Windows crashes all the time.

    Wrong analogy. Windows does crash a lot. It should be "It reminds me of Windows users who say Linux isn't ready for the desktop".

    Funny, this is the first time I ever saw a computer analogy used to explain a car problem in Slashdot. But, come to think of it, this is a rather neat analogy. Toyota is blaming their problems on driver error, Microsoft says third-party drivers are the only cause of crashes in Windows ever since XP came out.

    Both of these corporations are *wrong* at that, any system should be resistant to outside errors.

    A computer shouldn't crash just because a hardware driver fails. I have seen several Linux computers freeze when running some graphics applications, ATI cards are particularly prone to this, but you can still enter through the network and kill the offending application or, at worse, restart the windowing system. The fault with Windows is not the third-party hardware driver, it's the windowing system being built into the operating system.

    Likewise, a car shouldn't depend entirely on one computer system for operation. Brakes, even with anti-lock, should have a hydraulic system that should always be able to stop the wheels from turning if the driver presses hard enough on the pedal. The transmission should have a mechanical lever that puts it into neutral. Steering should be operable by mechanic links from the wheel if the power-assisted system fails.

    All this because a broken mechanical link or a leaking hydraulic system can be seen, or heard, but a software bug will remain lurking there undetected until it kills you.