Slashdot Mirror


Intel Patents On-Chip Cosmic Ray Detectors

holy_calamity writes "Intel has been awarded a patent for building cosmic ray detectors into chips, to guard against soft errors where a high energy particle from space changes a value in a circuit. It's a problem that largely only affects RAM. As component sizes shrink futher, "this problem is projected to become a major limiter of computer reliability in the next decade", says the patent. Intel's solution is to build in a detector that responds to cosmic errors by repeating the latest operation, reloading previous instructions, or rolling back to a previous state. You can also read the full patent."

100 comments

  1. Butterflies by mathnerd314 · · Score: 1, Funny

    I guess the butterfly stunt in http://xkcd.com/378/ wouldn't work after all.

    --
    Quidquid latine dictum sit, altum viditur.
    1. Re:Butterflies by Xacid · · Score: 2, Funny

      I actually thought this would have been more of a successful excuse to insert the phrase "Pick it up and reverse it" within a patent, but I was sadly disappointed. :) Yes, I'm referencing this: http://xkcd.com/153/

    2. Re:Butterflies by Thing+1 · · Score: 1

      Yes, I'm referencing this: http://xkcd.com/153/
      The math seems wrong (bit 6 on the bottom row should be 1, not 0).
      --
      I feel fantastic, and I'm still alive.
    3. Re:Butterflies by Jehosephat2k · · Score: 1

      Umm, check again....

  2. Mainframes allegedly already do this by Florian+Weimer · · Score: 2, Interesting

    But you can't really verify it because those events are so rare. It seems to me that Intel's innovation is to use some sort of detector, instead of using two or more chips and a comparator. It's probably way cheaper, but it won't work if the majority of unexplainable events are not, in fact, caused by cosmic rays but by some other effect (perhaps something temperature-related).

    1. Re:Mainframes allegedly already do this by Ceriel+Nosforit · · Score: 3, Interesting

      I saw a display in the visitors' center at CERN that detected cosmic rays. A cloud chamber, maybe.

      Either way, the... 2m by 2m (IIRC) display would detect cosmic rays about once every 2 seconds. This would mean my PC case is perforated by cosmic rays several times each minute. That's not rare.

      --
      All rites reversed 2010
    2. Re:Mainframes allegedly already do this by confused+one · · Score: 2, Informative

      Actually you can prove cosmic rays cause memory errors. IBM did so in the 90's; there was mention of this (and a link) in the article. As memory cells become smaller they WILL become sensitive to ionizing radiation. Intel seems to think we will get there sometime in the next decade or so.

    3. Re:Mainframes allegedly already do this by kesuki · · Score: 1

      that would be why servers etc all use ECC type RAM, with that extra parity bit, it's easy to detect the corruption of memory, and re-do whatever needs to be redone. the difference is that now intel is putting in some sort of cosmic ray detector, rather than detecting what happens to the ram...

      it seems painfully inefficient to 'redo' stuff that doesn't seem to be wrong just because a cosmic ray was detected. it's not like cosmic rays can be easily blocked, either, you could put the computers under a mountain, or design some sort of force field... that would drastically increase the electric draw.. the easiest solution is to simply stop shrinking transistors before they become unreliable.

    4. Re:Mainframes allegedly already do this by SeekerDarksteel · · Score: 3, Insightful

      it seems painfully inefficient to 'redo' stuff that doesn't seem to be wrong just because a cosmic ray was detected.

      1) The likelihood of a cosmic ray is ridiculously small. So small in fact that the cost of rewinding progress when they are detected would be completely unnoticeable.

      2) We *do* have the ability to package CPUs such that they are protected by CPUs. The problem is that the packages are so large and expensive that no one would buy them given the current probability of soft errors.

      So the solution is most definitely NOT to stop shrinking transistors. Even in 10 process technology generations, the mean time to a soft error actually affecting a bit on a CPU is something like 1 million hours. Never mind whether or not that particular soft error is critical.

      --
      The laws of probability forbid it!
    5. Re:Mainframes allegedly already do this by kesuki · · Score: 3, Informative

      well, if the detector is the size of a penny, then yes probably pretty rare to detect cosmic rays... but if the detector is the size of a pc case, it will get hits every few seconds. cosmic rays ARE very common, and not all of them are magnetically deflected, or stopped by the atmosphere. they just happen to be very small, and the frequency of hits to a small target is less than to a large target. about 8% of the radiation humans are exposed to each year are from cosmic rays. http://en.wikipedia.org/wiki/Cosmic_ray

      so clearly to a human sized target, the impact ratio is significant.

    6. Re:Mainframes allegedly already do this by palegray.net · · Score: 2, Interesting

      There's a reason satellites are chock full of Z80 processors: reliability in higher radiation environments.

    7. Re:Mainframes allegedly already do this by mrmeval · · Score: 2, Interesting


      http://www.allbusiness.com/technology/software-services-applications-programming/6493163-1.html

      MIL-STD-1705A radiation-hardened processors would be another choice. This company offers Linux support for what is normally so damned proprietary it's sekret. I don't know their product but just about anything that allows C to supplant ADA and JOVIAL can't be all bad.

      --
      I'd go on a Vegan diet but the delivery time from Vega is too long. --brownkitty
    8. Re:Mainframes allegedly already do this by lordholm · · Score: 1

      Most high end machines support ECC RAM if that is what you mean. Recovering from failures in CPUs is on other issue.

      In the space sector we have fault-tolerant CPUs that have ALUs and register in triple redundant configurations.

      It is exciting that the large CPU manufacturers are taking this seriously now, this might mean that we can fly COTS CPUs in the future space missions (a system I am working on is using a 25 MHz SPARC v7 (you cannot even get the v7 manual anymore), so you can immagine of how big difference it would be to have Intel stamping out 2-3 GHz rad-tolerant CPUs compared to what we are using now).

      --
      "Civis Europaeus sum!"
    9. Re:Mainframes allegedly already do this by kg123 · · Score: 1
      yes... mainframes do this already.... alot of time people talk about "cosmic background radiation" (which does occur) but historically another big source of radiation is small ammounts of radioactive heavy metals sneaking into solders... which can be more impactful just based on proximity to silicon.

      I haven't read the patent.... but I wonder if there are issues with the detector location vs where the event occurs... how many detectors does it take to cover a 20X20 mm chip? is it really worth that real-estate in the dead center of the chip? ( I think this is why traditionally cross-checking or voting is the prefered method)

    10. Re:Mainframes allegedly already do this by FractusMan · · Score: 1

      I work for Sun Microsystems, and it's not that rare. Perhaps it's because I see the errors when they happen on all the systems we have across North America as people report them, but I get at least a few hits a day regarding single bit ecache errors. I would say they happen probably once, in a five year period, for each processor. Well, maybe less. Actually, the more I think about the amount of systems we have out there, and that there's a few a day... Hell, I guess it is pretty rare. But it happens!

    11. Re:Mainframes allegedly already do this by Iron+Condor · · Score: 1

      As memory cells become smaller they WILL become sensitive to ionizing radiation.

      I have heard this claim before and I have yet to see any kind of credible argument for this. The ionization energy loss of a charged particle penetrating matter is proportional to the distance travelled -- so a smaller memory cell may need less energy to "flip" it, but it will also receive less energy by a passing particle. Thus if the aspect ratio (thickness to length) doesn't change, I see no particular reason why smaller transistors should be "more sensitive" to cosmics. To the contrary: a smaller area means they're harder to hit by a passing particle.

      As far as I can tell, cosmic rays deposit so-much energy, with such-much probability, per time, per area, for a given slab of silicon -- and making the slab thinner only reduces the energy deposit.

      If we keep the thickness constant, we just hit smaller targets. So instead of one bit-flip per week on our 1MB chip, we now get one bit-flip per week on our 1GB chip (which has the same area as the 1MB chip but smaller transistors).

      Is there some kind of credible source for the claim that smaller transistors somehow lead to greates sensitivity to cosmics? (And no, I do not consider any kind of claim by anybody in any kind of journal to be "a credible source" if it is merely regurgitating the same claim without somehow backing it up).

      --
      We're all born with nothing.
      If you die in debt, you're ahead.
    12. Re:Mainframes allegedly already do this by kg123 · · Score: 1

      uhhh usually they refer to alpha particles.... you know, helium nuclei.... those tend to not be uniformly distributed... but "lumpy". Alpha particles can have a couple of MEV of energy... but it may only take 1 MEV of energy to "upset" the smallest of latch structures. I know chip designers who had to spend man-months analysing which unprotected latches held which system critical bits and how susceptible each latch type was to radiation.... that was 90 nm.... imagine a 45 nm part. there is also a pretty well known incident of sun servers that shipped without ECC memory that had a huge reliability problem... the only solution was to move all the susceptible parts to the basement/caves...

    13. Re:Mainframes allegedly already do this by Florian+Weimer · · Score: 1

      I saw a display in the visitors' center at CERN that detected cosmic rays. A cloud chamber, maybe. Detecting cosmic rays is not the problem. It's difficult to verify that a piece of hardware is actually fault tolerant with regard to cosmic rays. (I don't believe in mainframes, that's why I put the caveat into the original posting. 8-)

    14. Re:Mainframes allegedly already do this by Florian+Weimer · · Score: 1

      Most high end machines support ECC RAM if that is what you mean. No, mainframe CPUs typically run in pairs or triples and are supposed to recover from errors (not just cosmic rays, too).

      It is exciting that the large CPU manufacturers are taking this seriously now, this might mean that we can fly COTS CPUs in the future space missions (a system I am working on is using a 25 MHz SPARC v7 (you cannot even get the v7 manual anymore), so you can immagine of how big difference it would be to have Intel stamping out 2-3 GHz rad-tolerant CPUs compared to what we are using now). I suppose the radiation in space is on a somewhat different level, so you still need special rad-hard chips. I guess you can consider yourself lucky if your locked on SPARC v7 because compared to other options, it's still reasonably close to some industry standard.
    15. Re:Mainframes allegedly already do this by gumbi+west · · Score: 1
      there is actually a constant stream of muons, neutrons, gammas, and electrons streaming down from above. The number passing through a chip is much higher than what you calculate on a per minute basis. In general neutrons and gammas will pass through a thin item (like a chip or gas chamber) and when they do deposit energy, there will be a lot deposited right there. Muons and elections on the other hand leave a constant trail of ionized particles behind them.

      The real question is, how much energy deposition is necessary to cause a bit flip. To answer this, there are many papers on the bit flip rates to various chips to various kinds of memory measured at accelerators and the like. There are also maps of cosmic ray (by particle and energy) density across the US and the world. Canada is actually right below the magnetic North and so it gets the highest flux rate because of the minimum amount of deflection, and the US is a close second.

    16. Re:Mainframes allegedly already do this by RedWizzard · · Score: 1

      This would mean my PC case is perforated by cosmic rays several times each minute. That's not rare. But who cares if a ray hits your power supply? It's got to hit the right piece of silicon on the memory chip. The target is many orders of magnitude smaller in volume than the case.
    17. Re:Mainframes allegedly already do this by RedWizzard · · Score: 1

      well, if the detector is the size of a penny, then yes probably pretty rare to detect cosmic rays... but if the detector is the size of a pc case, it will get hits every few seconds. So what? It doesn't matter if a ray hits the case, it only matters if hits a vulnerable part of a memory chip. The target is a lot closer to the size of a penny than the size of a case.
  3. How? by mistersooreams · · Score: 4, Interesting

    How did they manage to build a detector that can work out whether the cosmic rays collided with the actual bits (no pun intended) that hold the data? According to the oracle, cosmic rays collide with nuclei in an essential random way, so there's no way a detector could just see a ray passing through and know whether it was on a collision course. Perhaps they are detecting the pions and other subatomic particles that result from a collision actually occurring? If they've found a way to do that then it sounds fairly ingenious to me and a well-deserved patent.

    1. Re:How? by hedwards · · Score: 3, Informative

      They didn't, they've created a detector which works out whether the chip was hit by a cosmic ray or not. Then the ram is somehow restored to the state previous to the last operation and that operation is then repeated. I'm not even sure that hit is the right word, they've developed a detector that is capable of knowing when a cosmic ray travels through the same space as the chip, I don't know that they care whether or not the ray actually hit something or just traveled through the open space between the atoms.

      It's a lot less likely to cause problems than trying to guess which bit it was, and far less expensive than building a RAIMM(TM) to compensate for it.

    2. Re:How? by johnny+maxwell · · Score: 2, Insightful

      [quote]They didn't, they've created a detector which works out whether the chip was hit by a cosmic ray or not.[/quote]

      As the GP said, there is no way of knowing wheter a cosmic ray passed through you or not. The cosmic ray could easily just smash your bit to a new, random state and pass happily unhindered through the actual detector thingy. Only way to improve the situation would be to build a large detector volume (at least a couple cm^3).

    3. Re:How? by Abjifyicious · · Score: 1

      I think you missed the point the parent was trying to make. There's a catch-22 going on here: you can only detect a cosmic ray by interacting with it, but if you can interact with it then it's not a problem, because once it interacts with something then it's gone. All in all, this sounds suspiciously like a patent on parity ram disguised as something else.

    4. Re:How? by deblau · · Score: 2, Informative
      Next time, please read before posting. Oh wait, I must be new here.

      In some embodiments, the cosmic ray detector detects the debris tract of a cosmic ray. In some embodiments, the cosmic ray detector includes large, distributed P-N junctions to gather charge. In some embodiments, the cosmic ray detector includes optical cosmic ray detectors embedded into some optically clear supporting insulator such as diamond thermal spreaders. For example, one million electron-hole pairs may create a large number of recombination photons. In some embodiments, a scintillator panel (which gives off small flashes of light (photons) when a charge particle passes through it), a light guide to direct light from the scintillator, and photon detectors may be used.

      In some embodiments, the cosmic ray detectors include an array of micro-electro-mechanical systems (MEMS). MEMS cosmic ray detectors may be an integration of mechanical elements, sensors, actuators, and electronics on a very small scale. The cosmic ray detectors may include tips or other strain detectors to detect the shockwave from the nuclear collision by means of acoustic waves propagating through the substrate.

      --
      This post expresses my opinion, not that of my employer. And yes, IAAL.
    5. Re:How? by Anonymous Coward · · Score: 0

      That seems rather simple, just change random bits back, that ought to work . . .. ..

    6. Re:How? by Waffle+Iron · · Score: 4, Informative

      but if you can interact with it then it's not a problem, because once it interacts with something then it's gone.

      With cosmic rays, it's not just "gone". Instead, you get a shower of new energetic particles generated by the collision which compounds the risk of operational errors. The patent specifically mentions alpha particles knocked out of the atoms in the chip by the ray which travel through the circuits causing havoc.

      The patent also mentions that the detector may sense side effects of collision (such as voltage spikes) rather than the ray particle itself. Thus, the damage has already been done by the time the detector sees the event.

    7. Re:How? by andruk · · Score: 0

      As a physics student taking a class/lab on subatomic particles/rays, this guy is 100% correct. "Interaction" is NOT the same as "annihilation". The only thing I can really think of trying is a Faraday cage, and I don't even know if that would work or not, or if there would be any unintended consequences from it. Funny thing is, one of my prof's that is a ham guy has what is basically a vault for fooling around with circuits that are very sensitive, as there are radio towers located nary a few miles away.

      So, yeah, just because rays "interact" does not mean that the ray is "annihilated".

    8. Re:How? by Agripa · · Score: 1

      Cosmic ray impacts can inject carriers into the substrate which is a detectable condition. In older processes and especially in CMOS processes that do not use well isolation based on an insulator, carrier injection into the substrate can cause problems from random bias changes to destructive SCR latchup. You can see this in some analog processes where multiple circuits share the same substrate when you overdrive an input or output pin forward biasing the protection diode injecting carriers into the substrate and causing very odd problems in seemingly unrelated areas of the chip.

      Early radiation hardened processes avoided using junction isolation for this very reason. By avoiding it, they became much more resistant to single event upset.

    9. Re:How? by Profane+MuthaFucka · · Score: 2, Funny

      Wait a second... I'm pretty sure that none of what you said is in the Bible.

      --
      Fascism trolls keeping me up every night. When I starts a preachin', he HITS ME WITH HIS REICH!
  4. ECC Memory not good enough? by beefsprocket · · Score: 4, Insightful

    Cosmic ray detector certainly makes for better marketing hype than ECC.

    1. Re:ECC Memory not good enough? by Anonymous Coward · · Score: 0

      ECC = Estimator of Cosmic Collisions.

    2. Re:ECC Memory not good enough? by Tablizer · · Score: 2, Funny

      Cosmic ray detector certainly makes for better marketing hype than ECC.

      Yeah, its utterly ridiculous to believe that strange radiation from outer space can mes#[!^ ~` . '

  5. Rollback? Repeat last operation? Not likely. by BenJeremy · · Score: 2, Interesting

    It's just as likely registers could be corrupted, or the "rollback" state. Wouldn't be easier to have, I dunno, maybe error correction/detection involved, instead of some arbitrary cosmic ray detector?

    Sometimes the more "esoteric" designers attempt to get simply leads to more potential for disaster.

    Cosmic ray detection would be far better for random number generation, than anything else.

    1. Re:Rollback? Repeat last operation? Not likely. by RedWizzard · · Score: 1

      It's just as likely registers could be corrupted No, it's not. CPU registers are a few hundred bytes worth of storage. Assuming the registers have the same density as the main memory then the area taken up by registers is millions of times smaller than the main memory. Even if you include cache, the main memory is by far the largest target - by orders of magnitude.
  6. Just fantastic by elrous0 · · Score: 3, Funny

    I know at least four people who REALLY could have used this. Oh well, too late now.

    --
    SJW: Someone who has run out of real oppression, and has to fake it.
    1. Re:Just fantastic by Anonymous Coward · · Score: 0

      An allusion to Fantastic Four...

    2. Re:Just fantastic by f8l_0e · · Score: 0

      Being a super hero yourself, Captain Obvious, I'm sure you hang out with them quite often. Maybe you could help them solve the mystery of the anonymous joke killer.

  7. patenting ideas is patently stupid... by mnewcomb · · Score: 1

    /sigh They haven't built it... They haven't even designed it... They don't even know if it is possible... Yet... they can patent it? Patents should require proof of a working/workable DESIGN...

  8. Networking possibilities for science? by RalphBNumbers · · Score: 1

    It seems to me that, even if the individual detectors are very simplistic, and the geocoding of inputs is very rough, there would probably be some interesting scientific uses for a multi-million node planet-sized distributed cosmic ray detector.

    Does anyone in an relevant field see a good use for this?

    --
    "The worst tyrannies were the ones where a governance required its own logic on every embedded node." - Vernor Vinge
    1. Re:Networking possibilities for science? by the_brobdingnagian · · Score: 2, Insightful

      I work on distributed cosmic ray detectors. The patent is very sparse with details, so it's difficult to say much about it. The biggest problems I see are timing and data analysis. The detectors need to have a synced clock to within a few nanoseconds. This is possible with GPS if you know all the circuitry and the delays therein. But I don't think you could do it in normal pc's. Now each pc needs at least two detectors to do some triggering before you send the data. If you don't you'll end up with huge amounts of "noise" data. After that you still have a huge pile of raw data collected from a collection of (probably crappy) detectors who are not calibrated.

    2. Re:Networking possibilities for science? by VENONA · · Score: 1

      "After that you still have a huge pile of raw data collected from a collection of (probably crappy) detectors who are not calibrated."

      That may be the best argument against the whole 'wisdom of crowds' Web 2.0 thing that I have ever heard.

      --
      What you do with a computer does not constitute the whole of computing.
  9. Shouldn't take long... by canada_dry · · Score: 3, Interesting

    It won't take long for someone to figure out how to detect the gamma errors and create what amounts to a geiger counters on laptop computers. If this bill passes http://www.villagevoice.com/news/0803,thompson,78873,2.html will everyone be required to get a permit for their laptop computers? ;-)

  10. Poster Child by Gresyth · · Score: 0

    I nominate this patent as the poster child for Patent Reform.

    --
    Tech Support: "No, sir...clicking on 'Remember Password' will NOT help you remember your password."
  11. Current work and contribution of this paper by quo_vadis · · Score: 5, Interesting

    Currently, chips (both computational and memory) are protected against soft errors using multiple methods. There are rad hardening methods (both hardware and software) and most of the latest research involves using error correcting codes. Simply duplicating the output and comparing can only detect errors in one bit. The more the times you duplicate, the more you can detect (it progresses as n-1), and the max length of error that can be corrected is half that. However, this takes a lot of space (duplication that is), so generally other codes such as Hamming or BCH codes are used.

    The main problem using codes and everything is that cosmic ray errors cause whats called single event upsets and most codes can not detect 100% of errors where the hamming weight of the error (sum of number of ones in the error vector) is larger than the designed specification of the error. The problem comes when the SEU manifests itself as a multi-bit fault and the error vector cannot be detected by the code. SEU's are the most common type of errors in space application : See http://www.eas.asu.edu/~holbert/eee460/see.html

    The contribution of the cosmic error detector is that if you know you have a cosmic ray at some point in time, you can flush and redo your computation (for computation channels eg microprocessors etc) or flush that line in memory (for memory channels) in case of SEU's and that is a pretty big deal.

    --
    Legally obligatory sig : My opinions are my own... etc etc
    1. Re:Current work and contribution of this paper by museumpeace · · Score: 4, Interesting
      you mention rad hardening...some of that tech. would have been first needed in military satellites and so not necessarily divulged in a patent. One kind of rad hardened circuit that used to be prohibitive but with advances in solid state fab requires a particular kind of redundancy. It has been described in prior literature kinda like this: build a functional duplicate of each storage or processing element in a parallel layer so that ...
      • each element is aligned right over its corresponding element in the 2nd layer.
      • bias the logic of one layer such that the burst of conduction band electrons that would accompany a gamma ray hit will report a false "1" if anything.
      • bias the corresponding logic in the other layer so that that same burst of electrons...which will befall it at exactly the same time an place as its aligned circuit...will fault to a "0",if anything
      • gate the primary layer's output by the !XOR of the two layers: whatever the state of the circuit was supposed to be, it will be disabled until the transient from the gamma ray has been quenched
      --
      SLASHDOT: news for people who can't concentrate on work or have no life at all and got tired of yelling back at the TV.
    2. Re:Current work and contribution of this paper by pyro_peter_911 · · Score: 1

      SLASHDOT: news for people who can't concentrate on work or have no life at all and got tired of yelling back at the TV.
      I see we've both made it to the correct place.

      Peter

  12. Processor instruction retry by br1an.warner · · Score: 3, Informative

    POWER6 has actually be shipping with this for a while - if an instruction fails (cosmic ray or not, although in terms of random bit-flipping events they account for a large percentage), it gets automatically retried, transparently to the rest of the system. Without this sort of thing you generally take a hard fault - so this type of protection is great to see. Same thing on a SPARC64, incidentally (but not UltraSPARC - ie Niagara or children). What sets the POWER6 apart from both SPARC64 and this patent is if that instruction fails repeatedly Possibly indicating a chip fault), in many cases it can actually back the instruction out of the failing core and slap it onto another core, also transparently and avoiding a hard crash. Someone noted that this has been done on mainframes for years - yup, also true. This is another case of UNIX-class technology making inroads up the platform stack.

  13. cosmic by hhawk · · Score: 1

    In the late 70's TC May, an scientist working at Intel proved that cosmic rays could flip bits... given that discovery was many years ago, it seems rather clear that as chips get smaller, etc. that cosmic ray dectection could be a good thing on chips. http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1479948

    --
    http://www.hawknest.com/
  14. If the problem is with RAM... by sbaker · · Score: 3, Informative

    For RAM - there is really no problem - just use error checking. It's got to be easier to add an extra couple of bits to the width of your RAM to permit error-correction than to have a cosmic ray detector for every single bit.

    The tricky problem isn't RAM - it's computational elements. There is no single way to error-correct computational elements because they are so diverse. A multiplier would need different protection to an adder which is different from a shift-register. Hence, the idea of rolling back (say) the last instruction executed and having a "do-over".

    But for large arrays of homogeneous circuitry - like RAM - this doesn't seem worth the effort.

    --
    www.sjbaker.org
    1. Re:If the problem is with RAM... by saltydog56 · · Score: 3, Informative

      You are exactly right that the real problem is not the functionality of the memory chips, but rather the processor chips. For a number of reasons (but having said that it is very likely that a significant portion of the the problem of soft CPU chips is the on chip level one cache)

      On a regular basis I participate in the "radiation testing" of laptops intended for use on both the Space Shuttle and the International Space Station. This testing is normally done at Indiana University's Cyclotron Facility in Bloomington, Indiana. This past fall we completed testing on a group of laptops which implemented Intel's dual core Centrino Pro processors. Testing is conducted by hitting each of the components in the laptop with a proton beam while monitoring for induced errors.

      While the results of the testing varied by memory manufacturer, by far the softest component in the laptop was the CPU itself. That said, these processors actually did fairly well compared to some of the previous generations of CPU chips we have tested over the years.

      The rule that the smaller the die size, the greater the error rate does not seem to apply. For example, a number of years ago we tested a number of laptops using the Intel Pentium 3 mobile chip. Performance was so dismal that the decision was made not to procure any system based on that chip.

      Later testing of laptops based on the Pentium 4 mobile chip showed a dramatic turnaround - the Pentium 4 mobile chip, with its smaller die size actually out performed both the pentium3 mobile and the Pentium 2 chips then used for on-orbit operations. Our group does not do any analysis of "why" a failure occurred, only the collection of data to assist in the selection of suitable devices for use on the Shuttle and the ISS.

      The bottom line - die size is only one of the factors which come into play in determining how a chip will perform when hit by ionizing radiation. (one of my favorite theories is the declining deltas between a 1 and a 0 - in days gone by it could have been as much a five volts but is commonly down to around 1 volt in todays modern processors - this could serve to bring any electrical disruption caused by a particle strike closer to the threshold of changing a one to a zero - but what do I know, I am just a software guy)

      The concept of building a detector into chips is interesting, but not enough detail is provided to make a judgment on its feasibility. Single Event Upsets (SEUs or Bit-flips) are caused when a sub atomic particle such as a proton or a heavy ion slams into the silicon causing either an electrical disruption or damage to the silicon itself.

      The key here is that these particles are so tiny compared to the circuit itself that, from my perspective, unless the "detector" somehow encapsulates the whole circuit it is unlikely even notice the passage of a proton or other particle. To make detection even more difficult you must remember that you are working in a three dimensional environment - you can not predict the direction of travel, its energy level, or the location of a "strike"

      However, dealing with the effects of radiation on electronic components is something we are going to have to learn to deal with someday, so this research is both exciting and worthwhile.

  15. Cosmic-Ray-Detecting@Home? by obi · · Score: 1

    I feel another distributed computing project coming up: after SETI@home, and Folding@home, maybe this would make for an interesting way to get statistics on cosmic rays?

  16. Why don't they... by sokoban · · Score: 3, Insightful

    ... Just mount the chips in a vertical fashion. I work in an X-ray crystallography lab and we have a large format CCD detector. It's maybe about half a foot in diameter, but because it is mounted vertically, I see a cosmic ray streak maybe once every 200 or so 40 second exposures. Compare that to a cosmic ray detector of roughly the same size which is mounted horizontally in the other side of the building. It's counting cosmic rays almost constantly.

    --
    09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0 is the magic number.
    1. Re:Why don't they... by Anonymous Coward · · Score: 1, Funny

      Since these things are meant to go into space, I am not sure how well vertical and horizontal are defined. One thing that occurs to me: put the chip(s) on a gyroscope and have the detector keep the plane such that hits are minimized. Of course, the new -1 slashdot moderation for AC's means this comment won't ever be seen.

    2. Re:Why don't they... by Ant+P. · · Score: 1

      That's what they already do, in desktop PC towers anyway. Except for the expansion card slots and disks... maybe we'll see "CTX" boards starting to appear soon.

    3. Re:Why don't they... by RedWizzard · · Score: 1

      Just mount the chips in a vertical fashion. RAM is already normally mounted vertically, in both tower and desktop cases.
  17. Best Security Vulnerability Ever by UESMark · · Score: 1

    You could stage a denial of service attack using your death ray to target a given victim's machine. Their machine will completely lock up.

    1. Re:Best Security Vulnerability Ever by lordholm · · Score: 2, Funny

      Yes, but a simpler way is to bombard the machine with heavy lead pellets or cut the power... :)

      --
      "Civis Europaeus sum!"
    2. Re:Best Security Vulnerability Ever by whitehatlurker · · Score: 1

      You know that thinking like that is the reason you haven't won evil genius of the year so far. The attack must be high tech, complex and vulnerable to being stopped by a single dedicated hero (and gang of sidekicks). Can you put an LED countdown clock on "heavy lead pellets"?

      --
      .. paranoid crackpot leftover from the days of Amiga.
  18. DARPA Empire by the_pooh_experience · · Score: 1

    This sounds similar to what DARPA's EMPiRe project is doing.

  19. Attacking the JVM by LoonyMike · · Score: 2, Interesting

    This subject reminds me of a paper I saw some time ago, on a way to use the cosmic rays to your advantage and breaking out of the JVM. Here's the link: http://www.cs.princeton.edu/sip/pub/memerr.pdf

  20. Defensive patent. by Bill,+Shooter+of+Bul · · Score: 2, Interesting

    Its widely acknowledged that Intel created EMF burst proof chips for the government. The technology inside of them was never publicly discussed. I think it might be similar to cosmic ray correction. They might just be patenting a sub set of it now before the shrinking die sizes cause someone else to patent technology they've been using for years.

    --
    Well.. maybe. Or Maybe not. But Definitely not sort of.
  21. wait... by X_Bones · · Score: 1

    So they can tell now when a cosmic ray hits chip, and correct for it. But what happens when a cosmic ray hits the cosmic-ray detector and scrambles its brains, huh? Will we need a corrector for the corrector now too? And a corrector-corrector corrector? WHERE WILL IT ALL END

    1. Re:wait... by rubycodez · · Score: 1

      that's why I use cosmic ray sensitive turtles, all the way down

  22. CS student excuses by shadoelord · · Score: 1

    I remember joking about "stray alpha particles from the sun" screwing up code for class projects. Now Intel is trying to take that one away, all we'll be left with is "my dog ate it".

    --
    this is my sig, there are many like it, but this one is mine.
  23. we used to detect 1 or 2 hits a week by petes_PoV · · Score: 3, Informative
    Just to quantify the effect, the Sun E10000 Starfires we used a few years ago had ECC error counters built into the operating system. When I asked what they were for the salesman told me straight that they detected/corrected cosmic ray hits.

    More for laughs than anything else, I started logging them and found that a server with 16GB got maybe one ot two hits per week. After that I started to take ECC seriously - for professional quality servers.

    You probably don't need it for the domestic appliance quality stuff that people run at home - but for real work, get some decent kit

    --
    politicians are like babies' nappies: they should both be changed regularly and for the same reasons
  24. I have a better solution. by Xest · · Score: 5, Funny

    Tin foil hats, for RAM!

    1. Re:I have a better solution. by Agripa · · Score: 1

      Tin foil hats, for RAM!

      Oddly enough, that will not work well for direct impacts although it might be worthwhile at sea level. If you add shielding around the chip and it is directly exposed to a cosmic ray event, the shielding just serves to create a shower of particles which then affect a much larger area and transfer much more energy.
  25. Waste of space. by Jane+Q.+Public · · Score: 1

    In order to get a good idea of whether a few bits have changed in a large RAM array due to radiation (which is all it takes... more than a couple of bits can bollix data even in ECC memory), the detector itself would have to be comparable in size to the memory array.

    It is a waste of space.

    It would be cheaper (and maybe even lighter) to just radiation-harden the chip.

  26. You do not understand. by Jane+Q.+Public · · Score: 1

    That's what you are patenting: the idea! Although you are supposed to be working on something commercially viable.

    The patent office does not insist on working models unless it is an extremely unlikely idea... like perpetual motion, or free energy. There are good reasons for that.

    1. Re:You do not understand. by andruk · · Score: 0

      Nice nick for one. But...at the risk of sounding stupid (I probably am)...those reasons are what?

      It seems perfectly reasonable to me that they could/should request prototypes or at the very least a model ("Camelot!"). That way, ideas cannot be patented unless they are physically realizable.

      I mean, I can understand that it has the potential to turn into a race to build a prototype, but that sounds better than "A method by which the life of the host is continued through an alternation of inhaling necessary gases and expelling byproducts of chemical processes that occur within the host" aka "breathing". And really, how difficult is it (for a corporation) to build a prototype? Furthermore, your average little-guy inventor is probably going to have a prototype of his own, or if it needs special equipment (like building a new transistor), he/she can go negotiate with aforementioned large corporation (except Microsoft). This has the potential for patent trolls that actually have a few decent patents, but that is what the courts are designed to deal with, the wheels of justice do "grind slowly" however.

      It just seems to me that forcing people to build prototypes would cut out a lot of the crap (like business methods), and make it slightly blatantly obvious for the patent office to tell which should really be allowed and which shouldn't.

      Just my humble, uninformed, probably slightly retarded opinion.

    2. Re:You do not understand. by Jane+Q.+Public · · Score: 1

      Okay, I will try to explain. With the caveat that you should understand that recently, our patent system has not been working properly because of: (1) legislation and regulations that have corrupted the principles on which patents are based; (2) incompetence on the part of employees of the Patent and Trademark Office (PTO); and (3) blatant corporate exploitation of the system (which is really very closely related to the first issue). I could go on about this issue for hours but I will try not to.

      First, why are there patents (and copyright) in the first place? The story is long, but it boils down to something simple: it is in the interest of society as a whole if members of that society write papers and stories, and innovate scientifically and technologically. But... if society just TOOK the stories, or scientific papers, or inventions, when innovative people created them, then there would be no incentive for them to continue innovating! There are some excellent examples of exactly that. Ask yourself why China excels at copying the ideas of others, but does little innovating of its own? Simple: ideas belong to everybody. There is no profit in coming up with new ideas. But there is good profit in copying the ideas of others, and selling "ripoffs" of those products to others.

      The same was true of the former Soviet Union (and still is, to a lesser extent... they are starting to get it right finally). An engineer who invented new and exciting things was seldom rewarded for it, and might even be punished for it ("You were supposed to be improving our widget production!"). They often got paid the same whether they were learning and doing new things, or sitting around the office drinking vodka. (I happen to know that the latter still happens. I had an IM penpal in Russia near Moscow, who worked in a power plant... the rough equivalent of a public utility. I asked him one day how things at work were going, and he replied, "Not much going on a usual, so we are just sitting around the office drinking vodka again.")

      So after much debate, our Founding Fathers were convinced that it was best to reward clever individuals with a limited-time "monopoly" on the fruits of their ideas, in order to encourage them to innovate. After that short period (originally 17 years, on both patents and copyrights), the idea (invention, novel, whatever) became public property, and society benefits as a result. This is a very workable system, as America has demonstrated time and time again. (Less so recently, but again, my opening caveats... the system is no longer working properly).

      So, now: why is the patent on the IDEA, rather than on a working model of (whatever)?

      First, insisting on a working model would overwhelm the patent office with mechanical and electric devices sized from the microscopic to as large as several houses. Where would they put it all? Who would take care of it? How to dispose of it later? And so on.

      Second, insisting on working models would discriminate enormously in favor of the rich over the poor. Some devices are difficult or expensive to build. A large corporation might be able to afford to build one this month. A poor person would be forced to search for funds... and may never find enough. The story of Charles Babbage and his "Difference Engine" is a case in point. It was a breakthrough idea at the time. But it was barely within the power of the technology of the day to build it, and that at great cost. So he was always looking for funding, and the whole thing never got built. It has been proven since that it would have worked; a few years ago a university actually built one from his plans.

      As a result of Babbage's inability to secure adequate funding for his invention, the science of computing was delayed somewhere in the neighborhood of 100 years. That is a bit of a simplification of the story but essentially accurate as far as it goes.

      So, a poor person works and works on his idea for say 15 years, trying to get it to the point of being

  27. Interaction of radiation with matter by IvyKing · · Score: 1

    You're forgetting about what happens after an interaction of a cosmic ray with an atom. In the case of the ray being a neutron, the interaction will result in a lot of kinetic energy imparted to the nucleus (called the primary knock-on atom) which will then tear off a bunch of electrons as it slows down (a heavy charged particle with a given energy will have a well defined range in matter, which is why ion implantation superceded diffusion in chip fabrication). The range of the nucleus will likely be much larger than the thickness of the chip which would allow for use of a separate detector.

  28. Free opensource rad tolerant processor here. by orbitalia · · Score: 1

    This doesn't sounds so extremely new to me. You can even download the vhdl to a rad hard Leon3 (SPARC V8 instruction set) at gaisler here. This chip covers SEU (Single Event Upsets) typical of those caused by cosmic rays.

  29. Re:But there really is a memory problem by Catalina588 · · Score: 2, Interesting
    http://uksbsguy.com/blogs/doverton/archive/2007/05/23/microsoft-says-pcs-may-need-dram-upgrade-to-ecc-ram.aspx

    Microsoft's XP crash analysis early in this decade concluded that PCs always left on tended to crash unexpectedly. Dump analysis showed strange values in key OS variables, and cosmic rays (or other bit-blasting particles) were among the likely sources. The conclusion was so clear that Microsoft floated the idea (see URL above) that Vista-generation PCs should use Error-Correcting Code (ECC) memory to detect and fix multi-bit errors -- in consumer PCs. [Note that servers and business workstations have used ECC memory for decades].

    Having seen corrupted data in my own copy of Microsoft Money and other applications that I have left open for weeks, I am prepared to accept cosmic rays as well as Microsoft bugs as potential sources. Finally, why would Intel invest R&D capital in a cosmic ray detector if it had no likely or practical use for Intel's consumer and business customers?

  30. Photonic Chips by sanman2 · · Score: 1

    Maybe the solution will come when we abandon charge-states as our means of information processing, and instead shift into photonics. These components will then be immune to ionizing radiation.

    1. Re:Photonic Chips by MrNaz · · Score: 1

      And susceptible to a whole host of other problems. Imagine what would happen to an optically based computer circuit that was exposed to goatse radiation!

      --
      I hate printers.
  31. Insanity prevails? by GumphMaster · · Score: 1

    RELATED APPLICATION This application was filed the same day as an application entitled "System With Response to Cosmic Ray Detection" (application No. 10/882,898, now Patent 7,166,847) with the sane inventor.

    Meaning that the insane one was allowed to try and patent this? ;)

    --
    Patent litigation: A doctrine of Mutually Assured Destruction... in which everyone seems willing to push the button
  32. Cosmic rays? by Anonymous Coward · · Score: 0
  33. Breaking news! by GameboyRMH · · Score: 2, Funny

    Microsoft claims Vista's poor performance and unreliability are due to interference from cosmic rays. Vista makes a computer run so fast, they claim, that cosmic rays present a serious threat to the computer's stability, often resulting in lower performance than older operating systems like XP. Microsoft plans to release a cosmic ray shielding computer case, which will retail for $300, and should be released some time this month. Current Vista license holders will get a $50 discount.

    --
    "When information is power, privacy is freedom" - Jah-Wren Ryel
  34. Google Patents by XNormal · · Score: 1

    Patents are easier to read online with Google Patents. It also lets you download a PDF.

    here

    --
    Stop worrying about the risks of nuclear power and start worrying about the risks of not using nuclear power.
  35. CMOS RAM? by rrohbeck · · Score: 1

    Does anybody know is your typical battery-backed CMOS RAM is susceptible to corruption by cosmic rays too?
    I've looked into a few systems that arrived DOA due to a corrupt CMOS RAM (they were OK after resetting them with the jumper on the motherboard) after air shipment from the US to Europe or Asia and I wonder if that's the root cause.

  36. Or you could just do this.... by dsmatthews · · Score: 1

    Tho ensure the integrity of data and program logic flow under conditions that can cause bit flipping, just run multiple copies of your code on a multi-core CPU and have any thread that does not match the majority become invalid. If three threads agree and the other will not self terminate then they vote it of the CPU. Each thread has it's own memory blocks too, so you have the computational equivalent of a RAID. It would be cheaper and faster than Intel's idea as no roll-back is required, just perhaps a block copy to over-write the corrupted code/data. Anyone interested in cooking up this in a custom Linux kernel?

    dan@tekgnu.com

    1. Re:Or you could just do this.... by Ant+P. · · Score: 1

      Why bother going to that kind of effort? Just run the same program on a bunch of different computers and compare the output.

    2. Re:Or you could just do this.... by dsmatthews · · Score: 1

      Because it is FASTER. Multi core is a bunch of computers anyway, but they are very close together so there intercommunications are orders of magnitude faster than other options that involve a network. I made the suggestion because I believe that it is doable now, using existing CPU technology. Also, what makes you think that it would take more effort than any other method? It works at a very low level, potentially making use of the visualization capabilities of the CPU, if not, then that is the area of a new CPU that would be modified to make it possible. Essentualy part of the CPU is pseudo alalog in it's behavior and this allows the majority of cores to detirmine the outcome, rather than a single on/off state that is vulnerable to being randomised by radiation. Spreading a computation over time also offers protection, i.e. if you run the logic with the same inputs later you should get the same result, so if you use a pipeline that is redundant you get this type of protection. dan@tekgnu.com

  37. Should be testable by Lonewolf666 · · Score: 1

    You'd have to hit the computer in question with enough radiation that a number of errors should occur. Then you see if the mechanism actually works. It might, of course, be somewhat expensive to build or borrow a particle accelerator that provides a passable approximation of cosmic rays ;-)

    As Mrsooreams wrote just one post below you, it seems not guaranteed that the particles actually hit the detector and not only the computing elements. So I'd take this one with a grain of salt...

    --
    C - the footgun of programming languages
  38. dont recreate the wheel, use ECC xor chipkill. by medelliadegray · · Score: 1

    I recall reading a paper IBM published years back, advocating why ECC memory is still necessary due to cosmic rays... I forget, but they possibly tied it into why their chipkill (essentially, a RAID array for each DIMM) technology should be used, but as i understand it ECC will be adequate for correcting 'a' cosmic ray bit flip. chipkill is only necessary if you have multiple bits flipped, which potentially is the concern as dimm sizes scale up.

    I really fail to see this as more than some marketing hype for a solution to an already solved problem.

    --
    Troll, Troll, go away and flame again some other day
  39. Difference between a Computer Salesman and ... by Shamanarchy · · Score: 1

    As someone who has worked for several hardware vendors, including Sun, I am still amused by the truth in the following joke that was once told to me:

    Q: How do you tell the difference between a computer salesman and a used car salesman?
    A: A used car salesman knows when he is lying.

    ( My apologies to those computer salesmen who do really understand the technology they sell. Unfortunately there are too many who do not. )

  40. I forgot to mention... by Jane+Q.+Public · · Score: 1

    My discussion above was in regard to "utility" patents, which are for inventions and the like. It is also possible to get a "design" patent, which is on things like shape and style... that is a different animal entirely. The standards and rules are completely different for design patents, and not really relevant to the discussion.