Many DDR3 Modules Vulnerable To Bit Rot By a Simple Program

← Back to Stories (view on slashdot.org)

Many DDR3 Modules Vulnerable To Bit Rot By a Simple Program

Posted by Soulskill on Wednesday December 24, 2014 @02:05AM from the flipping-bits-for-fun-and-profit dept.

New submitter Pelam writes: Researchers from Carnegie Mellon and Intel report that a large percentage of tested regular DDR3 modules flip bits in adjacent rows (PDF) when a voltage in a certain control line is forced to fluctuate. The program that triggers this is dead simple — just two memory reads with special relative offset and some cache control instructions in a tight loop. The researchers don't delve deeply into applications of this, but hint at possible security exploits. For example a rather theoretical attack on JVM sandbox using random bit flips (PDF) has been demonstrated before.

138 comments

Min score:

Reason:

Sort:

Applications include... crashing computers. by Anonymous Coward · 2014-12-24 02:11 · Score: 1

I don't know if there are hundreds or thousands or hundreds of thousands of low level 'bugs' like this related to simple subsystems abused in specific ways.. but there are plenty.
1. Re:Applications include... crashing computers. by wonkey_monkey · 2014-12-24 04:23 · Score: 0
  
  Halt and Catch Fire.
  
  --
  systemd is Roko's Basilisk.
Theoretical vs demonstrated by Anonymous Coward · 2014-12-24 02:12 · Score: 0

If it's been demonstrated how is it still theoretical?
Either way, I wouldn't say that cache control instructions in a tight loop is dead simple.
Still pretty bad though.
1. Re:Theoretical vs demonstrated by beelsebob · 2014-12-24 02:42 · Score: 1
  
  I would guess that it's theoretical because it involves things like knowing exactly where the JVM is positioned in physical memory, and how its pages are laid out. That, and that the demonstration involved knowing all of these things before you started.
2. Re:Theoretical vs demonstrated by MShook · 2014-12-24 20:41 · Score: 1
  
  Because it's a scientify theory or as wiki says: A scientific theory is a well-substantiated explanation of some aspect of the natural world that is acquired through the scientific method and repeatedly tested and confirmed through observation and experimentation. As with most (if not all) forms of scientific knowledge, scientific theories are inductive in nature and aim for predictive power and explanatory force.
Many DDR3 modules? by ArcadeMan · 2014-12-24 02:12 · Score: 3, Insightful

This is all very interesting but totally pointless! Which modules? Tell us the brands, model names, manufacturer numbers?

--
Get free satoshi (Bitcoin) and Dogecoins
1. Re:Many DDR3 modules? by DigiShaman · 2014-12-24 02:21 · Score: 4, Insightful
  
  FTFP. "We induce errors in most DRAM modules (110 out of 129) from three major DRAM manufacturers."
  Short version, leakage current from adjacent gates can nudge other to bit-flip. I don't think this is a manufacturing problem as it is a fundamental EE design oversight. So yeah, defective by design (unintentionally)!!
  
  --
  Life is not for the lazy.
2. Re:Many DDR3 modules? by ArcadeMan · 2014-12-24 02:29 · Score: 1
  
  It also means that 19 out of 129 DRAM modules are not affected by this problem, hence my question.
  
  --
  Get free satoshi (Bitcoin) and Dogecoins
3. Re:Many DDR3 modules? by Rei · 2014-12-24 02:32 · Score: 5, Informative
  
  If you're wanting to narrow it down, you won't like this line from the paper:
  
  In particular, all modules manufactured in the past two years (2012 and 2013) were vulnerable,
  It's pretty clever, and something I always wondered whether would be possible. They're exploiting the fact that DRAM rows need to be read every so often to refresh them because they leak charge, and eventually would fall below the noise threshold and be unreadable. Their exploit works by running code that - by heavily, cyclicly reading rows - makes adjacent rows leak faster than expected, leading to them falling below the noise threshold before they get refreshed.
  
  --
  I am a proud traitor to my species in alliance with my mother the Earth in opposition to those who would destroy her.
4. Re:Many DDR3 modules? by ArcadeMan · 2014-12-24 02:36 · Score: 1
  
  That PDF has a lot of details but TL;DR, you were able to condense it into a single paragraph that we can read in a few seconds.
  Thank you.
  
  --
  Get free satoshi (Bitcoin) and Dogecoins
5. Re:Many DDR3 modules? by DigiShaman · 2014-12-24 02:37 · Score: 4, Interesting
  
  True, and commodity chips not to exact spec will introduce disturbance errors. But apparently this is been a known problem with DRAM with various method of mitigation during the binning process. It's just that density and tolerances have become so tight that the issue is now exasperated. I wouldn't be surprised at all if those 19 models also had a few that failed if tested again and again.
  Honest. General computing from low-end PCs, phones, and other devices are long overdue in employing ECC by default. So you lose capacity and tiny performance hit. BFD if that means your data doesn't become corrupted. The only people that would care are the PC gaming benchmark queens.
  
  --
  Life is not for the lazy.
6. Re:Many DDR3 modules? by Anonymous Coward · 2014-12-24 02:39 · Score: 0
  
  But I was assured that DRAM stays readable for minutes after they're removed from the machine?
  http://it.slashdot.org/story/0...
7. Re:Many DDR3 modules? by Anonymous Coward · 2014-12-24 02:45 · Score: 0
  
  That's a low enough fraction (~15%) that you have to wonder if it is a fluke of manufacturing that happened to make those particular batches "immune", rather than something that could be relied upon from brand/model/part numbers. It's well known that manufacturers swap chips all the time and give them the same model and part number as long as it meets the specs.
8. Re:Many DDR3 modules? by WaywardGeek · 2014-12-24 02:50 · Score: 1
  
  It sounds like you know a bit about modern DRAM architecture. Data sheets now days are not avalable to the public, so it's hard to figure out basic things, like how much power is burned in the DRAM in a simple loop. Do you have a simple rule of thumb for modern DRAM power loss? If I understand correctly, static power is minimal, but dynamic power can generate several watts of power.
  
  --
  Celebrate failure, and then learn from it - Nolan Bushnell
9. Re:Many DDR3 modules? by Luckyo · 2014-12-24 03:02 · Score: 1
  
  Overwhelming majority of "PC gaming benchmark queens" wouldn't give a toss because memory speed hasn't been a bottleneck in gaming in many years.
  People who would care are ordinary users and OEMs who would have to absorb the extra cost. Especially to OEMs costs are far from trivial.
10. Re:Many DDR3 modules? by Anonymous Coward · 2014-12-24 03:13 · Score: 0
  
  Most likely Hynix, Samsung and Micron. They avoided third party modules to get a better idea of the chips used. Which is which, that is the question..
11. Re:Many DDR3 modules? by Anonymous Coward · 2014-12-24 03:19 · Score: 0
  
  The study clearly says ALL MODULES manufactured in 2012-2013. The ones not vulnerable are older. This implies the manufacturing process.
12. Re:Many DDR3 modules? by DigiShaman · 2014-12-24 03:20 · Score: 2
  
  In my personal experience of "benchmark queens" in general; be it automotive performance or computing, are all about the synthetic numbers and zero basis on practicality (let alone value in cost). If a gamer is doesn't give a toss about a particular core subset of general computing (Video, CPU, RAM, and Storage), they're not benchmark queens. I've met plenty online who are. And when queens start debating online over numbers, the flamewars begin.
  
  --
  Life is not for the lazy.
13. Re:Many DDR3 modules? by QQBoss · 2014-12-24 03:31 · Score: 1
  
  It can, but the chances of it staying perfectly readable is very small. And realize that removing RAM from a machine puts it under a very different condition than intentionally accessing the RAM in a pattern which causes faster than normal leakage, so the results aren't mutually exclusive.
14. Re:Many DDR3 modules? by MightyYar · 2014-12-24 04:01 · Score: 1
  
  I'm not sure whether I more bothered by "benchmark queens" or people who flame over their subjective opinions. The latter are a lot like "audiophiles", unwilling to believe in blind testing.
  
  --
  W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
15. Re:Many DDR3 modules? by swillden · 2014-12-24 04:05 · Score: 1
  
  But I was assured that DRAM stays readable for minutes after they're removed from the machine?
  http://it.slashdot.org/story/0...
  Not if adjacent rows are being heavily, cyclicly read.
  
  --
  Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
16. Re:Many DDR3 modules? by Anonymous Coward · 2014-12-24 04:06 · Score: 0
  
  Except that if ECC were standard, the extra cost would be marginal. 12.5% extra for a single component, so overall the added cost would be negligible. Only an idiot would opt for non-ECC memory, and this is one case where machines should be idiot-proofed.
17. Re:Many DDR3 modules? by Anonymous Coward · 2014-12-24 04:12 · Score: 0
  
  You must not be looking in the correct places. The link below is a product page with datasheet, specs, etc.
  DDR3 Module
18. Re:Many DDR3 modules? by ChrisMaple · 2014-12-24 04:24 · Score: 2
  
  So, other than fixing the dram design, the solution is to refresh more frequently. A software fix might be a high priority background program that forces a full refresh at regular intervals (probably a big performance hit). If the CPU does its own dram control, there might be a register that affects refresh rate, or perhaps a microcode fix.
  The problem is analog in nature, which suggests that optimized and very clean supply voltages, and very clean and precisely timed control signals might reduce or eliminate the problem.
  In any case, this means that manufacturers need to fix their designs and test them more thoroughly.
  
  --
  Contribute to civilization: ari.aynrand.org/donate
19. Re:Many DDR3 modules? by Archtech · 2014-12-24 04:28 · Score: 1
  
  'I'm not sure whether I more bothered by "benchmark queens" or people who flame'.
  FTFY. Does anyone ever flame about anything except subjective opinions?
  
  --
  I am sure that there are many other solipsists out there.
20. Re:Many DDR3 modules? by Archtech · 2014-12-24 04:36 · Score: 1
  
  Reminds me of the first time I ever heard this particular discussion: at DEC in about 1983. A colleague who had gone to do quality engineering on VAX/VMS systems asked for statistics on crashes caused by memory errors. All VAX computers had built-in ECC (of course), but the advanced thinkers in engineering were wondering if it would be more cost-effective to do without. Money would be saved, both by the manufacturer and the customer, and systems would run significantly faster (maybe). Surely that would be worth the fairly infrequent crash, which could be recovered from with the help of backups, logs, etc.?
  We all thought the idea was daft - purely on general principle. The reduction in speed due to ECC could be exactly specified, as could the extra cost. But random crashes couldn't - and what if human error caused the backups, logs, etc. to be missing or corrupt? Worse still, what if errors were introduced that didn't cause a crash or any noticeable problem? All sorts of critical systems could go on stacking up subtly wrong data more or less indefinitely.
  To this day I always ask for ECC whenever I buy a new PC - but the only machines I have ever found that had it were Dell workstations.
  
  --
  I am sure that there are many other solipsists out there.
21. Re:Many DDR3 modules? by MightyYar · 2014-12-24 04:49 · Score: 3, Funny
  
  Climate change... [ducks].
  
  --
  W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
22. Re:Many DDR3 modules? by Anonymous Coward · 2014-12-24 05:02 · Score: 0
  
  well, sort of. as i see it, CLFLUSH is really the issue. i think memory designers work under the assumption that
  caching is effective, and this case can't happen. this is slightly naive because of the relatively new CLFLUSH
  instruction. as i see it, letting being able to issue a CLFLUSH at any privilege level is a problem waiting to happen.
  (and doesn't provide any extra coherence.)
23. Re:Many DDR3 modules? by tlhIngan · 2014-12-24 05:02 · Score: 2
  
  Data sheets now days are not avalable to the public
  Datasheets ARE publicly available. However, they're for the actual DRAM ICs themselves, and not of the modules.
  There are only a few DRAM manufacturers out there - Samsung, Hynix, Elpida, Micron are among them.
  Samsung Computing DRAM (they also have Graphics DRAM and others). Some of their newest chips don't have datasheets yet, but that'll be forthcoming. The older ones in production do, however.
  Hynix
  Micron (and Elpida).
  These are all generally available. Since the only real difference between them is a few timing numbers, they're not generally a huge secret - it's all governed by JEDEC standards anyhow.
  Memory modules are just collections of these chips so they can be generalized to what you buy in the store for your PC.
24. Re:Many DDR3 modules? by greg1104 · 2014-12-24 05:29 · Score: 1
  
  Memory speed can technically still be the bottleneck on large memory footprint games like BF4; see the bit-tech review for some numbers on that. The people chasing after PC gaming benchmarks reflexively use the fastest memory around though, and if you do that it's less likely for memory to dictate the speed limits.
25. Re:Many DDR3 modules? by greg1104 · 2014-12-24 05:31 · Score: 2
  
  I'm also bothered by people who put the word audiophiles in scare quotes for no good reason. P.S. Not all audiophiles are opposed to blind testing; some people like expensive audio toys that are objectively better too.
26. Re:Many DDR3 modules? by Anonymous Coward · 2014-12-24 05:50 · Score: 0
  
  So, use ECC RAM so the errors are detected and corrected.
27. Re:Many DDR3 modules? by Guy+From+V · 2014-12-24 06:04 · Score: 1
  
  Audio queen here, you probably mean double blind testing.
28. Re:Many DDR3 modules? by Anonymous Coward · 2014-12-24 06:06 · Score: 0
  
  Intel segments the market. You want to use ECC? You need both an Intel XEON processor and a motherboard that supports it. By design, you will not find a single Intel Pentium or Core that supports ECC.
  The only desktop platforms that support ECC is AMD. Otherwise with Intel, it's workstation/server class only.
29. Re:Many DDR3 modules? by Guy+From+V · 2014-12-24 06:11 · Score: 1
  
  If the module is supercooled quickly after its removed, it can be minutes before RAM bits start to wipe. Even if they do, RAM bits "erode" in a predictable manner allowing for information to be rebuilt if not degraded enough after power-down.
30. Re:Many DDR3 modules? by ttucker · 2014-12-24 08:05 · Score: 1
  
  To this day I always ask for ECC whenever I buy a new PC - but the only machines I have ever found that had it were Dell workstations.
  Always ECC user here as well. With Intel, only Xeon systems come with ECC support in the chipset. You are actually looking for any workstation level computer with a Xeon chip, although Dell is the only outfit with an even semi reasonable price.
31. Re: Many DDR3 modules? by Anonymous Coward · 2014-12-24 08:41 · Score: 0
  
  I'm pretty sure at least some, if not all, of the i3 range supports ECC.
32. Re:Many DDR3 modules? by Anonymous Coward · 2014-12-24 09:24 · Score: 0
  
  By design, you will not find a single Intel Pentium or Core that supports ECC.
  Wrong.
  Yes, you need a board with a Cwhatever chipset to get a BIOS that can enable ECC.
  Partially a forced chipset limitation (ECC for the DMI link and the chipset PCIe lanes is permanently fused off on the desktop chipsets).
  Partially pure segmentation bullshit (Sandy Bridge onwands main memory ECC including error reporting is completely handled on cpu, yet board manufacturers are not allowed to enable even that on non-workstation boards if they want to get any intel chipsets in the future...)
  Sidenote:
  intel requires a NDA to get a datasheet that contains the finer details of the CPU memory controller, including how to set up and control ECC.
  AMD has full memory controller info in their public datasheets.
  Guess which linux EDAC driver still can't properly decode error syndromes.
33. Re:Many DDR3 modules? by MightyYar · 2014-12-24 10:46 · Score: 1
  
  Even blind testing would be an improvement.
  
  --
  W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
34. Re:Many DDR3 modules? by MightyYar · 2014-12-24 10:49 · Score: 1
  
  They aren't scare quotes - they are there to differentiate people who think they can hear things that they really can't from people who truly chase better sound. If I hear anything about oxygen in your speaker wire, you'll get the quotes.
  
  --
  W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
35. Re:Many DDR3 modules? by oobayly · 2014-12-24 11:03 · Score: 1
  
  My understanding was that oxygen free copper is supposed to more fatigue tolerant so that it gives better plug-unplug endurance, not better sound.
36. Re:Many DDR3 modules? by MightyYar · 2014-12-24 11:36 · Score: 1
  
  I've seen nonsense about inductance and capacitance. And then it'll be stranded. Oy.
  Most people are using it to make a permanent connection in their homes with stranded wire... so endurance, fatigue, corrosion are all non-issues. I would wager a very high sum of money that double-blind testing would result in no perceptible difference.
  
  --
  W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
37. Re:Many DDR3 modules? by oobayly · 2014-12-24 12:23 · Score: 1
  
  Oops, had meant to say that in my comment - that very few people will need the "endurance" - I completely agree. I have to admit that I got suckered into buying zero-oxygen-copper cables (it sounds good, doesn't it), until I decided to check what it actually meant - zilch!
38. Re:Many DDR3 modules? by Luckyo · 2014-12-24 14:33 · Score: 1
  
  That is indeed the problem with many technologies. "If they were standard, their costs would be much cheaper".
  At which point the question becomes that of "is this functionality actually needed as a standard in most use scenarios?"
  For ECC memory, this question was asked ever since the early 80s and the answer is still "no".
39. Re:Many DDR3 modules? by Luckyo · 2014-12-24 14:39 · Score: 1
  
  This used to be the problem back in the day before DDR3, true. After DD3 got to around 1333-1600MHz, the problem was effectively eliminated in favour of latency being the only reasonable bottleneck. And that actually gets worse rather than better when you increase the frequency
  The tests you link show exactly that - no noticeable difference. They're looking at 1-2% difference between 1333 modules and 2400 modules. Because that is not the bottleneck. System is bottlenecked elsewhere, most likely on GPU. If this was a bottleneck, you would see improvements that would match the differential in RAM speed, as happens with most GPU tests for example.
40. Re:Many DDR3 modules? by MightyYar · 2014-12-24 16:10 · Score: 2
  
  ALL of that audiophile stuff sounds good (pun intended).
  
  --
  W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
41. Re:Many DDR3 modules? by sjames · 2014-12-24 19:17 · Score: 1
  
  That crazy theory again!?!
  Sir, I assure you that ducks absolutely do not cause climate change!
  But note that pirates can slow it.
42. Re:Many DDR3 modules? by sjames · 2014-12-24 19:21 · Score: 1
  
  Plenty have it on the server side. Just use a server board in your desktop.
43. Re:Many DDR3 modules? by sjames · 2014-12-24 19:43 · Score: 1
  
  In those cases, there tend to be a LOT of errors. The risk is that enough will read correctly to leak valuable information like passwords. Also, in those cases the memory is not active.
44. Re:Many DDR3 modules? by greg1104 · 2014-12-25 00:55 · Score: 1
  
  That's not how it works. The way you spot a bottlenecks in performance work is that if you change anything else, there is zero impact on the resulting system speed. Conversely, if you alter something and the system really does get faster, you must have just hit one of the bottlenecks.
  Given that, the way high detail performance goes from 83 to 86 FPS as RAM speed increases means that RAM speed must have been a bottleneck. If speed had been strictly limited by the video card instead, speeding up the memory would have given zero total system speed increase. It's not hard to get RAM fast enough to no longer be the bottleneck, but you can't just throw junk memory at this game without that turning into a limiter.
45. Re:Many DDR3 modules? by greg1104 · 2014-12-25 01:35 · Score: 1
  
  Inductance and capacitance impact total impedance, and it is possible to find bad combinations where that turns into an easily measurable problem with the cable. See high cap wire section of "Speaker Wire: A History" for how that comes out on a scope. It's very easy to find cases where the wire doesn't matter too. One of the funny things about objective audio testing is that people usually find what they set out to, because it's so easy to set up tests to give the results you want. That doesn't disprove there are no edge cases where those things do matter. Audible amplifier feedback and oscillation is a real thing.
  Serious corrosion does happen in old audio cables, with them turning a lovely puke green eventually. I have some systems going back to the 80's here, and that copper is totally grody, fer sure! Preventing that is mainly about the jacket and termination though. Just using high quality copper doesn't make it go away.
  That coat hanger wire vs. Monster Cable test used Martin Logan SL3 speakers, which was such a weird choice I have to throw the whole thing out as a waste of time. Those are electrostatic panels with a traditional woofer. Electrostatics have very different electrical properties than regular speakers. You can't really extrapolate from that exotic test to the rest of the market, where people are mainly using traditional cone and dome speakers. Car anology: you can observe that changing gas for a Tesla electric vehicle doesn't impact its performance, but that doesn't prove gas quality is irrelevant to regular engines.
46. Re:Many DDR3 modules? by greg1104 · 2014-12-25 01:38 · Score: 1
  
  Oxygen-free copper is very a much a real thing, and it does matter for some applications. The only part that's hard to support is whether those differences are audible in home audio. All other things being equal between two cables, it shouldn't matter. (All other things are usually not equal)
47. Re:Many DDR3 modules? by MightyYar · 2014-12-25 04:33 · Score: 1
  
  Perhaps you can measure things on a scope, but that doesn't mean the difference is perceptible. It's not my money, so I don't really care what audiophiles do with it - but they also seem to expect me to be impressed, which I am not. I politely nod but honestly think they are just burning their money. I can't take someone seriously who thinks that oxygen makes a perceptible difference in audio, and then think nothing of using stranded wire vs. solid. Even with an oscilloscope, the stranded vs. solid will be a much bigger difference than the 97% vs 99.99% copper. And by "much bigger", I mean "still not perceptible".
  I know a guy who does installs. He tells many stories, but I like this one: He ran out of super-expensive speaker wire specified by one customer. He temporarily finished the job with landscaping wire, of all things. It was the proper gauge and everything, but cheap stuff that he uses for outdoor installs (which unbelievable people insist on having fancy cable for! Shut those birds up, would you?). He came back later (when the specified wire came in) and told the customer what he needed to do. They guy, completely oblivious to the "problem", was horrified. Just horrified! He had been quite happy with the new system, but now noted that certain things do indeed sound wrong... the brain is an amazing machine.
  
  --
  W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
48. Re:Many DDR3 modules? by Anonymous Coward · 2014-12-25 05:48 · Score: 0
  
  density and tolerances have become so tight that the issue is now exasperated.
  
  exacerbated?
49. Re:Many DDR3 modules? by lsatenstein · 2014-12-25 10:11 · Score: 1
  
  FTFP. "We induce errors in most DRAM modules (110 out of 129) from three major DRAM manufacturers."
  Short version, leakage current from adjacent gates can nudge other to bit-flip. I don't think this is a manufacturing problem as it is a fundamental EE design oversight. So yeah, defective by design (unintentionally)!!
  So, as ddr3 gets more dense, and space between the cells has decreased, we should be standardizing on ECC memory for all desktops and servers. The second thought I have is "What minimal cpu clockspeed would enable this activity to occur with standard hardware? " It this problem likely to occur with off the self hardware motherboards and cpus?
  
  --
  Leslie Satenstein Montreal Quebec Canada
50. Re:Many DDR3 modules? by Bengie · 2014-12-25 11:35 · Score: 1
  
  Memory speed can technically still be the bottleneck
  And taking a piss before you head to work can save you gas money. Your link shows an 80% increase in memory speed giving a 1.7% increase in performance. Congrats, you just doubled your memory's power consumption.
51. Re:Many DDR3 modules? by greg1104 · 2014-12-25 12:55 · Score: 1
  
  Some ludicrously overpriced cable aimed at the mass market is stranded, with Monster being the biggest offender by volume. But most of the really expensive speaker cable is solid core instead of stranded, with the core size limited only by how flexible the cable needs to be. The stuff I like uses a number of 14 AWG wires that total to match 12 AWG. I've tried using twisted pairs of 12 AWG copper instead, just basic power cable from Home Depot, but I can barely route the stuff. I like the cables (and amplifiers) I use to be mechanically sound and measure perfectly, priority #1, so I never have to include them in troubleshooting why something sounds bad. Multi-component systems are hard to optimize, and making individual parts as perfect as is practical lowers the complexity. That doesn't lead me to $1000 speaker cables, but I'm not getting $5 ones either.
  When audio changes are big enough to show on a scope, normally the only reason someone bothered to isolate them out is because they were audible. Some of what audiophiles complain about here is real albeit misunderstood. Let's say you start with low-feedback amplifiers with a low damping factor, which some people think are good things. That gives you an amp that's more prone to oscillation than is has to be. If you then combine that with a high capacitance cable, next thing you know there's a perfect storm of bad design that really does sound different. What's supposed to be ultrasonic junk moves into audible. And some idiots will think that because it's different, it's better, so next thing you know every part of people's system is tweaked for more of that junk.
  There is a side of the market that demands the best engineered products for the price point at every step of the chain too though. I read audio reviews starting with the bench plots.
52. Re:Many DDR3 modules? by Luckyo · 2014-12-25 17:59 · Score: 1
  
  Actually that is how it works. Concept of a bottleneck refers to aspect of a pipe+pool system where thickness of the pipe is the limiting factor and increasing width of the pipe offers a comparable increase in flow throughput.
  When you double pipe's thickness and get 1-2% more flow, it means that your system's bottleneck is elsewhere.
53. Re:Many DDR3 modules? by strikethree · 2014-12-25 23:09 · Score: 1
  
  the issue is now exasperated.
  Not being a pedant, just trying to be helpful: The word that you are looking for is exacerbated.
  
  --
  "Someone needs to talk to the tree of liberty about its ghoulish drinking problem." by ohnocitizen
54. Re:Many DDR3 modules? by Agripa · 2014-12-27 06:56 · Score: 1
  
  This used to be the problem back in the day before DDR3, true. After DD3 got to around 1333-1600MHz, the problem was effectively eliminated in favour of latency being the only reasonable bottleneck. And that actually gets worse rather than better when you increase the frequency
  The latency at higher clock frequency does not increase in the way you suggest. It only appears that way because latency is measured in clock cycles so when the clock cycle is halved, twice as many are needed for a given duration.
55. Re:Many DDR3 modules? by Luckyo · 2014-12-27 20:07 · Score: 1
  
  Where did I post anything to suggest what you're suggesting?
  It's well known that increasing RAM frequency impacts latency in net negative way. Your suggestion implies that impact is neutral, when it's rarely so unless you buy much more expensive RAM specifically picked and binned for those frequencies and latencies. Typical RAM sold incurs significant net negative impact on latency as frequency increases. Alternative is lower reliability.
  Anyone who did any overclocking and worked with RAM memory doing it should be well aware of this issue.
56. Re:Many DDR3 modules? by Agripa · 2014-12-28 02:21 · Score: 1
  
  Where did I post anything to suggest what you're suggesting?
  And that [latency] actually gets worse rather than better when you increase the frequency
  Increasing the RAM frequency has little or no effect on latency; it only changes the unit of measurement. Latency as measured in clock cycles goes up but latency measured in nanoseconds stays roughly the same (actually it generally gets better) and it is the later which matters as far as the processor is concerned.
  The first word access time shown in this table is the most relevant:
  http://en.wikipedia.org/wiki/C...
57. Re:Many DDR3 modules? by Luckyo · 2014-12-28 08:14 · Score: 1
  
  Took me a while to figure out what you're talking about. That's some exotic trolling. Well done. Shame no one cares about it this far down the chain.
  Your case was specifically addressed long ago when I mentioned the costs. You've linked to standards table which addresses what kinds of memory are made. It's correct to state that in those standard, CAS latency generally gets net better as frequency goes up. What you are trolling on is costs - subject mentioned at the very beginning.
So, what comes next? by Anonymous Coward · 2014-12-24 02:13 · Score: 0

ALLR? (Address Line Layout Randomization)?
1. Re:So, what comes next? by ThePhilips · 2014-12-24 02:29 · Score: 1
  
  Wear Leveling?
  Leakage Leveling?
  P.S. Question is whether a workaround is possible with the CPU microcode.
  
  --
  All hope abandon ye who enter here.
2. Re:So, what comes next? by FirstOne · 2014-12-24 02:46 · Score: 1
  
  ECC is dismissed in the article, but the article ignores that ECC systems also have a scrubbing capability
  Unfortunately, ASUS is the only manufacturer that consistently includes ECC support in their AMD based motherboard line.
3. Re:So, what comes next? by Anonymous Coward · 2014-12-24 07:05 · Score: 0
  
  ECC is dismissed in the article, but the article ignores that ECC systems also have a scrubbing capability
  It's dismissed, because scrubbing only occurs on refresh and if you have multiple bit errors before then, ECC can't fix them. However, it will warn you, which is helpful since this is an attack, not an accident.
4. Re:So, what comes next? by Anonymous Coward · 2014-12-24 15:59 · Score: 0
  
  It dismisses ECC because of the way this attack works: correct and seemingly-ordinary reads in one dram row (done very heavily and carefully) induces charge leakage in a neighbouring row. ECC doesn't know until the neighbouring row is read, and can only correct in the case of a single-bit error and typically can only detect two-bit errors. This attack can produce multiple bit errors that ECC typically cannot even detect, let alone correct. Additionally, scrubbing is only done during idle periods and only on refresh, and still can only do what ECC does, which is again almost certainly one-bit-correct/two-bit-detect.
good news for ECC memory makers by funkymonkjay · 2014-12-24 02:21 · Score: 1

as for me, i'll wait for some real world examples of this possible exploit before i switch to ECC memeory, which would mean a new MB on top of the more exp memory.
1. Re:good news for ECC memory makers by Rei · 2014-12-24 02:35 · Score: 3, Informative
  
  According to the paper, EEC only reduces but does not eliminate the problem (section 6.3). Multiple bits can be corrupted at once.
  
  --
  I am a proud traitor to my species in alliance with my mother the Earth in opposition to those who would destroy her.
2. Re:good news for ECC memory makers by DigiShaman · 2014-12-24 02:48 · Score: 2
  
  Ouch! Seriously bad. Worse than the Pentium FPU bug (and that's bad). What good is a computer if you can't rely on the data being committed back to disk because of corruption mid-flight in RAM?! At least with the FPU bug, it was only FPU. But here we're talking about an industry wide issue where any operation cannot guaranty data doesn't become corrupted back to disk. By the time bit-rot sets in, you may have to dive into your grandfather-father-son backup archive. And that's assuming such a backup scheme is being used by those who are effected. Shit, that's assuming people are even backing up their data in the first place!!
  
  --
  Life is not for the lazy.
3. Re:good news for ECC memory makers by Anonymous Coward · 2014-12-24 03:04 · Score: 1
  
  Welcome to the Digital Dark Age
4. Re:good news for ECC memory makers by sshir · 2014-12-24 03:19 · Score: 4, Insightful
  
  At least with ECC you'll get _some_ feedback (it's random so it will pop from time to time) indicating that something fishy is going on. With regular ram all corruptions are silent so you'll get random crashes that will drive you crazy...
5. Re:good news for ECC memory makers by ericloewe · 2014-12-24 03:19 · Score: 2
  
  Difference being that the system is immediately halted if an uncorrectable error is discovered.
6. Re:good news for ECC memory makers by Anonymous Coward · 2014-12-24 03:44 · Score: 0
  
  You should note, this is an intentionally triggered corruption - it won't happen just because you've got DDR3. ECC mitigates nothing about this, it crashes also.
7. Re:good news for ECC memory makers by Anonymous Coward · 2014-12-24 03:56 · Score: 0
  
  You are missing the point; a machine with ECC will log all errors, correctible or not. If the machine is throwing memory errors, you know something is up.
8. Re:good news for ECC memory makers by wolrahnaes · 2014-12-24 04:16 · Score: 2
  
  ECC does not mitigate it, but it will detect the problem where non-ECC memory will happily keep on operating with the corrupted data.
  For the standard car analogy, consider tire pressure monitoring systems. They won't stop you from getting a flat, but they'll let you know you have a slow leak where you might otherwise keep driving until it's bad enough that you notice otherwise. By that time the damage is done and you probably need a new tire.
  
  --
  I used to get high on life, but I developed a tolerance. Now I need something stronger.
9. Re:good news for ECC memory makers by 0123456 · 2014-12-24 04:24 · Score: 1
  
  Ouch! Seriously bad. Worse than the Pentium FPU bug (and that's bad). What good is a computer if you can't rely on the data being committed back to disk because of corruption mid-flight in RAM?!
  It apparently only happens if you read the same bytes from RAM 139,000 times in 64 milliseconds. If your program is doing that, you probably have a lot more to worry about than disk corruption.
  If this was actually happening in the real world, computers would probably be crashing every few minutes.
10. Re:good news for ECC memory makers by greg1104 · 2014-12-24 05:46 · Score: 1
  
  The test numbers in section 6.3 show that ECC mitigates most of the errors, as the bulk of them are single bit ones. And if you're on a system that's prone to this problem, the odds are you will see a warning about that ECC correction kicking in long before you'll hit one of the uncorrectable multi-bit errors.
11. Re:good news for ECC memory makers by Dragonslicer · 2014-12-24 06:19 · Score: 2
  
  If this was actually happening in the real world, computers would probably be crashing every few minutes.
  You mean attackers have been exploiting this ever since Windows 95?
12. Re:good news for ECC memory makers by complete+loony · 2014-12-24 11:52 · Score: 1
  
  Worse problem; VM server farms. If you can run arbitrary code, you might be able to flip bits in the hypervisor or another VM.
  
  --
  09F91102 no, 455FE104 nope, F190A1E8 uh-uh, 7A5F8A09 that's not it, C87294CE no. Ah! 452F6E403CDF10714E41DFAA257D313F.
13. Re:good news for ECC memory makers by Anonymous Coward · 2014-12-24 16:07 · Score: 0
  
  It's harder -- almost certainly MUCH harder -- to do in a VM because you have to do intensive reads of a particular DRAM line and the native processor will definitely want to cache intensive reads. The attack involves attempting to force uncached intensive reads by flushing the Ln-caches in a tight loop on a single core, and that's hard to do with most sorts of modern operating systems on native metal where the attack is even runtime feasible. The VM infrastructure probably doesn't provide a monopolize-a-single-core/monopolize-its-NUMA-caches-for-the-target-lines/flush-those-caches-in-a-CPU-intensive-loop.
  Just attempting the exploit itself amounts to a DOS attack, and to make it successful you have to know a lot about the memory mapping.
The world is falling apart! by Anonymous Coward · 2014-12-24 02:32 · Score: 0

We have proof!
Why wait? by Anonymous Coward · 2014-12-24 02:41 · Score: 0

Your reply already seems to suffer,
Malicious code can cause computers to crash by rossdee · 2014-12-24 02:47 · Score: 2

Of course if you can get the target computer to run certain code, you can completely wipe all the RAM, but wheres the fun in that huh..
1. Re:Malicious code can cause computers to crash by MightyYar · 2014-12-24 04:05 · Score: 2
  
  This gives you a way to affect RAM outside of a sandbox.
  
  --
  W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
2. Re:Malicious code can cause computers to crash by 0123456 · 2014-12-24 04:10 · Score: 1
  
  This gives you a way to affect RAM outside of a sandbox.
  Only if the sandbox lets you repeatedly access memory and flush the cache between accesses, and you happen to know where your data is in physical RAM.
3. Re:Malicious code can cause computers to crash by Anonymous Coward · 2014-12-24 04:13 · Score: 0
  
  What are you targeting, an Apple II? Now that we have memory protection and sandboxed code execution, it's harder to run arbitrary code in privileged mode.
4. Re:Malicious code can cause computers to crash by MightyYar · 2014-12-24 04:53 · Score: 1
  
  Ah, yes, well I should have said "possibly" :)
  
  --
  W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
5. Re:Malicious code can cause computers to crash by Macman408 · 2014-12-24 10:23 · Score: 1
  
  It depends a bit on the physical structure of the RAM, but for the most part, the errors fall on logically adjacent rows (i.e. nearby memory addresses) in the RAM. So most of the time, you'll only affect other RAM inside your sandbox, and if you affect something outside the sandbox, it won't be far outside.
  I remember encountering a similar failure when designing a system; the particular memory controller and the particular DRAM module we were using both met all applicable specs, but when used together in a particular manner, they would fail miserably. The specific test was to alternate writing all zeros and all ones at different addresses. The RAM controller had an oddity where it would enable the drivers for the RAM data pins a very briefly before the data was known. For that particular data pattern, that meant that it would drive all ones on the data pins to the RAM for less than a nanosecond, before starting to drive all zeros (or the reverse). There's nothing really against that in the spec; the data was all correct for all the relevant setup and hold time requirements relative to the control signals. However, it caused a lot of noise on the ground plane of the DRAM module; we measured as much as 0.75V or so. (That's measuring the ground voltage on one side of the SO-DIMM to the ground voltage on the other side; it's shorted by a mostly-solid layer of copper, but that just wasn't enough to carry all the current with this particular access pattern.) So from the point of view of the RAM chips, it's a little like having your 2.5V supply voltage suddenly drop to 1.75V. It messes up all the reference voltages, so a 1 might be interpreted as a 0, or vice versa. The memory controller manufacturer refused to do anything about it (and it would've taken them many months to redesign and respin the chip anyway), but the RAM module manufacturer was friendly to us, and they beefed up the ground plane so that the noise level was much more manageable.
  In any case, I'm sure there are thousands of faults like this that are just waiting to be found and exercised in any given system. No modern computer is 100% tested, they're far too complicated. There will always be some weird sequence of things that could happen and trigger some failure - but hopefully that sequence is so odd, it'll never happen.
Does the cache control commands require root acces by TheSunborn · 2014-12-24 03:14 · Score: 2

Does the cache control commands require root access on Windows or Linux?
Why we need coders. by jpellino · 2014-12-24 03:19 · Score: 0

"just two memory reads with special relative offset and some cache control instructions in a tight loop" Yuh hurt yer what?

--
"Win treats sysadmins better than users. Mac treats users better than sysadmins. Linux treats everyone like sysadmins."
1. Re:Why we need coders. by Anonymous Coward · 2014-12-24 05:03 · Score: 0
  
  this is some ASM stuff.
  average yuck here either is the "I DUN NEED NO COLLAGE" types who scoff at low level stuff, or fell asleep in Assembly as low level language hasn't been necessary for a while.
  I see this and it makes my hair raise on end, as anybody could wreck your data with just 10 lines of ASM.
2. Re:Why we need coders. by Pelam · 2014-12-24 06:47 · Score: 1
  
  XD
Wow, a Forgetful Christmas Bug by Anonymous Coward · 2014-12-24 03:21 · Score: 1

The authors did a good job of covering the issue
Also, the paper is a good primer on dram stuff in general.
Unfortunately, this Christmas present.violates the Engineer's first rule.
Try to stay out of the news, because when you are in the news, it's usually not a good thing.
The failure mechanism:
There is is bug in most DDR3 chips built especially after 2010.
If you do too many read cycles in to short a time to the same row, some bits in an adjacent row may automagically change.
Kind of a cumulative, adjacent cell disturb mechanism.
Existing programs may do this accidentally, but it is unlikely because the cache usually lowers the number of read cycles to a safe number.
This can easily be done with a strange program using cache flushes, which an ordinary x86 user process can do if it wishes.
Mitigations on existing memory controllers:
ECC likely does not help because more bits are likely to be disturbed than most ECC can handle.
Keep strange programs off your system.
Changing the refresh rate 64mS to 8mS seems to eliminate the issue with perhaps a 35% performance hit.
The OS might be able to remap the memory so that only every other physical row is used, with a 50% decrease of memory capacity.
At least it's a 100% increase in reliable memory.
Mitigations on new equipment:
DRAMS that meet their specifications would be nice, but this seems more likely to be a change in the specs.
An increased refresh rate on rows near a lot of activity.
The authors propose a probability base plan.
Seems like one based on hard accounting might be smarter if you have to change the controller anyway.
Consequences:
This mechanism produces random results.
It seems there are likely more fruitful ways to break into a system.
The ease of implementation and wide applicability still make it an (ah-hem) interesting bug to say the least.
1. Re:Wow, a Forgetful Christmas Bug by skids · 2014-12-24 03:43 · Score: 1
  
  Thank you. Very helpful of you.
  
  --
  Someone had to do it.
2. Re:Wow, a Forgetful Christmas Bug by ChipMonk · 2014-12-24 04:35 · Score: 1
  
  It's too bad you posted this as AC. You could have gotten some good karma from the mod points.
Re:Does the cache control commands require root ac by PhrostyMcByte · 2014-12-24 03:31 · Score: 5, Informative

No. These are standard instructions that many apps require to function correctly when using multiple threads. Even if you aren't using them directly, at least some of the APIs you use most certainly are.
This makes me think of mid-90s Macs by RogueWarrior65 · 2014-12-24 03:40 · Score: 1

Way back when RAM was stupid expensive, one way to reduce cost was to use so-called composite RAM. On high-end Macs back in the early-mid 1990s, that could cause the machine to not boot but instead play the first four notes of the Twilight Zone theme song.
Re:Does the cache control commands require root ac by Anonymous Coward · 2014-12-24 03:40 · Score: 0

No. These instructions would be pretty pointless if they were restricted. They are designed to control low-level behaviour of processor caches for code optimization purposes. This is mostly relevant for expensive computations. These happen much more often in application code than in kernel code.
Not theoretical. It's hogwash. by Anonymous Coward · 2014-12-24 03:42 · Score: 5, Funny

This is ridiculous. Realistically, when have you ever run into a situation where stib teg ylirartibra deppilf?
this is why by Anonymous Coward · 2014-12-24 03:56 · Score: 0

I'm never upgrading from my vaxstation 4000. none of this new fangled tech for me. no sirree.
MEMTEST?? by Anonymous Coward · 2014-12-24 04:11 · Score: 0

Good summary.
My question is will MEMTEST, one of my boot options in GRUB will test all kinds of memory patterns, from systematic to pretty random.
Will passing a few total test cycles (often 12-24 hours for large RAM amounts) indicate less chance of this kind of adjacent bit corruption?
1. Re:MEMTEST?? by Anonymous Coward · 2014-12-24 09:54 · Score: 0
  
  No. This kind of bit corruption is most certainly not tested by MEMTEST.
2. Re:MEMTEST?? by Anonymous Coward · 2014-12-24 09:59 · Score: 0
  
  not until Memtest86 v6.0 Beta, see http://www.passmark.com/forum/...
Using Non-ECC Ram is Unacceptable by BrendaEM · 2014-12-24 04:16 · Score: 1, Insightful

Unless you are making a Speak-and-Spell, it's foolish not to use non-ECC RAM. I would rather pay an additional 9th as much and have some peace of mind that the RAM will at least keep from flipping a bit from comic rays, which happens about once a week.
I take that back; put it in the Speak-and-Spell, too.

--
https://www.youtube.com/c/BrendaEM
1. Re:Using Non-ECC Ram is Unacceptable by Archtech · 2014-12-24 04:41 · Score: 0
  
  I assume you meant "it's foolish to use non-ECC RAM".
  
  --
  I am sure that there are many other solipsists out there.
2. Re:Using Non-ECC Ram is Unacceptable by twistedcubic · 2014-12-24 07:17 · Score: 1
  
  This is true. However, getting a laptop with ECC RAM straight from the manufacturer is never an option, and impossible when RAM is soldered onto the motherboard. I think if Apple started using ECC RAM, and advertised it, others might follow suit (like with the "retina" displays).
3. Re:Using Non-ECC Ram is Unacceptable by thegarbz · 2014-12-24 10:53 · Score: 1
  
  How foolish and for what specific workload? I have a gaming rig where I sometimes edit photos and do 3d design and some light coding. In the past 10 years I've never seen any visible data corruption and not had an inexplicable crash.
  So tell me again why I should spend the money? Your once a week problem sound note theoretical than practical.
4. Re:Using Non-ECC Ram is Unacceptable by Archtech · 2014-12-25 23:47 · Score: 1
  
  Why was my comment moderated "Troll" when I merely pointed out that the parent had unintentionally inserted an extra negative in his statement? The drift of his comment was surely that ECC RAM is better. Yet he wrote "it's foolish not to use non-ECC RAM".
  It's sad that moderators don't take the trouble to read what is in front of them. Or, worse still, that at least one moderator routinely mods my comments "Troll" without reading them.
  
  --
  I am sure that there are many other solipsists out there.
memtest86 includes a test for this by Anonymous Coward · 2014-12-24 04:17 · Score: 1

Sort of already known 'weakness', recent memtest86 include the 'hammer test' for the purpose of testing this case, see http://www.passmark.com/forum/showthread.php?4836-MemTest86-v6-0-Beta
Re:Does the cache control commands require root ac by 0123456 · 2014-12-24 04:17 · Score: 1

No. These are standard instructions that many apps require to function correctly when using multiple threads.
Can you explain when you'd need to flush the cache when using multiple threads? You'd have to flush the cache back to RAM (isn't that a privileged instruction?), invalidate it, then read the data back from RAM. That's surely insanely slow compared to just using the CPU's internal cache coherency mechanisms?
Maybe parity isn't just for farmers? by Anonymous Coward · 2014-12-24 04:34 · Score: 0

At least with parity or inadequate ecc, the computer would likely stop before causing too much harm with bad results.
not for threaded xode by Anonymous Coward · 2014-12-24 04:39 · Score: 0

It's actually not for multithreaded code but rather for DMA coherency. DMA devices can access main memory so you may need to flush data from the CPU cache to main memory to ensure the PCI devices get the current data.
1. Re:not for threaded xode by 0123456 · 2014-12-24 06:24 · Score: 1
  
  That seems more likely, but, when I was writing DMA code years ago, we put the buffers in non-cached RAM (and there were only written to from a driver in the kernel). Maybe explicit cache flushes are faster these days.
Why now? by TheRaven64 · 2014-12-24 05:01 · Score: 0

This paper was published at ISCA in June and on Soylent News earlier today (or possibly yesterday). Why is it suddenly being circulated six months after publication? Someone trying to promote ECC memory?

--
I am TheRaven on Soylent News
Known issue by Anonymous Coward · 2014-12-24 05:10 · Score: 5, Informative

This has been know for some time. It's been referred to as "Row Hammer" and has been discussed at length by Intel and DRAM manufacturers.
https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#safe=off&q=intel%20row%20hammer
I've seen it cause multi-bit errors in ECC systems
Re:Maybe they can fix it with... by Anonymous Coward · 2014-12-24 05:38 · Score: 0

...or better DRAM modules. Does this problem occur with DDR2 modules?
Re:Does the cache control commands require root ac by Anonymous Coward · 2014-12-24 05:41 · Score: 0

clflush is really quite pointless. wfence() is what you want. arm has gotten rid of cache line flushing.
Re:Does the cache control commands require root ac by Anonymous Coward · 2014-12-24 06:08 · Score: 0

Nope. It's an amazing DoS attack. Just wrap the code in some offset finding and address marching code, compile it in /tmp, execute it in the background, rm it, and watch it cause chaos. This will be a big problem on any shared platform.
So.... by Festering+Leper · 2014-12-24 06:09 · Score: 1

Liquid nitrogen for your RAM then...?

--
if you want people to think you know what you are talking about, just put ".com" at the end of everything you say.com
Write-Only Memory by marciot · 2014-12-24 06:49 · Score: 1

This is the reason I recommend that everyone invest in write-only memory for their computers. It is far more secure and hack proof than the alternatives.
1. Re:Write-Only Memory by Anonymous Coward · 2014-12-24 16:36 · Score: 0
  
  Wa-hahahaha. Hardee har har. Whoo hoo hoo hoo. Phunnie.
  Write-only memory? Who's ever heard of such a ridiculous thing?
  Oh... Wikipedia has. Wikipedia article: Write-only memory (joke). (See also:Wikipedia article for Null device (redirected from Wikipedia article for /dev/null).)
  On a more serious note, though, Wikipedia's article for Write-only memory is actually a disambiguation page... there is also Wikipedia's page for Write-only memory (engineering) which actually describes actual usefulness.
  Sadly, the most clear example of "usefulness" provided by that article is in the "security and encryption" section where it describes DRM: securing the decrypted information from the user who shouldn't have access to the unencrypted data. But there are some other examples provided where one device can write to memory and then that same device doesn't have the ability to read from the memory that it wrote to. Log files could be a simple theoretical example. Sending info to a remote log server: generally okay. Reading all the privileged contents: not necessarily okay (depending on permission levels).
Just FANTASTIC :( by Anonymous Coward · 2014-12-24 06:52 · Score: 0

The rush to DDR3 was a cynical cash-grade designed to artificially inflate the price of DRAM modules (which happened beyond the wildest dreams of commodity memory speculators). Now we find the 'standard' is inherently faulty by design. And yet the equally worthless DDR4 is on our doorstep- offering almost nothing BUT another round of price inflation.
Current CPUs are NOT RAM bound- their large internal caches have mitigated restricted external bandwidth for all common computational tasks for years now. ONLY the issue of the onboard GPU considers the advantage of boosting RAM speeds, and even then recent moves by AMD and Nvidia to use reat-time compression to reduce bandwidth requirements implies existing memory bandwidth would be more than good enough if something like GDDR5 was used in place of the DDRn families of chips.
What we do NEED however is absolute reliability in memory chips. Now a modern computer has memory allocation operation through a VASTLY complicated system of dynamic virtual mapping, and many cores potentially seeking access to the 'same' memory locations, only rock-solid memory design can prevent terrible deep dangerous 'bugs' cause by unpredictable memory faults.
Since DDR3 make MASSIVE profits hand over fist, we might at least expect reliability for memory that has never been more expensive across the recent past. Sadly, the old adage of "the more you pay, the less you get" is kicking in. It is years since PC performance fanatics have fretted over exotic brands of RAM- so no-one has paid proper attention to the issue of correct RAM stick behaviour for quite some time.
Can't be a co-incidence that working with a brand new Kabini (AMD) laptop recently, problems led to running the standard memory tests that immediately showed the manufacture provided RAM was 'faulty'. I'd bet anything that the entire run of this particular laptop from HP has the same problem. A machine designed and built without ANYONE at HP even bothering to run the standard Microsoft Windows memory test. Recent AMD CPU/APU parts have proven VERY poor on the memory reliability side (AMD was the first to build the memory controller INTO the CPU chip, and AMD systems subsequently became FAR fussier about the memory sticks they'd work correctly with.
softECC by Anonymous Coward · 2014-12-24 08:05 · Score: 0

Why can't ECC be done in software? At least for userland applications (maybe not necessarily for kernel-space memory)?
There's prior art on softecc:
http://pdos.csail.mit.edu/papers/softecc:ddopson-meng/softecc_ddopson-meng.pdf
Re:Maybe they can fix it with... by Anonymous Coward · 2014-12-24 08:18 · Score: 0

Yes, yes...let the butthurt flow through you. Waste your modpoints knocking me down.
Not the first time hammering caused trouble. by Ungrounded+Lightning · 2014-12-24 11:46 · Score: 1

Story I heard about mid-20th-century IBM mainframe. (I think it was the 360 series).
Core memory was tight and had cooling issues. The designers examined the instruction set and determined that, given cacheing and the like, no infinite loop could hammer a particular location more than one cycle in four (25% duty cycle), for which cooling was adequate. So they shipped.
Turns out, though, you could do a VERY LONG FINITE loop that hit a location every other cycle, for 50% duty cycle (not to mention the possibility of hitting a nearby location with some of the remaining cycles). Wasn't too long before a student managed to do this.
And set the core memory on fire.

--
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
Allready known for flash memory by xluap · 2014-12-24 12:35 · Score: 1

Read disturb was allready known for flash memory. Read disturb is when a flash cells flips a bit when other cells adjacent to the disturbed cell are repeatedly read.
1. Re:Allready known for flash memory by Agripa · 2014-12-27 07:44 · Score: 1
  
  We used to call this "pattern sensitivity" when applied to RAM.
Wow. Superbad. by drolli · 2014-12-24 15:32 · Score: 2

Thats an evil bug. This could even be triggered accidentally by bad programming.
But more imporant, this allows you to break your VMs memory boundaries without any restriction. If you happen to make an educated guess about the memory layout of the physical machine and the host and guest kernel images loaded, you can try to
a) manipulate the host kernel directly (that would be nearly undetectable)
b) manipulate private keys in other VMs or the host
c) manipulate other VMs memory
d) communicate between VMs
And all of this independent of any software bug. The only thing which can be done about it would be to disable the feature on the simulated guest processor which allows to manipulate the cache arbitratily (and implicitely limit running guest programs to 1 core!). Alternatively,increase the refresh rate (i remember that the refresh rate could acturally be set manually in the 90s).
That being said, i just wonder if it possible to trigger this bug from a high level language (e.g. matlab) or the JVM where the operation causing the problem could be used implicitely for some vectorized code or other operations, e.g can this bug be triggered by the voilatile keyword in Java and accessign the memory in the same way?
1. Re:Wow. Superbad. by Anonymous Coward · 2014-12-24 16:16 · Score: 0
  
  Just my two cents, as someone who spent 15 seconds reading a bit from the articles:
  Cent #1:
  
  That being said, i just wonder if it possible to trigger this bug from a high level language (e.g. matlab) or the JVM [...]?
  I suggest you speculate.
  Maybe even do some original research.
  Whatever you do, by all means, do NOT read the first paragraph of the second paper that the summary refers to. If you did that, you would read this...
  
  We present an experimental study showing that soft memory errors can lead to serious security vulnerabilities in Java and .NET virtual machines, or in any system that relies on type-checking of untrusted programs as a protection mechanism. Our attack works by sending to the JVM a Java program that is designed so that almost any memory error in its address space will allow it to take control of the JVM. All conventional Java and .NET virtual machines are vulnerable to this attack.
  ... which directly answers your quoted question. You could learn from someone else's work, instead of repeating the work yourself. We obviously can't tolerate such efficiency.
  Cent #2 (since, I did say, I'm offering you my two cents...):
  Let's look at the rest of your post.
  
  you can try to
  a) manipulate the host kernel directly (that would be nearly undetectable)
  b) manipulate private keys in other VMs or the host
  c) manipulate other VMs memory
  d) communicate between VMs
  Hmmm... or, e), you could try to subvert security systems. You could elevate, getting permission to do things that you wouldn't otherwise be able to do. You could effectively modify any physical bit of RAM. I'd love to take credit for this novel idea, but in truth, this is exactly what the security researchers indicated in their summaries.
  As terrible as your theoretical attacks a), b), c), and d) all sound, all of those are clearly possible if you subvert the security system that is designed to prevent people from doing those things. If e) is accomplished (meaning that security becomes entirely ineffective), then all physical memory can be written to (by the attacker). So, as dramatic as your ideas may sound (because each one of those issues do appear to be very serious problems), I think you actually may be understating the case.
2. Re:Wow. Superbad. by Anonymous Coward · 2014-12-24 20:52 · Score: 0
  
  No. The article does not answer my question, just because the term JVM appears in both. HINT: take more than 15 seconds to read and think about it.
  my theoretical attacks were actually very practical and specific. Let me reprashe it like this: the next time my amazon ec2 micro instance dies unexpectedly, i will think again.
3. Re:Wow. Superbad. by Pinhedd · 2014-12-25 05:27 · Score: 1
  
  It's not possible to do any of those.
  1. The mechanism that this uses doesn't provide for deterministic results. At worst, rewriting the same row numerous times may result in some of the bits in spatially related rows being corrupted.
  2. Address spaces are highly randomized and virtual to physical translation makes it incredibly difficult to obtain even an educated guess as to the layout.
  This exploit just allows an attacker to possibly corrupt nearby data. It's a troll tool, nothing else.
4. Re:Wow. Superbad. by drolli · 2014-12-27 01:53 · Score: 1
  
  Maybe. Maybe not. Not sure what the effect of secod order page translation would be if you manage to trigger the loading of a module (of the first use of memory in a module) in another VM after your VM hase been loaded. If you manage to trigger the access to the modules data memory, whci normally may be unuses after you allocate ("pad") enough memory, i could imagine that you can actually kill "nearby" data (which in Second order translation would apprear physically close to you memory).
  I am not saying that this is a MOV instruction into another VMs memory, but merely stating that by a educated guess, and innocent network communication you could sometime reset/clear flags or counters which may enable you to do further things.
Re: Does the cache control commands require root a by Anonymous Coward · 2014-12-24 16:15 · Score: 0

You shouldn't be required to skip the cache for most CPU bound thread sync. Cores one one package can snoop each-other's cache to resolve semaphores without a RAM trip, and I believe that multi-chip SMP also has a communication channel for that.
For now, I'm sure that Xen is going nuts making sure there are guard pages to keep any smashing inside your own VM. I wonder if any Cloud providers match ECC errors to particular VMs, and would like to have a nice chat with anybody trying to trigger this on their hardware.
Re: Does the cache control commands require root a by Bengie · 2014-12-25 11:48 · Score: 1

FYI: You can snoop L2 cache, but not L1. Intel went with inclusive cache so snooping wouldn't be needed. AMD went with exclusive, which gives better cache usage, but went trying to sync threads, all of that cache snooping is a high latency operation. By having cache being inclusive, you no longer need to snoop, just look at cache normally.

AMD has higher overall throughput for many GPU type work loads, but Intel shines with work loads that require thread syncing.
The Patent office did it again by Anonymous Coward · 2014-12-26 03:42 · Score: 0

http://www.google.com/patents/WO2014004748A1?cl=en
Row hammer refresh command
Claim 1:
Watch for too many cycles to a row
and when it happens send a refresh to the adjacent row.
Given an understanding of the failure mechanism, a junior engineer should be able to think of this in about 100mS.
The paper in this article was very careful to not present this solution, instead presenting the a probalistic one.
This presents an interesting and useful experiment.
how many folks reading the paper thought of the claim 1 solution which Intel claims is novel?