Many DDR3 Modules Vulnerable To Bit Rot By a Simple Program
New submitter Pelam writes: Researchers from Carnegie Mellon and Intel report that a large percentage of tested regular DDR3 modules flip bits in adjacent rows (PDF) when a voltage in a certain control line is forced to fluctuate. The program that triggers this is dead simple — just two memory reads with special relative offset and some cache control instructions in a tight loop. The researchers don't delve deeply into applications of this, but hint at possible security exploits. For example a rather theoretical attack on JVM sandbox using random bit flips (PDF) has been demonstrated before.
I don't know if there are hundreds or thousands or hundreds of thousands of low level 'bugs' like this related to simple subsystems abused in specific ways.. but there are plenty.
This is all very interesting but totally pointless! Which modules? Tell us the brands, model names, manufacturer numbers?
Get free satoshi (Bitcoin) and Dogecoins
as for me, i'll wait for some real world examples of this possible exploit before i switch to ECC memeory, which would mean a new MB on top of the more exp memory.
Wear Leveling?
Leakage Leveling?
P.S. Question is whether a workaround is possible with the CPU microcode.
All hope abandon ye who enter here.
I would guess that it's theoretical because it involves things like knowing exactly where the JVM is positioned in physical memory, and how its pages are laid out. That, and that the demonstration involved knowing all of these things before you started.
ECC is dismissed in the article, but the article ignores that ECC systems also have a scrubbing capability
Unfortunately, ASUS is the only manufacturer that consistently includes ECC support in their AMD based motherboard line.
Of course if you can get the target computer to run certain code, you can completely wipe all the RAM, but wheres the fun in that huh..
Does the cache control commands require root access on Windows or Linux?
The authors did a good job of covering the issue
Also, the paper is a good primer on dram stuff in general.
Unfortunately, this Christmas present.violates the Engineer's first rule.
Try to stay out of the news, because when you are in the news, it's usually not a good thing.
The failure mechanism:
There is is bug in most DDR3 chips built especially after 2010.
If you do too many read cycles in to short a time to the same row, some bits in an adjacent row may automagically change.
Kind of a cumulative, adjacent cell disturb mechanism.
Existing programs may do this accidentally, but it is unlikely because the cache usually lowers the number of read cycles to a safe number.
This can easily be done with a strange program using cache flushes, which an ordinary x86 user process can do if it wishes.
Mitigations on existing memory controllers:
ECC likely does not help because more bits are likely to be disturbed than most ECC can handle.
Keep strange programs off your system.
Changing the refresh rate 64mS to 8mS seems to eliminate the issue with perhaps a 35% performance hit.
The OS might be able to remap the memory so that only every other physical row is used, with a 50% decrease of memory capacity.
At least it's a 100% increase in reliable memory.
Mitigations on new equipment:
DRAMS that meet their specifications would be nice, but this seems more likely to be a change in the specs.
An increased refresh rate on rows near a lot of activity.
The authors propose a probability base plan.
Seems like one based on hard accounting might be smarter if you have to change the controller anyway.
Consequences:
This mechanism produces random results.
It seems there are likely more fruitful ways to break into a system.
The ease of implementation and wide applicability still make it an (ah-hem) interesting bug to say the least.
No. These are standard instructions that many apps require to function correctly when using multiple threads. Even if you aren't using them directly, at least some of the APIs you use most certainly are.
Way back when RAM was stupid expensive, one way to reduce cost was to use so-called composite RAM. On high-end Macs back in the early-mid 1990s, that could cause the machine to not boot but instead play the first four notes of the Twilight Zone theme song.
This is ridiculous. Realistically, when have you ever run into a situation where stib teg ylirartibra deppilf?
Unless you are making a Speak-and-Spell, it's foolish not to use non-ECC RAM. I would rather pay an additional 9th as much and have some peace of mind that the RAM will at least keep from flipping a bit from comic rays, which happens about once a week.
I take that back; put it in the Speak-and-Spell, too.
https://www.youtube.com/c/BrendaEM
Sort of already known 'weakness', recent memtest86 include the 'hammer test' for the purpose of testing this case, see http://www.passmark.com/forum/showthread.php?4836-MemTest86-v6-0-Beta
No. These are standard instructions that many apps require to function correctly when using multiple threads.
Can you explain when you'd need to flush the cache when using multiple threads? You'd have to flush the cache back to RAM (isn't that a privileged instruction?), invalidate it, then read the data back from RAM. That's surely insanely slow compared to just using the CPU's internal cache coherency mechanisms?
This has been know for some time. It's been referred to as "Row Hammer" and has been discussed at length by Intel and DRAM manufacturers.
https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#safe=off&q=intel%20row%20hammer
I've seen it cause multi-bit errors in ECC systems
Liquid nitrogen for your RAM then...?
if you want people to think you know what you are talking about, just put ".com" at the end of everything you say.com
That seems more likely, but, when I was writing DMA code years ago, we put the buffers in non-cached RAM (and there were only written to from a driver in the kernel). Maybe explicit cache flushes are faster these days.
XD
This is the reason I recommend that everyone invest in write-only memory for their computers. It is far more secure and hack proof than the alternatives.
Story I heard about mid-20th-century IBM mainframe. (I think it was the 360 series).
Core memory was tight and had cooling issues. The designers examined the instruction set and determined that, given cacheing and the like, no infinite loop could hammer a particular location more than one cycle in four (25% duty cycle), for which cooling was adequate. So they shipped.
Turns out, though, you could do a VERY LONG FINITE loop that hit a location every other cycle, for 50% duty cycle (not to mention the possibility of hitting a nearby location with some of the remaining cycles). Wasn't too long before a student managed to do this.
And set the core memory on fire.
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
Read disturb was allready known for flash memory. Read disturb is when a flash cells flips a bit when other cells adjacent to the disturbed cell are repeatedly read.
Thats an evil bug. This could even be triggered accidentally by bad programming.
But more imporant, this allows you to break your VMs memory boundaries without any restriction. If you happen to make an educated guess about the memory layout of the physical machine and the host and guest kernel images loaded, you can try to
a) manipulate the host kernel directly (that would be nearly undetectable)
b) manipulate private keys in other VMs or the host
c) manipulate other VMs memory
d) communicate between VMs
And all of this independent of any software bug. The only thing which can be done about it would be to disable the feature on the simulated guest processor which allows to manipulate the cache arbitratily (and implicitely limit running guest programs to 1 core!). Alternatively,increase the refresh rate (i remember that the refresh rate could acturally be set manually in the 90s).
That being said, i just wonder if it possible to trigger this bug from a high level language (e.g. matlab) or the JVM where the operation causing the problem could be used implicitely for some vectorized code or other operations, e.g can this bug be triggered by the voilatile keyword in Java and accessign the memory in the same way?
Because it's a scientify theory or as wiki says: A scientific theory is a well-substantiated explanation of some aspect of the natural world that is acquired through the scientific method and repeatedly tested and confirmed through observation and experimentation. As with most (if not all) forms of scientific knowledge, scientific theories are inductive in nature and aim for predictive power and explanatory force.
FYI: You can snoop L2 cache, but not L1. Intel went with inclusive cache so snooping wouldn't be needed. AMD went with exclusive, which gives better cache usage, but went trying to sync threads, all of that cache snooping is a high latency operation. By having cache being inclusive, you no longer need to snoop, just look at cache normally.
AMD has higher overall throughput for many GPU type work loads, but Intel shines with work loads that require thread syncing.