A New Vulnerability In RSA Cryptography

← Back to Stories (view on slashdot.org)

A New Vulnerability In RSA Cryptography

Posted by ryuzaki0 on Saturday November 18, 2006 @09:45AM from the predictions-of-trouble dept.

romiz writes, "Branch Prediction Analysis is a recent attack vector against RSA public-key cryptography on personal computers that relies on timing measurements to get information on the bits in the private key. However, the method is not very practical because it requires many attempts to obtain meaningful information, and the current OpenSSL implementation now includes protections against those attacks. However, German cryptographer Jean-Pierre Seifert has announced a new method called Simple Branch Prediction Analysis that is at the same time much more efficient that the previous ones, only needs a single attempt, successfully bypasses the OpenSSL protections, and should prove harder to avoid without a very large execution penalty." From the article: "The successful extraction of almost all secret key bits by our SBPA attack against an openSSL RSA implementation proves that the often recommended blinding or so called randomization techniques to protect RSA against side-channel attacks are, in the context of SBPA attacks, totally useless." Le Monde interviewed Seifert (in French, but Babelfish works well) and claims that the details of the SBPA attack are being withheld; however, a PDF of the paper is linked from the ePrint abstract.

18 of 108 comments (clear)

Min score:

Reason:

Sort:

I got a question... by sam0vi · 2006-11-18 09:50 · Score: 4, Interesting

When i see this kind of news the following question arises: so what are we supposed to do now? Throw away RSA cryptography is not the best answer i think. What do you, fellow /.ers, would do to by pass this problem?

--
When my Karma level reaches 0 I feel in piece with the Universe
1. Re:I got a question... by Anonymous Coward · 2006-11-18 10:15 · Score: 3, Informative
  
  This is not a vulnerability in RSA per se, but rather in the implementation of RSA on modern CPUs. It is possible to run a "spy" process along side cryptographic application, which would "sniff" out private keys. It can do this by making note of how its own instructions are executed and thus predicting what instructions are executed for other processes. I think the important thing is that this type of attack requires local execution of code for this to work.
  
  I would think this can be circumvented by alternative implementation of the RSA encoding algorithm. Maybe by inserting "noise" instructions into the execution flow.
2. Re:I got a question... by Eon78 · 2006-11-18 10:18 · Score: 5, Informative
  
  You just keep on using RSA of course. As the article says: it is possible for a spy application running on your machine to get vital information about an RSA enryption process with OpenSSL. So, as long as you make sure your machine is secure there is nothing to worry about.
  
  Most of the time when you hear an encryption scheme is cracked or successfully attacked they mean that it has gotten easier to crack, not that the encryption is totally worthless. Which of course doesn't mean that countermeasures should not be taken, but it also doesn't mean that you have to throw out RSA.
3. Re:I got a question... by smallfries · 2006-11-18 11:04 · Score: 4, Informative
  
  RSA isn't the problem. The implementation of RSA on a modern processor is the problem. Moving to another algorithm wouldn't guarantee a lack of side channels. One way around this would be to specialise the algorithm with your own private key. This would unroll all of the loops, and decide the branches statically. If you assume that the machine is not compromised, then this executable could be stored as read-only for your account. If the machine is compromised enough for a non-priviledged process to read your private data then you don't need SBPA - you're toast.
  
  --
  Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
4. Re:I got a question... by maxwell+demon · 2006-11-18 11:45 · Score: 4, Interesting
  
  After now having read the complete article: Shouldn't it be possible to eliminate the branches completely?
  The following loop (adapted from fig. 3 in the paper) should IMHO work as well (although less efficiently):
  S = M A = M - 1 for i from 1 to n-1 do S = S * S (mod N) C = di /* should be doable without branch by just bit masking and shifting */ C = C * A C = C + 1 /* now if di was 1, C is M, otherwise C is 1 */ S = S * C (mod N) return S
  The only branch here is in the for loop, and that's independent of the key. Unless there are exploitable branches in the multiplication routine, of course.
  
  --
  The Tao of math: The numbers you can count are not the real numbers.
Not so bad... by statusbar · 2006-11-18 09:59 · Score: 4, Insightful

From the Abstract:
SBPA attacks empower an unprivileged process to successfully attack other processes running in parallel on the same processor

So it requires a spy proccess to be running on the same processor as the server....
--jeffk++

--
ipv6 is my vpn
1. Re:Not so bad... by SnowZero · 2006-11-18 10:56 · Score: 5, Interesting
  
  It gets better. The attack requires that the two processes are running on the same core with hyperthreading enabled (i.e. ALU-poor CMP). The "spy" process will be sucking up 100% cpu pretty much continuously. They also simplified the multiplication routine from OpenSSL. Even if you are running such a setup on a P4 with HT turned on (even though its often useless), and you need to run secure processes along with unsecure ones (generally not a good idea anyway), patches already exist for Linux and BSDs to address this. The patches modify the scheduler to prevent processes from different users from running on the same physical core. A half-hearted attempt is made in the paper to say that these attacks to generalize to something remote, but no details are given as to how their attack would compensate for the 100,000 fold decrease in timing accuracy to pull off the attack on even a local LAN.
  
  Essentially they took a very impractical attack with an unlikely scenario, and created a somewhat practical attack with an unlikely scenario. Avoid the problem scenario which was raised in the prior work last year, and you are still golden.
2. Re:Not so bad... by Beryllium+Sphere(tm) · 2006-11-18 11:01 · Score: 3, Insightful
  
  For example, on a shared server at a colo site?
Corel Cache by davidwr · 2006-11-18 10:03 · Score: 5, Informative

Just in case it gets Slashdotted.

PDF file

--
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
Multi-site servers at risk? by CamoCoatJoe · 2006-11-18 10:10 · Score: 5, Insightful

Let me get this straight. To use this attack, you need to be running on the same hardware, but you don't need any particular access beyond that? If that's the case, any multi-site server that allows you to run your own server-side scripting is at risk.

--
This is not a signature.
Branch predictor as a covert channel by hpa · 2006-11-18 10:16 · Score: 4, Interesting

This isn't really a flaw in RSA cryptography, but rather the fairly obvious situation that a branch predictor, shared between processes of different privilege levels, can be used as a covert channel and thus can be used to reveal state. The same is true with the cache, for example, and multithreading makes this problem many times worse by increasing the bandwidth of the channel. On architectures which don't have branch predictors, or don't share them, this is not an issue. ARM processors, for example, tend to rely on predication rather than branches (except when running Thumb), and thus don't suffer the same problem.

This class of problems is only going to grow as CPUs become less and less deterministic.
Re:Unsecure computer - no secrets. Big deal ! by Cid+Highwind · 2006-11-18 10:31 · Score: 4, Interesting

Think managed web hosting companies that put dozens of virtual hosts on a single physical server. If this really works from an unprivileged account, one malicious user could steal SSL keys from all the rest.

--
0 1 - just my two bits
Re:Unsecure computer - no secrets. Big deal ! by RAMMS+EIN · 2006-11-18 10:47 · Score: 3, Informative

``If you have a Trojan on your computer you are going to lose your secrets anyway,''

Whose secrets? Multiple people use my computers. If there's a trojan on the system, it can't necessarily access all these people's data.

``your private key is probably stored on the disk drive,''

Password-protected, thank you very much.

``and you use the keyboard to type passwords''

I don't use a keyboard with most computers I use; I communicate with them over SSH. Of course, I use a keyboard on _some_ machine, so if that machine has a keylogger running on my account (or root's), that would be a problem.

``Could someone explain how a local attack can be big news ?''

I haven't RTFA yet, but local attacks are often problematic for systems used by multiple people, especially if not all people know good security practices (or are even completely clueless - you get many of such people when you operate shared web hosts).

--
Please correct me if I got my facts wrong.
RSA Isn't Broken, And This Is Localhost Only by tqbf · 2006-11-18 11:17 · Score: 4, Informative
Aciicmez et al are extending an attack they published a few months ago. It's real, but:
- It targets RSA implementations, not the algorithm, which is fine
- Attackers need to be on the same host as the victim
- This specific attack is tuned to the Pentium 4 architecture
This paper doesn't break SSL.

We wrote about the attack two months ago. A quick, dumbed-down recap:

The CPU aggressively caches aspects of what programs do. It doesn't make an exception for RSA. You obviously can't just read key bits out of the cache.

But caches are finite, and way, way too small to accomodate everything every program does. So operations from one program are constantly evicting cached values from other programs. This makes the other program imperceptably but measurably slower. By writing a program that constantly and carefully measures those time differences, you can watch an RSA operation from another program leave footprints through the cache.

There are years-old attacks like this against the L1 and L2 caches, and extensions that use hyperthreading to improve the resolution. Some variants, which measure timing differences but don't track cache footprints, are remote attacks. These aren't. You run a "spy" process on the machine; it repeatedly executes a series of operations and measures timing differences. Aciicmez found an overlooked cache which makes Pentium branch prediction work (the BTB). They published back in August.

From what I can tell, this paper extends the attack; they figured out that the Pentium 4 architecture has two BTB caches, and their original attack wasn't hitting both of them. Their new attack does, and that creates much bigger timing differences, making RSA's footprints much easier to see.

This is really cool stuff, but from where I stand, they hit game-over back in August with the original BTB attack. This paper reads like a refined exploit for the same vulnerability.

Since this is localhost-only, and (unlike Bernstein's and Boneh's attacks) can't be extended remote, it's not going to impact SSL or (single-user) SSH. The classic victim of timing attacks is smart cards. For these attacks, another interesting possibility is DRM; these attacks say you can't trust crypto running on the same Pentium 4+ as an attacker.
Translation of the article published by Le Monde by jackjeff · 2006-11-18 12:09 · Score: 4, Informative

Better than BabelFish I hope.. human made, so prone to errors ;)

====

The confidence users have in Internet and in the capacity of the system to secure data has always been relative. And it could collapse if the microprocessor manufacturers and cryptography software editors were to be unable to cope against a new type of attack, fearsomely efficient, discovered by the team directed by the German cryptographer Jean-Pierre Seifert (universities of Haifa and Innsbruck). Electronic commerce could be threatened, but also, more broadly, everything that enables the dematerialization of exchanges, which rely on asymmetrical cryptography applications, would it be ciphers, digital signatures or message integrity checks.

In the still confidential article, the researcher and his colleagues describe the procedure they used to, gather a nearly entire cipher key of 512 bits (a series of as many of 0s and 1s) in a single attempt, that's to say in a few milliseconds. For comparison, the greatest public key that has been broken so far is 640 bits long, and as announced in November 2005, the process involved the usage of 80 microprocessors running at 2.2 Ghz for 3 months.

Since the announcement made this summer, on the International Association of Cryptology Research (IACR), that such an attack was theoretically feasible, microprocessors producers were on their nerves: the chips of nearly all of the computers, world wide, are vulnerable. So much that the head of Intel security, the number 1 microprocessor manufacturer, when confronted with the issue declared that he would be "unavailable for a few weeks". This is because the usual fix against classical attacks on public key cryptography - to increase the size of the keys - will not work this time.

Jean-Pierre Seifert was in fact able to affect the systems from the ground up. As most of the security relies on the incapacity to mathematically deduce the private key, kept secret, from the public one, he chose to study how the microprocessors was reading these confidential data.

He found out that the mode of operation or the chip itself, optimized for calculation speed, was making it vulnerable. "Security was sacrificed for the sake of performance", estimated the researcher.

The attack principle can be summed up as such: to go faster and faster, the microprocessor parallelizes operations and uses a branch prediction system to predict the result of the current operation. If the prediction is good, the computation time is greatly decreased. If not, the processor must go back and start again the elementary operation. It is "sufficient" to measure the computation time when the processor goes through the line of 0s and 1s that constitute the cipher key to able able to deduce it.

This threat, called "Branch Prediction Analysis" (BPA) was already known. It was thought a lot of attempts was necessary to statistically deduce the cipher key, thus making the attack not-practicable. The technique discovered by Jean-Pierre Seifert make it possible to break the key in a single attempt. It relies on the fact that the prediction process, essential to increase the processor speed, is not protected.

A spyware could then be made to listen to the chip discreetly, and send back the key to hackers, foreign intelligence services or competitors.

"A MATTER OF WEEKS"

We are not yet there though. "We have not made a turn key application that would be available online" argues Jean-Pierre Seifert. But he estimates that once the method is made public, in early 2007 during the next RSA conference - RSA, being one of the most popular ciphers -, the making of such software would be "a matter of weeks".

Cryptography specialists confirm that the threat is serious. One of the best world wide public key experts anonymously sums up the situation: "The real solution is to review the conception of the microprocessors itself - a long and difficult process. A short term solution would be to forbid normal applications to run in para
Re:Unsecure computer - no secrets. Big deal ! by Alsee · 2006-11-18 14:34 · Score: 3, Insightful

problematic for systems used by multiple people

And perhaps more signifigantly, it is problematic for idiots who think the definition of "secure/security" is using some DRM scheme hoping to "secure" a computer against its owner.

The owner of a computer can use the technique in this article to keep an eye on his own computer and track what his computer is doing for him, and to record the DRM-keys being used to "secure" his own data against him.

-

--
- - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
Great idea! by DrJimbo · 2006-11-18 16:57 · Score: 4, Interesting

That is a clever vectorization of the square-multiply loop. It sure looks to me like it would work (I used RSA encryption as the final project in a University assembly language class I taught). The slight decrease in efficiency of your routine will be not be noticed. The timing of the entire process is totally dominated by the N-byte x N-byte multiplications. An extra N-byte x 1-byte multiplication will cause less than a 1% slowdown, probably much less.

A slight improvement to your idea might be to balance the loop anyway, using D = 1 - di, etc., essentially a vectorized version of figure 4. This would slow it down by a factor of two but it would make it resistant to conventional timing attacks.

--
We don't see the world as it is, we see it as we are.
-- Anais Nin
Solution to: Branch predictor as a covert channel by Terje+Mathisen · 2006-11-19 06:35 · Score: 3, Interesting

From the linked article:

R0 = 1; R1 = M
for i from 0 to n-1 do
if d[i] then
R1 = R0 * R1 mod N
R0 = R0 * R0 mod N
else
R0 = R0 * R1 mod N
R1 = R1 * R1 mod N
return R0

The key-dependent if statement is the key here, if we can remove all such branches, then there's no Branch Target Buffer entry that depends on it, and no timing channel attack either:

R0 = 1; R1 = M;
for (i = 0; i < n; i++) {
mask = 0 - d[i]; // Either 0 or -1
nmask = mask ^ -1; // -1 or 0
T0 = R0 & mask; // Either 0 or R0
T0 += R1 & nmask; // At this point T0 will point to the value to be squared, R0 or R1!

T1 = R0 * R1 mod N;
T0 = T0 * T0 mod N; // Now we move the correct values back into R0 & R1
R1 = T1 & mask;
R0 = T0 & mask;
R0 += T1 & nmask;
R1 += T0 & nmask;
}
return R0;

There are at least three interesting issues here:

a) Most modern cpus have hw support for conditional operations, on x86 this is in the form of CMOVcc which is a (constant-time!) conditional move into a register, but as shown above, it really isn't needed here.

b) The perforance impact of the above branch removal can be negative!
On a P4 a branch miss costs about 20 clock cycles, and since a key-dependent branch will miss 50% of the time, the average cost is 10 cycles. My replacement code above takes around 5 cycles or less on any current cpu.

c) A final possible timing-channel attack would be due to the memory alignment of the R0 and R1 values:
By allocating them at the same address modulo the cpu page size, i.e. at 4 KB offset, the cache lines hit will be the same for both.

When I worked on the asm version of DFC, one of the AES also-rans, I removed a similar timing attack from a core 128-bit modular multiplication operation, using very similar techniques.

Terje

--
"almost all programming can be viewed as an exercise in caching"