New, Faster Attack against SHA-1 Revealed
VxSote writes "According to Bruce Schneier's
blog, a team of Chinese cryptographers has announced new results against SHA-1 that speed up the time required to find collisions compared to their previously published attack. Schneier says that a SHA-1 collision search is now 'squarely in the realm of feasibility,' and that further improvements are expected."
One is, is this really relevant when most terrorist threats these days are so low tech? Two, does anyone know what kind of funding the NSA gets these days? I remember a news report a couple years back that said they were deeply out of date with hardware and so forth.
Things like this are inevitable. Crypto is an evolving science, and this kind of thing is healthy.
I for one am very excited about the future of crypto.
Ignore Alien Orders
I wonder how this will effect RFC 4109 in that it depreciates MD5 in favor of SHA1. Does this mean that SHA1, at 2^63 is less secure than MD5 at a brute-force 2^64? I'm not a crypto expert or anything, just asking the question; will this effect the proposed standard for the HASH algorithm used in IPsec?
000 -> 0
001 -> 1
010 -> 00
011 -> 01
100 -> 10
101 -> 11
110 -> 000
111 -> 001
Granted, the hash isn't always smaller than the original, but it is so in 75% of the possible cases. This will grow towards 100% for original values with more bits.
While this finding definitely shows a weakness in the SHA algorithm, it isn't a weakness that makes most applications that use SHA any more vulnerable. They found a way to generate two texts that produce the same hash using an algorithm with a time complexity of 2^63 instead of 2^80 as would be required for a brute force attack. However, being able to generate two texts that produce the same hash won't help you exploit most systems that rely on SHA. If someone finds a way to generate text that produces a SPECIFIED hash in 2^63 time, then there's reason to be concerned. However, since these findings show that SHA-1 has some weaknesses, it's probably time to start looking for a better hashing algorithm before a more serious vulnerability is found.
Look up Joux's construction for breaking concatenated hashes. Concatenated hashes are only as strong as the strongest among them, whereas you would expect that the concatenation of an n-bit hash with an m-bit hash would provide (n+m)-bit security.
Although I'm not sure how these shortcut collision attacks fit in Joux's construction -- my recollection of it was that it was useful for collision search using the birthday paradox.
Join the NFSNET. Our prime goal is making little numbers out of big ones. http://www.nfsnet.org/
IAMECS (I am a mathematician/engineer/computer scientist), but IANAC (I am not a cryptologist).
Clearly it's better from a security perspective to combine ALL known hashes (at whatever strength you think you need) so that you get encryption AT LEAST as strong as the strongest one. By your analogy of the paper vs cardboard: you do not know in advance which one is cardboard and which is paper, so it's best to use both.
Unfortunately, part of the engineering goal of a digest is to take up significantly less space than the original. Well it just so happens that recursive algorithms that can be broken at 2^N bits can usually be broken in 2^(N+1) bits with only marginally (i.e. 1/N) more effort. In such cases, doubling the number of bits gives you jack and squat in the way of protection.
Thus, in order to keep the storage requirements low, we conclude that there's no real point in "doubling up" on a known compromised algorithm. Instead, you want to make a better algorithm that cannot be exploited by divide and conquer. One way to do this is to combine the results of dissimilar digests, as proposed by the grandparent (and many others).
In summary, parent is right that it helps to increase the number of bits, but is dead wrong on the best way of doing so (when comparing doubling up on one digest vs concatenating different digests).
when I was learning how these hash functions work, i was amazed to find that they work through the file to be hashed in a linear manner so that an attacker can target 1 small round and create a colision. does anyone know of any hash function that operates on the entire file in a single expanding round that can expand to take the entire file? it is my understanding that these attacks focus on defeating a single round of a set size hash. an expanding hash would make this many times more difficult to crack.
A hash that reduces the size of the original by only 50% could be re-iterated to create progressively smaller and smaller results untill a hash in a certain size range was created. any attack would have to break the first round, and that first round could be defined to produce a result no smaller than 10MB or something. that should be enough to keep us safe for a while.
-John Fenley
not quite.
... ahem... they realize that DES was DAMNED good. And allegedly they knew of 2 theoretical attacks 20 years before the civilian academics.
:)
They fixed SHA up cuz they knew of a flaw, but didn't explain what the fix does.
For DES, they were
But their interference in DES is to restrict DES DOWN to a 56-bit keyspace, cuz they knew that DES was TOO good.
Almost anyone with a million bucks can search through 56-bit key space nowadays. As far as I know, there currently does not exist a DES attack that is more efficient (cheaper) than exhaustive search. Not bad for a 20 year old algorithm, huh? That's SECURITY!!
It is commonly believe in the crypto community the weakest point of attack for DES is its small key space.
Now imagine how many more years of service we could had squeezed out of DES, if the keyspace was set to 128?
What no one seems to mention is that their attack finds "freeform" collisions. I mean, they go and find two plaintexts with the same hash. I wouldn't worry about it until they find 2^63 attack against given plaintext/hash.
You can read about the distinction in Birthday Paradox article on Wikipedia. In short, when the difficulty of finding collision against a given message is 2^n, the difficulty of finding any two colliding plaintexts is 2^(n/2).
So, while they may have found 2^63 attack against SHA-1, it is still a "birthday attack", and to find collision against my message signed with sha-1 the attack would still be 2^126.
Or did I miss something?
Robert
Bastard Operator From 193.219.28.162
You might look into steganography, which uses the text as a mapping function on a digital image, which becomes the key. If only the sender and the receiver know what picture was used, with the proper mapping the encryption can be 'completely unbreakable'.
:)
More interestingly (to me), you just inspired an idea. Caveat: IANACryptographer; I know little about it, so the following may be useless, stupid, or wrong. Or, as a professor once said about someone's paper, "This isn't right. It's not even wrong!"
One might consider the input text as a multi-dimensional space, using some mapping from the original linear text to the space. Then, the loci in that space could be fed into the hash according to another mapping. The mapping amounts to at least one additional hash function for each dimension.
For a simple example, consider the text as a rectangular array in row major order. Then, starting in the middle, using some small block size (perhaps variable) the bytes of the text in a rectangular 'spiral' are combined into input chunks for the hash process. If we use the alphabet (A-X) in a 4x6 matrix as an example, the first row would be ABCDEF, the 2nd GHIJKL, 3rd MNOPQR, 4th STUVWX. Using a 2-byte chunking, the chunks input into the hash would be IJ, PO, NH, BC, DE, KQ, WV, UT, SM, GA, BC, DE, FL, RX. Note that BC and DE are presented twice - I don't know if that's a good thing or a bad thing, but I think it can be good. Various strategies can be employed to map non-square or non-regular 'shapes'.
There are an infinity of ways to perform such mappings, a majority of which apparently destroy any benefit an attacker may receive from observing adjacencies and periodicities in the data. I say apparently because a major area of potential vulnerability will be in the transmission of the mapping rules. If the rules are encoded some way in the cyphertext, an attacker's first objective would be to crack the encrypted information about the mapping, which would allow the attacker to then consider the input to the hash as a linear text. This could be avoided if only a miniumu amount of information is included in the cyphertext that would allow a legitimate receiver to determine the correct mappings from among a large potential set of mappings with low mutual similarity.
One might view this method as "Through the Looking Glass", because, it performs spatial distortions on the data, though often of types not easily accomplished by a physical lens. Steganography is a one-dimensional mapping in this sense, where the location of a bit in the two dimesional space is not changed but it's position in color, phase, parity or other space (the value dimension) is changed.
This little flight of fancy may be old news or stupid, I don't know. If it's new and useful, I'll be glad to assert patent rights and other IP rights, so as to avoid proprietary lock-in and make the technology available under a Creative Commons -type license for the general good!
It's easier to be a result of the past, but more fun to be a cause of the future! http://www.spacefinancegroup.com/
Whirlpool. It's based on the mathematics that gives AES it's proofs of security against certain classes of attack.
It's slower than SHA-1 but guess what? Security costs CPU cycles. A lot of people tend to forget this most important lesson.
Simon.
Why not just append the MD5 and SHA1 hashes together? What are the chances of getting hash collisions with two difference algorithms from the same input? I don't think I'd lose much sleep over it.
Google for a25f7f0b29ee0b3968c860738533a4b9 OR a25f7f0b, an example of how to exploit a hash collision.
To avoid this exploit, the signer needs to make an unpredictable modification to the document prior to signing.
Alice and Bob use hash function H and signature function F to validate documents. Alice signs a message M by finding signature S such that F(S) = H(M). Bob accepts (M,S) as signed by Alice. Eve can calculate H(M) and F(S) but cannot find S given M.
Eve creates documents M1 and M2 which have the same hash, and asks Alice to sign M1. If Alice uses the expected hash, Eve can substitute M2 and Bob will accept (M2,S) as signed by Alice.
So Alice and Bob need to agree to a safer method. When Eve presents M1 for signature, Alice generates a new random key K, encrypts M1 with K and hashes, giving H(K(M1)). She then signs this by finding S such that F(S) = (K,H(K(M1))), and gives (S,K) to Eve.
Eve sends (M2,S,K) to Bob. Bob checks the signature by calculating F(S) and comparing it to (K,H(K(M2))).
Eve is foiled, because H(K(M1)) != H(K(M2)), even though H(M1) = H(M2).
Thus, immediately, the protocols for signing software distributions should be adjusted so that the signing authority Alice generates random key K and signs H(K(M)) instead of just H(M). The automatic software update agent Bob checks the signature (S,K) by calculating F(S) and (K,H(K(M))).
This will prevent substitution of a useful M2 for M1 by Eve, who is allowed to present an arbitrary M1 for testing and signature. Eve cannot predict K, and hence even if she can find M2 after the fact, such that H(K(M1)) = H(K(M2)), she is not allowed to change M1 after K is chosen. Hence M2 will not be useful.
Bob can safely install M1 automatically, confident that if M1 passes a simple syntax check (which M2 cannot possibly satisfy), it is the same program which Alice accepted for signature, even if hash collisions can be found.
A 128 bithash, has got a lot of possible values. 2 to the power of 128 values. Which is estimated to be a bit more than there are atoms in the universe. So the lack of possible permutations is not a problem (otherwise you could just take a larger hash), there are just weaknesses in the algorithm.
Detecting minute differences, like the ones between twins, is EXACTLY what hashes should do. The minute differences between a check in the amount of $100 or a check in the amount of $1000, for example. They both start out as the same sort of thing, but the environment adds information that you want to authenticate.
This is not really a philosophical issue, it's a numbers game.
SCO employee? Check out the bounty
I hear from the tech/geek community all the time that we don't have a problem with racisim or sexism. Everyone is equal, it's a meritocracy, etc. etc. It's easy to ignore these problems when your the dominant culture, or cocoon your self in a bubble and pretend you are.
Kind Regards
"A few great minds are enough to endow humanity with monstrous power, but a few great hearts are not enough to make us w
Where did you get that definition of encryption from?
Hash functions are many to 1 and cannot possibly have an inverse.
No, the inverse is computationally unfeasable or unknown. There is no way to prove that it is not possible. Most definitions I can find involve the transformation of plain text into an obscured form cyphertext. The ability to reverse this is not a prerequisit. So it has to satisfy:
P F = C
Where P is plain text, F is an encryption function and C is cyphertext.
The ability to reverse this is not needed, unless the purpose of encryption is to preserve confidentiallity of the plaintext.
A hash is a mapping from a large set of items to a smaller set.
No, it could be the opposite of this, taking a small set of items and mapping them to a larger set. Many hash functions result in a set value of set length and so a hash may be larger that its plaintext.
The only reason some people get lost in thought is because it's unfamiliar territory.
The problem is we have very little information to go on when it comes to the NSA's abilities. Sure, we know they knew about differential cryptanalysis 20 years before academia, but that's only one data point; it's dangerous to extrapolate too much (although it's great fun to speculate!)
Consider, it took the IBM cryptographers less than five years to discover differential cryptanalysis (they called it the "T-attack"), so maybe open academia were simply unlucky or unfocused when it came to block cipher cryptanalysis?