Implications Of The Recent Hash Function Attacks
An anonymous reader writes "Cryptography Research has issued a Q&A that explains the security implications of the hash function
collision
attacks recently announced at CRYPTO 2004. Apparently the consequences can be catastrophic for certain kinds of code signing and digital signatures, but MD5 sums for checking binaries are (mostly) OK. While the
speculation that SHA-1 is about to fail seems to be overblown, updating the many legacy systems and protocols that rely on MD5 is going to be a massive undertaking."
No matter how complex encryption algorithms get, there will always be a faster computer that comes along to break them. That is, until quantum computing becomes widespread.
TIME to get YOURSELF a new KEYboard with a FUNCTIONING caps lock KEY.
Etiquette is etiquette. He kills his mother but he can't wear grey trousers.
"While the speculation that SHA-1 is about to fail seems to be overblown, updating the many legacy systems and protocols that rely on MD5 is going to be a massive undertaking."
Any time I've tried to point this out, I've been shouted down by hysterical people (such as relex) squawking that because it may be possible to generate two messages with the same MD5 hash, SHA-1 is automatically broken. Um, no. They're two totally different algorithms. Use some common sense, people. I'm as cautious as the next person but screaming about how "all hash algos are insecure" is hyperbole at its worst.
Note to M1-ers: a curt but otherwise insightful message is not "Flamebait" or "Troll".
Yes, too bad ECC is not a hashing algorithm and has absolutely nothing to do with this, or else we'd be set.
Say I have program A that I distributed and I supply the md5/sha1 sum to insure it's 'validity'. From what I read you can get two bitstreams to produce the same sum, ok that was expected. But what I dont get is this still doesn't allow somebody to arbitrarily pick whatever sum they want for their code right? I mean still the chances of somebody writing some trojan'd program and magically somehow getting the sum's to match is extreemly small and/or really computationally expensive. If they were that smart, wouldn't they be working for one of the TLA's (Three letter Acronyms) already?
Your hair look like poop, Bob! - Wanker.
In the wake of stories like this is this a message that we need more secure forms of encryption than we already have? RSA is great so far, but how long until 1024 is broken? Or any other schemes, like the MD5 hashing that's used for digital signatures?
Keeping ahead of the crackers is a big concern not only for security of transactions, but for personal privacy as well.
CMDRTACO CHECK YOUR EMAIL!
For example, a devastating attack would be one that enabled adversaries to obtain a legitimate server certificate with a collision to one containing a wildcard for the domain name and an expiration date far in the future.
quick questions:
1) Don't the browser check for wildcard domain names in the certificates???
2) If not, why not???
Consensus is good, but informed dictatorship is better
And too bad that ECC is a) not provably secure and b) is rumored to have been broken already.
I am disrespectful to dirt! Can you see that I am serious?!
Comment removed based on user account deletion
Use both!
You say its easier than ever to find the soloution to each one of these hashes, so just use em both. I really think that the number of collisions that occur similtaniously are a bit fewer and farther between. I think that will be secure until we find a better and decently fast hash.
md5sum
d41d8cd98f00b204e9800998ecf8427e
In a nearly day, i'll be possible to hack d008960fa6b395dca1c8362165bb31be
Comment removed based on user account deletion
Has a collision been found yet concerning data which has both the same MD5 sum and the same SHA-1 sum?
It would seem as though even if SHA-1 were to fail, the two algorithms used together could bolster each other security-wise. This slows things down, of course, but would it not suffice for the time being?
you don't have to generate specific malicious code in order to exploit md5.
merely creating pure trash would be sufficient, think of the case of BIOS or other firmware. create random garbage with the same md5 hash and voila, you've turned your victim's PC/laptop/celphone/pda/etc into a doorstop.
there are many other ways that md5 can be exploited, this is only one.
Because wildcards are not necessarily a bad thing. The concept is that you have a single SSL accelerator in front of a whole pool of servers, and it absorbs the "security context" of all the hosts behind it.
If you want universal SSL deployment, this is one of the ways you get it.
--Dan
it would be better to post both the MD5 hash _and_ the SHA-1 sum. What's the chance of 2 different binaries having the same MD5 and the same SHA-1 at the same time??
Artaxerxes
ECC has been cracked...it just takes roughly the combined powers of 50 PCs to do it. This is a really old link but its valid for this thread.
http://cristal.inria.fr/~harley/ecdl7/readMe.html
SSL certificates (and certificates in general) do not attest to the trustworthiness of an individual or machine, but to the authenticity of that entity.
Wildcard certificates are prefectly valid because they are issued to an individual who has proved their authenticity to a certificate issuer, and because you need to have the private key (which is known only to the individual requesting the certificate) in order to be validated as authentic.
So, whether I put the certificate on one machine or ten, I'm still the same person, and the certificate still asserts that.
Comment removed based on user account deletion
Is here.
I didn't read the attack too well, but from the Q&A, it appears that the attacks are collision attacks (like the Birthday attack, but, I imagine, more efficient). The Q&A states "In contrast [to a preimage attack], a collision attack finds two messages with the same hash, but the attacker can't pick what the hash will be."
So, shouldn't it be possible to edit something in the document that doesn't change the meaning (such as a misspelling, or punctuation) before you sign it, thereby changing the hash to something completely different? It would seem that now the attacker is forced to find a document that has a given hash, which is essentially a preimage attack, or is there something I am missing?
...interesting if true.
512 bits made from 2 hashes, one weak and one strong will be weaker than a single 512 bit hash from the stronger algorithm.
The world is going to end! Giant asteroids will destroy all life on earth!
Oops, wrong article. Um... The world is going to end! Global warming... um, well... the Patriot Act... umm...
Well, it's not that bad. Somebody might be able to flip four very carefully selected bits in a file, and still produce the same MD5 hash. This could let me, for example, create an executable that had a normal, benign behaviour, and an evil trojan behaviour, and have one of the bits that I flip change a conditional so that the trojan behaviour was activated. (Note that open source tends to be immune to this kind of nonsense, since in the source code, the actual trojan part - not the conditional that activates it, but the actual evil payload - tends to stick out like a sore thumb.)
Note well that this does not let me create an evil version of somebody else's file. It only lets me create two closely related files, one of which differs by four bits from the other. I have to be able to construct the benign file in such a way that I can turn it into an evil file by changing four bits. And it can't be just any four bits, either; it's a very specific four bits.
So this isn't the end of the world. What it means is that you can't quite trust MD5 to guarantee that you got exactly, bit-for-bit, what you think you got.
But really, this new situation isn't much worse than what we had before. I mean, I could simply have the evil behaviour activated by the date, or by the IP address of the installed machine, or whatever, and get somebody else (who never saw the evil part run) to state that the program did what it was supposed to. Having an MD5 hash doesn't guarantee that the program isn't evil. Bottom line: don't run code written by bad people, whether it has a valid MD5 or not. (I know, I know. How do you tell who the bad people are? That's a hard question, but my point is that a valid MD5 has never told you whether the authors were bad people or not.)
A bad day for me.
I write 2 programs, lets call one "Cool Whizzy Must Have Util" and the other "Soul Sucking Destruction" and I tweak and tweak one of the binaries so that they have the same checksum.
Then I release the first one, everybody is eventually using it.
So then on my servers, I replace the first one with the 2nd one.
Gotcha!
Is that the danger here?
Yes but what does this mean to me, "Mr.MSAccess Guru/Administrator"?
Microsoft certification available upon request.
If you think
Some forms of ECC have been 'broken', Len Adlemann (A of RSA) showed that ECC in dimensions higher than 2 was no more secure. He has been working on some further attacks and thinks that ECC as a whole might be vulnerable.
I don't like ECC for two reasons. The first is that ECC is a very new field of mathematics, new results come regularly. It is entirely possible that someone would find an efficient means of transforming ECC problems into discrete math problems and come up with a solution.
The other reason is that ECC is patented up the wazoo. The most efficient ways of using ECC are patented and if you can't use them there is no efficiency advantage over RSA in a discrete field so why bother?
The hash algorithm thing is massively overblown. MD5 was already toast. SHA1 was due to be withdrawn in 2010 in any case and has already been superceded by SHA-256 and SHA-512. New versions of DSA for the larger hash sizes are also due.
It remains to be seen whether the construction of SHA-256 needs to be adjusted in the light of the MD5 results. It may well be that it shares the same vulnerability as SHA-1 and we should forget about the new hash functions and move straight to something else. Alternatively all might be right with the world. We do not know yet.
A lot of people are suggesting a competition similar to the AES competition for a new digest algorithm. There is already something underway for stream ciphers. This seems like a good plan, not least since the cryptographers seemed to have fun with the last one.
Looking for an Information Security student project suggestion?
Try http://dotcrimeManifesto.com/
"SHA1 is a totally different algorithm, so it's still perfectly safe."
Yes and no. MD5 collisions are not SHA1 collisions, and the attack that generated the MD5 collisions doesn't seem to be applicable to SHA1, or its authors would have published collisions on SHA1. The published collisions on several other algorithms: HAVAL-128, MD4, and RIPEMD. They also say that their method will work against SHA0. All these hash functions share similar design principles. It seems highly probable that the MD5 attack will have at least some applicability to SHA1 even though it isn't directly an attack against SHA1. Also, other researchers have published results against SHA1. In particular, Biham and Chen con produce collisions on reduced versions of SHA1 with up to about 40 rounds (the full hash function has 80). That isn't a break of the full hash function, and there's no guaranteed it can be extended to more rounds, but it looks worrisome.
"This attack produces two messages with the same hash, no guarantee what that hash would be, instead of one message with a chosen desired hash, so it isn't a threat to real systems."
That's just stupid. "No practically-findable collisions" is one of the design requirements for a secure hash function. Protocols using secure hash functions are based on the assumption that the functions used are secure hash functions. If your hash function doesn't guarantee collision resistence, then your protocols must be assumed to be broken unless you can go back and prove, for every protocol, "This one is still secure even if we use something that is not a real secure hash function."
One way a hash collision could be useful, for instance, would be against some signature schemes where the secret key is revealed if you ever sign an identical message more than once. People who use those schemes are careful to avoid signing the same message twice... but if you had two different messages and they had the same hash, it's quite possible to imagine that you could be tricked into signing the same hash more than once (because people sign hashes, not actual messages) and making trouble for yourself. Similarly, if you use hash output for initialization vectors in cipher modes that use those, the result could be encrypting two messages with the same keystream, which means an attacker can probably recover both messages (and then use them as stepping-stones to breaking the rest of your system).
Also, a fast way of finding collisions may well be extensible to a somewhat-slower, but still faster-than-brute-force, way of finding the preimages that you think an attacker really wants.
"This attack depends on the messages having a special form; they don't look like real plaintext, so it isn't a threat to real systems."
One of the conditions for a secure cryptographic system is that you don't depend on the plaintext having (or NOT having) a specific form. If your system doesn't work regardless of the content of the data I put through it, then I will punt on your system, and recommend to my clients some other system that will actually work. It's also not clear that the attack on MD5 really does require a specific form... those strings look randomly-generated to me, even though the XOR difference of them clearly is not. Maybe with just a little more work they can produce collisions of two meaningful and interesting strings with opposite meanings.
"All hash functions have collisions, so it was bound to happen and isn't a threat."
The important question is whether people can actually find collisions. With a good hash function, collisions should be rare enough that nobody has any reasonable chance of finding them on purpose any time soon. Wang, Feng, Lai, and Yu can find collisions on MD5 deliberately, with practical amounts of computer power. They have done this more than once, and have at least outlined a plausible theoretical explanation of how they can do it. That means MD5 does not provide the guarantees that a secure hash function must
The work that went into putting together AES was really fantastic.
I'm just looking forward to a similar effort around an advanced hashing standard.
Where would an effort like this form?
I thought the original post on slashdot said something about finding a collision with MD5 using a simplified version or different initialization numbers or something like that.
Is tried-and-true MD5 broken?
Upon hearing these news, Tom Ridge raised the level of alert to "Amber".
At least this time he had something a tad more substantial to instill fear in the hearts of all patriotic Americans such as myself.
Thank you Department of Homeland Defense! I sleep so much better at night!
Wearing pants should always be optional.
For example I may have a document that currently creates a mostly random hash, with preimage I could add extra bits into the document and make the hash equal to exactly what I want. Warez people could make all their fileid.diz hashes "L33T_L33T_L33T_L33T_L33T" with some extra padding at the end of the file.
Right now all we can really do is live with the hash that is currently generated.
Posted anonymously to avoid offending any of my colleagues.
In the current attack from what I have seen, the origional document makes hash A, and then the origional document is slightly modified to make hash A again from the modified document. I don't think they can make hash A without the origional document, its just too hard.
Go ahead and say it - "Beowulf cluster".
More info on the implications at Educated Guesswork. (It isn't my work, so anonymously it is.)
I wish I'd said what he said...
I'm Bob Fourney
I'm bob Fourney
I'm Bob Fourney.
I'm Bob Fourney:
I am Bob Fourney
I'm Robert Fourney
I'm Robert Anthony Fourney
So, you could write the malicious code (or email) and play games with the syntax until you get the malicious functionality and a hash you like.
It's easier if you can play with both the "good" message and the malicious one.
Oh No! The sky is falling! This could be as big as Y2K!
If you sign for one hash, you've signed for anything that can generate that hash... which is a great deal more than 2 different sets of data. It's an infinite set of data (in theory)!
stuff |
Well, personally I'm more of a skunk fan, but Abraxas in Amsterdam does a superb hash milkshake.
Blaming GW Bush for the Iraq war is like blaming Ronald McDonald for the poor quality of food.
I'm asking questions, educate me.
ROT-13 is completely invulnerable to hash collisions; no two non-identical inputs will ever result in identical outputs!
I recommend that everybody replace their existing encryption systems with ROT-13 immediately.
-Cbbg
http://eprint.iacr.org/2004/199.pdf
If you think that punctuation doesn't affect meaning, you are illiterate.
See what I've been reading.
In fact, that's also the reason that having a PGP-signed message which exclaims what the md5sum of a package is good for: double-wrapped-goodness double-happiness security. Someone trying to pass off the bad Trojan to Bob would not only have to create the md5sum collision of the bad Trojan to match the purported good program, but would have to present a... whoops, I see the error in my own argument. Yeah, stick with publishing multiple hash values with multiple hash algorithms.
The probability of being able to create a doppelganger which can cause a collision in BOTH hash functions simultaneously is multiplicatively harder. Right?
The time it'll take to produce a crack that breaks all three of DES, AES, and Rot-13 may be no stronger than AES alone is today; but if someone breaks AES tomorrow, at least some h4x0r5 can't break DES yet.
A programmer is a machine for converting coffee into code.
guess again faggot.
GNAA > j00
fristage postage is mine
This would allow labels to insert broken music/video content into p2p networks without the network being able to defend. They can take a pirated file, change it a little and add it back to the p2p network. The MD5 is the same but the file may be unplayable.
The implications are largely academic for now. What seems to have been discovered is that there are some shortcuts in MD5 and SHA0 that make these algorithms less robust than previously expected. The existance of one "hole" in an encryption or hash algorithm suggests that there might be more waiting to be discovered. This might take 1 year or it might never happen.
So, right now the primary risk is that someday we wake up to find that enough holes have been discovered to compromise preimage resistance. If this happens, it might be easier (but probably still horrendously expensive) to perform certain kinds of attacks on digital signatures and password files.
Another risk is to find out that these algorithms have an undiscovered bias in their output. MD5 and SHA are sometimes used to generate keys for encryption algorithms. If a bias is discovered, then it might be a possible (but probably still horrendously expensive) attack on some encrypted data.
Again, we are talking about stuff that could happen tomorrow, or could never happen. The consensus I'm reading is that the odds of a worse break in MD5 being discovered have just gone up by a significant level. The odds are not big enough to justify another Y2K panic (partly because quite a bit of software has already made the transition). However, where possible it is prudent to pick SHA1 over MD5.
With a collision attack, you can perform an attack that matters - here's an example:
Imagine that Microsoft won't sign any audio drivers for Windows XP that allow raw audio data to be output to disk. Also imagine that you are the driver release engineer at Creative (Sound Blaster division) and you want to release a driver that can do that.
What you do is build both drivers (one that Microsoft will sign, and one that you want to release with the "unacceptable" feature) with a large static data buffer that isn't used in the binary. You then try to modify both buffers in such a way as the two files will have the same hash (doesn't matter what hash, just that it's the same). This will take about 2^40 worth of work for MD5 instead of the 2^64 that it should take because of this security issue.
Once you've created your two binaries with the same hash, you send the acceptable binary to Microsoft and they sign it. Then, in the release section of your website you post the other binary with the signature you got from Microsoft... and the signature verifys just like they signed it.
There is also a break in the digital check situation, *if* the digital check protocal has random padding (many do) *and* the payee generates the check (also possible).
-- The act of censorship is always worse than whatever is being censored. Always.
Or one PC from 5 years in the future...
I mod down anyone who says "I will be modded down for this", regardless of the rest of their comment
I don't think people realize how complicated it would be to brute force one of these algorithms. Lets take one that has a 160 bit hash. This means that you would find a collision in ~2^80 tries, right? Wrong, this assumption requires that you store each of the hashes you receive in the brute force attack. This would amount to 2^80 x 160 bits, or 21990232555520 terrabytes. It is clearly not feasible to store that much. A 128 bit hash would find a collision in ~2^64 tries and would need 2^64 x 128 bits or 268435456 terrabytes of storage! Lets see someone distribute that.
2*31*37*263
This is not true. Any input that contains no alphabetic characters will result in identical outputs. ;)
On a lark I decided to run the purported collisions in the paper through MD5 to verify the claim, and I got a weird result. The two examples given are indeed collisions, but the hash is not what the paper says it is. The paper says that the hashes for the two examples are supposed to be:
9603161f f41fc7ef 9f65ffbc a30f9dbf
and
8d5e7019 6324c015 715d6b58 61804e08
but the hashes I get are:
74BE7342 8C5BDB65 9BE40E00 CF6AE31C
and
BC5E1391 D31E52F3 D41CBE8C 05D7DBC1
I'm using the MD5 library built in to Darwin (OS X) and I've verified that it passes the standard MD5 test suite in RFC 1321.
Thanks for pointing out the obvious...Wait, what now?
parent's trying to say more along the lines of "it'll be a lot less easy to find a collision dataset that's simultaneously a collision for md5 and sha1"
A lot of stuff I've seen floating around carries multiple verification methods (apache uses md5 and pgp sigs for example).
Even if one verification technique is rendered "broken" -- together, the two hash algorithms are still that much more complex to break (though your point is also valid: wasting 32 bits on crc32 isn't going to make it more secure than adding those 32 bits to your new nonbroken cryptographic hashing algorithm).
What if some attacker gets ahold of a Linux MD5 /etc/passwd file? That would likely now be enough to get access to the computers...
I hope the distros migrate to sha-1 for their default authentication mechanisms...
The Penguin Producer
Correct me if I'm wrong, but aren't the hashes in many unix /etc/passwd files MD5 hashes of the actual passwords? In this case, wouldn't it be easy for anyone with access to the passwd file to generate passwords that match the hashes (collisions) allowing them to login to accounts which were previously thought secure? This seems like an enormous security hole that wasn't even mentioned in the Q&A document.
Nate
To check a file manually, I should do the following;
Check the MD5sum against a known good source.
Check the GPG signature of that source.
Check the file size (might be harder to fake an MD5 for files of the same size?)
What I actually do most of the time is quite a bit is different;
Check the first and last few characters in the MD5sum against what is posted on the web/FTP site.
To get a complete MD5 collision is currently something the NSA might be able to do (paranoia hat not on). To get a look-alike that matches part of the original MD5 (just the part I tend to check) should be possible.
(Forging the original MD5 is probably the easiest thing to do since the GPG signature is rarely provided and if it is is probably rarely checked.)
A firewall can not protect you from yourself. Turn off what you do not need. Do not use the firewall to do your work.
I see that you have been paying close attention to the ECC FUD. Good for you.
A common security system practice, especially in web development, is to use MD5 to hash users' passwords and store them in a database. When the user enters their password for access, it is hashed and checked against the db. This means you can check passwords without having to store it plain-text anywhere.
My understanding is that this problem in MD5 means it is slightly easier to take an MD5 hash (if the database were stolen, for example), and find a password that will generate the same hash, and thus allow access. Is this correct? What are some developers doing out there to address this issue in their security systems? Is it really an issue in this scenario?
welcome our new bluefish password hashes.
"Long run is a misleading guide to current affairs. In the long run we are all dead." (John Maynard Keynes)
Ummm... RSA has been cracked too:9 6
http://www.rsasecurity.com/rsalabs/node.asp?id=20
The grandparent hasn't really grasped this.
It's not like i can download a BIOS form abit and make garbage which matches the hash.
It allows me to make two pieces of garbage which have the same hash
Yes, some crypto people are already saying you should change the whitespace in any pre-generated document you are asked to digitally sign.
Changing spelling or punctuation would also protect against collision attacks.
Nothing to see here; Move along.
Wait... Did I understand you correctly? George W. Bush likes virgin coke and is secretly married to Britney Spears?
what's shocking and horrifying is not that it actually happened , but that it was far easier to generate them than anyone ever thought .
There's something like an AES-style competition for stream ciphers? Sounds fun, who's organising it?
so that's 98 billion billion serial-years?
But, we've proven already with distributed attacks it's not so many parallel-years if you gather enough computing power together, so how many is that in parallel-years, or even quantum-years?
Now, I have no degree in English, but it certinally is possible to change punctuation without changing meaning. I didn't say "change any punctuation because it will not change the meaning". I said "...change something in the document that doesn't change the meaning..." (you even quoted this!) I can think of an example of a case where a change in punctuation will not change the meaning. For example I could write:
All three of these are legal English and have the exact same meaning. Granted the second one may not be the most common way to write the sentences, but all are changes that nobody would argue with (unless they had ulterior motives).
Maybe you should try thinking of more examples where punctuation doesn't change the meaning before calling someone illiterate.
...interesting if true.
Dingle berry, put away the pentium 60, newer flash BIOS has 2 areas, one that can be re-writen and another for basic start-up - I have had power failures during writes in newer systems and they recovered fine.
I would hope that you could present the original contract you signed with the same MD5 as evidence.
I would hope that the judge and jury would realize that it is easy (given the recent finding) for the shyster to find two contracts with the same MD5, but that it is computationally infeasible for you to find an alternate contract with a given MD5. Therefore, the existence of the two documents should be enough evidence to put the shyster in jail! One would hope ...
However, as others have suggested, it would be prudent to make minor alterations to any important document you digitally sign to defeat the shyster lawyer attack.
In real life crypto systems, the minor alterations take the form of random salt added to the hash. So a properly designed crypto system is not broken by this finding. In fact, any document signing system should include random salt in the hash and signature so that you don't have to go changing color to colour before signing.
The partent is NOT insightful. The breaking of MD5 is completely separate from the near-breakage of SHA-1. The SHA-0 algorithm, which is almost identical to SHA-1, has been broken. The reason people are worried about SHA-1 is that new techniques have been developed and they are partially successful on it. With some extra work, they may be ale to break it. The fact that MD5 was broken recently too is a coincidence and is NOT the reason people are worried about SHA-1.
Yes, we know these exist for any hash which can take inputs of greater length than the hash (insert simple pigeon-hole argument here).
But the reason this _is_ something different is that we found an algorithm to generate them without having to do a brute force search.
For example, for a 64 bit hash, you would expect to have to test 2^63 different inputs before finding a collision with the hash of a fixed input. Or, if you just want a collision of any two inputs, it would take many fewer attempts (the birthday paradox). But with the new techniques, it is even easier. Hashes are assumed to not have this property, thus cryptographic protocols built on top of any of these broken hashes may not be safe after this discovery.
The whole point of shadow passwords it to force an attacker to gain root to even be able to get your password file (/etc/shadow). If an attacker can gain root, the attacker can either install a rootkit or create their own account, say root2 with a UID of 0. Shadow passwords were originally designed to provide additional protection against dictionary attacks by trying to keep the file with the passwords in it out of an attacker's hands, but work perfectly against an md5 collision attack. As long as the root account is properly secured, md5 collision attacks would not be possible. The only way to crack an /etc/passwd file nowadays is through something like a rubber hose attack or purchase key attack (i.e. torture/bribe/blackmail the password from someone who knows it).
Disclaimer: this is a repost.
Q: What is the difference between SHA-0 and SHA-1? Is SHA-0 widely used?
A: SHA-0 was initially proposed in FIPS 180 (May 1993) as hashing standard by the U.S. government, but was replaced by SHA-1 in FIPS 180-1 (April 1995). SHA-1 adds an additional circular shift operation that appears to have been specifically intended to address the weaknesses found in SHA-0. SHA-0 is not widely used and should not be used in new systems.
This indicates that the US Govt had already shown SHA-0 was weak by 1995. I dimly recall there was a similar involvement in RSA or DES, where a weakness was avoided by the US govt requesting a (at the time) seemingly irrelevant change to the algorythm.
**TODO** Steal someone elses sig.
Can they now use this to break bittorrent?
Would seem to me the server could have a secret hash code and use it on data that turned out to be invalid and ban based on it but more security is always a pain in the ass...
Anyone seen any evidence of this or heard anything about it happening in the future?
I'm not sure you're entirely right there.
"The cryptographic hashes are broken. The world is going to fall apart."
You've got two separate statements there. Because of the full stop, you can't tell if the second statement was caused by the first.
"The cryptographic hashes are broken; the world is going to fall apart."
This implies causality. Thus, the world is going to fall apart *because* the cryptographic hashes are broken.
"The cryptographic hashes are broken, and the world is going to fall apart."
I was told to never put "and" after a comma, so I'm not sure this sentance *is* legal English. As with the first sentance, you again can't be certain of the causality of the statements.
However, you are right in your general statement, the first and last sentances have the same meaning.
How can you blame 9000 deaths on AI. Their highlighting government mischief has created awareness, forced investigations and lowered the number of deaths.
Adding an arbitrary salt makes it less secure. Here's why:
/* I'm just inserting this arbitrary string to randomize the hash --yeah, that's it.
/* Inserting whatever random characters it takes to generate the same MD5 hash:
MD5 will still be able to protect your file (binary or authenticated message, whatever) from being tampered with by SOMEONE ELSE, but not by yourself. One of the problems with the hash collision attack is, as suggested in TFA, someone creates two files that have the same hash (easier than creating a file that has the same has as another pre-specified file) and submits the good one to be trusted before substituting it for the evil one.
Suppose you write a program:
if (user == root) {
omnipotent_flag = True;
}
But what you really want is to write this malware:
if (user == backdoor) {
omnipotent_flag = True;
}
But --gosh darnit, the resulting two programs generate two different MD5 hashes! What to do?
Well, use the excuse of some arbitrary salt string:
if (user == root) {
omnipotent_flag = True;
ujiUFIDO94305-8345JFKL:JKDFLS:f */
}
Hey, now you can generate your own malware program:
if (user == backdoor) {
omnipotent_flag = True;
rewAFDSADSF5435435#$%#$% */
}
Of course, the example doesn't have to be as blatant as this, but you can see where this is going:
- the "random" salt can be at the beginning, with a nice-looking comment, and if it is short (like your example of "xyzzy"), it might be accepted. A 5-letter "salt" can increase your number of available files by 64^5 (assuming 64 characters are acceptable as "letters"). One of those might just generate a hash that matches the malware!
- Yes, I know the idea is that the hashes match for both the file with the salt and the file without the salt. But you can imagine someone saying, "Yeah, I added this RANDOM salt [laughs evilly to himself] and the hashes still match!" and the sheeple will say, "Wow, it must be a match --I guess I don't need to bother checking the hashes of the original unsalted file."
What is needed is a widely accepted Standard Salt String that is pre-pended to the file, and when people list the MD5's of a file, they also list the MD5's for the file with the Standard Salt String, and both must match. As long as it's the community that chooses the SSString, and not the contributor, the SSString can be any arbitrary sequence of characters, like, oh, say, "m1cr0$0f7 5Ux0R5".
KWTm
404555974007725459910684486621289147856453481154 in hex is "You sank my Battleship?"
[GPG key in journal]
Your explanation makes no sense to me.
How is it any easier to produce a file that will cause a collision in MD5 when I hash it in MD5 after an arbitrary (and unknown to the attacker) string has been hashed?
In this case, the attacker and the creator of the file are the same person. (I'll say why this is the only important case.)
The attacker/creator creates two files that have the same hash, by adding an arbitrary string of the attacker/creator's choice, using some excuse that will fool some people. Then you get a hash collision.
In your question, you're asking what if YOU add an arbitrary salt string of your choice, and then do the MD5 hash. In this case, the attacker cannot choose your string. But then what use would your new MD5 hash be? You would need to compare it to another MD5 hash (presumably, that of the originating site) which also used the salt string of your choice.
Suppose I post my file, and the MD5 hash is "123abc" (for argument's sake). You get the same MD5 hash after downloading my file. But now you say, "Hey, just to be sure: when I add 'xyzzy' to the front, the MD5 hash is '456def'. Did you get the same thing?" So now I have to go back and rehash it for xyzzy+file.
Then someone else says, "Hey, when I add 'plugh' to the front, I get '321cba'. Did you get the same thing?" Pretty soon I'll be having all these requests and have to post all these hashes. So I say, "Okay, I'm just going to pick some arbitrary string, like 'sf9FD798dfs' and do the hash." The danger is that some people would get fooled into thinking that this is the ONLY hash that needs to be checked (and not the file without the salt), and so it's much easier for me to create a collision.
404555974007725459910684486621289147856453481154 in hex is "You sank my Battleship?"
[GPG key in journal]
512 bits made from 2 hashes, one weak and one strong will be weaker than a single 512 bit hash from the stronger algorithm.
True. However, using 2 different algorithms that are not known to be weak is probably stronger than using a single algorithm that is not known to be weak but produces twice as many bits.
This follows from the fact that similar methods will be used to generate all of the bits in the latter case, therefore if there is some systematic flaw it is reasonably likely to apply to all of the bits. Whereas in the former case, you'd have to find 2 systematic flaws to get you as far (assuming that the algorithms used to generate them were dissimilar, and therefore unlikely to both contain the same flaw).
In your question, you're asking what if YOU add an arbitrary salt string of your choice, and then do the MD5 hash. In this case, the attacker cannot choose your string. But then what use would your new MD5 hash be? You would need to compare it to another MD5 hash (presumably, that of the originating site) which also used the salt string of your choice.
Yes. This is the same situation that the original poster who suggested this was talking about -- he was monitoring files on his system for changes (presumably in order to catch rootkits being installed, etc.).
I understand the problems if you don't choose your own salt.