Slashdot Mirror


Apache Subversion Fails SHA-1 Collision Test, Exploit Moves Into The Wild (arstechnica.com)

WebKit's bug-tracker now includes a comment from Friday noting "the bots all are red" on their git-svn mirror site, reporting an error message about a checksum mismatch for shattered-2.pdf. "In some cases, due to the corruption, further commits are blocked," reports the official "Shattered" web site. Slashdot reader Artem Tashkinov explains its significance: A WebKit developer who tried to upload "bad" PDF files generated from the first successful SHA-1 attack broke WebKit's SVN repository because Subversion uses SHA-1 hash to differentiate commits. The reason to upload the files was to create a test for checking cache poisoning in WebKit.

Another news story is that based on the theoretical incomplete description of the SHA-1 collision attack published by Google just two days ago, people have managed to recreate the attack in practice and now you can download a Python script which can create a new PDF file with the same SHA-1 hashsum using your input PDF. The attack is also implemented as a website which can prepare two PDF files with different JPEG images which will result in the same hash sum.

3 of 167 comments (clear)

  1. Re:Here's what it means by Lisandro · · Score: 5, Informative

    FWIW, you're correct, but "hash function" englobes much more than that. Technically, a CRC is, by definition, a hash function. So is bit parity.

    A cryptographic hash function has the properties you mention, plus the fact that it must not be easily reversible and uniformly distribute results over its entire output space.

  2. Re:Here's what it means by complete+loony · · Score: 5, Informative

    Google produced two pdf's that differ in some binary data near the beginning of the file. The SHA-1 hash routine processes data one block at a time, updating its internal state. There are two consecutive blocks that differ between the pdf's. The first pair of blocks produce an internal state where half of the bytes are the same. The second pair of blocks then produce an identical state. The remainder of the pdf files is the same.

    So you can use these two pdf prefixes and append whatever data you want to them to produce your own pair of files. Pdf includes a programming language for rendering content. Within this language you can inspect the earlier bytes of the file to detect which version of the file you are rendering, and make some visual changes. So while there are only a few bytes that are different, you can make two pdfs that display different content.

    Nobody has invested the time to produce a new hash collision, but someone has already automated the production of duplicate pdf's based on this work.

    --
    09F91102 no, 455FE104 nope, F190A1E8 uh-uh, 7A5F8A09 that's not it, C87294CE no. Ah! 452F6E403CDF10714E41DFAA257D313F.
  3. Re:Here's what it means by complete+loony · · Score: 5, Interesting

    This is why git is not vulnerable in this specific instance. In git all objects are prepended with their type, in this case "blob". Of course if you had $100k (-ish) to burn, you could repeat this attack on a file that does start with "blob" to break git.

    However you don't need to do this. This attack depends on reaching an intermediate state with specific properties in order to massively reduce the search space. Any attempt to hash a file that reaches one of these states can be detected and rejected. If you swap to using https://github.com/cr-marcstevens/sha1collisiondetection for all SHA-1 calculations, every instance of this attack can be detected and rejected.

    Also I mis-spoke slightly and spotted my error after checking the paper again. The first pair of blocks have half of the same bytes, but produce an internal state with only 6 bytes of differences. The second pair of blocks, again only differ in half of their bytes, and exactly cancel out those 6 bytes of differences. See Table One on page 3 for the actual byte values.

    --
    09F91102 no, 455FE104 nope, F190A1E8 uh-uh, 7A5F8A09 that's not it, C87294CE no. Ah! 452F6E403CDF10714E41DFAA257D313F.