Slashdot Mirror


Finnish Firm Claims Fake P2P Hash Technology

An anonymous reader writes "As reported by The Inquirer, a Finnish company known as Viralg Oy claim to have developed software that can create a junk file with the same hash as a genuine p2p download. This, according to the company, can altogether stop the sharing of copywritten files by flooding p2p networks with corrupt/junk data, which then spreads through the network, causing less and less of the original file to be available. However, with the resolve of the p2p userbase, is this software really going to 'beat all Peer 2 Peer pirates at their own game,' or simply prove a minor annoyance?"

30 of 748 comments (clear)

  1. Preview/Trailer by fembots · · Score: 3, Interesting

    I guess there are two schools here.

    One believes this kind of fake files will only add burden to the internet, as users will just download one fake file after another until they got a hit.

    The other believes that such annoyance will put most people off, because the total time/cost it takes to acquire something is now higher than the actual product.

    I don't think MP3s will be affected because you can start playing the song if you've got the first bit. Can/will other file formats do that too?

    1. Re:Preview/Trailer by John+Seminal · · Score: 2, Interesting
      One believes this kind of fake files will only add burden to the internet, as users will just download one fake file after another until they got a hit.

      The other believes that such annoyance will put most people off, because the total time/cost it takes to acquire something is now higher than the actual product.

      What will hurt P2P is how hard finding a good network is. Kaaza is filled with spyware, and half the stuff on there is not good. There are lawsuits all over the place, it is not worth it. Bit Torrent, which was nice, is also under attack by the RIAA. You get better files with Bit Torrent, less of the fakes, people sharing seem to check their files. But torrent websites are going down, at least the well known ones.

      What I think will be the next wave will be private P2P, by invitation only. It will be a group of friends sharing their music and files. It will be closed to outsiders, so the only people aware are friends.

      But even if there is a private P2P, with only a group of friends who know each other, will the RIAA be able to scan the internet, looking for their files? Will they go after friends sharing music the same way they would go after strangers sharing music?

      --

      Rosco: "If brains were gunpowder, Enos couldn't blow his nose."

  2. The question is.. by k98sven · · Score: 2, Interesting

    How big is that 'junk file'?

  3. Possible? Yeah by robpoe · · Score: 5, Interesting

    I've always thought it would be extremely possible to create a file with the same MD5 hash.

    Now, what the company has to do is create a file of the SAME FILE SIZE, with the same MD5 hash that's a fake .. then I'll be impressed.

    --
    = Grow a brain...
  4. Minor annoyance at first.... by dgatwood · · Score: 4, Interesting
    ...but if you can create a random junk file in a reasonable period of time, the mechanism can probably be extended easily enough to make it possible to add arbitrary junk to the end of a trojaned executable in a future version of the tool....

    --

    Check out my sci-fi/humor trilogy at PatriotsBooks.

  5. claims? by geoffspear · · Score: 5, Interesting
    I read the article and everything I could find by following links on their website, and found no reference to how their product supposedly works, or any claim having to do with identical hashes. Did the article submitter just make up the identical hash claim, or is there more information on this product available somewhere else?

    What hashing algorithm do they claim to have broken so completely? Sounds like BS to me.

    --
    Don't blame me; I'm never given mod points.
  6. Er.. by t_allardyce · · Score: 3, Interesting

    They might be able to fake one hash, but don't most P2P networks use a combination of different hashes? if not then it would be easy to implement - you can either go for more than one different type of hash like md5 and sha etc or add salt/pepper to a chunk and make any number of hashes where each additional hash makes it insanely harder to crack..

    --
    This comment does not represent the views or opinions of the user.
  7. Re:Secure Hashes vs. Fake Files by Capt'n+Hector · · Score: 2, Interesting
    Use a safer Hash function.

    Or even better, use more than one. If file_x is hashed 10 different ways, using 10 different algorithms, there's no way the file generated by this firm will behave the same way for ALL of them, perhaps not even for two.

    --
    Quid festinatio swallonis est aetherfuga inonusti?
    Africus aut Europaeus?
  8. Sharing by man_ls · · Score: 2, Interesting

    The time-vs-accuracy tradeoff is a big one. One client which I know some people who use, takes almost 48 hours to index a full hard drive of files to share, and hash them all.

    Anything less robust, you're liable to have collisions, such as these, apparently. Any more, and if you have a lot of files, there's a major time committment before you can actually begin to serve anything -- most people aren't willing to have their CPU pegged for 2 days straight while their P2P client hashes their 35,000 MP3s and 200 movies, or so.

  9. Re:They have cracked strong hashes, huh? by BlacBaron · · Score: 2, Interesting

    Says the algorithms patented on their site so presumably we should all be able to go look at this little marvel.

    --
    Update Watch - Automatic software update notification
  10. Hash by PureCreditor · · Score: 2, Interesting

    isn't the whole point of a hash is that it's computationally-infeasible to create a file that that H(new file)=H(original).

    if this technology is true, it'll completely undermine the safety of today's unix passwords, which are stored in clear text of their hash.

  11. Agreed by John+Seminal · · Score: 5, Interesting
    I wonder why people who use P2P don't help each other out a little more. For example, you have someone with 200 files shared. They are downloading and sharing at the same time. Sometimes they download a bad file, and share it. It would make more sense to have a "unchecked" folder for downloads, then more it to the "checked" folder to share.

    What is neat, or not so neat depending on your point of view, are music files which deteriorate after a while. I don't know how they are made, but I have listened to music that sounds pretty good, but after the 10th playing it starts skipping. Or it could be those skips are not very noticable when first played, but once identified, they become annoying.

    --

    Rosco: "If brains were gunpowder, Enos couldn't blow his nose."

    1. Re:Agreed by CSMastermind · · Score: 3, Interesting

      http://www.newscientist.com/article.ns?id=dn4248

      Not definitly...I've seen that technology for games(see link) and I remember microsoft had suggested doing that for MP3s and some other things with DRM. I don't know if the it's been put into place yet or not.

    2. Re:Agreed by Nebu · · Score: 3, Interesting

      Sometimes they download a bad file, and share it. It would make more sense to have a "unchecked" folder for downloads, then more it to the "checked" folder to share.

      The filesharing programs I use force you to share the directory you download into. Sure, I could name the download directory "unchecked", but few people bother to view the full paths as set by the sources from the people they download.

      What is neat, or not so neat depending on your point of view, are music files which deteriorate after a while. I don't know how they are made, but I have listened to music that sounds pretty good, but after the 10th playing it starts skipping.

      To tell you why this happens, we'd need to know about file formats and audio player. Assuming MP3, when you modify the ID3v2 data, the file gets completely rewritten since the ID3v2 tags are written at the head (and not the tail) of the file, for example. Depending on the player, the audio data might actually be getting decoded and re-encoded.

  12. Re:They have cracked strong hashes, huh? by boisepunk · · Score: 2, Interesting

    I see a really short reign of this new "technology" anyway. The hashes could only be for one specific file encoded by a specific encoder with the EXACT title/artist/album info which is not always consistent anyway. I see this as a futile effort.

    --
    main(0)
  13. Sword Cuts Both Ways by 4of12 · · Score: 2, Interesting

    If someone can really poison P2P networks with junk that hash matches (and I have a difficult time believing they've cracked all the hash generators), then consider some hypothetical entity probing illicit distribution of copyrighted material using hashes. They could end up making false accusations against individuals for trading trash instead of Trash©.

    --
    "Provided by the management for your protection."
  14. Interesting idea, how can we apply it to spam? by Progman3K · · Score: 4, Interesting

    If increasing the noise ratio on P2P networks is a good thing, maybe we can use a similar technique to defeat spammers?

    For example, if we could pollute spammers' email address databases with millions of bogus e-mail addresses, then instead of delivering millions of spam e-mails to real e-mail accounts every day, maybe spammers could only reliably send a few hundred to users, the rest of their messages would be to bogus addresses and be "noise" that spammers have to deal with.

    How could we go about doing this?

    --
    I don't know the meaning of the word 'don't' - J
  15. Just finding a hash collision isn't enough really by James+Youngman · · Score: 2, Interesting
    I suppose their method is based on the fact that it turns out that it's easier to find SHA-1 and MD5 collisions than was earlier thought. In fact there's another paper (this paper is not by the Chinese team) which shows that this can be achieved on individual PCs in mere hours, which puts this sort of thing into the realm of commercial exploitability.

    For example, you send the company a copy of the .mp3 file you want to drive out of circulation. They feed it to a computation cluster and eventually out comes another file which has the same hash. You then publish this new file with the same filename on the victim P2P network and hope that it spreads enough to poison the P2P well, so to speak. There are a number of problems with this scheme (assuming of course that this is the sort of scheme that they offer):

    1. The new 'collision' file might have the same MD5 hash, but is it a valid MP3 file?
    2. All it takes to beat this scheme is for P2P software to use more than one hash function, for example
      hash (data)
      {
      return concatenate(md5(data), sha1(data));
      }
      After all, even though we now know how to find collisions in MD5 and SHA-1 (quite slowly) we don't yet know an efficient way to find a single file that is a hash collision for both of them.
    3. If the company paying the money for the 'collision' file is doing so because somebody has spread their material around the P2P network, then the file must be quite prevalent. So why would they expect the 'collision' file to preferentially spread around the network enough to displace the original file?
  16. Re:Just an annoyance by merlin_jim · · Score: 3, Interesting

    For instance, hash with two different algorithms. In theory it is possible to find a file that can hash to the same value in two different algorithms, but its a lot harder than finding a file that hashes to a specific value in one algorithm.

    --
    I am disrespectful to dirt! Can you see that I am serious?!
  17. Re:They have cracked strong hashes, huh? by LiquidCoooled · · Score: 3, Interesting

    There is a world of difference between a valid collision and an invalid one.

    The anti p2p software appears to find invalid collisions which mean the downloaded file is useless.
    Finding collisions where the movie/app/document remains valid will be MUCH more tricky.

    --
    liqbase :: faster than paper
  18. Couldnt this work to your advantage by Anonymous Coward · · Score: 2, Interesting

    SO say the RIAA tries to sue you, saying they saw that you had the newest 50 cent album on Kaaza. Couldn't you claim that what you had was not 50 cent's album, but random files with the same hash as 50 cent's mp3's? I mean, can't you fight the RIAA with its own weapons? If they completely destroy the mechanism with determining what files you currently have, then how does their claim that you had X file hold any merit at all?

  19. Re:They have cracked strong hashes, huh? by Nebu · · Score: 2, Interesting

    The hashes could only be for one specific file encoded by a specific encoder with the EXACT title/artist/album info which is not always consistent anyway. I see this as a futile effort.

    Who pirates individual songs these days? I see this as being a major annoyance for people who pirate games. DVD ISOs are typically 4GBs, usually released by only one or two groups (and so there probably won't be more than 2 versions of the file), and take several hours if not days to download. Worst yet, the games contain executable content, so assuming the ISO mounts via Daemon Tools, for example, if you're really unlucky, you might randomly have gotten code that reformats your harddrive.

  20. Re:Not going to work that way.... by DickBreath · · Score: 2, Interesting
    Just need a better hashing mechanism.

    How about a hash of the entire file, plus a hash of every 128 KB segment. Constructing a file that matches all of the 128 KB section hashes, plus the overall hash is a much more difficult problem.

    Plus, you know after downloading only 128 KB that the file is not the real deal. It only takes 8 * 128 bytes or 1024 bytes of hash information per megabyte of download -- really only a few packets to communicate the hash list for, say, a 10 MB file. The benefit for this cost is
    • early detection of corrupt download
    • difficult of creating a corrupt download
    Now suppose that in BitTorrent like fashion, I could download each 128 KB segment from a different location.
    --

    I'll see your senator, and I'll raise you two judges.
  21. Re:Just an annoyance by bman08 · · Score: 4, Interesting

    The magic of this system is that it also works in reverse: "Your honor, my client hates p2p filesharing. All those songs he downloaded, he thought they were phonies with duplicate hashes and deliberately shared them in order to poison the network."

  22. Re:Just an annoyance by Neoncow · · Score: 2, Interesting
    I believe there is a hashing algorithm called TigerTree. TigerTree computes a single hash based on 1024 byte blocks. As the file is downloaded, each block can be independantly verified.

    So if they try to pollute a network by giving corrupt data for a valid file, all the downloader needs to do is notice that a particular client keeps sending corrupt parts. And of course if they send some real bits nad some fake bits, the downloader will keep the real bits and discard the fake ones.

    Don't ask me how it works, but I know that Shareaza makes use of this hash.

    Link I ripped from the Shareaza wiki: Tree Hash EXchange format (THEX)

  23. Re:Link to the patent application by antime · · Score: 2, Interesting

    Thanks for the link. If you look at page four of the document, it explains that because the UUHash algorithm used by Kazaa hashes only a small part of the file it is feasible to change other parts and produce hash collisions through brute-force attacks. Then the attacker just pretends to be a normal node and feeds bad data into the network.
    The obvious way to counter this is to either fix Kazaa or switch to a network where the whole file is hashed.

  24. Re:This is so stupid by WaterBreath · · Score: 3, Interesting
    Yes it can be used for copyright violations, just like a photocopy machine or tape recorder.

    And those things were each also embroiled in copyright lawsuits by big corporations in their day. The difference is that today, the big corps have finally gained enough political leverage to get it their way.

    Corporations are the new first-class citizens. Any individual, regardless of race, gender, or creed, is second-class compared to a corporation.

    I honestly fear that by the time the American people get fed-up enough to realize this, the transformation will be complete, and we will be powerless to change it.

  25. Re:This is so stupid by patio11 · · Score: 3, Interesting

    This doesn't cripple P2P. It just makes a dent in pirate-2-pirate. There is a difference, you realize. The Blizzard Bittorrent patch downloader will still function perfectly. Indie bands who release their new CDs to Kazaa won't have anybody trying to pollute their download pools. And it probably won't even work, more's the pity.

  26. Re:Possible? Yeah by Council · · Score: 2, Interesting

    Oh, I get Mr. Schneier's thing and I'm not behind on the news; I am under the impression that that there have not been demonstrated preimage attacks on MD5, which is what I was referring to.

    Re: SHA-1:

    These are not theoretical results but actual collisions.

    Again, here it is preimage attacks that are the problem, not just any collisions. But the results mentioned in the link are NOT actual collisions, just an algorithm to produce those collisions that might be feasable to run sometime soon. They didn't actually calculate any collisions. So not "actual collisons", but a "theoretical result". But that's just pedantry, sort of.

    Anyway, as far as preimage goes SHA-1 is certainly still secure, as is -- I believe -- MD5, and this is what's relevant in downloading. If they are not, please point me to the appropriate thing.

    --
    xkcd.com - a webcomic of mathematics, love, and language.
  27. Re:Possible? Yeah by cryptoguy · · Score: 2, Interesting
    Here are two different files with the same md5 sum. They are quite similar, but notice for example the differences at byte 20 and at byte 27.
    file1.dat:


    00000000 d1 31 dd 02 c5 e6 ee c4 69 3d 9a 06 98 af f9 5c
    00000010 2f ca b5 87 12 46 7e ab 40 04 58 3e b8 fb 7f 89
    00000020 55 ad 34 06 09 f4 b3 02 83 e4 88 83 25 71 41 5a
    00000030 08 51 25 e8 f7 cd c9 9f d9 1d bd f2 80 37 3c 5b
    00000040 96 0b 1d d1 dc 41 7b 9c e4 d8 97 f4 5a 65 55 d5
    00000050 35 73 9a c7 f0 eb fd 0c 30 29 f1 66 d1 09 b1 8f
    00000060 75 27 7f 79 30 d5 5c eb 22 e8 ad ba 79 cc 15 5c
    00000070 ed 74 cb dd 5f c5 d3 6d b1 9b 0a d8 35 cc a7 e3

    MD5(file1.dat) = a4c0d35c95a63a805915367dcfe6b751

    file2.dat:

    00000000 d1 31 dd 02 c5 e6 ee c4 69 3d 9a 06 98 af f9 5c
    00000010 2f ca b5 07 12 46 7e ab 40 04 58 3e b8 fb 7f 89
    00000020 55 ad 34 06 09 f4 b3 02 83 e4 88 83 25 f1 41 5a
    00000030 08 51 25 e8 f7 cd c9 9f d9 1d bd 72 80 37 3c 5b
    00000040 96 0b 1d d1 dc 41 7b 9c e4 d8 97 f4 5a 65 55 d5
    00000050 35 73 9a 47 f0 eb fd 0c 30 29 f1 66 d1 09 b1 8f
    00000060 75 27 7f 79 30 d5 5c eb 22 e8 ad ba 79 4c 15 5c
    00000070 ed 74 cb dd 5f c5 d3 6d b1 9b 0a 58 35 cc a7 e3

    MD5(file2.dat) = a4c0d35c95a63a805915367dcfe6b751

    For SHA1, you are correct. They presented an algorithm for finding collisions in full 80-round SHA1, and demonstrated the correctness of the algorithm on SHA1 reduced to 58 rounds. Here is the SHA1 announcement:

    http://theory.csail.mit.edu/~yiqun/shanote.pdf