Slashdot Mirror


Finnish Firm Claims Fake P2P Hash Technology

An anonymous reader writes "As reported by The Inquirer, a Finnish company known as Viralg Oy claim to have developed software that can create a junk file with the same hash as a genuine p2p download. This, according to the company, can altogether stop the sharing of copywritten files by flooding p2p networks with corrupt/junk data, which then spreads through the network, causing less and less of the original file to be available. However, with the resolve of the p2p userbase, is this software really going to 'beat all Peer 2 Peer pirates at their own game,' or simply prove a minor annoyance?"

36 of 748 comments (clear)

  1. They have cracked strong hashes, huh? by Flywheels+of+Fire · · Score: 5, Informative
    This is not true. It might work on Kazaa but most other P2P networks use MD5 or better. Okay, they have found collisions but no one has found a way to generate file for a given key. So the claim by the finnish company is bogus.

    Or they have cracked even the strong hashes. In which case they are really cool. I know Mr. Torvalds is Finnish, but I doubt even he could come up with algorithms to do that.

    In their conceited press release, they have compared Spoofing vs DRP/a

    1. Re:They have cracked strong hashes, huh? by BlacBaron · · Score: 2, Informative

      Bittorrent uses a hash for segments of the file, usually segments are 256k, 512k or 1mb, but I think any power of 2 is valid. It then lists these in the .torrent file. The hash of the info section of the torrent file is used to uniquely identify each torrent on the tracker.

      --
      Update Watch - Automatic software update notification
    2. Re:They have cracked strong hashes, huh? by mboverload · · Score: 5, Informative

      Bittorrent clients ban IP's that send them a certain number of bad pieces.

    3. Re:They have cracked strong hashes, huh? by BlacBaron · · Score: 2, Informative

      Patent application is here...

      http://v3.espacenet.com/textdoc?DB=EPODOC&IDX=WO 20 05032111&F=0&QPN=WO2005032111

      I just skimmed over it, but it seemed to suggest their whole strategy revolved around having the "correct" original file with the right hash, then switching it for one with all the wrong data such that the client application didn't notice.

      They suggested keeping the beginning of the file the same so as not let users determine its dodgy straight away.

      As I said i've only skimmed this, but this to me says things like BitTorrent are inherently immune, possibly kazaa is not as I'm not sure if it has hashes of sections of a file.

      --
      Update Watch - Automatic software update notification
    4. Re:They have cracked strong hashes, huh? by CristianoMonteiro · · Score: 2, Informative

      well, if you know a way to generate a bogus packet with the same size and the same hash within a 2^256 bytes space (SHA1), please call NSA.

      As said in a previous post, there isn't enough matter in the universe to store 2^256 bytes of data and no computers in the known universe can calculate that amount of information in a reasonable time frame.

      --
      -------------------------------------------- Se você consegue ler aqui então fala português. Óbvio
    5. Re:They have cracked strong hashes, huh? by KalaNag · · Score: 2, Informative

      In fact, someone else already answered that. http://it.slashdot.org/comments.pl?sid=139986&cid= 11723871

    6. Re:They have cracked strong hashes, huh? by redhog · · Score: 3, Informative

      Nah, you are both wrong. Two 160bit hashes are prolly somewhere in between as strong as a 320bit hash and a 160bit hash, depending on exactly how the hash-values distribute over the input space. If the hash where perfect, the distance between any two hash-values with one bit of difference would be the same. However, in reality, that would hardly be the case except for some hashes with a given data-to-hashsize-ratio. But taking two random hashfunctions would probably combine into one where many bits are redundant (not the same bits for all hash-values of course). Hm, hope that goes for enought of an explanation. Otherwize, go read up on coding theory at mathworld.wolfram.com or wikipedia. A search for "Hamming distance" might also be a good start :)

      --
      --The knowledge that you are an idiot, is what distinguishes you from one.
    7. Re:They have cracked strong hashes, huh? by CDarklock · · Score: 2, Informative

      > Two 160bit hashes are prolly
      > somewhere in between as strong
      > as a 320bit hash and a 160bit
      > hash

      That's exactly what I'm saying. If the two hashes are completely independent -- zero bits of redundancy -- then you have a 320 bit hash. If they're completely redundant, you have a 160 bit hash. So the question is how independent MD5 and SHA1 are; if they're completely independent, then they combine to a 288 bit hash. If they're completely redundant, they combine to a 160 bit hash and you may as well just use SHA1.

      The birthday attack isn't really relevant to practical hashing, anyway. Hashes collide; that's why we use them. When you use 128 bits to represent two megs of data, there's going to be something else that has the same hash. The existence of multiple messages with the same hash is a natural, normal, and NECESSARY quality of a hash function.

      --
      Microsoft cheerleader, blue flag waving, you got a problem with that?
  2. "Copyrighted" by As+Seen+On+TV · · Score: 5, Informative

    It's "copyrighted," not "copywritten." We're talking about rights, not writings.

  3. Coral Cache by Anonymous Coward · · Score: 5, Informative

    I took the liberty of pre-caching the site on Coral before it went live - http://www.viralg.com.nyud.net:8090/index.html. I think Slashdot should really consider doing this as part of the proceedure...this site won't last a minute under the weight of our collective, nerdy asses.

  4. Re:Already done by B3ryllium · · Score: 5, Informative

    By the time this is submitted, it will probably already be redundant (even though it's informative :)) - but the hashes are used for parallel download streams of the same file. So, if you saturate the network with the same hash, you can corrupt the data when the client automatically assumes it's the same file and tries to merge it with the other incoming data.

  5. Read this... by Virtual+Karma · · Score: 2, Informative
    One of the big advantages of BitTorrent/Suprnova is the high level of integrity of both the content and the meta-data due to the working of its global components. We have shown that only 20 moderators combined with numerous other volunteers solve the fake-file problem on BitTorrent/Suprnova

    Read more here

  6. Link to the patent application by Zarhan · · Score: 4, Informative

    in pdf form

    Note the claims section and references - they keep talking about Napster and Kazaa - nothing about anything that use hashes.

  7. Re:Already done by rkcallaghan · · Score: 4, Informative

    how will this be different from the flodding of fake files already on P2P networks like Kazaa. Sure, the hash will be the same, but what "JHoe Sixpack" looks at hashes?!

    Joe Sixpack may not look at hashes, but his P2P software probably does. I know aMule uses the hash to match files that have had their names changed.

    ~Rebecca

  8. Seems bogus to me by gtoomey · · Score: 5, Informative
    It takes 2^69 operations to find collisions with SHA1

    Unless they have lots of supercomputer time, seeding the occasional p2p file with bad data will be very expensive.

    1. Re:Seems bogus to me by pjrc · · Score: 5, Informative
      Remember that those 2^69 "operations" (each many CPU cycles) are for a SHA1 "collision" attack. A "preimage" attack that would be necessary to inject corrupt data into a p2p network using SHA1 (such as Bittorrent) is much harder and has not been discovered and published.

      Quoting from the linked page:

      Q: What is a collision attack and a preimage attack?
      A: A preimage attack would enable someone to find an input message that causes a hash function to produce a particular output. In contrast, a collision attack finds two messages with the same hash, but the attacker can't pick what the hash will be. The attacks announced at CRYPTO 2004 are collision attacks, not preimage attacks.

    2. Re:Seems bogus to me by imsabbel · · Score: 2, Informative

      haha.

      A sha hash is what? 256bit?
      so you get 32byte per block.
      Now how many pertubation can you get...
      Lets assume your p2p software uses block sizes of 4byte. For a complete database you would need 2^32*32Byte=128Gbyte.
      For a complete 8byte set you would need 2^64*32byte.
      All the storage space in the world wouldnt even be enough for a 128Byte block, and bittorrent uses a minimum of 32Kbyte, edonkey even has a hash over the total filelenght.
      For 32Kbyte, there isnt enough matter inthe universe to store enough information to get even a 1:10^50 chance of getting a hit.

      --
      HI O WISE PRINCE. WHT TOOK U SO DAM LONG?
  9. bittorrent uses sha1 by pjrc · · Score: 2, Informative
    Hard to believe this is gonna work on bittorrent... the most important file sharing app in use today.

    The Bittorrent protocol uses SHA1 hashing.

    Yes, there was recently a paper presented that "broke" SHA1, but the result is 2**69 operations instead of 2**80 to find a SHA1 collision. 2**69 is still a very large number of operations... a lot less than a full 2**80, but still a prohibitively large number (more costly than the actual realized losses the entertainment industry is suffering).

  10. Experiences as a Finn by Anonymous Coward · · Score: 1, Informative

    From what I have notices, using Kazaa-type software in Finland is nowadays a complete waste of time. What you get are exactly these files the company claims to have created. Sometimes you here like 10 seconds of the actual song and the rest is just random noise.

    Now, I do not know if what they claim is technically true or whether it is this company that is behind all these files, but I can tell that in real life it is extremely hard for a "normal non-geek user" to find pirated music here in Finland anymore.

    Bittorrent and DC++ type of systems seem to be unaffected though.

  11. Blaaaaaah by mindriot · · Score: 4, Informative

    Not only the company's, but also the submitter's claim seems to be bogus. Neither the Inquirer article nor the viralg.com website anywhere seem to be talking about hashes. Moreover, I'm kind of wondering where the Inqurer got their stuff from, since the viralg website contains... nothing. Nothing but blaah. No word at all on how they protect anything from anyone. A random link to the Finnish Top 40 allegedly showing how BMG became the market leader for domestic music. Umm, except that nothing whatsoever proves that Viralg had anything to do with it. (If you have evidence to the contrary, please post it!) Then there's some blurb about being insiders with mathematical knowledge up in the lonely north where there's nothing else to do is what got them where they are. So, where are they? Not like they actually tell us. No contact information besides the email address either (and nothing in the whois info). Apparently, being up in the lonely north with nothing else to do doesn't get you much further than producing a nonsensical website claiming you know how to save the world, find the question to the answer to life, the Universe and everything, with "stunning results."

    And, breaking hashes, nonsense. If anything, maybe they are managing to manipulate P2P protocols to send you data you weren't supposed to be getting, but which is not actually going into the checksum?

    Nothing for you to see here, methinks... and here I am wasting my time actually writing a reply to a trollish article. :)

    On another random note, I kind of liked how their website looked in links.

    Empty. :)

  12. Re:Agreed by Jeremiah+Cornelius · · Score: 2, Informative

    Shareaza has a "commenting" system for just this purpose.

    --
    "Flyin' in just a sweet place,
    Never been known to fail..."
  13. Pitty, I thought md5 was unique by houghi · · Score: 1, Informative

    Or at least to be unique for each individual file per size. That would have ment that if you send the md5 sum plus the size info, you could in theory remake the file.

    So instead of sending 'cf878d4809930e3696d9c9c242a6f646 1450466 KB' and recalculating what the content was, I will just have to retrieve SL-9.3-LiveDVD-amd64.iso.

    Oh well, back to the drawing board.

    --
    Don't fight for your country, if your country does not fight for you.
  14. Re:Agreed by Jjeff1 · · Score: 2, Informative

    I've also heard MP3s that work fine on my PC, but skipped horribly on my car player. Different players handle corrupted or badly compressed files differently.

  15. incomplete downloads by TamMan2000 · · Score: 3, Informative

    I wonder why people who use P2P don't help each other out a little more. For example, you have someone with 200 files shared. They are downloading and sharing at the same time. Sometimes they download a bad file, and share it. It would make more sense to have a "unchecked" folder for downloads, then more it to the "checked" folder to share.

    That would break a feature which enables greater sharing... Uploading of parts of files that you do not have all of. Think BitTorrent, but less organized...

    --
    "I'll have a Guinness, no wait, make that a Coors Light" -Grad student I work with, who shall remain anonymous...
  16. Re:That sig is from diskworld, isn't it? by CharonX · · Score: 3, Informative

    Hehe, yup, its one of the great lines HEX produced.
    I can really reccommend Terry Pratchett's books to everyone.

    --
    +++ MELON MELON MELON +++ Out of Cheese Error +++ redo from start +++
  17. As someone who actually _does_ have a P2P attack.. by Effugas · · Score: 5, Informative

    It's a couple pages in my paper here. Basically, the first 300Kb of Kazaa's files are hashed normally, then every 32Kb chunk of the file is hashed independently. This allows independent chunks to be downloaded out of order. These out of order chunks are recursively hashed against one another to create one final value, called a "kzhash", which is verified after the file is downloaded.

    The attack is to use the recently released collision -- which creates two blocks that, when mixed against the default initial state of MD5, emit the same system state. Every 32K, you can embed one or the other in the file you're transmitting, and kzhash can't tell. What can you do with this? Morph a file as it traverses the network; have an installation executable describe the systems its being installed on as it propogates through a network. With a fairly large installer, you'd get quite a few bits in there.

    You still don't get to do random noise, and while it's no Tiger Tree, kzhashing doesn't appear so exploitable that this group is likely to have anything. I could be wrong, but then, virtual algorithm? Right.

  18. Re:Interesting idea, how can we apply it to spam? by rbarreira · · Score: 3, Informative

    Your post advocates a

    (X) technical ( ) legislative ( ) market-based ( ) vigilante

    approach to fighting spam. Your idea will not work. Here is why it won't work. (One or more of the following may apply to your particular idea, and it may have other flaws which used to vary from state to state before a bad federal law was passed.)

    ( ) Spammers can easily use it to harvest email addresses
    ( ) Mailing lists and other legitimate email uses would be affected
    ( ) No one will be able to find the guy or collect the money
    ( ) It is defenseless against brute force attacks
    ( ) It will stop spam for two weeks and then we'll be stuck with it
    ( ) Users of email will not put up with it
    ( ) Microsoft will not put up with it
    ( ) The police will not put up with it
    ( ) Requires too much cooperation from spammers
    ( ) Requires immediate total cooperation from everybody at once
    ( ) Many email users cannot afford to lose business or alienate potential employers
    (X) Spammers don't care about invalid addresses in their lists
    ( ) Anyone could anonymously destroy anyone else's career or business

    Specifically, your plan fails to account for

    ( ) Laws expressly prohibiting it
    ( ) Lack of centrally controlling authority for email
    ( ) Open relays in foreign countries
    ( ) Ease of searching tiny alphanumeric address space of all email addresses
    ( ) Asshats
    ( ) Jurisdictional problems
    ( ) Unpopularity of weird new taxes
    ( ) Public reluctance to accept weird new forms of money
    ( ) Huge existing software investment in SMTP
    ( ) Susceptibility of protocols other than SMTP to attack
    ( ) Willingness of users to install OS patches received by email
    ( ) Armies of worm riddled broadband-connected Windows boxes
    ( ) Eternal arms race involved in all filtering approaches
    (X) Extreme profitability of spam
    ( ) Joe jobs and/or identity theft
    ( ) Technically illiterate politicians
    ( ) Extreme stupidity on the part of people who do business with spammers
    ( ) Dishonesty on the part of spammers themselves
    ( ) Bandwidth costs that are unaffected by client filtering
    ( ) Outlook

    and the following philosophical objections may also apply:

    (X) Ideas similar to yours are easy to come up with, yet none have ever
    been shown practical
    ( ) Any scheme based on opt-out is unacceptable
    ( ) SMTP headers should not be the subject of legislation
    ( ) Blacklists suck
    ( ) Whitelists suck
    ( ) We should be able to talk about Viagra without being censored
    ( ) Countermeasures should not involve wire fraud or credit card fraud
    (X) Countermeasures should not involve sabotage of public networks
    ( ) Countermeasures must work if phased in gradually
    ( ) Sending email should be free
    ( ) Why should we have to trust you and your servers?
    ( ) Incompatiblity with open source or open source licenses [hey, it's Microsoft... they've probably already submitted the patent...]
    ( ) Feel-good measures do nothing to solve the problem
    ( ) Temporary/one-time email addresses are cumbersome
    ( ) I don't want the government reading my email
    ( ) Killing them that way is not slow and painful enough

    Furthermore, this is what I think about you:

    (X) Sorry dude, but I don't think it would work.
    ( ) This is a stupid idea, and you're a stupid person for suggesting it.
    ( ) Nice try, assh0le! I'm going to find out where you live and burn your house down!

    --

    The AACS key is NOT 0xF606EEFD628B1CA427BEA93A9CA9773F
  19. Re:Just an annoyance by merlin_jim · · Score: 4, Informative

    Actually we were both wrong; it is (2^keylength)^2 number of keys. However this number is equivalent to 2^(keylength*2), not 2^(keylength^2)

    Why would this not be "just double work"?

    First you find all files matching the first hash, then filter out one matching the second.

    And where exactly do you think the work is occuring? Computing the second hash. If you have one hash algorithm, you only have to match once. If you have two hash algorithms and you did it this way, you have to match enough with the first algorithm to find a match for the second algorithm. This isn't twice as much work, this is twice as much keyspace (with each bit increase in keyspace representing twice the work)

    --
    I am disrespectful to dirt! Can you see that I am serious?!
  20. Re:Just an annoyance by mancontr · · Score: 2, Informative

    The file isn't comprobed only when complete, every chunk is comprobed when received. (BT:1/2mb,ED2k:10mb)

  21. Re:Just an annoyance by mancontr · · Score: 3, Informative

    I meant: The file isn't verified only when complete, every chunk is verified when received. (BT:1/2mb,ED2k:10mb) Sorry, me fail english... (that's not umpossible...)

  22. Re:Just an annoyance by Hannes+Eriksson · · Score: 2, Informative

    Thank you for pointing out my mind slip.
    While I'm at it...

    With an 8-bit hash key, there are 256 possible keys. This means that 1/256 files will match the hash. With another hash function with 8-bit keys there are 1/256/256=1/65536=1/(256^2)=1/((2^8)^2) files matching the two keys. This keyspace is indeed the same size as that of a 16-bit key with the important difference that it is much easier to find matches if you can partition the search space.

    Picture yourself an unpainted 65536-piece square jigsaw puzzle (quite impossible for a human to do within a lifetime?).

    Now change your mental picture to a 65536-piece square jigsaw puzzle painted in 256 randomly ordered differently coloured vertical stripes. The solution for a column of the puzzle quickly degenerates into the work of solving an unpainted 256-piece 1-D puzzle (not so impossible, might take a couple of days). After doing 256 of those (might be a slight bit time-consuming, some years), the set of stripes represents another 256-piece puzzle (needing like another day to solve).

    This is not magic with large numbers, but the difference between brute force and the rest of the methods.

    For a 10MB file, there are 2^83886080 possible bit arrangements. 1/(2^32) of these (2^2621440) are collisions in a 32 bit key space. You wouldn't have to try them all to find enough collisions to find one which also makes a collision with another algorithm. Especially not if you know something about the algorithm.

    --
    Geek rants since like... 2000 or something.
  23. Good Luck Poisoning Torrents by NFN_NLN · · Score: 2, Informative

    I've already looked into poisoning Torrents: 1) There is a hash on the entire file (simple enough) 2) The data shared from a torrent is broken up into pieces. Contributors can only send whole pieces. (ie many people contribute to the entire file you're downloading but only 1 person contributes to a given piece). AND EACH PIECE IS HASHED. Take a look at the .torrent for yourself. The .torrent contains the hash of every piece. So not only would you have to make a file of the SAME SIZE with the SAME HASH, but every 1MB (for example) would also need to have the SAME HASH. Not only that but if you inject enough bad pieces you get booted (and yes this can be tracked, becuase as I stated before pieces come from a single individual).

  24. Re:claims? by SpecBear · · Score: 2, Informative
    Looks like a fraud/hoax/jok/whatever.
    • There's no text on the site. It's all images and flash animations. This immediately raises suspicions.
    • They claim that the technology has already been successfully used by BMG.
    • No Company info, phone number, or address, just a single email address
    • No details of how the tech works.
    • Claims 100% effectiveness.
    • Red alert phrase: "virtual algorithm"

    Anybody remember the name of that company that promised extremely high lossless compression rates on arbitrary files?
  25. Re:This is so stupid by ComputerizedYoga · · Score: 2, Informative
    That is assuming the "1337 hax0rs" don't get hold of the algorithms. I can just imagine people messing around with p2p networks just for fun.


    early in the lives of gotwoot and scarywater (large, fairly well known fansub bittorrent tracker sites), they encountered ddos issues...

    people were using botnets and what amounts to trivial network code to send false complete requests to the trackers, and volunteering as seeds. So, in a field of maybe 100-200 legitimate seeds, there would be ~30,000 fakes poisoning the tracker. The tracker couldn't tell they were fakes, so was redirecting 99% of requests for blocks to the fakes advertising themselves as seeds (And eventually running out of memory as more bots were activated and the server broke under the load).

    The recent weaknesses found in md5 and sha1 also make block poisoning a possibility. Which opens the door to download pool poisoning. If an attacker can generate a block that checksums to a known good block, then the downloader will only be able to detect that poisoned block in a many-blocks hash, not in individual block hashes. This means that the bad block would be propagated before it was detected, and poison the whole larger block (chunk).

    Even further, clients would have no way of determining exactly which block is bad, so would have to discard the entire chunk and start again... and again, may very well end up with the poisoned data.

    That's assuming that the app is still using a broken hash though. This becoming a problem would probably force the application into a better hashing algorithm (the yet-unbroken sha256 over sha1 or md5, for example), or into complete unusability, assuming the attackers were determined enough to poison every file and to do so intently enough to make an impact.
  26. Re:Just an annoyance by ShiroPengin · · Score: 2, Informative

    >Why would this not be "just double work"? It is squared work.

  27. Re:Possible? Yeah by cryptoguy · · Score: 2, Informative

    All we can really say is that these researchers did not demonstrate a preimage attack. However what they did demonstrate should raise serious concerns that a preimage attack might be possible. For example, I could hash the latest blockbuster movie file, saving the internal MD5 state at the last iteration. Then, proceed with their algorithm, searching for a pair of two-block extensions to add to the file which lead to MD5 collisions of the entire file. If not, why not?

    Bottom line, attacks get stronger over time, never weaker. Once a crack appears, further probing generally widens the crack.

    MD5 is probably ok to use in a scenario where you don't expect an active adversary, or in a keyed hash where the security is protected by a secret key. But relying on MD5 to protect data integrity against a well funded adversary is foolish at this point.