Finnish Firm Claims Fake P2P Hash Technology
An anonymous reader writes "As reported by The Inquirer, a Finnish company known as Viralg Oy claim to have developed software that can create a junk file with the same hash as a genuine p2p download. This, according to the company, can altogether stop the sharing of copywritten files by flooding p2p networks with corrupt/junk data, which then spreads through the network, causing less and less of the original file to be available. However, with the resolve of the p2p userbase, is this software really going to 'beat all Peer 2 Peer pirates at their own game,' or simply prove a minor annoyance?"
People will always creatively find a way around everything!
Or they have cracked even the strong hashes. In which case they are really cool. I know Mr. Torvalds is Finnish, but I doubt even he could come up with algorithms to do that.
In their conceited press release, they have compared Spoofing vs DRP/a
Iran captures three CIA agents
Bah! Screw you guys. I'll just make my own P2P hash algorithm. With blackjack. And hookers. In fact, forget the P2p hash algorithm. And the blackjack.
It's "copyrighted," not "copywritten." We're talking about rights, not writings.
I guess there are two schools here.
One believes this kind of fake files will only add burden to the internet, as users will just download one fake file after another until they got a hit.
The other believes that such annoyance will put most people off, because the total time/cost it takes to acquire something is now higher than the actual product.
I don't think MP3s will be affected because you can start playing the song if you've got the first bit. Can/will other file formats do that too?
Rock that crushes, Paper & Scissors that don't matter.
I took the liberty of pre-caching the site on Coral before it went live - http://www.viralg.com.nyud.net:8090/index.html. I think Slashdot should really consider doing this as part of the proceedure...this site won't last a minute under the weight of our collective, nerdy asses.
How big is that 'junk file'?
I've always thought it would be extremely possible to create a file with the same MD5 hash.
.. then I'll be impressed.
Now, what the company has to do is create a file of the SAME FILE SIZE, with the same MD5 hash that's a fake
= Grow a brain...
Check out my sci-fi/humor trilogy at PatriotsBooks.
What hashing algorithm do they claim to have broken so completely? Sounds like BS to me.
Don't blame me; I'm never given mod points.
Bullshit. "Virtual Algorithms" my ass.
... it only takes most pirates (at most) a week to find a work around and everything is back to (pirating) normal.
Good quote, too many chars. Seriously, the slashdot 120 char limit sucks!
By the time this is submitted, it will probably already be redundant (even though it's informative :)) - but the hashes are used for parallel download streams of the same file. So, if you saturate the network with the same hash, you can corrupt the data when the client automatically assumes it's the same file and tries to merge it with the other incoming data.
They might be able to fake one hash, but don't most P2P networks use a combination of different hashes? if not then it would be easy to implement - you can either go for more than one different type of hash like md5 and sha etc or add salt/pepper to a chunk and make any number of hashes where each additional hash makes it insanely harder to crack..
This comment does not represent the views or opinions of the user.
Their site is down so I can't get any real details, but I think this is smoke and mirrors in any case.
I want a new world. I think this one is broken.
Use 2 (or more) different hashing algorithms on the file, and check the file size.
I'm pretty sure that should reduce the collisions to some stupidly small value.
Update Watch - Automatic software update notification
Read more here
fuvoo: watch something
in pdf form
Note the claims section and references - they keep talking about Napster and Kazaa - nothing about anything that use hashes.
how will this be different from the flodding of fake files already on P2P networks like Kazaa. Sure, the hash will be the same, but what "JHoe Sixpack" looks at hashes?!
Joe Sixpack may not look at hashes, but his P2P software probably does. I know aMule uses the hash to match files that have had their names changed.
~Rebecca
Don't most P2P programs use MD5? I was also under the assumption that P2P programs do a checksum on each piece of the file they receive, and if it's inaccurate it automatically re-downloads that part of the file. I've had pieces of a bittorrent download fail due to corruption and the client has just downloaded that part again.
Seems like this company's setup would only work in very specific circumstances, meaning it won't have much of an effect at all.
You have enemies? Good. That means you've stood up for something, sometime in your life. --Winston Churchill
Unless they have lots of supercomputer time, seeding the occasional p2p file with bad data will be very expensive.
Or even better, use more than one. If file_x is hashed 10 different ways, using 10 different algorithms, there's no way the file generated by this firm will behave the same way for ALL of them, perhaps not even for two.
Quid festinatio swallonis est aetherfuga inonusti?
Africus aut Europaeus?
The time-vs-accuracy tradeoff is a big one. One client which I know some people who use, takes almost 48 hours to index a full hard drive of files to share, and hash them all.
Anything less robust, you're liable to have collisions, such as these, apparently. Any more, and if you have a lot of files, there's a major time committment before you can actually begin to serve anything -- most people aren't willing to have their CPU pegged for 2 days straight while their P2P client hashes their 35,000 MP3s and 200 movies, or so.
isn't the whole point of a hash is that it's computationally-infeasible to create a file that that H(new file)=H(original).
if this technology is true, it'll completely undermine the safety of today's unix passwords, which are stored in clear text of their hash.
If I have one of these files and share the hell out of it, I better not be contacted by RIAA. If this spreads, not only will it make sharing difficult, it will make tracking legitimate (haha) piracy more difficult to detect. This (sort of) reminds me of a more high tech version of the time everyone started changing the name of their tracks to things like "Br1tn3y Sp34rs" to evade blocked searches.
Using multiple hashes is a hash algorithm itself. If someone found a general way to crack hashes, then they'd be able to crack this new 'super' hash just as easily. All you'd really be doing is creating a hash with more bits. Might as well use the "best" hashing algorithm and increase the width.
autopr0n is like, down and stuff.
Let's just concede they can actually produce a junk file which has the same hash. I'll even skip over which hash - let's also say it's one of the useful ones.
I'd be tempted to step up the credentials for a file, say one hash for the entire file, and another for the first 1kb, and so on. It should get significantly harder with each additional verification point.
What is neat, or not so neat depending on your point of view, are music files which deteriorate after a while. I don't know how they are made, but I have listened to music that sounds pretty good, but after the 10th playing it starts skipping. Or it could be those skips are not very noticable when first played, but once identified, they become annoying.
Rosco: "If brains were gunpowder, Enos couldn't blow his nose."
I'm switching to hashish.
Linux - Because Mommy taught me to Share.
If someone can really poison P2P networks with junk that hash matches (and I have a difficult time believing they've cracked all the hash generators), then consider some hypothetical entity probing illicit distribution of copyrighted material using hashes. They could end up making false accusations against individuals for trading trash instead of Trash©.
"Provided by the management for your protection."
The Bittorrent protocol uses SHA1 hashing.
Yes, there was recently a paper presented that "broke" SHA1, but the result is 2**69 operations instead of 2**80 to find a SHA1 collision. 2**69 is still a very large number of operations... a lot less than a full 2**80, but still a prohibitively large number (more costly than the actual realized losses the entertainment industry is suffering).
PJRC: Electronic Projects, 8051 Microcontroller Tools
Here is a tool specifically designed to cripple the flow of data, how can it be thought of as anything but a virus? Should it work I could see TV and Movie studios using it surreptitiously to cripple net-based fledgling media companies.
This should be outlawed just like another intentionally malevolent software. Why shouldn't everyone write viruses and malware when the big guys do it and the government sanctions it. This is just the kind of thing that keeps web commerce from taking off to its full potential.
Letter To Iran
If increasing the noise ratio on P2P networks is a good thing, maybe we can use a similar technique to defeat spammers?
For example, if we could pollute spammers' email address databases with millions of bogus e-mail addresses, then instead of delivering millions of spam e-mails to real e-mail accounts every day, maybe spammers could only reliably send a few hundred to users, the rest of their messages would be to bogus addresses and be "noise" that spammers have to deal with.
How could we go about doing this?
I don't know the meaning of the word 'don't' - J
What will they do when people like the files with random noise better than any of the current music?
I Am My Own Worst Enemy
You can always ensure an identical hash and size by filling the file with identical data and then uploading the new file to the P2P network. Imagine how quick filesharing would stop if all of the major industry groups started doing this. P2P wouldn't stand a chance, no siree.
The hash is generally generated on the client side of the original uploading system - and the validity of the file can only be checked once the file has been fully downloaded. So to break the system, just modify one of the open soure clients to report a particular hash for some random file of the same size as the original. There isn't any need to go to the effort that these guys have.
For example, you send the company a copy of the .mp3 file you want to drive out of circulation. They feed it to a computation cluster and eventually out comes another file which has the same hash. You then publish this new file with the same filename on the victim P2P network and hope that it spreads enough to poison the P2P well, so to speak. There are a number of problems with this scheme (assuming of course that this is the sort of scheme that they offer):
SO say the RIAA tries to sue you, saying they saw that you had the newest 50 cent album on Kaaza. Couldn't you claim that what you had was not 50 cent's album, but random files with the same hash as 50 cent's mp3's? I mean, can't you fight the RIAA with its own weapons? If they completely destroy the mechanism with determining what files you currently have, then how does their claim that you had X file hold any merit at all?
Not only the company's, but also the submitter's claim seems to be bogus. Neither the Inquirer article nor the viralg.com website anywhere seem to be talking about hashes. Moreover, I'm kind of wondering where the Inqurer got their stuff from, since the viralg website contains... nothing. Nothing but blaah. No word at all on how they protect anything from anyone. A random link to the Finnish Top 40 allegedly showing how BMG became the market leader for domestic music. Umm, except that nothing whatsoever proves that Viralg had anything to do with it. (If you have evidence to the contrary, please post it!) Then there's some blurb about being insiders with mathematical knowledge up in the lonely north where there's nothing else to do is what got them where they are. So, where are they? Not like they actually tell us. No contact information besides the email address either (and nothing in the whois info). Apparently, being up in the lonely north with nothing else to do doesn't get you much further than producing a nonsensical website claiming you know how to save the world, find the question to the answer to life, the Universe and everything, with "stunning results."
:)
:)
And, breaking hashes, nonsense. If anything, maybe they are managing to manipulate P2P protocols to send you data you weren't supposed to be getting, but which is not actually going into the checksum?
Nothing for you to see here, methinks... and here I am wasting my time actually writing a reply to a trollish article.
On another random note, I kind of liked how their website looked in links.
Empty.
"This, according to the company, can altogether stop the sharing of copywritten files by flooding p2p networks with corrupt/junk data"
Slashdot should rejoice at this! Since none of us download illegal material and nobody that any of us knows downloads illegal material, this technology might allow us to continue our legal, legitimate downloading of media and only target those handful of ruffians who engage in illegal filesharing. I'm all in favor of this!
"I have never won a debate with an ignorant person." -Ali ibn Abi Talib
P2P clients, when they search for files, receive alleged hashes from where? The peers that claim to have them. And since most of these protocols have been reverse-engineered by now, I suspect that this program just combines a random-data generator with a multi-network untrustworthy P2P client. It'll sit on a network and respond to searches, report the expected filename, filesize, and hash (whatever algorithm is used), and wait for people to bite.
There is no technological way of verifying that the other peer is telling the truth (or at least there won't be unless the whole world implements some sort of Orwellian "Trusted Computing" requirement), aside from downloading the whole file and verifying it against the expected hash. No hash algorithms need be broken. I mean, once the whole file is downloaded, what does it matter to them whether the hash really matches? Why would even an idiot keep a downloaded file just because the program says it's verified and the size matches, if he can clearly see that the file doesn't work?
Signature.
Y'all are missing the point.
These guys are not about taking out P2P.
They are part of a denial of service attack against the RIAA and MPAA, and we need more companies like them in order to make it effective.
You see, it works like this:
1) Make up a really snazzing sound anti-piracy product,
2) Back it with lots of sexy buzzwords and hand-waving
3) Sell, sorry LICENSE, it for lots of money to the (RI|MP)AA.
4) When it fails to perform, let in the next guy ready to do the same.
Repeat until (RI|MP)AA bank accounts have been depleted.
How about a hash of the entire file, plus a hash of every 128 KB segment. Constructing a file that matches all of the 128 KB section hashes, plus the overall hash is a much more difficult problem.
Plus, you know after downloading only 128 KB that the file is not the real deal. It only takes 8 * 128 bytes or 1024 bytes of hash information per megabyte of download -- really only a few packets to communicate the hash list for, say, a 10 MB file. The benefit for this cost is
- early detection of corrupt download
- difficult of creating a corrupt download
Now suppose that in BitTorrent like fashion, I could download each 128 KB segment from a different location.I'll see your senator, and I'll raise you two judges.
P2P is a technology. Yes it can be used for copyright violations, just like a photocopy machine or tape recorder. But it also has amazing possibilities in terms of creating a universal organic archive. Crippling like this -- and through using lawsuits -- is an unnecessary attack on a system in its infancy.
The copyright issues will work themselves out -- until the 20th century human art and ingenuity survived for thousands of years without the ability to make millions selling recorded music and video. If p2p has a major effect on the entertainment industry's ability to profit (and I'm still not convinced that it really will), human art and culture will survive. And people will continue to find ways to make a living creating art.
I wonder why people who use P2P don't help each other out a little more. For example, you have someone with 200 files shared. They are downloading and sharing at the same time. Sometimes they download a bad file, and share it. It would make more sense to have a "unchecked" folder for downloads, then more it to the "checked" folder to share.
That would break a feature which enables greater sharing... Uploading of parts of files that you do not have all of. Think BitTorrent, but less organized...
"I'll have a Guinness, no wait, make that a Coors Light" -Grad student I work with, who shall remain anonymous...
Hehe, yup, its one of the great lines HEX produced.
I can really reccommend Terry Pratchett's books to everyone.
+++ MELON MELON MELON +++ Out of Cheese Error +++ redo from start +++
The RIAA can put out "evil clients" that find good files and lie to the tracker telling the tracker it's a bad file.
Unless the tracker double-checks the file itself, or has some way to trust the clients it's getting reports from, it's vulnerable to being lied to.
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
It's a couple pages in my paper here. Basically, the first 300Kb of Kazaa's files are hashed normally, then every 32Kb chunk of the file is hashed independently. This allows independent chunks to be downloaded out of order. These out of order chunks are recursively hashed against one another to create one final value, called a "kzhash", which is verified after the file is downloaded.
The attack is to use the recently released collision -- which creates two blocks that, when mixed against the default initial state of MD5, emit the same system state. Every 32K, you can embed one or the other in the file you're transmitting, and kzhash can't tell. What can you do with this? Morph a file as it traverses the network; have an installation executable describe the systems its being installed on as it propogates through a network. With a fairly large installer, you'd get quite a few bits in there.
You still don't get to do random noise, and while it's no Tiger Tree, kzhashing doesn't appear so exploitable that this group is likely to have anything. I could be wrong, but then, virtual algorithm? Right.
I don't know how the search functions work in Kazaa etc. but can't you just send match to all querys with a fake client? Is there real data integrity check built into Kazaa clients?
Quidquid latine dictum sit, altum sonatur.
eMule definitely helps you better yourself.
Patience is a virtue, right?
It's been a long time.
Oh, I get Mr. Schneier's thing and I'm not behind on the news; I am under the impression that that there have not been demonstrated preimage attacks on MD5, which is what I was referring to.
Re: SHA-1:
These are not theoretical results but actual collisions.
Again, here it is preimage attacks that are the problem, not just any collisions. But the results mentioned in the link are NOT actual collisions, just an algorithm to produce those collisions that might be feasable to run sometime soon. They didn't actually calculate any collisions. So not "actual collisons", but a "theoretical result". But that's just pedantry, sort of.
Anyway, as far as preimage goes SHA-1 is certainly still secure, as is -- I believe -- MD5, and this is what's relevant in downloading. If they are not, please point me to the appropriate thing.
xkcd.com - a webcomic of mathematics, love, and language.
I've already looked into poisoning Torrents: 1) There is a hash on the entire file (simple enough) 2) The data shared from a torrent is broken up into pieces. Contributors can only send whole pieces. (ie many people contribute to the entire file you're downloading but only 1 person contributes to a given piece). AND EACH PIECE IS HASHED. Take a look at the .torrent for yourself. The .torrent contains the hash of every piece. So not only would you have to make a file of the SAME SIZE with the SAME HASH, but every 1MB (for example) would also need to have the SAME HASH.
Not only that but if you inject enough bad pieces you get booted (and yes this can be tracked, becuase as I stated before pieces come from a single individual).
file1.dat:
00000000 d1 31 dd 02 c5 e6 ee c4 69 3d 9a 06 98 af f9 5c
00000010 2f ca b5 87 12 46 7e ab 40 04 58 3e b8 fb 7f 89
00000020 55 ad 34 06 09 f4 b3 02 83 e4 88 83 25 71 41 5a
00000030 08 51 25 e8 f7 cd c9 9f d9 1d bd f2 80 37 3c 5b
00000040 96 0b 1d d1 dc 41 7b 9c e4 d8 97 f4 5a 65 55 d5
00000050 35 73 9a c7 f0 eb fd 0c 30 29 f1 66 d1 09 b1 8f
00000060 75 27 7f 79 30 d5 5c eb 22 e8 ad ba 79 cc 15 5c
00000070 ed 74 cb dd 5f c5 d3 6d b1 9b 0a d8 35 cc a7 e3
MD5(file1.dat) = a4c0d35c95a63a805915367dcfe6b751
file2.dat:
00000000 d1 31 dd 02 c5 e6 ee c4 69 3d 9a 06 98 af f9 5c
00000010 2f ca b5 07 12 46 7e ab 40 04 58 3e b8 fb 7f 89
00000020 55 ad 34 06 09 f4 b3 02 83 e4 88 83 25 f1 41 5a
00000030 08 51 25 e8 f7 cd c9 9f d9 1d bd 72 80 37 3c 5b
00000040 96 0b 1d d1 dc 41 7b 9c e4 d8 97 f4 5a 65 55 d5
00000050 35 73 9a 47 f0 eb fd 0c 30 29 f1 66 d1 09 b1 8f
00000060 75 27 7f 79 30 d5 5c eb 22 e8 ad ba 79 4c 15 5c
00000070 ed 74 cb dd 5f c5 d3 6d b1 9b 0a 58 35 cc a7 e3
MD5(file2.dat) = a4c0d35c95a63a805915367dcfe6b751
For SHA1, you are correct. They presented an algorithm for finding collisions in full 80-round SHA1, and demonstrated the correctness of the algorithm on SHA1 reduced to 58 rounds. Here is the SHA1 announcement:
http://theory.csail.mit.edu/~yiqun/shanote.pdf
Sorry, that level of doublethink is only alowed for corporate lawyers. Your lawyer will be smacked down for trying it, since it is not a defense permitted to second-class citizens (see earlier post).
Freedom: "I won't!"
All we can really say is that these researchers did not demonstrate a preimage attack. However what they did demonstrate should raise serious concerns that a preimage attack might be possible. For example, I could hash the latest blockbuster movie file, saving the internal MD5 state at the last iteration. Then, proceed with their algorithm, searching for a pair of two-block extensions to add to the file which lead to MD5 collisions of the entire file. If not, why not?
Bottom line, attacks get stronger over time, never weaker. Once a crack appears, further probing generally widens the crack.
MD5 is probably ok to use in a scenario where you don't expect an active adversary, or in a keyed hash where the security is protected by a secret key. But relying on MD5 to protect data integrity against a well funded adversary is foolish at this point.