RIAA Tracking Songs by MD5 Hashes
aSiTiC writes "Apparently RIAA has obtained some technical experts in their prosecution of file swappers. Currently they are tracking traded mp3 files from the Napster network by matching MD5 hashes. This seems quite interesting but I was under the assumption that identical hashes could be created with identical rips and id3v2 tagging. Now may be the time to update your illegal mp3 file MD5 hash sums."
ya think? and here i thought it was the magical mp3 fairy who put mp3s on my hd...
As far as I know, you will get indentical hashes from identical files with the same ID3. How can they track files with the help of MD5-hashes?
What if I own the CD but got files off the Internet because I was too lazy to rip them? Would I still be expecting to be sent to the prison camp?
In other news, all songs produced by RIAA artists in the last 10 years all have the same MD5 hash anyway, because they're all the same.
"If you want to improve, be content to be thought foolish and stupid." - Epictetus
you just normalize or edit the begining or the end of the song? Does the MD5 Hashes still works?
The md5 hashing algorithm has been proven to contain flaws allowing two files to produce identical md5 sums.
The only way for two files to have the same MD5 hash is for them to both be encoded with the same encoder, from the same WAV file, with the same bitrate and all advanced options, and to have exactly the same ID3 information, the same filesize, and to be identical to the last bit.
Otherwise, the MD5 will be nothing like the same, for two perfectly identical songs where one has a spelling error in one field of the ID3 tag. I imagine for any one song, there are many many different MD5sums out there, although perhaps one or another good quality version would exists on hundreds of different PCs...
Conversion Rate Optimisation French / English consultant
I only trade plumber porn pics. Should I be worried?
will they start sending subpeonas to aol/tw customers this time?
Gee ... I would have thought that most people had moved on from Napster to BitTorrent, KAZAA or eDonkey/Overnet
This space for rent. All reasonable inquiries will be entertained at proprietors discretion.
hmm Isn't that how k-sig, built into Kazaa Lite K++, works, by tracking MD5 hashes so ppl get exactly the file they want.
Changing MD5 hashes on songs to avoid RIAA would also lessen the effectiveness of K-SIG. Trading hashes of know working files was one of the ways ppl on P2p avoided downloading those fake RIAA files.
Now may be the time to update your illegal mp3 file MD5 hash sums.
I sincerely hope this is tongue-in-cheek. For all the self-righteous, pompous sabre-rattling that goes on in here about how good Slashdotters only possess MP3's that are ripped from personal collections, I would certainly hope that we wouldn't stoop so low as to blatantly and openly be trading tips on how to avoid getting caught doing illegal things.
What's next? A HOWTO on setting up an encrypted file system for our child porn?
Like woodworking? Build your own picture frames.
Apparently RIAA has obtained some technical experts in their prosecution of file swappers. Currently they are tracking traded mp3 files from the Napster network by matching MD5 hashes
...
After all, in these dot-bust days, it's still possible to get a nice highly paid job and be called an expert by putting the right spin to strcmp() in your resume
"A door is what a dog is perpetually on the wrong side of" - Ogden Nash
It is generally believed amongst file traders that it is legal to download an mp3 for a song, when you own the CD. In other words, you don't need to rip and encode songs from your own CD. However, this may not be true (I am not a lawyer).
The RIAA is using MD5 hashes as a basis for proof that the individual in question downloaded the files they are sharing, instead of ripping them from their own CD collection. This is supposed to show the individual is a willing participant in stealing and distributing music, instead of someone who is just sharing what they already own. But, see above.
I think this is mostly just a FUD tactic. They can talk to the media about how their MD5 hashes prove so-and-so is a big mean pirate hacker. MD5 hash certainly sounds scary, especially when the technology is described by the media as a tool used by hackers.
---
I support spreading santorum
They are really fighting a losing battle.
Exchanging music is not about piracy, it is about exchanging culture, just like when my grandfather leant me some old Jazz records and said, "here, you might like this".
Today culture moves at the speed of light and the RIAA believes it has the right to tax this movement. It cannot succeed except by destroying the Internet.
I'm starting to believe, watching this debate evolve over many years, that the file traders are right, for the wrong reasons.
Human culture depends on exchange of ideas and information, and music and films are a large part of this in today's world. No album, no movie scene, no written text is a personal creation, they are all taken from the pool of common culture, modified, and redistributed.
Seeking all means to do this faster than ever - and ignoring the barriers, such as "ownership", that stand in the way - is the prerrogative of today's world. We simply can't put the genie back into the bottle and start exchanging pieces of paper and vinyl discs again.
The debate is huge, but the results already seem clear: any laws designed to stop the process from continuing will be further and further ignored until they are seen by a majority of people to be useless vestiges of a material-obsessed past.
Ceci n'est pas une signature
modprobe loop
/dev/loop0 /dev/hdb1
/dev/loop0
/dev/loop0 /home/kombat/pr0n
modprobe cryptoloop
modprobe aes
losetup -e aes
(input password)
mke2fs -j
mount -t ext3
enjoy!
I am a viral sig. Please help me spread.
Are we sure they're actually using MD5? The article doesn't even contain the string "md5" that I can see. It mentions hashes though, but there's something called Robust Hashing which can be used to identify, or at least, compare content in a "fuzzy" way.
Belief is the currency of delusion.
The only problem is that a lot of file sharing software uses the fact that 2 files (from different sources) have the same hash in order to swarm the download from multiple sources. If everybody goes around intentionally making their mp3s have different hashes, swarming basically won't work anymore.
No, I don't want a free iPod
Ok guys.. let's all give it up. Let's delete all our MP3's and start buying CD's now. The RIAA has clearly won!
Hail to the king!
I want my karma, and I want it now!
Audio rippers aren't always perfect AFAIK.
... or even competent! How many rippers can't get the tagging right when the song and artist ARE PRINTED RIGHT THERE ON THE LOUSY CD COVERSLIP! Sheesh! Learn the difference betwenn Meat Loaf and Leo Sayer for cryin' out loud!
"Lawyers are for sucks."
- Doug McKenzie
I think this sums it up!
Amazing magic tricks
The article does not mention MD5 anywhere. So one can not assume this is the technology they are using in their proof. As the technical information in this article has more than likely gone through several iterations of "dumbing down" we can not say what technology is being used. It is quite feasible that they are comparing segments of the encoded information with files that where groked from Napster (pre 2001). Additionally as very few people change all the information contained within the ID3 tags ("meta information" from the article?) it maybe enough to show how unlikely they are to match unless the file is from the same source. For example if I insert the string "whateverbarcodezwashere" into some obscure tag with the ID3 tag of an MP3 and it arrears in an MP3 file on someone elses computer it is likely that they orginated from the same source. For the record it is conjectured that it is astronomically unlikely that two randomly choosen different byte sequences will produce the same MD5 hash.
----
Just change the ID3 tag on all the files and that will break any existing MD5 checksums. Even addiing a capital will do it
Rus
Cheap UK and US VPS
Ummm, I paid for a CD the other day but I want to listen to it on my MP3 player. The CD is copy protected. I run linux. The only way I can listen to it via mp3 is to, yup, download an 'illegal' mp3! Whoever thought that up was a fscking genius.
--
This sig is inoffensive.
And what, pray tell, did she steal?
Lets see someone put together an app that flips bits here and there within MP3s to make each one it runs against unique enough to create a new MD5 hash!? (I would, but I can only program in a pseudo-language ;) It could even be as simple as adding in a trailing byte to all of your MP3s, though that could be easily filtered. Hell, if you can hide messages within compressed JPEGs without noticeably affecting their quality, why not do something similar to MP3s just to jack up this sort of tracking!?
"1984" was ment to be a warning, not a guidebook. You hear that Kim Jong-il!? BushCo?!
Uh, its not like the hash is in the file. Its computed from the file. You could write something in winamp that randomly changed bits in your music, and that would change the hash, but it would also slowly corrupt your music until you had static.
If the hash is using ID3 tags, you could change some unused field in there, but there would be a much smaller number of permutations available (although probelby still enough to be useful)
Pretty much no rip is identical.
First step: the *.wav is ripped. Using libcdparanoia, which i personally perfer, i find slight variation in size depending on the machine and cdrom drive i rip them on.
Second step: encoding on different machines, with different encoders, using different algorythms, using different levels of floating point precision, on different architectures etc... produces vastly different files.
Third step: sharing. Oftentimes an mp3 is downloaded 99.8% before the connection is broken. You keep the mp3 becuase mp3 is a sequential file format and you only lose a second or two of music. The rest of the file is intact.
Their md5 searching scheme could be circumvented quite easily by changing a comment in the id3 but they could get around that by cutting out the id3 part of the file when they make their md5sum.
The downside to this is that if you are searching for music on something like gnutella by the ***sum, the content would differ and you would not get as many results. Gnutella would not download from multiple sources becuase the file would not have the same signature.
Whatever the case, it is clear that some form of file obfuscation is now needed for safety online. Or we can wait for freenet to mature.
Imagine, the MD5 file as a solution, and the original file as the question. The MD5 file might contain the number '5', but you wouldn't know whether the question asked was 2+3 or 4+1. You do know however that the question wasn't 3+1 or 2+2 though.
If you download the question, you can check that the solution matches the expected solution. If so, the download is good.
Note, this is a very simplified version, using a pretty poor analogy. I'm sure there's a website that explains this better.
What good evidence destroying/hiding mechanisms are there around? Apart from deleting and overwriting the area several times? How about something that can kill the hard-drive even when the computers off? I see crime scenes on the news all the time with police carrying out computer cases for examination - it always struck me that you could fit tamper protection in your computer - any attempt to move it, open the case or anything with out proper authorisation would cause the hd to torch its-self, this could be as simple as a battery inside with enough power to boot the machine quietly and very quickly destroy the data, the police would have no time to stop it, while all this is probably illigal itself, it could be better than being sued for $50000 per song or whatever their price is?
:)
I hope the next kazaa lite comes with file altering/deleting/anti-riaa utilities
This comment does not represent the views or opinions of the user.
No, we need to create a honeypot farm. You remember that article way back when on Slashdot? It described how to implenent a whole farm. Then we strictly prohibit scanning of the networks for MD5 checksums. Since RIAA is using bots, they won't read the warning and fire off the subeona. When you get a subeona, then you slam them with a computer crime lawsuit. See, you can still get rich from RIAA. But how do you get illegal MD5 check sums with out possesing the files? If you wanna screw with RIAA you have to be damned sure that you right.
The views expressed are mine own and do not express the views of my employer.
From the NAPSTER network??? This is worse than i thought - it appears the RIAA has built a Time Machine! Next they will be going further back than napster andprosecuting free-thinking pilgrims who would share their newspapers.
Yikes.
I suppose that (if its possible) you would either want to swamp these guys with false positives, or distribute the hash keys and the files somehow to make it more difficult and protracted to discover who actually owns that file.
I suppose that one viable option in P2P would be a freenet model where downloading involves a number of encrypted hops between peers to search or get the data, and where peers cache popular data and indexes in encrypted form. It would be much, much harder to figure out who shared that file then.
Obviously there is a trade off going this route. You wouldn't want the sluglike performance of Freenet so it would not be as secure, but I'm sure you could reduce the number of hops and other measures and still make life massively more difficult for RIAA and their ilk to track down your activities.
You're right in that it is possible to have the same MD5 sum for multiple files, but the chances of it happening is extremely small for two reasons.
The first reason is that MD5 has 128 bits to describe the file, meaning that there is a 1 in 2^128 chance that any given random bitstream will have the same MD5 sum (Of course, MP3s aren't all that random in portions of the file format, but the basic argument still stands).
The second reason is the very process of verification. In order to verify a file, you must already have a checksum of the original file to compare it to, and you have a file which you think could be the same file, meaning file names and file sizes are already identical. If those files differ by as much as one bit, then they will produce different checksums. If you're willing to try to match a file named "ISO of Windows XP" with a file size of 650.1MB versus a file named "ISO of Mandrake" with a file size of 643.8MB then you're already sure that they're not the same file by the filesize alone.
In short, possible, but extremely unlikely.
http://news.bbc.co.uk/1/hi/entertainment/music/318 7695.stm
:wq
Maybe someone should write an email virus that listens on the Kazza ports and reports back gigs and gigs of shared mp3's to anyone who asks.
Then, when people get busted, they can say "It was a virus".
Of course, this would make the search feature of Kazza useless...
From the article:
Copyright lawyers said it remains unresolved whether consumers can legally download copies of songs on a CD they purchased rather than making digital copies themselves.
So it's still up in the air. But here's where I get confused:
For example, the industry disclosed its use of a library of digital fingerprints, called "hashes," that it said can uniquely identify MP3 music files that had been traded on the Napster service as far back as May 2000.
By comparing the fingerprints of music files on a person's computer against its library, the RIAA believes it can determine in some cases whether someone recorded a song from a legally purchased CD or downloaded it from someone else over the Internet.
Okay, how? Only way I can see is if they have a HUGE-ASS library of mp3s downloaded from Napster that they can check every file against. Seems unlikely that "nycfashiongirl's" copy of "Beat It" would match exactly with one in the RIAA's library.
The recording industry also disclosed that it is examining so-called "metadata" tags, hidden snippets of information embedded within many MP3 music files. In this case, lawyers wrote, they found evidence that others -- including one user who called himself "Atomic Playboy" -- had recorded the music files and that some songs had been downloaded from known pirate Web sites.
Now it's making more sense. I don't think they're using hashes at all. I think they're checking the ID3 tags for stuff like "ripped by 4t0m1c P14b0y - www.atomicplayboy.com."
So really it should read something like "Using a surprisingly astute technical procedure, the RIAA examined song files with an advanced file analysis application, iTunes, and found evidence of references to Atomic Playboy." The article of course, doesn't mention whether it was possible for them to plant the evidence, which it would've been if they were simply allowed to possess her hard drive and weren't required to make any backup copies for the judge.
Of course, if, in her defense, she counters with "well yeah, not all of them were ripped from the physical CDs, lots of times I'd want to listen to one of my CDs, and I couldn't find it, so I'd just download it -- but here is my CD collection for evidence, your honor," then there's going to be an interesting precedent set -- is it okay to download songs that you already own on CD?
Also, she's in court not so much for downloading, but for uploading, which is much more of a crime (have they even sued anyone for just downloading yet?), and it really doesn't matter where she got the songs, just that she was sharing them.
c-hack.com |
If I use KaZaa to access indie artists who are
sharing their songs - as is their right - AND I
also rip my entire 1000+ CD/LP/8track collection
to the same computer AND I intellegently store
all the files in the same heirarchy.
Have any laws been broken?
KaZaa is configured to share everything in my
heirarchy so that the indie songs can continue to
be shared.
Have any laws been broken?
I go in for Jury Duty, meanwhile Another Kazaa
user downloads the indie shared files.
Have any laws been broken?
Another Kazaa user downloads the rips from my
personal collection because their 8track player
is on the fritz.
Have any laws been broken?
Another Kazaa user downloads the rips from my
personal collection because their LPs were
destroyed in a flood.
Have any laws been broken?
Another Kazaa user downloads the rips from my
collection because they want to see what the
latest Madonna single sounds like before going
out and buying the CD.
Have any laws been broken?
If any laws were broken here - who broke them?
Just because I leave the front door open does not
mean that anyone can enter and take what they
want from my house. Same as my computer.
The action of downloading is at question not
making the article available.
YMMV. Consult a lawyer.
comment directly in my journal
From the article:
...
Copyright lawyers said it remains unresolved whether consumers can legally download copies of songs on a CD they purchased rather than making digital copies themselves.
By comparing the fingerprints of music files on a person's computer against its library, the RIAA believes it can determine in some cases whether someone recorded a song from a legally purchased CD or downloaded it from someone else over the Internet.
So, the RIAA has been downloading illegal copies of music for years, in fact probably has a huge library of music. Simultaneously, in their broad sword efforts to completely end p2p, they're arguing that it's illegal to download songs you've already bought. So, even if the RIAA has gone through all the hoops with this library, obtaining licenses for each song they swiped off of file traders in their investigations-- which I doubt; recall Microsoft's slip ups-- they're arguing that the methods they've been using to track down illegal file traders are actually illegal themselves! In fact, the RIAA might have the largest collection of illegal music of anyone, even larger than mine! Of course, this should come as no surprise, after all of the attempts to make it legal for them to attack suspected infringers PC's, it's pretty clear that the RIAA's privilege and property makes them above the law.
How long is it until a P2P client is created which appends a half second of noise to the end of everything you download, thus modifying the checksum?
I can see it now... "And in recent news, according to the RIAA there are over 10 billion songs being traded. The organization is quoted as saying 'We intend to sue individual users for having more songs than we've created...'"
Revealed: How RIAA tracks downloaders
(Music industry discloses some methods used)
There is an interesting pattern here:
- Some one comments that the IP laws have not kept up with technolgical and social change, and that they are now impeding the cultural goals they origonally served. They may have made sense when we were limited to exchaging physical objects, but they don't make sense now.
And the responses are allong the lines of:The respondents are completely missing the point. To see this, imagine what the discussion might have looked like if it had happened way back when:
- The rule about not eating X hasn't kept up with the times. It made sense when we didn't know about the parasites, but now that we know how to clean and cook them it doesn't makes sense.
I suspect the responses would have been along the lines of:Every time I see this played out, my response is, "Gee, IP law really is dying, isn't it?", with the same sort of awe I had watching little bits of sand wash downstream at the bottom of the grand canyon.
-- MarkusQ
they're only likely to match if they're from the same place. hence illegal copies.
"if i'd known it was harmless, i'd have killed it myself"
The ripping stage can also produce slightly different checksums, depending on the condition of the CD - Audiograbber actually reports "potential speed errors". Unlike data CDs, some level of read error is considered acceptable on music CDs; you don't want the player to keep re-trying a bad sector if it detects a big problem - it would ruin your listening pleasure!
When I am king, you will be first against the wall.
The same story is posted on CNN.com. Accompanying this article is one by Marci A. Hamilton, a chairman at Benjamin N. Cardozo School of Law, Yeshiva University. She states that going after students who illegally download media is not only OK, but is RIGHT. I wouldn't have a problem with this were it not for the reasons she supports it with. She says that a world without copyright laws would cater only to the rich and the government. When was the last time you heard of a government worker writing a song on the top 10 list? When was the last time a millionaire, (not a musician) created a song that made it to the hall of fame? My point is, without free music/media, many of the people who come up with the latest and greatest entertainment would never see any of the media that's out there. Marci claims to be looking out for the poor country music singers in her article. If they're as poor as she says, how are they ever going to be able to afford a CD at $15 a piece???
s .hamilton.music/index.html
Musicians and music labels alike need to come to grips with the fact that their moneymaker, (CD sales) will need to take a back seat to actual performances by the artist. We need to take it back to the old days when music artists actually sang and performed and didn't just sit in a dark room behind some curtain tooling away on their synthesizer.
http://www.cnn.com/2003/LAW/08/07/findlaw.analysi
With all this hash talk going on, I thought I'd mention that Musicbrainz uses some sort of similarity hash in identifying songs. It compares the hashes of the files you have to an existing user submitted database. If the match is good, then you can use the database tag info, which is pretty handy.
I've compared albums I've ripped myself to the database and gotten "100%" matches (along with some matches of a much lower percentage) That leads me to think that if the RIAA kept its own database like that, they could do a whole lot of comparison with similarity or quasi-unique (ala MD5) hashes. I'd also venture that, with enough work at the comparison system, they could make court-valid assertions. They can hire plenty of geeks to handle the statistics necessary to call something 'beyond a reasonable doubt.' (for criminal proof)
The MD5 thing isn't for tracking the same song ripped by different people. The thread on this, so far, has left me scratching my head as to why folks feel the need to restate that encoding an mp3 with different settings/software will result in a different md5. Right, this is slashdot and we all know this already.
The reason for md5 matching is so they can nail someone as the 'origin' of the ripped song, then hold them liable for all the copies of a matching md5 on P2P networks. It would be more a demonstration of "look how much damage one copy did to us!".
I believe what they are referring to is a system that takes a sample of a song (let's say 30 seconds) and generates a 'hash' based on that... The thing about this system is that it is a loose hash, meaning that changing one bit does NOT necessarily change the hash. It is a sonic fingerprint (Not in the digital watermark sense), so that in theory if you had a direct CD-ripped wave, and an analog rip from a cassette as a wave (for instance), you could match the two files, even though they are FAR from bit-for-bit exact.
This is what they mean when they say hash. NOT md5. Obviously MD5 could not track an mp3, since changing even one character in the ID3 tag would change the whole hash.
So they probably have an automated downloader that then generates a fingerprint from the downloaded file and compares it to a db of fingerprints to determine if the song is copyrighted. I'd bet that's all.
Just out of curiosity...Did you have insurance? Did they write you a check for the CDs you lost in the fire? I doubt it, but if it had happened, would still feel you had already "paid for" the CDs, and simply thumb your nose at the RIAA and Big Insurance and download the files, as you'd already "paid for" them?
I promise, I'm not begging to be flamebait. I'm really curious.
Where does the line get drawn between physical property and intellectual property, and what rights do you have if you HAD purchased it, but it's gone now? I mean, I can't go to the lot and get another car because mine is destroyed in a fire. Of course, I could go take a picture of it...but I could do that anyway.
I'm curious.
Any sufficiently well-organized Government is indistinguishable from bullshit.
An excellent example.
The statue sits there, the result of laborious work by its creator (made possible thanks to a decade of training at the hands of other masters, but that's another story).
Now the statue is in the hands of a private collector who charges people to view it. He claims he owns it, but the state decides that the statue is far too important. They buy it, and put it on public display. Now everyone can see it, be inspired by it, make rough imitations, photos, even tiny or full-scaled replicas.
Which is preferrable? Which results in a better and richer culture?
Clearly no theft occurs by looking at the statue, except that the original owner cannot claim his viewing rights any longer.
This is the best metaphor for digital culture. totally intangible, yet very important. The discussion of "rights" and "theft" and "ownership" is meaningful only insofar as the direct artist is concerned. All other parties are unavoidably biased, and finally it is the common interest that must prevail.
It is clearly impossible to restrict all creations to "pey per view". Impossible and stupid, for people will simply turn elsewhere and make their own, or steal to view. Culture does want to be free, as you know very well because you are here on Slashdot, proving that point exactly.
Comparing Kazaa users with suicide bombers, burglars, and corporate thieves is fanciful slander, and you know it.
Ceci n'est pas une signature
I was under the impression that MP3 (MPEG-1, Layer 3) was a lossy algorithm. Even with the same ripper settings working off the same stored raw CD audio file, will it actually produce identical output? Can the MP3 encoder drop different bits as irrelevant on different passes in time on the same data with the same settings? If this is indeed the case (I don't know, I am not familiar with the detail of the algortithm), then MD5 sums become a virtually foolproof way to identify a file since an identical sum can only be produced from the exact source MP3, not one that is close. Just a thought on that matter. And a second point, more of an idea really... Has anyone thought of trapping RIAA? Here is my proposal... 1) Go and buy 50-100 CDs from your local music stores (I know, this is abhorrent since you are lining the pockets of the people you want to fight but it is a means to an end). SAVE ALL THE RECEIPTS! You will need these. 2) Download a popular P2P program and sign on. 3) Go download crazy and download an MP3 for EVERY SINGLE SONG on the pack of CDs you just purchased. Be obviously, be a bandwidth pig, get somone's attention. 4) Take screenshots and printouts of the directories containing your "booty". This will establish the timestamps of when they were downloaded. Sign and date the screenshots, preferably with witnesses who sign them as well. 5) Wait for a supoena from RIAA. 6) Join RIAA in court and argue "fair use" by throwing up your stack of legally purchased CDs and the receipts for them clearly indicating that they were purchased PRIOR to the supposed infringement and you were simply wanting MP3s of CDs you own but lacked the knowledge/skill/time/tools to rip them. Is such a case copyright infringement? It's a dangerous game to play because the fair use doctrine has been supported, it is not a matter of law. The outcome could be undesired because it could cause a rethinking of what constitutes fair use. The fun part of such rethinking could be the broadening of what is considered infringement into areas where it was not infringement and ignite an absolute firestorm.
Fair Use is about the right to quote portions of one work within another, as a means of making commentary, criticism, or parody. See Standford's explanation or Title 17, Chapter 1, Section 107 of the Copyright law.
You might argue that it's 'reasonable' to download an MP3 file that corresponds to a track from a CD that you own, but it's simply not 'Fair Use'.
Here's what I do: Bitty Browser & Andromeda
Don't we already pay a small tax to the recording industry every time we buy blank audio CDs (but not data CDs)? I'd like to see some lawyer fight a case claiming that a P2P user has already paid the RIAA and is therefore exempt from their lawsuits when downloading the music and burning it to an audio CD. That would be an interesting lawsuit.
Although I may not have said it as well as I could have, that is the basis of my question. If the RIAA continues to make copyrighted CDs and shuts down P2P services, what am I to do when I have a damged disc. I could make a backup even though I am entitled to one and I can't grab the files off of P2P because no one will give me access to the file out of fear of being sued. Now the RIAA can start making disc more fragile and easier to scratcha and I will be forced to buy the same disc over and over during the course of my lifetime. But I just want to listen to the damn song. Isn't it great to be a consumer in America?
The Tools Of Ignorance wanna be a tool?
Hashing and compression aren't really my thing so maybe someone could clarify my understanding.
I was under the impression that hashes are not reversible like compression algorithm's are, but that they try to add as much chaos between slightly different variations of the original. (The same way the telephone company racks up money by having area codes be very distant from each other; a typo in the area code probably means big bucks for a wrong number)
My spreadsheet of 1997 budget information could produce the same hash as a RIP of Meeco's Star Wars disco theme remix, but it would be unlikely to produce a hash similar to my 1996 budget information (which is practically the same other than 1996 being 1997). None of these would ever compress to the same result using a loss-less compression scheme (or they might be in for a surprise when they uncompressed their Mecco track).
Producing a unique result for each file is what a compression algorithm does. If a hash were truly unique and reversible then you'd have a compression algorithm, right?
Now to make this relevant to this case...
Could someone make a MP3 from MD5 generator? It'd create an MP3 with the goal of having exactly the same MD5 hash as the original song. Admittedly it'd probably sound like a confusion of radio static and Husker Du. Not anyone's cup of tea to listen to probably, but it might wind up being just the sort of edge case to make MD5 hashes insufficient evidence in court (especially if the defendent had a nose ring). If this isn't possible, then perhaps it could make a JPG from MD5 generator? Visual noise is much more appealing to many than auible noise and probably easier to create.
Everyone is missing the point here with the MD5 hashes.
OK, if you use the defaults in your MP3 encoder, and the ID3 tags from CDDB the *encoding* would be the same, but not the end file. Know why?
The rippring process differs greatly - you've got things like scratches on discs that some CD-ROMs will pick up as errors and some won't, you've got pauses due to slow processor/HD on different computers etc.
The only way I'd say to get an identical file would be to rip it using the same computer, encoder and CDDB - in which case "Jane Doe" must have been the original producer of the Napster file if the KazaA one matches it (or she copied it from someone else).
She's guilty as Hell, but personally I support her as the RIAA/MPAA are scum.
#include <sig.h>
Maybe they're speculating that the jury will immediately succumb to the magic word 'hash'.
But otherwiese, frankly, i don't see what this could be good for. Hashes (whether MD5 or SHA or some other algorithm) don't prove a thing.
Identity: The identity of the hashes of two MP3s only provey that the MP3s were encoded with identical settings from an identical CD source. If two people, one in NY the other in LA buy the latest Red Hot Chili Peppers album and rip and encode it both on Windows machines using identical versions of RealOne (or any encoder) then the resulting MP3s will have identical hashes. Whether the probability of two different files accidentally having the same hash ist 1 in 2 or 1 in 2^127 is absolutely irrelevant here. The chances of two people using the same software with the same CDDB information to rip the same track from a CD that sold a million copies is a lot higher. Everybody with a half episode of Matlock legal expertise will tear the RIAAs position apart on this ground.
Trackability: Hashes cannot be used to reliably track the path of copies across P2P networks either. Since the hash is more sensitive to minor changes than the ear doing random changes to the ID3 tags or randomly changing a bit or two somewhere in the MP3 will wipe the tracks.
So two files having the same hash doesn't prove they come from a single origin. Two files having different hashes doesn't prove they don't come from a single origin.
Hashes don't prove a thing
Remember that the MD5 hashes are the values used by popular P2P software to enable synchronized multi-source downloading of a file. If everybody "sharing" modifies files to affect MD5 hash values, then the P2P networks essentially fall apart into single source FTP-like downloading.
If you accepted insurance money for the CDs, then, while the license to listen to the music still exists, you have transferred it to the insurance company who paid you.
If you total a car, the insurance company will give you X dollars and TAKE AWAY YOUR CAR.
When you buy insurance, you are buying a guarantee that, in the event of loss/damage, that the insurance company will buy your stuff at a "fair" price.
If that were possible, it would destroy the value of an MD5 hash immediately and everyone wouild quit using it faster than you could blink.
The purpose of CRC hashes is entirely different. They are designed to detect a burst of bit errors in a stream of data, the type of error that is most likely to occur in a network transmission. They are not meant for fingerprinting files.
I doubt that anyone with any degree of sophistication in cryptology would attempt to use CRC and MD5 hashes interchangeably.