Ask Slashdot: How Do I Scrub Pirated Music From My Collection?
An anonymous reader writes "I tried out Google Music, and I liked it. Google made me swear that I won't upload any 'illegal' tracks, and apparently people fear Apple's iCloud turning into a honeypot for the RIAA. My music collection comprises about 90% 'legal' tracks now — legal meaning tracks that I paid for — but I still have some old MP3s kicking around from the original Napster. Moreover, I have a lot of MP3s that I downloaded because I was too lazy to rip the CD version that I own. I wanted to find a tool to scan my music to identify files that may be flagged as having been pirated by these cloud services; I thought such a tool would be free and easy to find. After all, my intent is to search my own computer for pirated music and to delete it — something that the RIAA wants the government to force you to do. But endless re-phrasing on Google leads to nothing but instructions for how to obtain pirated music. Does such a tool exist or does the RIAA seriously expect me to sift through 60 GB of music, remember which are pirated, and delete them by hand?"
Rerip all your CDs, this time to FLAC, since disk is now cheap as hell.
Get rid of all the old mp3s.
Scrap what you have and buy it all brand new. I'm sure that'll make everyone at the RIAA happy ;-)
From napster? A search for 128 kbit MP3 might be enough. Your legal ones are probably of higher quality.
A software could identify files which were downloaded. But it can never detect legally whether you have the right to listen to that file. Unless of course oly drmd files are considered to be legally ok.
Moreover, I have a lot of MP3s that I downloaded because I was too lazy to rip the CD version that I own
How can they tell the difference between an MP3 that you ripped from a CD that you own, and an MP3 that somebody else ripped from another copy of a CD that you own?
Secession is the right of all sentient beings.
Smartest question I've seen on /.
If you yourself can't determine the legality of the (music) files you possess, how can the RIAA? a court?
One file may be legal for one person, and illegal for another. For example, if you rip your CD yourself, the resulting MP3 is legal. Copy the same MP3 onto a friend's computer, and it's illegal. I don't think such a software is even possible to write. Every pirated / illegal MP3 file would have to be already watermarked as such in order for the software to function. What if the "common" version of the file floating around on Napster was just a basic 128Kbps rip with a common MP3 encoder, and you used the same encoder to rip the same song from the original CD yourself? In theory, it is very possible that the resulting MP3 is bit-for-bit the same as the one millions of other people pirated from Napster, even though you own the original CD and ripped the file yourself.
Morphing Software
He didn't "blame" anybody else - he accepts that there are some illegal files and he wants to clean them out without the hassle of creating his library all over again. Even if you aren't worried about the hours spent ripping your old CD's, maybe some of those CD's are scratched or have been lost, and there are legal downloaded files mixed in too - and playlists and ratings or whatever.... The question is very valid.
For the naptster stuff, just check for anything that has a godawful bitrate. For the downloaded stuff, the file names will probably be very different to whatever he uses when ripping himself.. so he just needs to find a media player that can sort by bitrate, and list filenames (it will be fairly easy to just scan quickly down the list and check for any block of files that stands out, assuming he downloads albums at a time and not just lots of individual tracks..).
which is totally what she said
how users can protect themselves from corporate political aggression.
Guns. Lots of 'em.
Geeks are so full of shit that "beating the crap out of them" takes a whole new meaning.
what you don't have in cd format, buy in cd format (amazon often has used cd's at ok prices. shipping is never reasonable but its their profit margin 'tax').
advantage of used cds: 'the man' does not get paid. no riaa income on used cd's. its just the buyer and seller (and some middleman, perhaps). disadvantage: no money goes to the band (but they made their money the first time on that 'first sale').
if you are worried (I would not be, I think you are paranoid) then make sure you have cds for every file. and like I said, used cd's deprive the riaa of any income, so that's probably your best route.
personally, I think your first and only problem is even considering these 'cloud' services. copy enough songs to your portable to last a day (or run a random mix uploader) and what's so hard? today's portables are even big enough to hold what used to be our whole collection. many people could fit their entire collection on portables. the cloud is about 5 years too late, to be serious.
--
"It is now safe to switch off your computer."
Does such a tool exist or does the RIAA seriously expect me to sift through 60 GB of music, remember which are pirated, and delete them by hand?"
No, Mr. Bond. I expect you to die.
I'm sure the RIAA would prefer you to simply delete everything and buy it again. Just to be sure. Remember... these are the folks who swore it was illegal to rip your own CDs and firmly believed you should have an individually purchased copy of media for each individual player you used.
I posted a similar comment in thread from yesterday, but I'll ask here again, hoping someone will see it.
Basically, is the statute of limitations applicable to downloaded music? In my limited legal knowledge, it's not a felony to download music, afik, so misdemeanors typically fall under a 7-year statute of limitation, and so if you downloaded stuff from Napster's heyday, more than 10 years ago, could those mp3s even be used to legally prosecute you?
Of course I know we're talking about the RIAA here, and they act as if the law doesn't apply to them in their dealing. But I'm curious.
One file may be legal for one person, and illegal for another. For example, if you rip your CD yourself, the resulting MP3 is legal. Copy the same MP3 onto a friend's computer, and it's illegal. I don't think such a software is even possible to write. Every pirated / illegal MP3 file would have to be already watermarked as such in order for the software to function. What if the "common" version of the file floating around on Napster was just a basic 128Kbps rip with a common MP3 encoder, and you used the same encoder to rip the same song from the original CD yourself? In theory, it is very possible that the resulting MP3 is bit-for-bit the same as the one millions of other people pirated from Napster, even though you own the original CD and ripped the file yourself.
So just digitally sign everything you personally rip. I don't see how that could be so difficult. The computer you use to rip it could do it automagically.
Now of course if most stuff ripped isnt signed on purpose thats a different story. Maybe those Mp3s aren't legal?
True the md5 idea alone wouldn't solve everything but the guy asked if it could be possible to sort his files, and thats easy. Judging legality isn't easy even with lawyers and courtrooms.
The legality of the file is not a property of the file itself, and cannot be determined from the file's content. If I buy an MP3 on Amazon, I can legally use it. If I put it on bittorrent and you download it, you have the same file as I do, but the RIAA says you're not allowed to use it.
This idea is explored in more details in the following blog post What Colour are your bits?
"Delete the ENTIRE library and re purchase all of them to be sure. It's cheaper than our lawyers raping you..."
IF you call a RIAA office the above will be their answer. if you call any lawyer the above will be their answer. if you cant PROVE you bought it, it's pirated by default.
Do not look at laser with remaining good eye.
Seriously, it doesn't matter. The crazy lawsuits are for distributing music and only that, which you're not doing. The whole idea of these being "honeypots" is ridiculous. There's nothing you can actually be charged for even if the RIAA could influence Apple or Google or Amazon. Which is doubtful because they each make far more money than the RIAA and would have to destroy their reputations to go along with such a "trap".
If you have some ethical issue then just buy a legal copy of the music for anything you're unsure of. Having multiple copies for personal use IS still fair use.
I don't think that idea has actually been tested. It's not entirely clear what constitutes an "unauthorized copy." We can throw away the ridiculous old RIAA argument that ripping from your own CD is unauthorized and not fair use. But is it an authorized copy to copy somebody else's fair-use rip because it's easier than making your own rip? And can you prove that you owned the CD before you made that copy? I think at that point you get into the highly-paid lawyer version of "he said, she said."
The audio data and subcode (track timing) data are split into two separate streams in the CD drive. The CD standard allows sync between audio and subcode to drift by (as I understand it) up to one sector, or 588 samples. This phenomenon is called "rip jitter". CD-ripping tools will overlap reads within a single rip by a sector or two to correct for changes in this drift, but there are still hundreds of offsets where the whole rip can start. Thus there are hundreds of distinct "basic 128Kbps rip[s] with a common MP3 encoder", each with a different starting rip jitter because the CD drive signaled a "start of track" in a different place within the sector.
One of my pals has regularly shopped the thrift stores (Goodwill, Salvation Army, etc.) looking for albums of the music he has downloaded. His theory is that as long as he has the album with the music - regardless of the format - he's covered.
I think he's probably right, actually. Although it might cost hims some legal fees to get RIAA off his ass if they choose to land on him.
No one ever had to evacuate a city because the solar panels broke!
The illegality of downloading track of a CD you own has yet to be proven.
In which jurisdiction? In the United States, see UMG Recordings v. MP3.com.
Through an Md5 database hosted on the RIAA website or funded by the RIAA. Every legal file could be known. And then every illegal file would be among those not in the official database.
Won't work. From an article about whether iCloud's match could be used as a honeypot, that I thought was posted on /. a few days ago:
Then there will be MP3s that individuals created themselves from, for example, ârippingâ(TM) their CD collections. While these are not watermarked to the individual, they appear to be unique for each âripâ(TM). To confirm this, I ran a test with fresh installations of the exact same CD ripping software on two different computers. I then had them rip the same track from the exact same CD using the unchanged system default settings on both computers. The MD5 hashes did not match.
( http://betweenthenumbers.net/2011/06/is-apples-icloud-music-match-a-possible-honeypot/ )
And as such, there's a moderately decent chance that an innocent person will be found innocent. But it still costs a hell of a lot to be innocent in a court of law.
In the UK copyright law does not even allow recording TV shows to watch later, it is merely tolerated
This is incorrect.
s70, Copyright, Designs and Patents Act 1988, entitled "Recording for the purposes of time-shifting", provides that:
The making in domestic premises for private and domestic use of a recording of a broadcast ... solely for the purpose of enabling it to be viewed or listened to at a more convenient time does not infringe any copyright in the broadcast ... or in any work included in it.
Don't worry about it. You're being paranoid. Even if they could detect that you have some illegal music, they really don't care unless you're actively trading it. Look at how companies handle pirated software, for example. Microsoft can tell if your WIndows isn't "genuine" and yet the worst thing they do is cripple your copy and give you a rather polite message about making it genuine. That's the worst I would ever expect from a "honeypot." At worst they're going to say "Hey, we think this song is not genuine, would you like to buy a fresh copy to ensure you're legit?" They're not going to call the FBI on your ass for having an illegal copy of Twisted Sister on your hard drive. It just isn't going to happen.
To do such a thing you would have to define
1: a whitelist of files that are identical to copies sold by legitimate services or "perfect" CD rips.
2: a blacklist of files that were found on P2P networks and have sufficiant defects or other idenitifying features that it is unlikely they would match any non-pirate's copy.
You could then go through a file collection sorting files into white, black and grey. The technical aspects of implementing such a tool are trivial.
However the problems are
1: it's pretty hard to find every file that is out there on legit services and basically impossible to find every file that is out there on P2P.
2: Afaict it is also bloody hard to get a perfect rip of a track from CD (and that is before you start considering the encoding options)
3: your CD rips will probablly not be on either the whitelist or the blacklist (see point 1), unfortunately it is likely that many pirate files won't be either (see point 1). Unfortunately not being on the tool's blacklist doesn't nessacerally mean the file isn't on the music industries blacklist.
4: most people outside of the music industry would probablly not want to give them a helping hand by building a list of "probablly pirate" tracks and those trying to track down pirates and extort money from them are unlikely to want to release their lists either.
note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
"...does the RIAA seriously expect me to sift through 60 GB of music, remember which are pirated, and delete them by hand?"
No, Mr. Bond, the RIAA expects you to die.
If Slashdot were chemistry it would look like this:Cadaverine
They can't 'get' you, it's all a fear tactic. Especially for titles you have on CD.
Don't distribute them. There, you are fine.
The Kruger Dunning explains most post on
How much does it cost for you to build a time machine to go back to 1995 and make every audio encoder digitally sign every file it compresses? I don't see how that could be so difficult.
also, legality of ripped music is different in many countries. suppose you visit a friend in the Netherlands. he has a CD or DVD you like. you can sit down behind his computer and copy it, and the resulting copy is perfectly legal. it does not have to be a direct copy, mp3 or any music or video format is just fine. it's different if the friend copies it for you, in that case he's illegally spreading copyrighted material. you can take your copied CD back home, and it's still legal as far as I know, under the Berne convention.
No one can understand the truth until he drinks of coffee's frothy goodness.
--Sheikh Abd-Al-Kadir, 1587
At least with kiddie porn the law says that any match, whether you paid for it or not, constitutes a violation. That's not the case with music--how are they supposed to know what files are legally downloaded copies and what are illegally downloaded copies? The only way is to keep a database of invoices for everything you have ever paid for, ready for when they come to audit you. But when are they going to search your files? At border crossings? Airports? Now you have to carry this bunch of invoices around with you all the time. It's akin to the proverbial "papers" you need to travel in a repressive regime. You see where i"m going with this.
In civil suits (which for a copyright lawsuit is what it would be), the standard is a "plurality of evidence", meaning that whichever side can present a more convincing argument to the judge will win, proof be damned. (IANAL, do not consider this legal advice, all situations are different, etc. etc.)
It's better to vote for what you want and not get it than to vote for what you don't want and get it.
- E. Debs
That raises an interesting question. If I rip a song using a particular program from a particular pressing of a CD, and you rip it using the same program from the same pressing of a CD, will the two end up with identical hashes? I've always been under the impression that ripping audio data wasn't entirely deterministic from a CD (no error correction), and thus two rips even with identical software and settings won't necessarily byte-for-byte match.
Not identical. The CD drive cannot determine accurately when a song starts, so when you rip a song, and then rip it again from the same CD on the same computer, each rip will have a small random amount of silence at the beginning. Then there is the question whether conversion to AAC or MP3 is deterministic, which depending on the software it might not be. Next anything in a Quicktime wrapper contains the creation date inside the file (which caused paranoia when people figured out that iTunes sets the creation date of files downloaded from the iTunes store to the time the file was created on the computer, so two people downloading the same song would never get identical files). That also means two AAC files created at different times will always be different.
It really doesn't matter. The only damages the RIAA can reasonably claim for you having pirated music is around $1/song. It's UPLOADING that music that they care about, because then they can pretend that your upload is providing that song illegally to 20,000 people and therefore claim that that single song is worth $20,000 in damages.
They RIAA has NEVER sued ANYONE for merely possessing pirated music. I don't think they've ever sued anyone for downloading music either. It's all about what you upload. If you aren't uploading anything, you should be fine.
With crappier rippers maybe. With a direct digital rip, it should be the same every time, in theory, from any CD drive.
Exact Audio Copy uses the AccurateRip system which somehow manages to tell me that my rips are exactly the same as hundreds of other random people via some central DB. The only time it doesn't match up, is when I have a massive scratch in a very old CD, and EAC took hours ripping and re-ripping the same sector to get the best results possible with what I gave it to work with.
Morphing Software
Sorry, but the reasoning in your post made me wince. For one, the term "bad guys" is simply a us-vs-them generalisation that holds no water. I haven't met a single person in my life who is incapable of a good deed, or a bad deed. So, the question really should be, "Is the act of breaking a law always bad?".
For two, even if the answer to this is no, it makes no mention of how many laws are unjust, and which ones specifically are. If a significant portion of laws are just, then we should certainly be very concerned whether something is illegal or not, in general. Unless we know whether a specific law is unjust, then we would be sensible treating the laws as they are in the majority (which I think most people would say is "just").
Oh god. You voted for Bush, didn't you?
You know, there is a difference between trolling and pointing out the flaws in your reasoning. Just saying.
MD5 Hashes of the files is a fine way of identifying pirated music. In fact I'm pretty sure it's how most cloud services WILL do it. The real question here is how do you identify which hashes will be blacklisted? I think the best approach to that would be to go through some famous torrent and Gnutella sites and scrape the hash values from those torrent files and databases. I know torrents have a way of doing this as part of the .torrent file itself and I believe that the Gnutella protocol probably has a similar system of uniquely identifying files. This way you would not have to download all the files but could still know which ones are being shared illegally by logging all those hashes and comparing them to your files. I think it is technically feasible to do this, but extremely difficult. I would recommend cleaning your files instead by adding trash to the tags section in an unused field. This would confuse most common hash algorithms. I imagine the companies could have a much more sophisticated way of hashing the files such that it does not take tags into account, but to preform this form of unique ID the companies would have to manually download each song illegally and ID it. I don't think that's likely. I feel that cleaning your pirated files is the best solution.
There is no such thing as a "direct digital rip". The CD standard doesn't provide one, there are no boundaries on the CD for one to work against, and as stated rip jitter is inevitable. The only question is how the software and hardware involved handle it. The post you were objecting to talks about one of the pieces of the magic used to help with this fundamental problem that you're not aware of, and there are some others too.
Drives that support what's called AccurateStream will guarantee you that they always pick the same spot every time you ask it to seek somewhere, which is the first part of the problem. If you drive doesn't do that, you end up needing to do the overlapping read shuffle described above to figure that out. See EAC Drive Options for more about all that.
Even if you have AccurateStream, there's a second problem: the spot will be the same every time, but exactly where that is can't be guaranteed--it varies based on the drive model. The way AccurateRip copes with this problem to collect a database of CD Drive Offsets. If your drive isn't in their database, what you can do is use a known music CD that AccurateRip has good data on, then calibrate your drive using it to figure out how much you're off by. People submitting those test results is how they compiled the database.
If you have AccurateStream hardware, and you know your drive offset, you can get the same rip every time and match against the checksums that AccurateRip provides. But this is only happening because several pieces of the chain know how to compensate for the limitations of audio CDs encoding, there is no way to get digital data straight off of them usefully.
http://www.gcaudio.com/resources/howtos/demagnetization.html
I say use two different computers. One with all your "good stuff" and the other for internet use. That's what I do, although got ZoneAlarm but I had a PC that got so botched up, I nearly lost everything. So now I use two (also got Macs, one online the other not). Not that I have pirated music, don't really know if some of my few Connie Francis mp3s are pirated (virtually all my music is on CDs and vinyl). But with the mob mentality of various you-know-who organizations it seems pointless to debate the legal issues (they will continue to be as aggressive as those in Hackastan).
mfwright@batnet.com
You can't do it because ITunes leverages napster data.
I know this because I have some obscure tastes in music. I have a tape and a cd of an old band. I downloaded one of the songs that's only on the tape from napster. I was disappointed with the recording because of three glitches in the track. Years later, itunes pops up. I buy the song from itunes. Low and behold, same three glitches are in the itunes version.
This happened for not just one song, but two songs from two different artists in two different genres. One was a single glitch, which I would have dismissed as chance, but four glitches at the same timestamps from two different songs in two different genres?