Napster, Audio Fingerprinting, and the Future of P2P
mjmalone writes "Napster founder Sean Fanning is poised for a comeback, seems the now 22 year old Fanning has developed technology which creates "audio fingerprinting" of individual tracks and compares them against fingerprints in his firm's database to determine legality. A fee may be set and collected on a copyrighted track by its rightful owner. Fanning is actively recruiting industry support as well as pushing the idea to p2p services such as kazaa and grokster. " This isn't exactly new technology, but it's still interesting to see what Fanning is up to these days besides movie cameos.
I recall that in its dying days Napster was talking about adding this to appease the recording industry. The variation then was from a company called Relatable. Sounds like Shawn is stuck in a recursive loop.
Why I don't think it would work, just quoting from the article:
Another issue is that it would be up to the labels to claim ownership of each track, and they may claim greater rights than they are entitled to or rights that are subject to dispute. Many songs have multiple rights holders, depending on who wrote the composition and who performed it, and the labels and the artists signed to them have frequent ownership disagreements.
For example, many of the songs on file-sharing networks are recordings of live performances, whose digital distribution rights and royalties might have to be negotiated between labels and artists.
I remember seeing a book once that helped you identify songs by whether the sequence of notes at the beginning of the piece went up, down or stayed the same pitch when compared to the previous note. It was about the size of a telephone directory.
A quick Google finds out that its called The Parson's Code, with a lot more information here.
Presumably the fingerprinting scheme works in a similar fashion (over a larger portion of the song, and probably over multiple fragments of the song as well).
Ian.
A physicist is an atom's way of thinking about atoms
Uh? Fanning made Napster. Literally.
My Journal - 1,337 fans and countin
Err, the current mass of shitty 128kbps mp3 files made by your average aol loser is bad enough. If your method allows flying under the fingerprint radar, fine. But I wouldn't want to download that crap then.
Those people who care about quality you could catch with a simple md5 check, because they release lossless ripped by EAC with offset-corrected settings et al.
Of course it runs NetBSD. BTC: 1NT7QvbetmANwaMzhpVL6
MusicBrainz already has a free music fingerprint program. It identified about 60-70% of my songs correctly. It also will rename your files and update the ID tags.
The 30-40% it did not find... I could easily find by doing some searching manually through the program.
It was a nice way to completely identify my mp3 collection. Yes, it's a legal collection, but I wanted an easy way to rename the files and id tags.
Anyhoo... the program is pretty buggy so save often. Help the cause.
Enjoy.
DavaK
Audio fingerprinting is not something like a hash function that leads to a deterministic identifier. It is more like a web search engine that finds the best fuzzy match.
If you use audio fingerprint scores in the aggregate, for example to see what's popular, it works. If you depend on any one audio fingerprint matchup being accurate, especially accurate enough to use for legal notices, it doesn't make sense.
Music is a semantic object. Saying whether two pieces of music are the same thing depends on stuff that even humans have a hard time figuring out, like how much originality there is in a tribute band's cover.
This is not an md5, this is spectral analysis "fingerprint" of the song. Thus they can identify the song no matter what the encoding (within reason, of course, but you wouldn't want to listen to a song so badly encoded that it can no longer be recognized anyway).
See http://musicbrainz.org/ for some software that uses the same technology to help you tag your MP3s.
I'm sure someone will come up with some software that, say, rearranges the MP3 frames of a song, foiling the fingerprinting but allowing the song to be restored on the other end..
Sean's business model seems a tad flawed. His new software has already been written, and an SDK is freely available here. Source code for both the Linux and Windows clients (which includes the fingerprinting code) is a click away under their downloads section. Redhat and Debian packages are there too, as well as Ruby and Perl bindings.. so fire up apt-get and go to it!
As a class project, a friend and I built a music recognition database. You can read our paper.
The general approach is fairly straightforward. You extract a set of "features" (typically several Mel Frequency Cepstral Coefficients, or MFCCs) from each sample of the song, say 10ms. You then pick several (say, 16) arbitrary points and iteratively generate that many "average" feature vectors, along with their weights so that they all sum to a one vector. This data is turned into a Hidden Markov Model (HMM). To see what audio you have, you run it through each of the possible HMMs and see which produces the greatest likelihood.
This method is typically applied to speaker recognition, where a linear search through HMMs is reasonable. This obviously isn't the case when you know about hundreds of thousands of songs, so a large part of the challenge is narrowing the field of HMMs to check (which is one of the focuses in our paper). Relatable, who were working with Napster a long time ago, have clusters that can classify 1,000 songs per second; I'm pretty sure they use this technique.
This technique has several important features. First, it doesn't depend on any properties of files themselves. Checksums would be trivial to beat, looking at a file's length could be circumvented by inserting silence, etc. Since this creates an average of sample data, a song would need to be changed quite a bit to fail to match. (The system is robust to, for instance, changes in bitrate, slowing the music down, and rearranging bits of the song or putting it in reverse.) We didn't have enough "derivative" music to test how it handles sampled music vs. the original -- it depends how much is changed.
Finally, this sort of system is useful for much more than song identification. You can build a model for an artist or genre and determine how to classify the song. One of my focuses in the paper is unsupervised genre classification -- my tests indicated some fairly reasonable groupings. This technique could be used for music recommendation -- "You like Dropkick Murphys? Well, they sound like Flogging Molly, so you might want to check them out."
Ceci n'est pas une signature.
Sean Fanning did not invent P2P.
I'm sure lots of people around here already know this, but Sean Fanning's service wasn't even P2P, it used a client-server model, which turned out to be its achilles heel. Killing a service based on that model is a simple matter of removing the servers, the vast majority of which were owned by Napster. Thats why P2P has become the prefered method for trading, it suffers from no such weakness; all nodes have to be individually removed.