Slashdot Mirror


Google Researchers Create TV Audio Analysis System

segphault writes "Ars Technica reports on a paper (PDF) about ambient audio analysis authored by Google researchers. The system described in the paper can effectively determine what television show a user is watching just by capturing a short audio clip. The paper explains how a regular computer microphone can be used to record an audio clip that is then converted into a statistical data summary and transmitted to a remote server which matches the clip against archived data in order to ascertain which TV show it is associated with. Apparently, the system is fully viable, and other kinds of ambient noise don't negatively impact its accuracy. The paper also describes how web services can provide contextually relevant information based on a consumer's television viewing activities."

8 of 108 comments (clear)

  1. This already exists? by abigsmurf · · Score: 3, Interesting

    There's a system in the UK where you can go out clubbing, here a song you like, dial a number and hold the phone out to the music and it'll text you the name of the song. Assuming they don't hire scores of extremely knowledgable music buffs with quick fingers, surely it's a very similar system. TV dialogue may be less distinctive to the human ear but to a computer it just means a larger amount of data to search through.

    1. Re:This already exists? by Nimloth · · Score: 3, Informative

      Not quite, the service you mentionned recognizes a sound clip against its more or less exact replica in the (large) database.
      This here matches a sound clip to a pattern to find the TV show, meaning it doesn't have all the current episodes of the program in its database, it just has statistical data and patterns which help it match the audio. The latter could successfully match new (live) episodes without having the database updated. Your tune system wouldn't.

    2. Re:This already exists? by asuffield · · Score: 3, Informative

      I always wanted to have the ability to "hash" songs, and come up with an algorithm that would be robust enough to work across multiple codecs and encoding options, different (relative) normalizations, and maybe even be able to handle empty space at the beginning and/or end of the song.

      It's been done. Here's a system where you can hum a tune and it tells you the song: http://www.musipedia.org/

      Current systems are mostly based on pitch changes, so they aren't perfect (especially with the recycled slush turned out by low-grade high-visibility pop acts), and largely useless for rap, but they mostly work. There are numerous variations on the system, this is just one of the more significant ones that is publically availabel on the web.

      I would think by making a hash based on values relative to sound signatures within the clip this might be possible, but I don't really know how this stuff works

      What google is doing may or may not be related. They might instead be using a form of speech recognition technology, or a combination of both, or something else entirely.

  2. I thought something like this was up! by Anonymous Coward · · Score: 3, Funny

    Is THIS why Google has been returning so many porn sites on my searches lately?

  3. Uses & Motives? by eldavojohn · · Score: 4, Insightful

    This seems like a not too complicated idea. You create an inexpensive operation that extracts what features you want from the sound data. Most importantly, you avoid features that are prone to randomness and entropy. It would take some research to figure out what the best features are and that's the audio fingerprint.

    Since Google has more storage than you can imagine, they can most likely apply this fingerprinting technique to every episodes of every major show. Then they host the fingerprints in Google style and use their patented "Google Technology" to search it much the same way web content is searched.

    Why would you want this? Well, there's the obvious marketing ploys. You know that people who watch Darma & Greg like to shop at Trader Joe's and like Odwalla brand food so you offer free episodes of Darma & Greg with only Trader Joe's & Odwalla episodes. You let the sponsors (Trader Joe's and Odwalla) foot the bill for the bandwidth/royalties or whatever.

    The second useful implication would be cross suggesting shows to a user based on random sampling of the shows. You could allow users to watch old TV shows on the internet and then build a profile of them and their shows. Much how Amazon works, you could then suggest other shows, other DVDs of shows or perhaps build a site that randomly shows the user episodes that they might like based on prior viewings and statistics of other users.

    The take away from this article for me was the fact that Google has vested interest in archiving and now television will be archived Google style.

    I can't think of many other uses for this as the system isn't really "inferring" or "thinking" about data samples but is more so matching extracted features against a database. You know, voice recognition software allows for decent voice fingerprinting. You could most likely easily identify characters based on voices (but not actors due to stars like Hank Azaria who do multiple voices). Then you wouldn't need a database of all shows but more so just a database of character voice fingerprints. I would find this sort of approach more interesting but less specific and useful.

    Aside from showing this off to your friends, it's not very useful. What I personally would like to see this new Google strategy applied to is all the tapes recorded of famous people like the United States Presidents. If you divided those up into sessions and I was listening to a particular tape of the Nixon set where he talked about the "new right", perhaps a database with references would then point me to some tapes or materials on Joe McCarthey's staunch views on the right.

    --
    My work here is dung.
  4. Re:Great... by Anonymous Coward · · Score: 3, Insightful

    Keeping piracy out of Google Video.

  5. recognizing sound samples by mstrcat · · Score: 4, Insightful

    I don't watch TV much, so I could care less about identifying the TV shows. But what I really would like is an app that would accurately identify mp3 files and apply artist, track #, ect. I've tried a few of the availible programs such as Replay Music and their accuracy is horrid. Maybe Google can do it better. Of course the other use I see for this is identifying music in movies and older TV shows. Newer TV shows do a great job of identifying music, but some older shows (season 1 of The Wire) have great music clips that aren't named in the credits.

  6. Re:Nielsen by apnielsen · · Score: 3, Interesting
    Portable People Meters belong to Arbitron, not Nielsen Media.

    Not sure about PPM's tech, but Nielsen's A/P meter does exactly what TFA describes. That's the only way Nielsen Media could roll out Time Shifted Viewing at all (disclosure: I work for them). To say that Google "created" it is an insult to the people I work with every day.

    I see a patent suit in Google's future. As much as I hate patents and like Google, I'd like to at least see some full disclosure here. To (erroneously) state one one hand that they invented the technology and then admit (on page 4 of the PDF) that they intend to compete with the actual inventors, they're begging to get sued anyway.