Google Researchers Create TV Audio Analysis System
segphault writes "Ars Technica reports on a paper (PDF) about ambient audio analysis authored by Google researchers. The system described in the paper can effectively determine what television show a user is watching just by capturing a short audio clip. The paper explains how a regular computer microphone can be used to record an audio clip that is then converted into a statistical data summary and transmitted to a remote server which matches the clip against archived data in order to ascertain which TV show it is associated with. Apparently, the system is fully viable, and other kinds of ambient noise don't negatively impact its accuracy. The paper also describes how web services can provide contextually relevant information based on a consumer's television viewing activities."
Big Brother is listening you!
There's a system in the UK where you can go out clubbing, here a song you like, dial a number and hold the phone out to the music and it'll text you the name of the song. Assuming they don't hire scores of extremely knowledgable music buffs with quick fingers, surely it's a very similar system. TV dialogue may be less distinctive to the human ear but to a computer it just means a larger amount of data to search through.
Is THIS why Google has been returning so many porn sites on my searches lately?
Off the top of my head:
A PVR that doesn't need to rely on blind luck and often incorrect listings to know if it's recording the right thing.
My Tivo often mischannels to PBS. I'm pretty sure this algorithm should be able to tell Family Guy from the "Boring ass old people talking about politics hour".
I've had enough abrasive sigs. Kittens are cute and fuzzy.
This seems like a not too complicated idea. You create an inexpensive operation that extracts what features you want from the sound data. Most importantly, you avoid features that are prone to randomness and entropy. It would take some research to figure out what the best features are and that's the audio fingerprint.
Since Google has more storage than you can imagine, they can most likely apply this fingerprinting technique to every episodes of every major show. Then they host the fingerprints in Google style and use their patented "Google Technology" to search it much the same way web content is searched.
Why would you want this? Well, there's the obvious marketing ploys. You know that people who watch Darma & Greg like to shop at Trader Joe's and like Odwalla brand food so you offer free episodes of Darma & Greg with only Trader Joe's & Odwalla episodes. You let the sponsors (Trader Joe's and Odwalla) foot the bill for the bandwidth/royalties or whatever.
The second useful implication would be cross suggesting shows to a user based on random sampling of the shows. You could allow users to watch old TV shows on the internet and then build a profile of them and their shows. Much how Amazon works, you could then suggest other shows, other DVDs of shows or perhaps build a site that randomly shows the user episodes that they might like based on prior viewings and statistics of other users.
The take away from this article for me was the fact that Google has vested interest in archiving and now television will be archived Google style.
I can't think of many other uses for this as the system isn't really "inferring" or "thinking" about data samples but is more so matching extracted features against a database. You know, voice recognition software allows for decent voice fingerprinting. You could most likely easily identify characters based on voices (but not actors due to stars like Hank Azaria who do multiple voices). Then you wouldn't need a database of all shows but more so just a database of character voice fingerprints. I would find this sort of approach more interesting but less specific and useful.
Aside from showing this off to your friends, it's not very useful. What I personally would like to see this new Google strategy applied to is all the tapes recorded of famous people like the United States Presidents. If you divided those up into sessions and I was listening to a particular tape of the Nixon set where he talked about the "new right", perhaps a database with references would then point me to some tapes or materials on Joe McCarthey's staunch views on the right.
My work here is dung.
Designed to maximize user privacy while minimizing dependency on unique hardware, the system described in the paper seems interesting and feasible. In order to protect user privacy, the software uses "summary statistics" automatically generated from ambient audio rather than transmitting an actual recording. The actual audio cannot be extrapolated from the summary statistic data, so the system doesn't "overhear" or transmit user conversations.
Still, if the data reveals what show the person is watching, your President or anyone else who gets to see the data might start treating you differently depending on what you are watching latley.
profiling
will help to add meta data to all those mpeg4's you have bittorrented or recorded on your DVR
...I obey the laws of physics....
Keeping piracy out of Google Video.
Do you remember the MTV show Popup Video? They showed older music videos with popup balloons that gave extra information, like actors in the video that later became famous or mistakes made during production. If Google analyzed the sounds coming into your laptop and gave you a link to a site like the Internet Movie Database then you could have Popup Television. Learn more about the specific episode you are watching, and even have the ability to edit that information yourself.
It'd make an interesting toy. I'm sure that anyone with some imagination could think of even cooler applications.
AlpineR
In the mean time, I avoid non free software and even have bad thoughts about my cell phone.
Friends don't help friends install M$ junk.
I wish it could get it right, and record my "dateline" and not,
football head baby and big fat cartoon man talking about his ass gas hour...
every day http://en.wikipedia.org/wiki/Special:Random
and while whatchine Fox news, I was pointed here: http://tinyurl.com/z9x2y
Don't fight for your country, if your country does not fight for you.
I don't watch TV much, so I could care less about identifying the TV shows. But what I really would like is an app that would accurately identify mp3 files and apply artist, track #, ect. I've tried a few of the availible programs such as Replay Music and their accuracy is horrid. Maybe Google can do it better. Of course the other use I see for this is identifying music in movies and older TV shows. Newer TV shows do a great job of identifying music, but some older shows (season 1 of The Wire) have great music clips that aren't named in the credits.
Not sure about PPM's tech, but Nielsen's A/P meter does exactly what TFA describes. That's the only way Nielsen Media could roll out Time Shifted Viewing at all (disclosure: I work for them). To say that Google "created" it is an insult to the people I work with every day.
I see a patent suit in Google's future. As much as I hate patents and like Google, I'd like to at least see some full disclosure here. To (erroneously) state one one hand that they invented the technology and then admit (on page 4 of the PDF) that they intend to compete with the actual inventors, they're begging to get sued anyway.
This very statement presupposes that other noise is irrelevant, which seems bogus.
Snoring is background noise, and suggests non-watching.
Laughter is background noise, and suggests careful watching.
Of course, the laughter might not be about what's on TV...
It seems to me that watching is an activity involving the eyes and mental processing. It seems to me that audio of what is coming out of the TV is not a statement about either the eyes or about mental processing. This technology of Google's may be an advance in something, but I hope the advertisers paying for this data have their eyes open about the nature of what they are buying because (to re-mix a metaphor) to my eyes this sounds a bit suspect.
Sociologically, it sounds like a foot in the door to get harmless censors in place. Oops, Freudian slip there. That's sensors, I mean. Google would never involve itself with censorship.
Once the sensors are in place, when "we" realize that it's not getting "us" the data "we" want, we'll just do a few "harmless" downloads of "upgrades", perhaps causing a minor tweak to look at the video data rather than the audio, or perhaps doing language processing after all, and ... With user-friendly software like this, who needs spyware?
I also question the claim that because no information is transmitted back to Google that this is the definition of not invading privacy. How is this fundamentally different than the claim that if the police search your house but find nothing, they have not invaded your privacy because they've not placed any record of illegal activity on your permanent record?
It seems to me that once you place a Turing Machine into someone's environment, capable of doing arbitrary processing, and all it sends is a sanitized report, you have all the mechanism in place for abuse. What if the Turing Machine, capable of arbitrary processing, decides that it doesn't want to send a sanitized report. Who is auditing what is sanitized and what is not?
What if it turns out to later be possible to lift information from the supposedly cleansed records? Who will audit the use of that data?
There seem to me to be a lot of slippery slopes here.
Kent M Pitman
Philosopher, Technologist, Writer
I'd hate google desktop (or any other google utilitty) spying on my mic to discover my musical preft or anyting else. no tv in my home, but what about the speed at which i type or the general noise in my home or how often my phone goes off or how hard or long my baby cries.. do not listen on my mic, please: 'click' . imagine how many things can be recorded and easily recognized in a home. and many a pc/laptop/headset has a builtin mic, useful to skype, which can thus be used. horror.
I'd like to implement something like this for myself, but with conversational noise instead of TV. I sometimes use my laptop as a visual aid during conversations in my living room. If we're talking about a particular topic, I may pull up a relevant wikipedia article, or something like that. I wouldn't mind if this were more automated.
I can envision running a speech-to-text translator on my laptop mic and then piping that text into my beagle desktop searcher, or maybe even one of those google desktop search tools on windows. I'd rather not send this data to google, for privacy reasons, though.
I could see this being useful at work, or in a conference or class, too. I could stand to have relevant pieces of notes that I took from previous classes pulled up with my professor mentions a particular topic.
Anyone know of a tool or project like this?
So to sum this up: I give up my privacy at home. For...better targetted ads?
I'm very skeptical this wouldn't be abused - if not by Google, then by someone else. And even if this is not abused, I run the risk for what?
I don't like ads now.
Everyone who loves the idea of personalized ads, put up your hand!
----------
From the other side, what will your friends think when that "random" ad for viagra pops up?
-- Life is good. Tastes like chicken.