Full-Text Audio Search
Captain Chad writes "The latest print edition (12/16/2002) of InfoWorld has an interesting article about an audio search program by Fast-Talk Communications. (The article is not yet available on the InfoWorld web site, but the Fast-Talk site has some good info, including a downloadable trial version.) The product works by breaking the audio stream into phonemes, which are the 'basic units of sound in a language.' The search is then performed for a specific sequence of phonemes. This method is faster and far superior to traditional audio searches which convert to text and then perform a normal text search. The author of the Infoworld article, Jon Udell, tried a variety of searches that were surpisingly successful. If this technology is as good as he claims, there is a reasonable chance it will revolutionize the way we store data. Maybe there will even be an 'Audio' tab on Google." Here's the Infoworld article.
How long before the feds start digitizing all of our telephone conversations and using this technology to google our private conversations?
Yay!
I don't know much about the subject, but isn't this the method used to convert speech to text? Sounds to me like it's the only way to do it...comparison of a sequence of phonemes to another, except that the each word in the dictionary is associated with a sequence of phonemes. And that's why you're required to "train" the software with your own voice/accent.
Somebody who knows about the subject, please post and explain the process.
Warning: Opinions known to be heavily biased.
... Or imagine Google recording all possible audio streams (TV, radio, ... streets?) and allowing us to search those? All it takes is enough procesors, a bit of wiring...
Now if you record street conversations or all types of public conversations... Do a search on 'bomb'... How appealing is that to big brother.
All right... I'm learning sign language. Now.
I just hope one of those nuisance lawsuits from Tzsvestaeya Zolskovova, the eccentric widow of Sergei Zolskovova, (Russian lunguist who coined the word phoneme) over the use of the term "phoneme" doesn't hobble progress in this fascinating area.
Someone mentioned it can be used by the government for TIA stuff - agreed, but same with any technology. It has its positive and negative uses. I don't think we are all going to revert to cavemen to get away from it.
Random is the New Order.
Aside from searching for music, I can see this being really useful in web conferencing software. Consider this:
You hold a meeting where each person's channel was recorded and stored as part of the meeting info. Upon saving the meeting minutes, the software builds a phonetic index of the entire conversation.
Searches later on would be no more taxing on the server than a fulltext search in MySQL is now.
Useful? Definitely. And that's just one possibility.
putfwd.com - 1GB Free file storage with a twist
The basic idea of using audio similarity to "grep" short sounds out of audio streams (as opposed to using ASR and text-matching) is quite old - some classic papers based on dynamic timewarping date back to 1977, and HMMs became popular for this application about ten years after that. Papers on this kind of thing appear in conferences like ICASSP - look for keywords like "keyword spotting" or "wordspotting." The phone company wanted to do this for obvious reasons.
Note that I'm not saying the GATech technology used by this company is derivative - I haven't looked at the specifics of this approach.
Well, I'd personally love to have an audio search tool to comb through all the mp3 files of talk radio programs such as *Loveline*, *Opie & Anthony*, and *The Greaseman*, which I have. Sometimes I think, "Now which show had that cool bit about..." and I have no hope of finding it.
For a professional rather than personal use, imagine how useful this could be to radio stations if they keep digital archives of their programs--if someone wanted to look up a particular program based on a vague memory of some of the text, a tool like this would be invaluable.
Chasing Amy
(We all chase Amy...)
"The more corrupt the state, the more numerous the laws"-Tacitus