Searching Sound
Technology Review has one of their few stories that's not registration-required describing searching audio files for any specified set of sounds. All sorts of interesting applications become possible if you can turn analog audio into a digitally-useful product without massive human intervention.
Tsunami -- You can't bring a good wave down!
Aren't CDs just that? If you really want to make a more usefull digital product, start scouting for some new tallent. American/Pop Idol isn't cutting it. :)~
I work for a CT (Computer Telephony) company (see comment on story from half an hour ago). My company does soundex, phonex, and some proprietary stuff too, to convert recorded phone calls into the text of the call, regardless of noise, tone, etc. Useful for your friendly government to spy on you. This is really old news.
Those who can, do. Those who can't, consult.
When you think about it, though, government and military agencies must have had this for quite some time.
Tapping and bugging really does no good unless you've got someone listening all the time - and that's both expensive and impossible. While I realize that someone only has to be listening every time someone makes a phone call with the tapping situation, the outcome is lots more hours of audio then are feasible to search and use.
If we couldn't have searched audio on a wide scale before, then I find it hard to believe we'd ever be catching anyone by specific phone intercepts. Instead, we'd just be using that sort of thing as evidence.
I mean, I realize this is a great technology, I just doubt it's as "new" as it seems...
Is this sponsored by the RIAA so they can detect Metallica
files named 1.x or 8TyX.jpg?
...that I can finally find that one song that goes Wagga-chigga wa! Wagga-chigga wa! Wagga-chigga wa-wa! Thoomp! Meedly-meedly-meedly-meedly! Meedly-meedly-meedly-meedly meedly-meedly-meedly-meedly meeeeeeee!!
If it's for-profit but free, you're not the customer -- you're the product (e.g., the Slashdot Beta's "audience").
This is a step towards full voice control of systems I have always felt the computers will not have truly come of age until they are voice controlled. For general business use all other forms of interaction are a compromise. The future I look forward to is full voice control of systems. Probably via a discreet headset so the box next door to you doesn't start typing your letter. I will then be possible to have truly 'afordant' systems.
It would be cool if we're able to actually 'search' for any soundbytes. Even with altered speed / tone.
Listening to all those techno remixes, I always have a hard time trying to find out where those cute backgound soundbytes came from...only to find out it was a heavily distorted Mozart or a mixed up vocal of JFK.
Lets home this doesn't make pointy head bosses think they can store customer information as a blob of speech data...
Yes, and /. isn't, you insensitive clod.
When anger rises, think of the consequences.
Confucius (551 BC - 479 BC)
If you really want to find out how it works:
;)
Links to PS and PDF files are on this page
http://www.cs.waikato.ac.nz/~nzdl/publications/
(They are not going to like what I am about to do to their server
It would be nice if there was a search engine exclusively for that - instead of typing "linus torvald linux .au", you would navigate to a subdirectory called 'pronounciations', pick the audio format and voila...
If you can search websites and images, jobs and news articles, sound bytes would be the logical next step.
SEO Copywriter. Just Say ON
(n/t)
A company called Fast-Talk Communications has a set of tools that they resell for 3rd-party apps for things like searching interviews for specific words that were said. I have actually seen this feature in used some newsroom software made by Dalet Digital Media and it was amazing to see in action. Very fast and accurate
The research for the fast-talk technology was done at Georgia Tech's Interactive Media Technology Center (IMTC). They've got a page about the corporate spin-off of the technology.
Then on the second page, they say that some sort of pre-processing is needed:
So I see no revolution here... it's just about indexing the phonemes of a audio stream and then searching these, right?
But I don't say "hoos".
Music wants to be free.
I wonder if the RIAA will throw money at this type of technology, to help catch "pirates" who might otherwise escape by subtly transmogrifying their shared MP3s. Or maybe it already has?
#include "disclaim.h"
"All the best people in life seem to like LINUX." - Steve Wozniak
This seems like it may be a good tool to use for learning how animals communicate with each other. Just an idea.
I've always been intrigued by this kind of technology (and eventually would like to build my own product). It would be nice to know what system architecture other people have used (and algorithms, etc). I imagine the searching for sounds to actually be the conceptually easy process. I'm more interested in the precise categorization of sounds. Does this turn an audio file into a single stream of phonemes/sounds or maybe a streamof sets of possible or concurrent sounds or ...?
I'm not saying this product should be open-sourced like some zealots around here, but I do think such a (cheaply) available product could change peoples lives. Imagine being able to search for precise sounds/words in your favorite movie or news broadcasts. (Assuming you have a continuously recording device for a specific channel) Anytime your favorite news topic was mentioned, the whole story could be recorded and saved just by noting where the story transitions took place (This might take a bit of analysis to see what sort of indicators might be used for story transition, but I don't think that it would prove unmanageable.) Furthermore, a gillete/idsoftware/whatever type approach could be used to market such a product. The digital audio to sound categorization software could be free/open source. The strategies to implement certain uses (like that mentioned above) could be closed source.
(One might be able to accomplish the same feat by the analysis of the Closed Caption signal many news shows provide, so maybe this wasn't a good example, however some shows (like Good-Day here in Dallas-Fort Worth area) dont have a CC.)
Web Images Sounds Groups Directory News
__porn_blowjob_"money_shot"__ [I'm Feeling Lucky]
I can see it now ...
-- Dossy
Dossy's Blog
Apple's Final Cut Pro's Soundtrack feature does this. It automatically categorizes inported samples depending on the instrument in them. Then, you can search on that.
mbbac
Has anyone heard about this company Polyphonic HMI (www.polyphonichmi.com) that claims to be able to create a digital 'fingerprint' of music (beat, melody, etc.) and identify potential 'hits'? Anybody know how they're claiming to do it?
I know it's just indexed by name, but this can be useful.
http://findsounds.com/