Slashdot Mirror


Searching Sound

Technology Review has one of their few stories that's not registration-required describing searching audio files for any specified set of sounds. All sorts of interesting applications become possible if you can turn analog audio into a digitally-useful product without massive human intervention.

14 of 68 comments (clear)

  1. been there, done that by shachart · · Score: 2, Informative

    I work for a CT (Computer Telephony) company (see comment on story from half an hour ago). My company does soundex, phonex, and some proprietary stuff too, to convert recorded phone calls into the text of the call, regardless of noise, tone, etc. Useful for your friendly government to spy on you. This is really old news.

    --
    Those who can, do. Those who can't, consult.
  2. When you think about it... by Ieshan · · Score: 4, Interesting

    When you think about it, though, government and military agencies must have had this for quite some time.

    Tapping and bugging really does no good unless you've got someone listening all the time - and that's both expensive and impossible. While I realize that someone only has to be listening every time someone makes a phone call with the tapping situation, the outcome is lots more hours of audio then are feasible to search and use.

    If we couldn't have searched audio on a wide scale before, then I find it hard to believe we'd ever be catching anyone by specific phone intercepts. Instead, we'd just be using that sort of thing as evidence.

    I mean, I realize this is a great technology, I just doubt it's as "new" as it seems...

    1. Re:When you think about it... by Chris_Stankowitz · · Score: 2, Interesting
      Tapping and bugging really does no good unless you've got someone listening all the time -

      It was done this way for many many years. It is partly why many investigations took a long time to be fruitfull. There are also laws in some states that do not allow for a tapping to continue if after "xyz time" has passed without any usefull information.

    2. Re:When you think about it... by arvindn · · Score: 2, Interesting
      I'm not sure.

      What's new about this technology is that it does searches without transcription, but instead works at the phoneme level. This doesn't mean that the results are more accurate than if transcription and indexing are used. Its just that the new technique has applications in some cases that can't be handled by the conventional method, like when your model is inadequate, and you would lose information by converting phonemes into lexical form.

      Its not clear how this sort of thing would be useful for the military. My guess is that for the purpose of espionage it would be much better to have the recording converted to text first.

    3. Re:When you think about it... by PerryMason · · Score: 2, Interesting

      The big problem with this sort of technology is that in the past when you wanted to tap someone, you had to have a good reason (good enough to persuade a judge anyway) and you had limits on what you could and could not record/listen to. Now with technology like this and Echelon etc, it becomes possible to monitor every person who makes a phone call or sends an email. In effect you are presumed guilty and have to prove your innocence by not discussing or commiting a crime. One of the fundamental tenets of the western legal tradition is that you're presumed innocent until proven guilty and technology like Echelon turns that right on its head. Its just another example of fundamental rights being subjugated for the purpose of protecting us from 'evil-doers' who will just end up using other methods of communication. Meanwhile its you and I who end up losing our rights.

      Its similar in some ways to the mass DNA testing of populations to find a rapist for instance. Every person submits a sample, but is it likely that the perpetrator is going to submit their's? Every member of the population is presumed guilty until they prove themselves innocent, while the guilty simply refuses the test, and anyone who refuses the test, regardless of grounds, is tarred with the brush.

      I honestly think we're at the beginning of a massive degredation of human rights (particularly privacy) as a result of both technological and global factors. Unless we do something to ensure our fundamental rights, it really won't be that long till 1984.

      --
      "I'm tired of all this 'Aren't humanity great' bullshit. We're a virus with shoes" - Bill Hicks
    4. Re:When you think about it... by npendleton · · Score: 2, Informative

      See or read "Killing Pablo" and then tell me what you think about catching an individual from an intercepted phone call. The U.S. Government poured top flight resources (NSA and Delta Force) on the problem of helping a Colombian Government military unit find and kill drug king-pin Pablo Escobar. Escobar was killed by this Colombian military unit.

      This technology would help immensely on message analysis. Evaluating messages typically is divided into two areas, signal analysis, and message analysis.

      Signal Analysis is when and where the signal (phone, fax, email, ham radio, etc.) originated and went to. Even if you can't read the messages, the signal analysis may be all that one needs.

      Message analysis means understanding the content of the message. Decrypting or deciphering the message is common problem for text based messages. Voice is much harder to scramble in telephone networks. Once a message is opened, can "Voice matching" quickly and accurately discern who is speaking, regardless of where and to whom signal analysis says the telephone numbers belongs to. Indexing and phoneme transcription clearly helps analysts search for instances and patterns. But this is not a transcription, this phoneme transcription, that reduces the mountain of words in a language to 25 sounds. Search results can bring too many hits, or none, because people are using ambigious pronouns or homonyms, like "He" instead "Pablo" or "Their|There|They're", that ambiguate the meaning for the search tool. Ultimately message analysis requires understanding the way people in an organization think and speak. Indexing and transcription technology can help but not replace people understanding. What does "I dropped off the package." mean to you?

      The other place that phonemes transcription could be helpful is with Foreign Broadcast Information Service (FBIS). The CIA set up FBIS during the Cold War to monitor news services around the world in native languages. FBIS helped monitor trends and propaganda.

      Mac Refugee, paper MCSE, Linux wanna be

  3. Does that mean... by Valdrax · · Score: 3, Funny

    ...that I can finally find that one song that goes Wagga-chigga wa! Wagga-chigga wa! Wagga-chigga wa-wa! Thoomp! Meedly-meedly-meedly-meedly! Meedly-meedly-meedly-meedly meedly-meedly-meedly-meedly meeeeeeee!!

    --
    If it's for-profit but free, you're not the customer -- you're the product (e.g., the Slashdot Beta's "audience").
  4. Re:Usage scenario by MacroRex · · Score: 3, Interesting

    (paranoia)
    No really, what if they start bugging public places where people talk a lot (bars etc) and run the output through something like this? After acquiring a speech sample from bank/airport/whatever and thus connecting it to a person, it's a breeze to have a global textual log of everything the person says in a public place.

    Of course, the article talks only about deconstructing the audio sample into words, but further analysis is a natural extension of the idea.
    (/paranoia)

  5. Google for sounds? by Shiranui · · Score: 2, Interesting

    It would be cool if we're able to actually 'search' for any soundbytes. Even with altered speed / tone.

    Listening to all those techno remixes, I always have a hard time trying to find out where those cute backgound soundbytes came from...only to find out it was a heavily distorted Mozart or a mixed up vocal of JFK.

  6. Oops, here's a link by madmarcel · · Score: 4, Informative

    If you really want to find out how it works:

    Links to PS and PDF files are on this page

    http://www.cs.waikato.ac.nz/~nzdl/publications/

    (They are not going to like what I am about to do to their server ;)

  7. One good implementation by emcron · · Score: 2, Informative

    A company called Fast-Talk Communications has a set of tools that they resell for 3rd-party apps for things like searching interviews for specific words that were said. I have actually seen this feature in used some newsroom software made by Dalet Digital Media and it was amazing to see in action. Very fast and accurate

    The research for the fast-talk technology was done at Georgia Tech's Interactive Media Technology Center (IMTC). They've got a page about the corporate spin-off of the technology.

  8. Index or no index? by Psychic+Burrito · · Score: 3, Insightful
    There's quite a contradiction in this text. First they tell us that FastTalk doesn't uses an "index":

    The key to expediting the process was eliminating the need for transcription or indexing or both.

    Then on the second page, they say that some sort of pre-processing is needed:

    (...) the Fast-Talk approach ?processes the speech in such a way that you can later go back and search it very efficiently (...)

    So I see no revolution here... it's just about indexing the phonemes of a audio stream and then searching these, right?

  9. RIAA funding forthcoming? by fluffhead · · Score: 2, Interesting

    I wonder if the RIAA will throw money at this type of technology, to help catch "pirates" who might otherwise escape by subtly transmogrifying their shared MP3s. Or maybe it already has?

    --

    #include "disclaim.h"
    "All the best people in life seem to like LINUX." - Steve Wozniak
    1. Re:RIAA funding forthcoming? by Anonymous Coward · · Score: 2, Funny

      Let's see:
      - RIAA mention in title CHECKED
      - pirates word between quotes CHECKED
      - use of the word shared CHECKED

      All you need for a couple of karma points! +3 ??