Data Mining And The CIA
Brotha Z writes "It seems that the CIA has developed a piece of software labeled "Oasis" that can convert the audio from television and radio broadcasts in to text. This software is stated to be able to determine the sex of the speaker, if the speaker is a different person than the original speaker - and if one of the speakers is named, it will continue to place the name next to the correct speaker from that point on. More information on this multi-faceted piece of software can be found here." Hmmm. Sounds like some nice speech recognition technology ("perfect demo" alert!), but as a taxpayer, something rings badly about it. If they're going to use my money to spy on me, can't they at least open source the code so I can dictate a letter?
The bottom line of this kind of technology is that although the speech recognition itself is relatively poor it is helped by the fact that most of the interesting words (names of people, places, etc.) occur very often in the same segment. So, it's all statistics. No accurate transcription needs to be made to achieve this kind of result. Therefore, applications such as automatic sub-titling are not possible with such systems. And I think they are still quite far away too.
As for the CIA claiming break throughs, well, I think other people can say wittier things about that.
Theo
: I don't understand why they specifically mentioned TV and radio.
Large vocabulary (but somehow predictible), speaker trained to overarticulate, no superposition between different speakers, slightly simpler language model (complete phrases, language close to written).
State-of-the-art recognizers have an error rate of ~10% on that test, which until last year was one of the evaluation tests at the speech group at NIST. Check http://www.nist.gov/speech/tests/index.htm for details.
Since the point of disminishing returns was reached, the test in going to be replaced with a new one, a audio/video recorded meeting transcription. Much, much harder.
OG.
I remember a bit in one of the Tom Clancy books in which Jack Ryan tell his wife that CNN often knows stuff faster than the CIA. I belive its probably true at least for some types of things. And I would bet hevily that in places like CIA and the Petagon that TV's tuned to CNN are rather common.
Erlang Developer and podcaster
Probably a few tens of thousands are what CIA is interested in. I'm sure that at CIA or NSA there is a room of people listening in to the BBC World Service and Radio Moscow, CNN etc. Lets face it the CIA does not have agents anywhere and much of what you can get off of the public wire is good information.
Plus you can also find out what world leaders are thinking by reading the newspapers in a country and listening to the national radio station. I would imagine that something like a Tivo would make this much easer for them.
Erlang Developer and podcaster
From the last sentence in the article:
Another intelligence official, on condition of anonymity, said: "If they have this kind of technology to plumb the depths of open sources, you can imagine what kind of technologies they have to track down spies."
All this technology wasn't good enough to track down Aldrich Ames, Edward Lee Howard, or the FBI's Hannsen, who together are probably the biggest moles in the history of espionage. People forget that tools are useful/automatic, but they aren't intelligent. Someone must be at the controls to interpret and act on the data. This tool sounds great, and there could be potential civilian uses beyond CI, but people must remember it's only a tool.
Cheers!
Ehttp://eugeneciurana.com | http://ciurana.eu
Also without bothering to RTFA, I'll repeat Paradise_Pete's question: Do you know what a neural net is?
You see, as I assume was his point, "computer chip neurons" work differently from central processing units, but not from the "data structure neurons" that can be trivially implemented in a program running on a "regular computer" to simulate the exact same neural net. The fact that they did it in hardware is interesting in its own right to someone interested in neural net research (I'll probably go read it later), and perhaps the speed factor is so great that a software version couldn't run in real time (which I guess could be what you meant) or would require an astoundingly powerful and expensive conventional computer in order to do so, but there is nothing special about "computer chip neurons" that in principle prevents the same thing from being done in software on a "regular computer".
Maybe this truly "doesn't run on regular computers" simply because they haven't implemented such a sofware-based simulator, but that's very different from implying that it's based on some kind of exotic technology that a Von Neumann machine is fundamentally incapable of duplicating, which is what it sounded like you were claiming and which is probably what Paradise_Pete objected to (and was wrongly punished for).
David Gould
David Gould
main(i){putchar(340056100>>(i-1)*5&31|!!(i<6)<< 6)&&main(++i);}
My English teacher would cringe at that run-on.
I don't read ACs: If a post isn't worth so much as a nom de plume to its author then I wont bother either.
I mean, like, really, now, dude. Are they going to start scanning soap operas for the sake of national security? Is Jay Leno broadcasting national secrets? Someone clue me in on the intelligence application here.
I suppose it might be handy for transcribing the numbers stations, though somehow I doubt that they'll seem quite so glamorous in ASCII:
--
Sheesh, evil *and* a jerk. -- Jade
Then again, stranger things have happened. But I would bet the proverbial farm that the guts of the software is Classified.
The article is not so much about speech recognition (as some other comments have mentioned). It deals with the possibilities of being able to label speakers, storing data of all kinds of sources in a database and being able to detect previous statements of a person. So, this is more about the intelligent combination of various existing techniques (including speech recognition and machine translation).
Personally, I think this has been done before to a certain degree, the resources available to the CIA (and their counterparts) are just becoming incredibly huge. Given the increasing amount of traffic that is generated by Internet users, they're probably pretty happy about that.
On the terrorists who are being mentioned all the time in that article: they're probably using encryption technology anyway, so I'm not sure if the really dangerous people will be caught with that system.
And why did the benchmark only involve a few words? Because that's all it can recognize. This thing isn't doesn't do speech recognition, it does sound recognition; IIRC, it can only handle single syllables words, and only four or five at that, and no sound-alikes. (I think "yes" and "no" were half its vocabulary.) It might be breakthrough for such a small ANN, but it's not that useful as a natlang system. I suppose something similar could make a good front-end to more complete system, though.
I now a large group of paranoid people who like to start all of their unimportant phone conversations with "I'm going to kill the president" or some such giberish because they are firmly convinced that all telephone conversations are being monitored by some Echelon type system, and have been for 20 years. They believe, that by throwing such "Noise" out there, they're helping protect everyone's privacy.
What amuses the hell out of me though, is that this kind of works against them if their own theories hold true.
The way I see it, almost nobody else goes to such efforts no matter how paranoid they are, and even if some phone-listening machine was being put to use, all they're doing is ensuring that they will be listened to.
And it's not that I don't think this sort of thing goes on or anything, it's just that I don't bother fighting it anymore now that they're able to read (and control) all of our minds anyway.
"Everything you know is wrong. (And stupid.)"
"Everything you know is wrong. (And stupid.)"
Moderation Totals: Wrong=2, Stupid=3, Total=5.
Anyone got a couple of spare lawyers looking for a fun afternoon or twenty?
LibBT: BitTorrent for C - small - fast - clean (Now Versio
I suspect the most common use of this sort of software is to monitor foreign broadcasts - something the CIA/OSS has been doing for more than 50 years. Traditionally, this has been done through a group (mentioned in the article) called the Foreign Broadcast Information Service (FBIS). FBIS monitors newspapers/broadcasts of many, many non-US media sources and makes this information available to US Government agencies.
For many years, FBIS made available to the public a daily paper copy product via the US Dept of Commerce's National Technical Information Service (NTIS) that was fedex'ed daily to hundreds of subscribers around the country/world. There were several issues, broken down by regions. For many years, it was one of the best public ways to track what was happening in the Soviet space program.
It's widely known that FBIS/CIA as been developing and using technology to aid the translation process for many years.
A few years ago, they dropped the paper product and moved to an electronic version.
The FBIS server to distribute the information to US Government users can be seen at http://199.221.15.211/ and can be found via a simple Google search on "FBIS".
The public can access this information via NTIS's World News Connection system (http://wnc.fedworld.gov). Yes, there is a charge to use WNC, because NTIS has to pay copyright (gasp!!!!!) to the foriegn sources (just because you steal the data stream doesn't mean you own it!) as well as operate the system. It's pretty well known that foriegn sources who complain loud enough also get paid by the Govt for the US govt use of the data.
My guess is that it's really fairly poor speaker independent stuff. It probably does a quick, low quality word recognition algorithm
It doesn't always have to be speaker-independent. Since it doesn't have to be real-time, all you need to do is identify the speaker, and then start over. If we're really talking about TV and radio sources, then there are going to be a large number of regularly-appearing speakers. Just a SWAG, but I'll bet that under a million people account for 80% of all the TV and radio minutes worldwide.
Observation At Several Interacting Scales
Operational Application of Special Intelligence Systems
Oracle Application Software Implementation Strategy
"My one oasis in the dust and drouth Of city life."--Tennyson
Keeping
I don't think you realize how boring and mundane most intelligence work is. Thousands of extremely junior people sit all day long translating newspapers and transcribing radio/TV broadcasts. Much of this stuff is made available through FBIS (pronounced "fibis") to further bore people slightly higher up the ladder throughout the government and contracting agencies.
However, it is useful once in a while. Especially when looking back and saying "Now how didn't we catch that?" If it could be brought online cheaper and more quickly, I can see how this would be well worth the money - without being particularly draconian (except insofar as the concentration of enough otherwise innocuous information can be quite powerful).
Sometimes, just sometimes, they mean what they say.
"Patriotism is your conviction that this country is superior to all other countries because you were born in it." -- GBS
If Google can find the space to archive the internet, don't you think the CIA could find enough space to archive all of these broadcasts in ASCII format?
I personally would LOVE to see a huge searchable, on-line database of everything ever said by anyone that was broadcasted. Imagine the implications. I'd search for all of my local politicians to see if they ever said anything stupid in their previous life as a coked-out-Miami-televangelist. I'd also search for my own name to see if I missed a song dedication or an NPR sponsorship in my name.
I guess a notable drawback is that the CIA could pretty easily scan cell-phone bandwidths as well... documenting any 'notable' private conversations. Perhaps we should all start talking in pig-latin to avoid the CIA's attention, al la Napster?
"If they're going to use my money to spy on me, can't they at least open source the code so I can dictate a letter?"
I may be wrong, but doesn't the CIA's charter say that they cannot conduct operations on native soil?
Bow before my sig, for it is good.
Just this morning I was joking with my wife about buying an alarm clock that snoozed when I yelled 'shut up', 'piss off', 'go away', or 'it's saturday'... Maybe this technology will lend itself to alarm clocks in the future :)
Morning sarcasm. I'll get back to work.
LOAD "SIG",8,1
LOADING...
READY.
RUN
Beyond that, the TellMe service should also recognize the command "shut up" along with "stop" and "tell me more". I mean, if you're going to have a voice-activated phone portal, why not use "natural language" for commands? ("Shut the hell up you stupid bitch! I said "stock quotes" not "stock racing"!)
For those of you who have no idea what I'm talking about, dial 1-800-555-TELL. The service is free, for now.
- I don't care if they globalize against free speech. All my best free thoughts are done in my head.
and
I don't know about you, but I'm pretty damned impressed.
the article on this system
Without the pad, it's not Dance Dance Revolution, it's Listen
I don't understand why they specifically mentioned TV and radio. If the audio is digitised before being pass to the software, it doesn't really matter where it comes from. Maybe they are trying to draw attention from the fact that it can be used on things like making transcripts of phone calls, normal conversations recorded with various listen devices?
About that feature that id the speaker, imagine a conversation that goes like this:
Speaker 1: You the Man.
Man: No, YOU the MAN.
Man: No no, you Da Bomb
Da Bomb: Hehe
Watch word: BOMB Alert! Alert!
As a final side note, I won-der... if... it... works... if... you... talk like... Cap-tain... K-irk... ;-)
====
Codeala - Just another mindless drone
They don't seem to have very accurate speech recognition technology. The article claims to reduce transcription time by a factor of about nine. That's a lot less unreasonable than believing in good speech recognition technology.
My guess is that it's really fairly poor speaker independent stuff. It probably does a quick, low quality word recognition algorithm - quite a few of those are around - and then some sort of Bayesian network to correct the transcription using lexical context. I know that ARPA was openly funding people doing exactly that a few years ago, and I'll bet their papers are on the web. It doesn't shock me greatly that someone has had some measure of success with it.
If it was 100% accurate transcription, then I wouldn't believe it. But as a time saving device for transcribers... that I find credible.
DARPA also funds a lot of automatic topic spotting research. One of my ex-profs received grants from them under just such a rubric and her papers are publicly available on the web. I'll bet whatever technology they are using, it was developed by a prof at an open university who publishes freely.
As for multilingual text searching and summarisation, the best technology of its kind known to me is Latent Semantic Analysis - the brain child of Thomas Landauer. It's a fairly recent, but hardly secret or obscure, indexing technique that's gaining ground commercially for data mining applications. It can certainly do the the small number of things being claimed by this article. All the relevant papers are on the web.
In short, this doesn't sound like super-secret spy stuff. I'll give long odds the real work is in journals and webpages that are publicly available. Having a couple billion dollars to speed up testing and implementation probably helps, but none of this sounds revolutionary or years ahead of the curve.
Less well known is their Foreign Broadcast Monitoring Service, for which generations of linguists have listened to the hype output of governments worldwide. (FBIS refers to this as "open source" material.)
They've been hoping for years to automate some of this stuff, and apparently they've succeeded. It doesn't require particularly good speech recognition, since the basic goal is to pull out the interesting stuff from the endless drivel.
This sort of info is used to answer questions like "Is country X changing their policy on Y", and "Who is speaking for country X on subject Y?" This is basic political intelligence information.