Natural Language Processing for State Security
Roland Piquepaille writes "Obviously, computers can't have an opinion. What computers are very good at, though, is scanning through text to deduct human opinions from factual information. This branch of natural-language processing (NLP) is called 'information extraction' and is used for sorting facts and opinions for Homeland Security. Right now, a consortium of three universities is for the U.S. Department of Homeland Security (DHS) which doesn't have enough in-house expertise in NLP. Read more for additional references and a diagram showing how information extraction is used."
What comptuers are very good at, though, is scanning through text to deduct human opinions from factual information. This branch of natural-language processing (NLP) is called 'information extraction' and is used for sorting facts and opinions for Homeland Security.
Yeah, because we need AT&T giving wide-scale, undocumented wiretaps to the NSA, who use voice recognition to generate transcripts of everyone's phone calls, and then DHS can run NLP on those transcripts to compile a list of "persons of interest", who are then automatically added to the TSA no-fly lists.
Yeah, I can envision the future, and the future sucks.
Push Button, Receive Bacon
There is a great little company in Brooklyn, NY called Alias-i. Some years ago they built this interesting "tool" called....guess....ThreatTracker. Information Extraction, Named Entity Recognition and other interesting stuff, if you are into this.
No, I don't work for them, but their LingPipe toolkit has some cooooool stuff.
Simpy
It's clear "national security" has become what "the internet" or "the cold war" were in their prime: an all-purpose catchphrase to get funding for any research whatsoever, no matter how tenuously connected.
Look at the two project proposals below and imagine which one will have an easier time getting funding:
"An epistemological metaanalysis of object-subject interrelations and conflict avoidance in Beowulf"
or
"An epistemological metaanalysis of object-subject interrelations and conflict avoidance in Beowulf to better understand threats to NATIONAL SECURITY"
Trust the Computer. The Computer is your friend.
Sounds kind of like DARPA's Information Processing Technology Office's GALE Program:
" The goal of the GALE (Global Autonomous Language Exploitation) program is to develop and apply computer software technologies to absorb, analyze and interpret huge volumes of speech and text in multiple languages, eliminating the need for linguists and analysts and automatically providing relevant, distilled actionable information to military command and personnel in a timely fashion. Automatic processing "engines" will convert and distill the data, delivering pertinent, consolidated information in easy-to-understand forms to military personnel and monolingual English-speaking analysts in response to direct or implicit requests."
Demented But Determined.
Screw national security, how about search, how about for business and commerce, how about for for culturial exchange and global interaction. The chances of me getting attacked by a terrorist are less than getting hit by lightning, the chances with dealing with foriegn cultures, foriegn business and commerce are rapidly approaching 100%. There are 4 billion people out there who have the potential to mutually benifit from clean communication. Please don't patrinoze me, I'm not too worried about getting nailed by terrorists, but am very bothered by the possibility of having my individual liberties nickeled and dimed to death.
The system has to handle complex contexts and multiple varying worldframes. It has to superimpose multiple viewpoints - alternate personnas - in interpreting the source. Also useful is applying certain theories of story to modeling the world.
Yes, it's non-trivial, but achievable. I can see a day coming where a Google-like entity can, by modeling you then acting as a 'cloned' agent, apply such personnas in your service and find not just data but meaning, for your benefit.
It's the "narrow domains" that is the crux of the problem.
When used successfully over said "narrow domains", the human tendency (especially that set of humanity which makes the high-level choices for groups and organizations) will be to expand the domain in hopes of applying it to ever greater numbers of items.
Of course, as the search domain is expanded, the effectiveness of the results decline, with no warning to the clueless idiots driving the search. False positives eventually exceed true positives by greater and greater margins.
In the end, the strategy collapses, as a great many victims are shown to be wrongly targeted -- but until that point, the system does a LOT more harm than good.
Thank Goodness our leaders are such wise and contemplative souls that they would never, ever misuse such a tool.