Slashdot Mirror


Natural Language Processing for State Security

Roland Piquepaille writes "Obviously, computers can't have an opinion. What computers are very good at, though, is scanning through text to deduct human opinions from factual information. This branch of natural-language processing (NLP) is called 'information extraction' and is used for sorting facts and opinions for Homeland Security. Right now, a consortium of three universities is for the U.S. Department of Homeland Security (DHS) which doesn't have enough in-house expertise in NLP. Read more for additional references and a diagram showing how information extraction is used."

8 of 132 comments (clear)

  1. tinfoil hat... or is it? by macadamia_harold · · Score: 5, Interesting

    What comptuers are very good at, though, is scanning through text to deduct human opinions from factual information. This branch of natural-language processing (NLP) is called 'information extraction' and is used for sorting facts and opinions for Homeland Security.

    Yeah, because we need AT&T giving wide-scale, undocumented wiretaps to the NSA, who use voice recognition to generate transcripts of everyone's phone calls, and then DHS can run NLP on those transcripts to compile a list of "persons of interest", who are then automatically added to the TSA no-fly lists.

    Yeah, I can envision the future, and the future sucks.

    1. Re:tinfoil hat... or is it? by rtb61 · · Score: 2, Interesting
      Sorting facts from opinion by use of language, how amazingly pointless and stupid. Now lets see if the program can sort BS facts from real facts. This just seems like another scheme cooked up by incompetant political appointees, who don't have any idea about what they are being paid to do. Their only hope of retaining their postition, so they can continue their real function of politcal party support for the current adminsitration, is to try to get that magic box to do their job for them.

      You want to know how incapable they are, look at the extent and perversion of punishment inflicted upon innocent people at GITMO prior to any semblance of justice being done i.e. proving the truth of the claims and opinions of a whole swag of political appointees in court of law (you are innocent until proven guilty in a court of law, precisely because past generations learned the lesson that you can not trust the enforcers of the law unless they are held under constant public attention and review).

      Exactly how many successful prosecutions have there been after all those years of operation of a facility that is clearly a perversion of jusctice. They are more likely to get successful prosecutions, against those who created and operated that facility, rather than the inmates (it is not about the accussed terrorists, it is about ensuring that the government adheres to the principles of justice, so that future generations do not suffer the perversions of justice that past generations suffered).

      --
      Chaos - everything, everywhere, everywhen
  2. Alias-i's ThreatTracker by otisg · · Score: 4, Interesting

    There is a great little company in Brooklyn, NY called Alias-i. Some years ago they built this interesting "tool" called....guess....ThreatTracker. Information Extraction, Named Entity Recognition and other interesting stuff, if you are into this.
    No, I don't work for them, but their LingPipe toolkit has some cooooool stuff.

    --
    Simpy
  3. A boon to research by JanneM · · Score: 4, Interesting

    It's clear "national security" has become what "the internet" or "the cold war" were in their prime: an all-purpose catchphrase to get funding for any research whatsoever, no matter how tenuously connected.

    Look at the two project proposals below and imagine which one will have an easier time getting funding:

    "An epistemological metaanalysis of object-subject interrelations and conflict avoidance in Beowulf"

    or

    "An epistemological metaanalysis of object-subject interrelations and conflict avoidance in Beowulf to better understand threats to NATIONAL SECURITY"

    --
    Trust the Computer. The Computer is your friend.
  4. Sounds like GALE by Dr.+Eggman · · Score: 4, Interesting

    Sounds kind of like DARPA's Information Processing Technology Office's GALE Program:

    " The goal of the GALE (Global Autonomous Language Exploitation) program is to develop and apply computer software technologies to absorb, analyze and interpret huge volumes of speech and text in multiple languages, eliminating the need for linguists and analysts and automatically providing relevant, distilled actionable information to military command and personnel in a timely fashion. Automatic processing "engines" will convert and distill the data, delivering pertinent, consolidated information in easy-to-understand forms to military personnel and monolingual English-speaking analysts in response to direct or implicit requests."

    --
    Demented But Determined.
  5. screw national security by argoff · · Score: 2, Interesting

    Screw national security, how about search, how about for business and commerce, how about for for culturial exchange and global interaction. The chances of me getting attacked by a terrorist are less than getting hit by lightning, the chances with dealing with foriegn cultures, foriegn business and commerce are rapidly approaching 100%. There are 4 billion people out there who have the potential to mutually benifit from clean communication. Please don't patrinoze me, I'm not too worried about getting nailed by terrorists, but am very bothered by the possibility of having my individual liberties nickeled and dimed to death.

  6. Re:A really difficult problem by Walt+Dismal · · Score: 2, Interesting
    I agree with the 'lots of work' part, but believe it is possible to achieve good results on wider domains outside of toy worlds. One key - from my own research - is to use (massive) databases of culture-related knowledge (belief systems) to build alternative viewpoints from which to massively parallel analyze the input. Each analysis agent has its own viewpoint or frame, driven by a very large database of world knowledge that is culture-specific. By culture I mean not just nationality but specific domains of belief systems. For example, American+middle class+scientist+age range 40-50, or white male Protestant businessman :) etc. Derrida was kind of right in a way; you have to bring specific personal knowledge to interpreting something, and no two people come to anything in exactly the same way. But two people with similar cultural bases will see similarly, all other things being equal.

    The system has to handle complex contexts and multiple varying worldframes. It has to superimpose multiple viewpoints - alternate personnas - in interpreting the source. Also useful is applying certain theories of story to modeling the world.

    Yes, it's non-trivial, but achievable. I can see a day coming where a Google-like entity can, by modeling you then acting as a 'cloned' agent, apply such personnas in your service and find not just data but meaning, for your benefit.

  7. Re:A really difficult problem by constantnormal · · Score: 3, Interesting

    It's the "narrow domains" that is the crux of the problem.

    When used successfully over said "narrow domains", the human tendency (especially that set of humanity which makes the high-level choices for groups and organizations) will be to expand the domain in hopes of applying it to ever greater numbers of items.

    Of course, as the search domain is expanded, the effectiveness of the results decline, with no warning to the clueless idiots driving the search. False positives eventually exceed true positives by greater and greater margins.

    In the end, the strategy collapses, as a great many victims are shown to be wrongly targeted -- but until that point, the system does a LOT more harm than good.

    Thank Goodness our leaders are such wise and contemplative souls that they would never, ever misuse such a tool.