Natural Language Processing for State Security
Roland Piquepaille writes "Obviously, computers can't have an opinion. What computers are very good at, though, is scanning through text to deduct human opinions from factual information. This branch of natural-language processing (NLP) is called 'information extraction' and is used for sorting facts and opinions for Homeland Security. Right now, a consortium of three universities is for the U.S. Department of Homeland Security (DHS) which doesn't have enough in-house expertise in NLP. Read more for additional references and a diagram showing how information extraction is used."
What comptuers are very good at, though, is scanning through text to deduct human opinions from factual information. This branch of natural-language processing (NLP) is called 'information extraction' and is used for sorting facts and opinions for Homeland Security.
Yeah, because we need AT&T giving wide-scale, undocumented wiretaps to the NSA, who use voice recognition to generate transcripts of everyone's phone calls, and then DHS can run NLP on those transcripts to compile a list of "persons of interest", who are then automatically added to the TSA no-fly lists.
Yeah, I can envision the future, and the future sucks.
Push Button, Receive Bacon
What comptuers are very good at, though,
.... is spell-checking.....
....something, apparently, the editors are not good at....
Have you read my journal today?
The slippery slope to being automatically flagged as someone to watch out for. No human control in the process, but one day when you go to apply for a loan or get your drivers' licence renewed, you might get a surprise.
Job? I don't have time to get a job! Who will sit around and bitch about being broke and unemployed then?
Number 891224 has expressed a dislike of Emperor Bush, incident reported to FBI and Homeland Security.
Great Intellect...
There is a great little company in Brooklyn, NY called Alias-i. Some years ago they built this interesting "tool" called....guess....ThreatTracker. Information Extraction, Named Entity Recognition and other interesting stuff, if you are into this.
No, I don't work for them, but their LingPipe toolkit has some cooooool stuff.
Simpy
I would say that comptuers (sic) aren't very good at deducting human opinions yet. They _may_ become better. Are humans good at deducting other humans opinion yet?
I just can't be bothered.
I have, in agregate, spent about 3 1/2 years in the last 20 years working on using NLP for semantic information extraction.
Possible? Yes, given very narrow domains of discourse and lots of work.
It's clear "national security" has become what "the internet" or "the cold war" were in their prime: an all-purpose catchphrase to get funding for any research whatsoever, no matter how tenuously connected.
Look at the two project proposals below and imagine which one will have an easier time getting funding:
"An epistemological metaanalysis of object-subject interrelations and conflict avoidance in Beowulf"
or
"An epistemological metaanalysis of object-subject interrelations and conflict avoidance in Beowulf to better understand threats to NATIONAL SECURITY"
Trust the Computer. The Computer is your friend.
Wow, thanks for another waste of time. And you people stop linking to his blog in comments, he exists for nothing but ad clicks.
There goes a promising career path. I know any technology can be used for good or for evil, but in today's political climate, it seems especially irresponsible to be aiding and abetting what may wind up becoming the pretext for torture of some 16 year old blogger.
Now, if you'll excuse me, I have to prepare myself for my upcoming extraordinary rendition....
Sounds kind of like DARPA's Information Processing Technology Office's GALE Program:
" The goal of the GALE (Global Autonomous Language Exploitation) program is to develop and apply computer software technologies to absorb, analyze and interpret huge volumes of speech and text in multiple languages, eliminating the need for linguists and analysts and automatically providing relevant, distilled actionable information to military command and personnel in a timely fashion. Automatic processing "engines" will convert and distill the data, delivering pertinent, consolidated information in easy-to-understand forms to military personnel and monolingual English-speaking analysts in response to direct or implicit requests."
Demented But Determined.
That doesn't stop the really determined idiot though. Oh no.
I have a spelling checker,
It came with my PC.
It plane lee marks four my revue
Miss steaks aye can knot sea.
Eye ran this poem threw it,
Your sure reel glad two no.
Its vary polished in it's weigh.
My checker tolled me sew.
A checker is a bless sing,
It freeze yew lodes of thyme.
It helps me right awl stiles two reed,
And aides me when eye rime.
Each frays come posed up on my screen
Eye trussed too bee a joule.
The checker pours o'er every word
To cheque sum spelling rule.
Bee fore a veiling checker's
Hour spelling mite decline,
And if we're lacks oar have a laps,
We wood bee maid too wine.
Butt now bee cause my spelling
Is checked with such grate flare,
Their are know fault's with in my cite,
Of nun eye am a wear.
Now spelling does knot phase me,
It does knot bring a tier.
My pay purrs awl due glad den
With wrapped word's fare as hear.
To rite with care is quite a feet
Of witch won should bee proud,
And wee mussed dew the best wee can,
Sew flaw's are knot aloud.
Sow ewe can sea why aye dew prays
Such soft wear four pea seas,
And why eye brake in two averse
Buy righting want too pleas.
-- "Candidate for a Pullet Surprise"
By Jerrold H. Zar, Northern Illinois University
Journal of Irreproducible Results 39, 1 (Jan.-Feb. 1994): 13
Deleted
Why do I immediately assume this will be abused?
DHS officer: Mr. 100%, I'm afraid we'll have to take you into custody. Our information extraction search on your blog concluded you are anti-American.
Me: From my blog? Is this about my criticism of the Iraq war?
DHS officer: Our results are classified, but please accompany us to GTMO for further "information extraction" to confirm the results of our investigation...
Ok, I know I'm taking a very cynical view here and that's pretty full of FUD, but why else does State security need this? Is this for them to monitor every chat room and blog?
Obviously, computers can't have an opinion.
Welcome the new opinion-based CAPTCHA-s!
What comptuers are very good at, though, is scanning through text to deduct human opinions from factual information.
... aims to teach computers to scan through text and sort opinion from fact. Or, We're interested in seeing how we would extract information about opinions.
Funny, because neither of the articles state that. In fact, they don't even say that software can do that at all yet: A new research program
So yeah, it would be nice if they could sort opinions from facts. Why they're at it, why don't they just recognize lies from truth too, because wouldn't that be doing the exact same thing? Then we can just run statements made by people suspected of committing a crime through the software, which can then sort out all the facts from the opinions, and we'll no longer need judges, juries or attorneys.
Roland, next time save yourself some time and just make the whole freaking thing up from scratch.
Dan East
Better known as 318230.
another thing Rolands computer is not very good at is spell checking his posts!
Screw national security, how about search, how about for business and commerce, how about for for culturial exchange and global interaction. The chances of me getting attacked by a terrorist are less than getting hit by lightning, the chances with dealing with foriegn cultures, foriegn business and commerce are rapidly approaching 100%. There are 4 billion people out there who have the potential to mutually benifit from clean communication. Please don't patrinoze me, I'm not too worried about getting nailed by terrorists, but am very bothered by the possibility of having my individual liberties nickeled and dimed to death.
Especially since the system, whilst it will have some quite interesting applications and the research will yield interesting results, can't work. A computer cannot distinguish between a fact and a lie told as fact...garbage in, and all that.
Let me rephrase that with an example:
'I am ten years old' and 'I am twenty years old'. Which is fact, which is lie? Better yet: 'we believe Iraq has WMD' versus 'we beleive Iraq has no WMD'. No matter what algorythms or heuristics you throw at this, all a computer at most can tell you is 'sometimes when used in conjunction with this phrase, the statement is false'...but that helps you IN NO WAY, because it means the statement can also be true...the indicator means nothing...you get as many false positives as false negatives...hell, even a ratio would be meaningless in intelligence gathering.
-- Waht? Tehr's a preveiw buottn?
You mean it was not the computers that voted for George W Bush? Then who the hell did?
Sent from my ASR33 using ASCII
Information extraction (IE) is a type of information retrieval whose goal is to automatically extract structured or semistructured information from unstructured machine-readable documents. A type of concept extraction that automatically recognizes significant vocabulary items in text documents, such as, names, terms, and expressions.