Xerox's 'Intelligent Redaction' Scanners
coondoggie writes "Xerox today touted software it says can scan documents, understand their meaning and block access to those sensitive or secure areas so that prying eyes cannot read, copy or forward the information. Xerox and researchers from its Palo Alto Research Center debuted "Intelligent Redaction," new software that automates the process of removing confidential information from any document. The software includes a detection tool that uses content analysis and an intelligent user interface to protect sensitive information. It can encrypt only the sensitive sections or paragraphs of a document, a capability previously not available, Xerox said."
Wonderful, I wonder what the scanner does to the 'redacted' material?
Maybe it's as good as Adobe PDF's redaction feature, and anyone can unredact the document?
Or maybe it sends the redacted portion to any one of the 3-letter agencys, that 'don't exist'.
So, once you have marked a certain confidential information as confidential, it will do it automatically in other documents. Which means that for the low, low price of your time, you can submit a document with "fill-in the blanks" text until it redacts the same parts and BANG you know what the redacted section was...:D
I hear that the government has already ordered a thousand of these.
"It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."
I'm sure this will lead to a lot of copiers having "accidental" drownings in their bathtubs and Completely Innocuous single car crashes.
One critic of the new capability cited concerns about censorship saying, "REDACTED"
This is just the same software that has been used on the UFO files that have been put out with lots of stuff blacked out.
[CLEARANCE RED]
Welcome Troubleshooters,
The computer has just scanned in this manual that only you may read on your new experimental weaponary:
Attachment: Manual.pdf
Manual.pdf:
[This has been deleted for security reasons.]
...it's just a new way to save money on support and service when printers stop printing or blow toner all over the place. "Look at this mess! The first page greys out and then there are only a few faint lines for the next 30 pages!" "Nothing wrong with the printer. That information is simply redacted."
"Life is life." --Laibach
This is a poor idea. It better be 100% accurate at marking classified data as classified. All it will take is one screw-up and some extremely important data out there can be leaked to the wrong people.
99.99% accurate isn't going to be good enough, is it?
Hey now, don't lump these crackpot tinfoil-hat conspiracy theorists in with the rest of us run of the mill liberals.
I do love how all these 9/11 conspiracy theorists all suddenly became phD level structural engineers, aeronautics engineers, and whatever the hell other kind of engineer exists. I'm an engineer and even I know when I"m trying to analyze something that's way above my expertise.
I recall reading a story of a Greek philosopher once (forgot which one it was). He walked through the city, talking with common folk about all subjects from politics to history, and arrived at the conclusion that everyone except himself is a fool - for he is the only one who realizes when what he's talking about is out of his league.
Attention corrupt senior corporate management:
Tired of dealing with underlings trying to take you out by blowing the whistle on your illicit financial dealings? We have just the type of business equipment that you're looking for. Stop those do-gooders right in their tracks by automatically keeping them from copying those fudged books and secretive memos. Act now, and we'll throw in the automatic notification upgrade so you can terminate their employment before they have the chance resort to other means of toppling your investment scam...
(okay, I'll put my tinfoil hat back in the closet, now)
Now I just have to find out how it works so I can print T-shirts that cannot be copied :)
AI is a disaster through-and-through. It never works well. Ever.
Consider hand-writing recognition, autonomous robotics, and game theory, just to name a few of the narrowest, most-well defined (read:easiest) AI applications. AI works well in none of these - at best, it's so-so (like the 95-98% success rates in OCR).
Now what you have here, with the automatic redacting copier, is that the copier needs to understand the document its reading, and determine which parts to redact. Contextual understanding is *HARD* - it's the same class of problem as automated translation - only harder in this case.
This copier idea is a huge flop. I don't know why they waste money on it. Anyone who relies on this copier to redact documents is a fool, because it is bound to make all kinds of mistakes (both type 1 - missing things it should have picked up, and type 2 - redacting things it shouldn't).
To make laws that man cannot, and will not obey, serves to bring all law into contempt.
--E.C. Stanton
This way when some critical info gets missed in the redaction process, there's no one to blame! So not only will our (I'm usian) gov't be more efficient about hiding stuff from us, no one will have to take the fall if it goes wrong.
That said, I'm amazed at what modern Ai can do. It's not clear, from this rather thin article, how much this system depends on human input to prevent mistakes. There must be some kind of training process. What is the state of these kinds of systems? I remember from some AI courses I took years ago, that they worked well but inevitably someone would end up calling someone else something stupid. Then the machine would start skipping important bits and the coders would look like idiots.
That was hard and a real stretch there at the end. blah.
man, I feel like mold.
Obviously this is not possible in general, since how sensitive information is can and will change over time. Without full AI awareness of the situation that places the document in context, this is not possible. (E.g. the statement "Bob will be leaving the company" could either be highly sensitive or old news, depending entirely on the time and/or reader. Even more fun, what about "accidentally" sensitive statements where the mere fact that the machine hides it flags it as an item of interest to someone who didn't know it was interesting?)
Also, a machine may "blank out" the sensitive part but leave enough around it for an astute hostile actor to still gain something - such things are so highly context sensitive I can't see any general algorithm that could guarantee success in all such cases.
Still, two possibly useful approaches that are closer to hand would be:
1) Supply the machine with a form, and specify certain areas (which will contain an SSN, for example) as containing information that must be treated as sensitive. So long as a standard form is used, the results could be handy.
2) Supply the machine with a complete list of information you want to keep under wraps (and all the various ways that information might appear - drawings, descriptions, what have you) and have it check each document for anything that matches anything on its sensitive list. This also has problems and would be easy to get around but it WOULD be helpful to prevent non-hostile carelessness - i.e. "WHOOPS Bob just scanned something sensitive to add to that email, better blot out the parts that aren't cleared to go outside the organization."
While a general solution isn't possible, I can actually see this being useful in controlled situations. The article mentions medical, financial and government which all have lots of well defined forms that can be used. It won't allow the replacement of human judgement but it might make it easier to stop certain forms of accidental distribution in well defined cases, and that's worth pursuing so long as it doesn't encourage carelessness.
"I object to doing things that computers can do." -- Olin Shivers, lispers.org
Yes, but on the other hand ... that doesn't prove there wasn't one, just that they're trying to get us so sick and tired of hearing about it that we don't care anymore. It used to be they would just repeat a lie over and over and over and over and over until we ultimately believed the lie. It's sort of a mild form of brainwashing, and it worked because back then we really wanted to believe that our government and our representatives truly had our best interests at heart. Well, we're too sophisticated for that now: I mean, between eight years of Clinton and almost as many of George Bush we've collectively reached the conclusion that everything they say is a lie. "No, Mr. President ... don't believe you. You had your chance to let us trust you and you blew it."
So, all they can do now is just keep pounding the obvious lies into our heads at every opportunity until we finally say, "Enough! Whatever! I can't stand it any more and I don't CARE if there were weapons of mass destruction or not! Jesus Christ, just SHUT THE FUCK UP ABOUT IT ALREADY! AAAAAAGGGGHHHHH!"
If that's their plan, it seems to be working.
The higher the technology, the sharper that two-edged sword.
Yeah, and the practice continued unabated. It amazes me that they claim the very same government that is inept, unethical and incapable, can pull off keeping such a huge secret. For a 140 year old reference: "It appears we have appointed our worst generals to command forces, and our most gifted and brilliant to edit newspapers. In fact, I discovered by reading newspapers that these editor/geniuses plainly saw all my strategic defects from the start, yet failed to inform me until it was too late. Accordingly, I am readily willing to yield my command to these obviously superior intellects, and I will, in turn, do my best for the Cause by writing editorials - after the fact." ~ Robert E. Lee, 1863
Zerox doesn't need _that_ much accuracy. Remember who the customer is with this kind of product. Mostly major-league litigation mills who get boxes upon boxes of documents and mass-storage devices that need to be read and searched quickly. Now redaction can be automated to some degree.
I can easily see this being a very successful product in litigation circles.
http://www.maxineudall.com/2010/02/should-economists-be-sued-for-malpractice.html
Here's the latest report from General Petraus in Baghdad:
http://tinyurl.com/3ygnh
I've never heard of this Robert E. Lee gentleman. I'm assuming he proved his superior intellect by winning the war?
if this technology could be applied to sun-glasses?
No folly is more costly than the folly of intolerant idealism. - Winston Churchill
I wonder if it prints yellow dots to encode the redacted text for forensic analysis.
You know, it used to be that a "national security" threat was something that could kill millions, or wipe out the White House. Now a kid with some lighter fluid can be arrested for terroristic threats, and it's the White House that authorizes the killing. Can nobody read the Constitution?
We the [REDACTED][
Okay, I see where this is going to be used. For some reason, I figured it would be used for government purposes like classified documents. Still, Coca-Cola will be pretty pissed it lets something containing their secret formula go...
IRC did that years ago...
<Cthon98> hey, if you type in your pw, it will show as stars
<Cthon98> ********* see!
<AzureDiamond> hunter2
<AzureDiamond> doesnt look like stars to me
<Cthon98> <AzureDiamond> *******
<Cthon98> thats what I see
<AzureDiamond> oh, really?
<Cthon98> Absolutely
<AzureDiamond> you can go hunter2 my hunter2-ing hunter2
<AzureDiamond> haha, does that look funny to you?
<Cthon98> lol, yes. See, when YOU type hunter2, it shows to us as *******
<AzureDiamond> thats neat, I didnt know IRC did that
<Cthon98> yep, no matter how many times you type hunter2, it will show to us as *******
<AzureDiamond> awesome!
<AzureDiamond> wait, how do you know my pw?
<Cthon98> er, I just copy pasted YOUR ******'s and it appears to YOU as hunter2 cause its your pw
<AzureDiamond> oh, ok.
Source : http://bash.org/?244321
I have discovered a truly marvelous proof of killer sig, which this margin is too narrow to contain.
Oh come on, everyone knows that SOP for redaction in government is to redact in Word by changing the text colour to white, or the background to black...
Ha! I knew it. All you people who think that redaction somehow evolved from nothing more microscopic ink blots are godless fools.
To avoid a meltdown, follow these easy steps.
1. Read radiation gauge and ensure it shows no more than (deleted for reasons of national security).
2. Press the (deleted for intellectual property reasons) button.
3. Watch carefully for (deleted for reasons of national security).
If meltdown cannot be avoided, (deleted for reasons of excessive gore and violence).
We used to have a Bill of Rights. Now, with the rights gone, all we have left is the bill.
I've had one of these devices rigged up so that when I want to send an e-mail, post stuff in a web form or something, I just write it on a piece of paper and scan it, and it does everything else. To be honest, I [REDACTED] recommend it. The [REDACTED] machine is quite good at [REDACTED] everything I [REDACTED] want it to do. I [REDACTED] for one [REDACTED] welcome [REDACTED] our new [REDACTED] photocopier [REDACTED] overlords.
Top Secret intelligence is defined as intelligence that if released would cause grave harm to the United States, its allies, their interests, and/or its operations abroad. They are not going to trust a machine to go through information and determine what is and what isn't TS material. All it takes is for the AI to screw up a few times in one report for the risk that someone will get hurt or killed to increase to unacceptable levels.
I just don't see the real sensitive environments touching this with a ten foot pole. Hiring analysts by the dozens to work on this stuff is a lot cheaper to these agencies than having to answer to Congress why dozens of informants in Al Qaeda were assassinated because the AI didn't block their names.
I thought that Intelligent Redaction was the Discovery Institute's explaination for why they don't release any research.
No kidding!!! What do you say at this point?
If the intelligent redaction feature accidentally misses actual critical information and instead redacts non-critical information, that could be a good thing. I mean, for people who want to know things other people don't want them to know.
[ think ]
What is the next step in development of this feature? What about using it to prevent the duplication of copyrighted works (sort of a DRM for paper)?
So are grammar checkers etc. Those things are all easy to "do" but they have never been and most likely will never be done well by machines in a generalized manner that can be used in multiple contexts. Anybody can develop natural language based programming rules in just a few lines of a script with grep and sed. Simply doing such a thing for a single specific case or two and doing it in a way that will work flawlessly in all contexts are worlds apart. I'm sure this thing is somewhere in-between but leaning heavily towards the only-works-in-certain-contexts side.
There is more use for this product than just lawyer offices. My mother works for childrens services in washingon state.
she has to manually redact a lot of police reports & case worker reports.
granted, with this product, it would automate it, but someone will always have to proof read the document.
hopefully if DSHS gets these, she'll still have a job.
Let's see you last for 4 years when out-manned, out-gunned, out-fed, and out-funded.
The masses are the crack whores of religion.
Don't you mean censorship ?
Contrary to what 9/11 Truthies say there were all kinds of parts visible and recovered from the Pentagon site. A simple Google serach shows this. (But of course we all know that they were planted and the photos were doctored.)
Undetectable Steganography? Yep, there's an app fo
This machine would be interesting, or frightening, depending upon your point of view, if AI was anywhere near the kind of skill level you need so that this concept even remotely works. As it is, many intelligent people spill secrets they should not spill.
intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
It's called compartmentalization. Look it up.
Argument over, really. Your monolithic-entity fallacy fails, utterly.
Nice chatting with you.
...what sort of automatically-redacted copy will it make?
"How to Do Nothing," kids activities, back in print!
What's to stop it from holding our secrets hostage in an attempt to be given human rights?
I recall seeing a patent that did this. Instead of a photocopier though it did it through the network. So for example you sent an email externally or copied a file to a database with lower security access it would be auto-redacted.
Can't remember the number, but should be easy to find.
.... some p0rn, will it airbrush out the naughty bits?
Have gnu, will travel.
Putting aside the fact that OCR and related AI is still just this side of "not very good," for an AI to sucessfully and exclusively redact certain material, someone still has to at some point define the dataset of what is redactable, and feed that data into the machine. Unless, of course, this AI is simply allowed to crawl the networks and glean for itself what's good and bad for us...
Slashdot Burying Stories About Slashdot Media Owned
This scanner thing is already obsolete.
Redacting after the fact with trusted computing is the wave of the future; though it is a bit treacherous.
has been redacted. Welcome to the United Gulags of America, loozars.
PatRIOTically,
George W. Bush\.
Everyone seems to be automatically assuming that it would be used for classified data. This looks more to me like something developed for the businesses that have to deal with HIPPA. Well-defined medical forms (with SSN, name, etc in the same place every time) could automatically be redacted in order to ensure patient privacy and HIPPA-compliance. Looks like a win for the medical industry. It could also work well in the financial world where "need to know" information can be blacked out on financial forms and applications.
--Insert catchy
[sarcasm]
Oh cool, now all of the secret info will already be collected in the copier for any bad guy to harvest! how marvelous
[/sarcasm]
I worked at a company that had a "top secret" project that they were working on that if the internal name were revealed it could result in, well, not much really... but management was very paranoid that it would get out. This copier could have sensed the name and blanked it out when sales copied "sensitive" material accidentally. Nice.
Except for the fact that once you make the machine start thinking the user begins to stop thinking. If sales knew about this feature then they wouldn't be bothered to care at all what they were copying and sending out to customers. Eventually the copier wouldn't be a fail safe for the user but would be just a new liability for error. I can't see how this is really much better except it just shifts the blame to IT.
This idea was actually hatched at PARC. Present-day personnel were digging through old files to rediscover forgotten PARC inventions. It was originally used to redact humorous parts, so unfortunately the rediscoverers missed the punch line.
In Soviet Russia photocopier redacts you!
... uh, that was awfully close to truth.
The massive AI in Metal Gear Solid, intended to (among other things) remove all references to the names of The Patriots. Of course, to do this, it had to have that list embedded in it...
I haven't read anything that funny in years.
-Rick
"Most people in the U.S. wouldn't know they live in a tyrannical state if it walked up and grabbed their junk." - MyFirs
You are confused. Liberals are people who believe in reform based on scientific principles and understanding. Conservatives are people who are opposed to reform based on authority. These are the definitions. If there are people who deny scientific evidence, they are not liberals.
As for the Reagan quote in your sig, there's one that the current administration seems to be based on: "One way to make sure crime doesn't pay would be to let the government run it."
Shhhhh! They may not know we know!
Meet Blackpaper!
I just finished pulling about one ream of paper out of the guts of the Xerox machine at my office. Why SURE I trust the Xerox to redact documents while I'm getting more coffee! What could go wrong??
This unbiased moderation brought to you by the Porcine Aviation Group!
tm
Support TBI Research: http://www.raisinhope.org
Hear that ? That's sensitive information leaking.
In [redacted], [redacted] redacts [redacted]!
If the device has the ability to scan and recognize secrets, couldn't that ability be misused if it was hacked? The possibility, however remote, of storing and/or forwarding the redacted info isn't a danger?
And of course, context is utterly irrelevant. This thing redacts the clients name in a court document and automatically starts taking it out in invoices too. Hey! I want to find a law firm that uses this technology...
Sadly not, the southern states don't exactly produce, well, anything of merit.
the description given as "computer understands what you write" is an AI-complete issue.
So unless you've seen little robots lying around pondering about life and how superficial data was, I don't see that happening any time soon.
I've done some research on exactly the same issue, and let me tell you, the scientific community is light years away from "text understanding".
What we do in such cases is quite simple. I believe this is a classic search scenario. Given documents/lines/terms/ontology-members that are defined as "sensitive" and then apply X algorithm to detect them in all documents coming in for search.
Every algorithm has it's own pro's and con's mainly measured by their false positives and false negatives, each one suiting their ones own need. Algo's with very few missed positives end up having a LOT of false positives and vice versa.
So you either accept quite a few secrets not getting detected by the system or you accept very few secrets leaking out and a lot of normal text also blacked out.
I'm all for just printing black pages "just to be safe"
So, I either see the poster spewing crap to dramatize the slashdot post (what a surprise!) or the classic marketing people going mental and talk about things they have no idea about.
Hell, I won't even bother to check the xerox announcement.
Stay away from this tech.
1806 - carbon paper invented by Ralph Wedgwood
1806+1 week - oops! Wedgwood's secretary punches hole in said carbon paper, uses it anyway.
You are confused. Liberals are people who will spend all the money in other people's pockets to save animals and trees. Who cares about some stupid white bears that live where it's too cold for people anyways? It's not worth wrecking our economy over, which shows just how much I care about the bears and trees.
Conservatives are people who step back and observe the situation before they announce "NO, I'm not going to sacrifice all the gains of the 20th century just because someone who needs a bath and some life experience says the ice is melting."
Now run along before you get run over by a hummer.
No weapon in the arsenals of the world is so formidable as the will and moral courage of free men.-Ronald Reagan
Nice straw man. I'll build a straw woman for him to fu-- I mean marry.
Conservatives don't understand that cutting back on pollution that results in millions dollars worth of medical treatment is a worthy expenditure. They will happily obscure the skies with poison to save corporations a buck. Meanwhile we all have to breathe poisoned air because of their greed and ignorance. Unable to see past their goal of lowering taxes at any cost they wreck the governent all while pretending that the only victims of their stupidity are bears in climates that are too cold for us to live in.
Actually I don't know if I've built a straw man as much as I have accurately described the actions of conservatives.
These machines will upload all confidential material to a central Xerox server so that it can go through this information and improve the performance and accuracy of future similar products.....and.... . . . . Blackmail.....and eventually . . . . . World Domination bwahahahaha!
I'm looking outside right now. It looks fine. I'm breathing the air, and it's fine. The trees are fine, the sunshine is fine, the weather is mild and comfortable.
All I see is you making a case to raise my taxes. I'm going to stop you from spoiling this beautiful day.
No weapon in the arsenals of the world is so formidable as the will and moral courage of free men.-Ronald Reagan
I see a cloud of yellow smog above my city.
I see increased incidences of asthma in Houston and other industrial areas. I have asthma.
I see Republicans rolling back environmental protections knowing full well that they are litterally making us sicker.
I see you ignoring scientific evidence in favor ignorance and hollow argunments. I see a Republican.
.... you subscribe to the notion that when something comes out of AI that works, it's engineering; but when it doesn't work, it's still AI. This is a running joke among AI researchers. To some extent, it's justified. Really, how much AI is in alpha-beta pruning and pattern matching? However, this view point discredits every single bit of work that has been done in the field of AI, and trivializes all achievements after the fact. Your friendly robot factories? Automated airport trains? Data mining? And finally, what is arguably the crown jewel in robotic AI, autonomous cars? All courtesy of work in AI.
The biggest misconception that lay-people have of AI is that it is one giant engineering project, like the moon landing. This is completely wrong. Think of it rather as Physics: an on-going process through which we attempt to understand our own intelligence and recreate it in machines. To expect it to yield a fully-formed android within your lifetime is not only misguided, but displays a lack of understanding of what intelligence means. I can guarantee you that we will not notice when strong AI actually is created. It will probably take us a generation just to realize that computers will have had achieved consciousness a generation ago.
And btw - the two terms you're looking for are false negatives and false positives.
Those who can, do. Those who can't, sue.
. . . and slicked bread!
Bill Clinton carried many of these ideas you think are pretty silly and left with a budget surplus. George Bush, on the other hand, destroyed that surplus. So much for conservatives stepping back and observing the situation before they dive in. At least we didn't destroy our economy for nothing, though, right?
"Question with boldness even the existence of a god." - Thomas Jefferson
Bullish Machine Tzar
OLDSPEAK:Torture. NEWSPEAK:Rendition.
OLDSPEAK:Censorship. NEWSPEAK:Redaction.
I fail to see why I have to pay for you to live somewhere nice.
Get a job, get a life, get a house in a nice place, and quit trying to make me pay for your immorality, sloth, grunge, and whining.
No weapon in the arsenals of the world is so formidable as the will and moral courage of free men.-Ronald Reagan
Vote for Ron Paul.
Is he the pr0n guy? No wait, that's Ron Jeremy. I'd vote for him. Ron Paul is the other guy, the guy who writes his Es and Ls backward. No thank you.
If you are not allowed to question your government then the government has answered your question.
That you have to "train" it to recognize various document types, and then it redacts the same locations on subsequent documents that match the fingerprint.
If you are not allowed to question your government then the government has answered your question.
Sure, just as soon as you start paying for your own fucking highways. If you don't live in a big city, you are getting a lot more tax-funded projects than you are paying for. By the way, did you add that last noun in for an extra touch of irony? Maybe you should have gone all the way and added "childish, baseless assumptions" to the mix.
If you'd take the time to learn the history of your government, you'd have to logically conclude that 9/11 is not well beyond the possibility of an inside job. The real clincher here is that the case could have been decided reasonably one way or the other, but all the evidence was destroyed, confiscated, or simply didn't exist (the crash in Pennsylvania).
...block access to those sensitive or secure... Buy some other scanner!Clippy as a Revenant ? Dear Gods, NO! I do *not* want to see clippy with a rocketlauncher on each shoulder.
What a depressingly stupid machine.
Is Tip-Ex not being maintained any more then, or did Black Marker buy them out?
I have enough trouble just finding somebody who wants to see my sensitive or secure areas. I don't see how they will ever sell any of these.
I have a high paying job and I live somewhere nice (near my job).
Stop polluting the fucking air. Stop making it more profitable for companies to pollute my air.
Stop ruining America.
I pay for my highways when I fill up my Jaguar. And we should have more toll roads too, BTW. The people not paying for the roads aren't the rich like me, they are the poor who literally get a free ride.
No weapon in the arsenals of the world is so formidable as the will and moral courage of free men.-Ronald Reagan
Bah, another lib who doesn't have a brain.
No weapon in the arsenals of the world is so formidable as the will and moral courage of free men.-Ronald Reagan
lol - that's your best shot?
Nice taunt. Child.
No weapon in the arsenals of the world is so formidable as the will and moral courage of free men.-Ronald Reagan