Xerox Confirms To David Kriesel Number Mangling Occuring On Factory Settings
An anonymous reader writes with a followup to last week's report that certain Xerox scanners and copiers could alter numbers as they scanned documents: "In the second Xerox press statement, Rick Dastin, Vice President at Xerox Corporation, stated: 'You will not see a character substitution issue when scanning with the factory default settings.' In contrast, David Kriesel, who brought up the issue in the first place, was able to replicate the issue with the very same factory settings. This might be a serious problem now. Not only does the problem occur using default settings and everyone may be affected, additionally, their press statements may have misled customers. Xerox replicated the issue by following Kriesel's instructions, later confirming it to Kriesel. Whole image segments seem to be copied around the scanned data. There is also a new Xerox statement out now."
Swapping numbers while copying may seem like bizarre behavior for a copier, but In comments on the previous posting, several readers pointed out that Xerox was aware of the problem, and acknowledged it in the machine's documentation; the software updates promised should be welcome news to anyone who expects a copier to faithfully reproduce important numbers.
The old analog process never had this problem.
69 dude!
Now if 6 turned out to be 9, ...if all the hippies cut off all their hair,
I don't mind, I don't mind,
I don't care, I don't care.
Dig, 'cos I got my own world to live through
And I ain't gonna copy you.
“He’s not deformed, he’s just drunk!”
What???? A copier changes numbers? A copier is supposed to copy.
to see Xerox fall to this kind of hand-waving. Mr Rick should either publicly apologize or leave his post. You might say this event does not warrant such a response, however i argue that it does.
from a copied report that changed a 3 to an 8....
Did this tool try to notify Xerox first or did he just start shouting from the mountain tops?
It isn't a security issue so the only purpose served by his going public without him contacting Xerox is to stroke his ego.
How would any of you like it if someone found a bug in your stuff and instead of notifying you, went to your managers and bad mouthed you?
You'd think he was a prick.
Am I the only one who finds this truly frightning; that the photocopier has a bug in a sub system that is basically reading the content of the documents being photocopied? I didn't even know photocopiers did this normally. This is another prime example of how organizations like the NSA can theoretically get their fingers into cracks we didn't even know existed. I would never have thought that something I photocopy could be intercepted, but apparently it can. The bug part of this issue is just a small thing relative to the larger issue, IMHO!
By the way, I read in another comment about the new slashdot ipad app. I'm posting this comment from it. What a breath of fresh air compared to the slashdot mobile site!
The potential for damage with this kind of error almost can't be overstated. Besides errors in billing, construction, manufacture or products, medicine dosages, etc. already outlined, there are other likely problems:
Publications may contain wrong data.
Scientific conclusions may be based on wrong data.
Government policy may be based on wrong data.
Money may go to wrong accounts or be taken from wrong accounts.
You think you paid your taxes? The government may not agree.
Am I the only one who finds this truly frightning; that the photocopier has a bug in a sub system that is basically reading the content of the documents being photocopied?
Yes, you should find that frightening. That's not new, though, pretty much all photocopiers these days don't actually "photocopy" the document, they scan it to memory and then print the scan. Your documents are saved to memory on the photocopier. Yep, that's a security flaw.
http://www.thedailygreen.com/environmental-news/latest/digital-copier-security-461009
http://www.cbsnews.com/8301-18563_162-6412439.html
http://message.snopes.com/showthread.php?t=60313
http://www.geoffreylandis.com
Time to buy a Ricoh.
At least they don't monkey with the compression to the level it actually distorts the image.
This signature is lame.
Back when I saw the first scanner based copiers roll out I'd thought we see something similar to this happen. Whenever you eliminate the analog signal path it becomes much easier to corrupt the thing in unnoticeable ways, even unintentionally! It's clearly the way to go, because of how much complexity it removes, but as soon as you start storing data on a medium and read it back you start having these problems, it only gets worse as you try and conserve that storage medium with compression or other tricks/hacks. It's just a fact of life in the digital age: the tradeoffs are still better than the previous way of doing things. (Well that is unless your name was "Mr. Buttle" and the ministry of information drilled a hole in your ceiling).
I am just really glad to see that Xerox is taking the initiative, working with closely with the person who found the problem, and opening it's doors to others who want to help out. It's all too often that a big company has a big obvious problem with a product and not only doesn't admit there's a problem, but refuses to help or work with those experiencing them.
Coming soon ... Xerox voting machines.
Remember when Xerox commercials featured a monk copying documents? Their ad agency was trying to humanize the company.
So all they've done now is add an algorithm for random human error. Just making the company more human... monks did that as well.
We've got a XEROX 7556 in the office and I scanned several number heavy documents, with fonts as small as 6pt. I tried both the default and low res levels. Every number came out correct. Since we recently moved to paperless records (and we had 100's of thousands of multipage documents) I was a bit worried. I'm less worried than I was when the story first came out. Lets hope the upcoming fix doesn't slow the scanning process noticeably.
Where all think alike, no one thinks very much.
I'm not saying that you're wrong, but I would like to know how reliable your information is.
Do you work, or have you worked, directly for Xerox on these sorts of products?
If you have not, how did you come upon this information? Is it based on actual specifications or design documents? Or is it based on speculation?
They meant to admit this to the public last week, but their press release got its letters changed around for some reason...
At the federal level, our entire legal system is based on the concept that a machine copy of a document is as good as the original. In addition to all the other problems pointed out by other readers -- engineering errors, medical errors, financial errors, this type of error also greatly harms our legal system as well. A problem since the legal system is essentially the operating system for our society. I don't see how Xerox is going to survive the wave of lawsuits that is going to follow. They need to immediately warn everyone to stop using their systems, and then recall all affected units. Going forward, I suspect that the name "Xerox" will now mean: "to mangle or randomly distort".
Numbers are the bedrock of the capitalist regime. They are sacred. Do not transform them when copying them. Better to mangle words cause we all know they have semiotic plasticity anyway. But for the love of the capitalism and all it portends, please keep the numbers pure. That is all.
Time to buy a Ricoh.
At least they don't monkey with the compression to the level it actually distorts the image.
Any compression at all, any modification at all, is unacceptable in a copier. How do you not get that?
How soon until they'll patent this as a feature and try to sue someone else?
I expect a copier to copy an image of the page, not to perform an OCR scan and reprint it.
What's next? An NSA back door so the scanned text can be fired off to the US spy network?
I do not fail; I succeed at finding out what does not work.
Are you sure that you read the article?
Please quote the exact sentence or sentences that describe that the machines operate as you claim they do. I expect to see explicit references to the scanning process, the storage to some storage medium, the compression, and the printing based on that compressed and stored representation.
I do not see the words "store" anywhere on that page. The words "storage" and "print" do appear, but they are outside of the article in completely unrelated text.
Please just come out and admit that your claims are not reliable, and that they are based on pure speculation, if that is indeed the case (as it does appear to be).
I wonder if this is caused by an anti-copy feature that just hits the innocent.
Nobody learns from others mistakes, eh Xerox?
The copiers are failing to copy numerals properly.
worse and worse. As in ratio of 5 to downward spiral. In was 'in the tea I BSDI is also dead, there are only was after a long
Could someone explain what the purpose of this compression is? There must be enough memory to copy without the heavy compression since there are high resolution presets with less compression also. Is this compression used as a way to lower the resolution? I don't see the added value of the compression at all.
http://www.dkriesel.com/en/blog/2013/0808_number_mangling_not_a_xerox-only_issue
And one of the comments to that posting says:
I have experimented with the open source jbig2enc library available at http://github.com/agl/jbig2enc, which has a encoding parameter called the “threshold”, described like this:
“sets the fraction of pixels which have to match in order for two symbols to be classed the same. This isn't strictly true, as there are other tests as well, but increasing this will generally increase the number of symbol classes”
The included command tool accepts values for this parameter between 0.4 and 0.9, with 0.85 as the default.
I have found replaced digits in single-page numerical tables encoded with this parameter set as high as 0.82. As with the other examples you have found, the errors are not in any ways obvious to the eye which is, of course, the real problem.
Since JBIG2 has been supported in PDF since 2001, it would be surprising if only Xerox have fallen into this trap.
Just as well for Rick, he outsourced this work to HCL. They'll clean up the mess left by those lazy, grasping American engineers in no time at all!
Stick Men
The U.S. government says it can require companies to do things, and the companies have to keep it secret. There does not seem to be any limitation.
Thirty plus years as a professional engineer - the lifeblood is "blueprints". This has always been a significant issue, regardless of the technology involved, there WILL be reproduction errors. Be it because of dirt on the optics, spilled coffee on the originals, scratches on the mylar / sepia, or bad diazo paper; EVERYBODY with any sense knows to check and double check anything which does "not add up". Hence why checksum was developed for electronic data processing. ... ad infenitum; I WANT the original file translated into the oldest format available, preferrably human readable! With electronic signatures; but the suit weasels in industrial corporations use my PE status to make me the scapegoat for all their deliberate ignorance and just plain stupidity.
The worst is to try to use a pdf of a tiff of a pdf of a jpg of a
Time to Go Galt and let their progeny "pick grit with the chickens" (Sen. Al Simpson).
With memory so cheap, why not just store uncompressed bitmaps? Problem solved.
The fact that this is even POSSIBLE makes me worry that there's covert firmware deliberately tampering with things.
First of all, how does it even know what a number *looks like*?
And how the hell does it SWAP numbers?
I've never known decompression artifacts to do that. It's just plain loony.
Something seems decidedly fishy here.
I think everything else you wrote was good but in the case of disclosing security attack vectors, letting everyone know or only letting hackers know, before giving the company a chance to fix the security hole results in a great many more hackers using the attack vector than if it had been reported without public disclosure. We have no idea who figured out the attack vector first, the researcher could very possibly be first, or be one of the first, to discover it. Do hackers always share attack vectors with other hackers immediately after finding them?
Security bugs are very different from functionality bugs and should not be compared. Similarly the disclosure of these bugs should follow different paths.
Think globally but act within local variable scope.
I'm sorry, but this story is absolutely outrageous. Photocopy machines are used in mission critical situations. Users here the word PHOTOCOPY and expect the copies are like photographs- one-for-one duplicates, with the proviso that lowering the resolution causes universal copy degradation. No-one expects SEMANTIC compression, or any form of OCR/repeating pattern compression. The photocopier does NOT have a clue as to the nature of documents being copied, and CANNOT make any assumption about the semantic properties of the document.
The ONLY compression algorithms acceptable are JPG and similar syntactical spatial compression algorithms. Yes, JPG requires a VERY high setting when used in something like a photocopier, so NO document, no matter what its content, is degraded too much by the compression, but this is unavoidable. Such high quality JPGs are no where near as 'small' as those usually encountered, but they will still give significant memory savings over the original bit-image capture.
So called document compression algorithms are ONLY suitable for simple books with straightforward, non-challenging fonts in fairly large sizes. Even then, such algorithms are KNOWN to have major issues, and are only really suitable for documents that are casually read (text ONLY) and NOT for documents where the accuracy of equations or lists of numbers will matter to the reader. You will notice that books you download that contain loads of FINE detail, like equations, tend to be JPG scans of the page (often in a PDF container).
There is ZERO chance that a Xerox software update will fix this problem. Instead, it will simply make the problem less obvious, so that only when the rocket explodes or the plane crashes will accident researchers discover that Xeroxed documents had altered certain details on the copied document. Every mission critical environment is under a legal obligation to throw out ALL their Xerox machines, and to mark the brand as dangerous.
Xerox needs to trash EVERY model that uses semantic document compression algorithms (which stay highly dangerous no matter how high you raise their settings) and design new models with lossless compression, or very high setting JPG like compression. If the level of compression creates files too large for their current storage methods, they'll have to build decent hard-drives into their more expensive models to hold current document runs. For god's sake, HDD costs per TB have been tiny for ages now.
Xerox proves that for too many companies, there are no such things as competent software engineers. An ENGINEER doesn't just know of a technique, he/she understands when the technique is applicable, and also how to do test research proving that a proposed solution is acceptable. How the hell did Xerox EVER authorise compression algorithms that EXPECT certain forms of document, when you can stick ANY form of document into your photocopier?
How is this information worthy of slashdot.... What in the world would make somebody think this is interesting... Somebody who has obviously not called the customer service department of virtually any company.
Whatever, downvote, but good grief.
Can you please show some evidence proving that you actually are a Xerox technician?
This explains the shitty character sheet scans.
Yes, faxes? Remember them?
They're still widely used in many industries today. In fact, I applied for an Apple Developer account in a company name not too long ago and, unlike with an individual account, there is some paperwork involved that Apple insist must be faxed to them. Apparently it's more secure. Anyway, I'm not ranting about that issue today, but more the widespread use of faxes in the area of Law.
Lawyers love faxes. They fax everything they can. A lot of them are using email more and more these days, but faxes are still a critical part of their business.
Most faxes can use JBIG compression. High-end faxes use JBIG2 compression. This compression is what's been blamed in this Xerox issue. How many faxes have been received over the years that have been subject to silent modification of the information?
It's not hard to imagine a legal situation where just one number modified on a page could prove to be very expensive...
Specialist Mac support for creative pros, Melbourne