OCR Software Dev Abbyy Exposes 200,000 Customer Documents (bleepingcomputer.com)
A misconfigured MongoDB server belonging to Abbyy, an optical character recognition software developer, allowed public access to customer files. From a report: Independent security researcher Bob Diachenko discovered the database on August 19 hosted on the Amazon Web Services (AWS) cloud platform. It was 142GB in size and it allowed access without the need to log in. The sizeable database included scanned documents of the sensitive kind: contracts, non-disclosure agreements, internal letters, and memos. Included were more than 200,000 files from Abbyy customers who scanned the data and kept it at the ready in the cloud. "Some collection names like 'documentRecognition,' or 'documentXML' hinted that database would be part of a data recognition company infrastructure," Diachenko writes in a blog post today.
Don't bother keeping anything onsite. Thanks, AWS.
https://www.youtube.com/watch?...
Pain is merely failure leaving the body
I just assume that any online (cloud based or not) OCR or fax bridge site is going to store a copy of my document in an insecure way. I assume that employees of the service will have access to view my document. I haven't thought too much about them exposing my documents to the public, but it's not a huge step from what I already assumed about them. Anyway, the result is that I don't send anything sensitive or with information I wouldn't want publicly known through online OCR or fax. Because it would be crazy to upload my private sensitive documents to randos on the Internet and assume that they'll never be seen.
the DB itself was on the web? and not under some kind of proxy?
Notice how those who decided to have the scanning process outsourced to "somewhere in the cloud" will consider this a confirmation of their success. Now all the blame is assigned to Abby, and no blame assigned to them - exactly what they wanted to achieve, nothing less.
Yeah that's the thing...it is so easy to use and already comes with an API. The only reason your typical developer would put anything behind a proxy is to graft an API onto it.
I object to power without constructive purpose. --Spock
Couldn't a bunch of AWS customers band together to hire a security researcher to check their permissions? Or even Amazon itself on behalf of their clients?
Granted, there are issues of what companies want public and what they want private. I'm guessing anything bigger than a gig might trigger a warning, as would anything with personal data.
Then again, I've never used the cloud for anything more than transferring stuff from my phone to my PC, or vice versa, and have never used AWS. So I have no real experience with the issues, but am willing to bloviate on them, just like Trump.
My basic assumption for anything being OCR'd effectively free in the cloud through a software provider is that it's not safe. Could be sloppiness (as in this case), could be automated OCR+human verification.
While I actually do have a couple of Abbyy programs installed (FineScanner Pro and Business Card Reader Pro), I've never actually made serious use of them. On the other hand, I do use Microsoft's Office Lens program which provides much of the same capabilities - but provided under the Office 365 bundling and with a lot more resources devoted to security.
I used to have a couple more photo-to-scanned-document software packages as well, but I decided to dump anything like that from Russian companies some time back. Abbyy's presence is as much an oversight of "haven't cleaned in a while" as anything else.
fencepost
just a little off
Or ... why is it always mongodb?