OCR Software Dev Abbyy Exposes 200,000 Customer Documents (bleepingcomputer.com)
A misconfigured MongoDB server belonging to Abbyy, an optical character recognition software developer, allowed public access to customer files. From a report: Independent security researcher Bob Diachenko discovered the database on August 19 hosted on the Amazon Web Services (AWS) cloud platform. It was 142GB in size and it allowed access without the need to log in. The sizeable database included scanned documents of the sensitive kind: contracts, non-disclosure agreements, internal letters, and memos. Included were more than 200,000 files from Abbyy customers who scanned the data and kept it at the ready in the cloud. "Some collection names like 'documentRecognition,' or 'documentXML' hinted that database would be part of a data recognition company infrastructure," Diachenko writes in a blog post today.
Don't bother keeping anything onsite. Thanks, AWS.
https://www.youtube.com/watch?...
Pain is merely failure leaving the body
I just assume that any online (cloud based or not) OCR or fax bridge site is going to store a copy of my document in an insecure way. I assume that employees of the service will have access to view my document. I haven't thought too much about them exposing my documents to the public, but it's not a huge step from what I already assumed about them. Anyway, the result is that I don't send anything sensitive or with information I wouldn't want publicly known through online OCR or fax. Because it would be crazy to upload my private sensitive documents to randos on the Internet and assume that they'll never be seen.
Since a long time as years ago there were not many OCR on Mac to choose from: https://ocrkti.com So far it just works for me, blazing fast, too!
I have absolutely zero pity for the companies/people who uploaded such data to abbyy's servers. They perfectly knew what they were doing. You don't store private data unecrypted in the cloud unless you want to share it with the entire world.
the DB itself was on the web? and not under some kind of proxy?
Maybe it was too hard to properly configure NAT for the poor folks, so they just assigned elastic IP to everything with traffic allowed from 0.0.0.0/0.
Notice how those who decided to have the scanning process outsourced to "somewhere in the cloud" will consider this a confirmation of their success. Now all the blame is assigned to Abby, and no blame assigned to them - exactly what they wanted to achieve, nothing less.
Anyone notice that these type of DB exposure issues almost ALWAYS happen on AWS. I'm not saying AWS in insecure because it's not but why is it always someone hosting on AWS that has this issue.
I know they should not have to, but I wonder how much information would still be private if amazon pushed a nagging splash screen on the management portal reminding it's customers to actually secure their databases? Or even better run a basic security scan on their clients address space. Even a trivial scan will catch most of these unsecured databases.
Yeah that's the thing...it is so easy to use and already comes with an API. The only reason your typical developer would put anything behind a proxy is to graft an API onto it.
I object to power without constructive purpose. --Spock
Couldn't a bunch of AWS customers band together to hire a security researcher to check their permissions? Or even Amazon itself on behalf of their clients?
Granted, there are issues of what companies want public and what they want private. I'm guessing anything bigger than a gig might trigger a warning, as would anything with personal data.
Then again, I've never used the cloud for anything more than transferring stuff from my phone to my PC, or vice versa, and have never used AWS. So I have no real experience with the issues, but am willing to bloviate on them, just like Trump.
My basic assumption for anything being OCR'd effectively free in the cloud through a software provider is that it's not safe. Could be sloppiness (as in this case), could be automated OCR+human verification.
While I actually do have a couple of Abbyy programs installed (FineScanner Pro and Business Card Reader Pro), I've never actually made serious use of them. On the other hand, I do use Microsoft's Office Lens program which provides much of the same capabilities - but provided under the Office 365 bundling and with a lot more resources devoted to security.
I used to have a couple more photo-to-scanned-document software packages as well, but I decided to dump anything like that from Russian companies some time back. Abbyy's presence is as much an oversight of "haven't cleaned in a while" as anything else.
fencepost
just a little off
Why would a dev have prod access?
Same mistake made for decades but people still have not learned.
Go fast, get to the cloud! It only seems fast because people are ignoring security and proven practices because it is "cloud"