One Petabyte of Data Exposed Via Insecure Big Data Systems
chicksdaddy writes: Behind every big data deployment is a range of supporting technologies like databases and memory caching systems that are used to store and analyze massive data sets at lightning speeds. A new report from security research firm Binaryedge suggests that many of the organizations using these powerful data storage and analysis tools are not taking adequate steps to secure them. The result is that more than a petabyte of stored data is accessible to anyone online with the knowledge of where and how to look for it.
In a blog post on Thursday, the firm reported the results of research that found close to 200,000 such systems that were publicly addressable. Binaryedge said it found 39,000 MongoDB servers that were publicly addressable and that "didn't have any type of authentication." In all, the exposed MongoDB systems contained more than 600 terabytes of data stored in databases with names like "local," "admin," and "db." Other platforms that were found to be publicly addressable and unsecured included the open source Redis key-value cache and store technology (35,000 publicly addressable instances holding 13TB of data) and 9,000 instances of ElasticSearch, a commonly used search engine based on Lucene, that exposed another 531 terabytes of data.
In a blog post on Thursday, the firm reported the results of research that found close to 200,000 such systems that were publicly addressable. Binaryedge said it found 39,000 MongoDB servers that were publicly addressable and that "didn't have any type of authentication." In all, the exposed MongoDB systems contained more than 600 terabytes of data stored in databases with names like "local," "admin," and "db." Other platforms that were found to be publicly addressable and unsecured included the open source Redis key-value cache and store technology (35,000 publicly addressable instances holding 13TB of data) and 9,000 instances of ElasticSearch, a commonly used search engine based on Lucene, that exposed another 531 terabytes of data.
They stole the data which I had stolen from the guys who stole it . Damn thieves !!
Worthless server logs???
Sure, nothing that would aid an intruder in server logs...
Wherever You Go, There You Are
I have worked at more than a couple of places where the Oracle database and application passwords were all the default from installation
It can take a lot of work to identify every dependency on the defaults once that they have been in use for a few years and way too many admins just do not want to deal with it
Oracle used to have a sense of humor about it with one of the key default passwords being 'change_on_install', now they force you into a password generation cycle at the end of any install
Moreover, never trust your users (consider admins as users), they can defeat any security scheme that you set up.
The only way to be certain is to consistently test for simple attack vectors before assuming that you have to deal with complex situations
Some admins want to spin far-out webs of security when their pants are down around their ankles, in many cases it is just so that they do not have to do their jobs
Wherever You Go, There You Are
There's no need to secure mongoDB because it's webscale. That means it's invulnerable to hackers and bad programming.
Just cruising through this digital world at 33 1/3 rpm...
Status of memcached is quite infortunate. We need it to share sessions across hosts, which is a requirement for load balancing, but it has no authentication feature
I read that latest versions support SASL, though.
So, how many of these databases contain Clinton's e-mail stash?
NoSql = NoSecurity
Table-ized A.I.
It's OK... it puts most of the bad guys over their data caps when they attempt to download it all.
Urethra Franklin is my fav.
It little behooves the best of us to comment on the rest of us.
Even one focuses on ID theft. But how about some one intentionally corrupting data such as the 'deleted_beacuse_you_didn't_password_protect_your_mongodb' entry.
By corrupting data you can create a 'Tuttle vs Buttle' event if those data are use for intelligence dragnets or throw a nice monkey wrench into someones high speed trading algorithm. Remember, your results are only as good as your data allow them to be.
putting the 'B' in LGBTQ+
its amazing they are havin an drought.
The tools aren't user friendly.
Setting up authentication for a web api should be trivial. Right now it's not--you can figure it out, but it's substantially more complicated than Googling "what authentication model should I use for this" and adding a couple of lines to your source files. Many programmers outside of critical areas are not going to spend enough time on it to get it right so long as that is true.
Making it worse is bad auth implementations by third-party providers which consume programmer-hours in debugging. (I'm looking at you, Facebook, with your really unhelpful error messages.)
more than a petabyte of stored data is accessible to anyone online with the knowledge of where and how to look for it.
(Readable sites and login-credentials) picts or it didn't happen.
On an on-topic item: I, too, worked for a company where the SOP was to run a NAS with over 12PB of storage and the default credentials were used "for support reasons." For the rounding-off-error area of 40TB I controlled I was finally able to extract a concession and change a single character of the password: an "o" to a "0".
At least it wasn't accessible on the internet. And that change kept anyone internally from logging into my section on the first try.
If the universe is someone's simulation -- does that mean the stars are just stuck pixels?
Of course, seeing as the USA is sick, they can now afford to go to their doctor thanks to the ACA.
"So long and thanks for all the fish."
Insert "MongoDB only pawn in game of life" reference here.
https://xkcd.com/1553/
I disagree. The problem is people that vastly over-estimate their own skills and insights and then proceed to mess it up. Authentication is never a trivial thing. Faking that triviality only makes things worse.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Yes and no. Just because it's hard to do well doesn't mean you can't make it easier to implement. The easier you make good programming, the more likely people are to do it. The entire point of API documentation, for example, is to make it easier to do good programming.
On web app scecurity, right now there's a hodge-podge of solutions and no clear industry leader for a secure and efficient answer.
Okay, that was over time top, I admit. How about "immature security"?
Table-ized A.I.