NSA Makes Contribution To Apache Hadoop Project
An anonymous reader writes "The National Security Agency has submitted a new database, Accumulo, to the Apache Foundation for incubation. Accumulo is based on the original BigTable paper with some extensions such as the ability to provide cell-level security. It appears there are some hurdles that must be cleared concerning copyright before the project could be accepted."
It's a trap! It HAS to be. /tinfoil
I'm a heavy user of open source and GPL software, but I admit to not knowing the nuances of open source licenses. My question is this: why are corporations donating their code to Apache instead of just releasing them through the GPL and Sourceforge? Oracle recently did this as well with OpenOffice, and I seem to vaguely recall a few others.
The government is explicitly exempt from the ability to claim copyright. There is no problem.
It's a trap! It HAS to be. /tinfoil
No, no, it's not a trap, not in the slightest. Just insert your penis into this device... I assure you, it's not a meat-grinder, really, it's not! And I didn't have my fingers crossed when I said that, not even a little bit.
Those who can make you believe absurdities can make you commit atrocities. - Voltaire
You're either trolling terribly or just terribly ignorant. In the hopes of the latter:
The Apache Foundation maintains many open-source software projects, one of which is a popular web server. Another is Hadoop, which is a distributed file system for storing huge amounts of data on a cluster of individual computers, based on Google's Google File System and other similar technologies.. To facilitate access to that data, there are other projects that function as databases, with the actual information stored in Hadoop. One existing project is HBase, which is an implementation of a system (called BigTable) described by Google. Now, the NSA has donated the source code for their own such database (also based on BigTable) to the Apache Foundation.
Now, there are a lot of Apache Foundation projects, and never enough time or people to maintain them all completely. The best projects are considered "mature", and the ones that aren't up to the normal Apache levels of quality and support and considered to be in "incubation". Someday, if enough people like Accumulo and help with it, it will mature.
You do not have a moral or legal right to do absolutely anything you want.
How do you "submit" a databse?
It turns out that if you read sentences all the way to the end, they become a lot more clear.
Wikipedia is your friend.
The National Security Agency is a cryptologic intelligence agency of the United States Department of Defense responsible for the collection and analysis of foreign communications and foreign signals intelligence, as well as protecting U.S. government communications and information systems,[1] which involves cryptanalysis and cryptography.
Database Engine* if you want to be pedantic about it.
The Apache Foundation is not a "website server outfit". It's an open source community with their own licenses that contributes to a variety projects. But yes, much of their work is related to developing servers, engines and protocols for the web.
I am tired of typing, so see BigTable and google Cell Level Security. I have not really heard of the latter, but I think it's basically the policy used to grant or deny cell level access to various users in each row of a table.
Honestly, if you have not even heard of many of the names used in the summary, this article probably holds no interest to you.
You're either trolling terribly or just terribly ignorant. In the hopes of the latter:
I believe you're right on both counts. He's trying to troll but he's also terribly ignorant.
Seriously, fuck you. Not everyone is up on the latest fads and buzzwords in the software "industry", OK? Nothing at all made the barest sense to me in that headline.
"Another is Hadoop, which is a distributed file system for storing huge amounts of data on a cluster of individual computers, based on Google's Google File System and other similar technologies.."
There, I knew you could do it. Was that too much to hope for?
"To facilitate access to that data, there are other projects that function as databases, with the actual information stored in Hadoop."
What's Hadoop? Do you at least understand that software's massive hierarchy of vaguely interrelated concepts is not exactly crystal clear to an outsider?
But, this article really doesn't apply to someone outside the Software or IT industry. For instance, If you or I read an article here about life sciences, semiconductors or black holes, we wouldn't go about flaming about why the summary didn't explain basic concepts, would we?
NSA has been trying for decades to get vendors to get serious about security, without much success. One of NSA's units is the Central Security Service, the defensive side, which develops and tests security technologies for Government and military use. They have people testing safes and locks, for example.
Back in the 1980s, NSA tried applying that approach to computing, with the Trusted Computer System Evaluation Criteria. Systems were classified from A1 down to D. A very few specialized systems made it to an A level, but most commercial systems couldn't come close.
Manufacturers hated the testing procedure. Software vendors are used to controlling their own Q/A process. The NSA approach came from the test procedures for safes and padlocks - vendors could submit something, and it was tested by NSA personnel against NSA criteria. If it failed, the manufacturer got a list of defects, which was not necessarily complete. The manufacturer could resubmit the product, and NSA would retest it, on a strictly pass/fail basis. No third try was allowed, and failure was publicly announced by NSA.
After a decade of screaming and foot-dragging by vendors, the "common criteria" security scheme replaced the TCSEC in 2002-2005. This is much more "vendor friendly". The most strict levels of the TCSEC criteria were removed. Security evaluation is mostly done by outside labs, not NSA, and the vendor pays for and controls the process. The vendor can keep trying to pass as many times as they want. Failure is not publicized.
A reasonable number of systems meet some levels of the common criteria, but nothing below EAL5 really means much. Windows XP made it to EAL4.
NSA has tried, with NSA Secure Linux, to get people to take mandatory security seriously. NSA Secure Linux has "mandatory security", where there are levels and compartments which create boundaries data is not allowed to cross. Think of everything being in its own sandbox, with limited and tightly controlled intercommunication between sandboxes.
The point of that is not that NSA Secure Linux is a highly secure implementation of mandatory security. It was to get people to implement, modify, and partition applications so that they could work under a mandatory security model. A web browser, for example, would have to be structured so that the parts which could open local files were completely separated from the parts that communicated with the untrusted outside world. This didn't catch on in the browser world, although finally, a decade or so too late, browsers are starting to to run Flash in sandboxes.
NSA keeps trying. This new database is one for which fine-grained access control is possible. The challenge is to write apps that can live with such tight controls. They're trying to get people to get serious about security.
(It's been a long time, but I used to work on this stuff.)
Maybe my reply holds no interest for you? Oh yeah, fuck you too.
I know NSA doesn't have the best 'street-cred' but remember that they are the folks that brought up SELinux. When they are working for security they generally know what they are talking about. Has anyone had any experience installing software on a NSA machine? If you have then you know the hurdles and testing that takes place to get something usable. They LOVE security and really just want you to love it as much as they do.
OMG facts!
...the least our overlords can do is pitch in on building the databases our overlords are going to store all that crap they recorded about us.
Just a nit-pick, but the main value of Hadoop is to run distributed map-reduce applications across individual computers. The Hadoop file system is often used along with it, but other distributed file systems can be used in its place.
... the best security programmed in software can and will be breached by other means. This emphasis on security IMHO is misplaced, if you want something secure you don't hook it up to the outside world.
I think you should be appointed the editor of something like simple.slashdot.org (similar to simple.wikipedia.org). Great summary!
..."Another is Hadoop, which is a distributed file system for storing huge amounts of data on a cluster of individual computers, based on Google's Google File System and other similar technologies.."
There, I knew you could do it. Was that too much to hope for?
"To facilitate access to that data, there are other projects that function as databases, with the actual information stored in Hadoop."
What's Hadoop? Do you at least understand that software's massive hierarchy of vaguely interrelated concepts is not exactly crystal clear to an outsider?
So, you quote the information, then you ask for that information...
You're fucking retarded.
My question is this: why are corporations donating their code to Apache instead of just releasing them through the GPL and Sourceforge?
Corporations that want to continue to use the code are more likely to donate to Apache and use the Apache License. Fewer strings attached, much lower likelihood of unpleasant surprises in the next version of the license, etc. Basically open source without the politics and drama.
Corporations that are essentially abandoning/discontinuing the software are more likely to just putting it up on SourceForge and be done with it.
i think you might enjoy the book "Shadow Factory" by James Bamford,
or maybe you might like the PBS Frontline special about his book, available online at pbs.org (the video is called Spy Factory for some reason)
You are describing software testing in the 1990s. Thomas Drake was heavily involved in software testing, and worked for NSA contractors until 2001, when he was hired at NSA itself.
After 9/11, he got disturbed with some of their wasteful practices . . . I am wondering if 'vendor friendly' software testing was one of the practices he might have had a problem with.
The DoD IG report on Trailblazer is still mostly redacted... the public is left in the dark about these things.
The article mentions a technicality in Apache's current contributor license agreement that appears to bar Apache from accepting public domain work because there is no copyright owner to grant an explicit copyright license.
Well stop being a retard and you wont get flamed.
We're going to see more of this sort of thing. Almost everyone assumes that all software is copyrighted, or that only the copyright holder can release software as free/libre/open source software (FLOSS). Neither are true!! This matters when the US government gets involved, because its "normal" rules are really different from most organization's.
For example, if a government employee develops software as part of his official duties, then in practically all cases that software is NOT subject to copyright in the US (per US law 17 USC 105). It's not just that the author doesn't have copyright; there IS no copyright in the US. Also, when a contractor writes software, the government often receives all the release rights as if it was the copyright holder yet it is not the copyright holder (these are called "unlimited rights"). In this case, the government can release the software as FLOSS, on its own initiative, even though it is NOT the copyright holder. For more details, see: Publicly Releasing Open Source Software Developed for the U.S. Government.
The US government spends billions of dollars each year developing software. It's my hope that, over time, it will release more of the software it develops to the people who paid for it.
- David A. Wheeler (see my Secure Programming HOWTO)
It seems that the extra cell-level security is more of a capability, in that you can categorize (or add a label to) a cell and when you query you specify the access level you have...and the result is included or not depending.
I wonder how it deals with "lost security levels?" If you don't know the security level of a cell, you can't ask for it. If everyone forgets, then the data just sits around, waiting to be pruned. How can you tell the difference between a resource leak and unarchived classified documents that you can't get to?
I suppose that's one of those odd problems that only happens in government. "Why is the database only returning 10 results to me when the database itself is over 16PB?" More amusingly, if the total amount of data used by an NSA system is classified, who has enough information to order more storage?
If you're an outsider, why do you care about an article that essentially only matters to insiders? And while we're explaining the intricacies of the software industry, I will take the opportunity to introduce you to this wonderful invention. It's called a search engine. When you don't know what something means, you can search for it yourself, therefore avoiding looking both ignorant and lazy.
By the by, this is /. Notice the subtitle: "News for nerds". I think you may be lost. You may feel more comfortable here: http://digg.com/ or perhaps here: http://myspace.com./
"We live as though the world were as it should be, to show it what it can be." - Joss Whedon via Angel
Unfortunately the NSA exports something that the USA has in abundance - fear and distrust. It hasn't always been this way, and I can remember when people tended to trust each other much more. In the 1950's in my native New Zealand people didn't even feel the need to lock their doors at night. But since the USA has started bombing countries at random, and letting its own citizens arm themselves with whatever calibre weapon they feel like, the world has changed. And it wasn't terrorists that changed the world - it was the USA's fear and distrust that has changed the world for the worse. Distrust is contagious. If you distrust me then I am going to distrust you and probably everyone else. Seed people's minds with distrust and this is what you get from them. In any case, the NSA's paranoic attitude to systems and security is merely a cover for their nefarious reasons for "rating" secure systems - they want to know the worst of what is out there that they may have to deal with. Only a fool would allow an organization like the NSA "rate" their safe, their locks, or their computers. The knowledge that the NSA gains in such exercises is their raison d'etre, and foolishly submitting any secure item to the NSA for their rating only conveniently lets the NSA know that one has such a item when otherwise they might be totally unaware of the item's existence.
Maybe I should set up a gold rating agency. You show me your secret stash of gold and I will tell you whether it is secure or not.
That's an interesting problem - most open source licenses depend on copyright for enforcement. If there is no copyright, those licenses can't be used. Is there a way to incorporate?
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
It appears there are some hurdles that must be cleared concerning copyright before the project could be accepted.
Wait... what? I'm not trying to be a grammar nazi, I'm genuinely confused. ARE some hurdles that must be CLEARED (future tense)... COULD be accepted (past test)? Which is it-- do the hurdles still need to be cleared before the project can be accepted, or have the hurdles been cleared and the project accepted?