NSA Makes Contribution To Apache Hadoop Project
An anonymous reader writes "The National Security Agency has submitted a new database, Accumulo, to the Apache Foundation for incubation. Accumulo is based on the original BigTable paper with some extensions such as the ability to provide cell-level security. It appears there are some hurdles that must be cleared concerning copyright before the project could be accepted."
It's a trap! It HAS to be. /tinfoil
It's a trap! It HAS to be. /tinfoil
No, no, it's not a trap, not in the slightest. Just insert your penis into this device... I assure you, it's not a meat-grinder, really, it's not! And I didn't have my fingers crossed when I said that, not even a little bit.
Those who can make you believe absurdities can make you commit atrocities. - Voltaire
Apache Brand
Our interest in releasing this code as an Apache incubator project is due to its strong relationship with other Apache projects, i.e. Accumulo has dependencies on Hadoop, Zookeeper, and Thrift and has complementary goals to HBase.
'The tyrant will always find pretext for his tyranny.' - Aesop's Fables
But other companies and individuals that produce works do get copyright. While they may give the government (and even the NSA) a license to use their works, the government can't just donate those works off to Apache without clearing it first. That means any code the NSA didn't write themselves needs to be removed, replaced, or also donated by the owner.
You do not have a moral or legal right to do absolutely anything you want.
You're either trolling terribly or just terribly ignorant. In the hopes of the latter:
The Apache Foundation maintains many open-source software projects, one of which is a popular web server. Another is Hadoop, which is a distributed file system for storing huge amounts of data on a cluster of individual computers, based on Google's Google File System and other similar technologies.. To facilitate access to that data, there are other projects that function as databases, with the actual information stored in Hadoop. One existing project is HBase, which is an implementation of a system (called BigTable) described by Google. Now, the NSA has donated the source code for their own such database (also based on BigTable) to the Apache Foundation.
Now, there are a lot of Apache Foundation projects, and never enough time or people to maintain them all completely. The best projects are considered "mature", and the ones that aren't up to the normal Apache levels of quality and support and considered to be in "incubation". Someday, if enough people like Accumulo and help with it, it will mature.
You do not have a moral or legal right to do absolutely anything you want.
How do you "submit" a databse?
It turns out that if you read sentences all the way to the end, they become a lot more clear.
NSA has been trying for decades to get vendors to get serious about security, without much success. One of NSA's units is the Central Security Service, the defensive side, which develops and tests security technologies for Government and military use. They have people testing safes and locks, for example.
Back in the 1980s, NSA tried applying that approach to computing, with the Trusted Computer System Evaluation Criteria. Systems were classified from A1 down to D. A very few specialized systems made it to an A level, but most commercial systems couldn't come close.
Manufacturers hated the testing procedure. Software vendors are used to controlling their own Q/A process. The NSA approach came from the test procedures for safes and padlocks - vendors could submit something, and it was tested by NSA personnel against NSA criteria. If it failed, the manufacturer got a list of defects, which was not necessarily complete. The manufacturer could resubmit the product, and NSA would retest it, on a strictly pass/fail basis. No third try was allowed, and failure was publicly announced by NSA.
After a decade of screaming and foot-dragging by vendors, the "common criteria" security scheme replaced the TCSEC in 2002-2005. This is much more "vendor friendly". The most strict levels of the TCSEC criteria were removed. Security evaluation is mostly done by outside labs, not NSA, and the vendor pays for and controls the process. The vendor can keep trying to pass as many times as they want. Failure is not publicized.
A reasonable number of systems meet some levels of the common criteria, but nothing below EAL5 really means much. Windows XP made it to EAL4.
NSA has tried, with NSA Secure Linux, to get people to take mandatory security seriously. NSA Secure Linux has "mandatory security", where there are levels and compartments which create boundaries data is not allowed to cross. Think of everything being in its own sandbox, with limited and tightly controlled intercommunication between sandboxes.
The point of that is not that NSA Secure Linux is a highly secure implementation of mandatory security. It was to get people to implement, modify, and partition applications so that they could work under a mandatory security model. A web browser, for example, would have to be structured so that the parts which could open local files were completely separated from the parts that communicated with the untrusted outside world. This didn't catch on in the browser world, although finally, a decade or so too late, browsers are starting to to run Flash in sandboxes.
NSA keeps trying. This new database is one for which fine-grained access control is possible. The challenge is to write apps that can live with such tight controls. They're trying to get people to get serious about security.
(It's been a long time, but I used to work on this stuff.)
I know NSA doesn't have the best 'street-cred' but remember that they are the folks that brought up SELinux. When they are working for security they generally know what they are talking about. Has anyone had any experience installing software on a NSA machine? If you have then you know the hurdles and testing that takes place to get something usable. They LOVE security and really just want you to love it as much as they do.
OMG facts!
...the least our overlords can do is pitch in on building the databases our overlords are going to store all that crap they recorded about us.
Just a nit-pick, but the main value of Hadoop is to run distributed map-reduce applications across individual computers. The Hadoop file system is often used along with it, but other distributed file systems can be used in its place.
... the best security programmed in software can and will be breached by other means. This emphasis on security IMHO is misplaced, if you want something secure you don't hook it up to the outside world.
I think you should be appointed the editor of something like simple.slashdot.org (similar to simple.wikipedia.org). Great summary!
That means any code the NSA didn't write themselves needs to be removed, replaced, or also donated by the owner.
Unless that code was "work for hire". If so the contractor (individual or company) has no rights to it, just like any other employee.
Works of the US government are public domain, and thus can't be released under the GPL. That's the copyright issue mentioned in the summary.
(I know people here don't read the articles, but don't they at least read the summaries?)
i think you might enjoy the book "Shadow Factory" by James Bamford,
or maybe you might like the PBS Frontline special about his book, available online at pbs.org (the video is called Spy Factory for some reason)
You are describing software testing in the 1990s. Thomas Drake was heavily involved in software testing, and worked for NSA contractors until 2001, when he was hired at NSA itself.
After 9/11, he got disturbed with some of their wasteful practices . . . I am wondering if 'vendor friendly' software testing was one of the practices he might have had a problem with.
The DoD IG report on Trailblazer is still mostly redacted... the public is left in the dark about these things.
Protip, a license that restricts what the coder and devs using the code can do, is not really "free". It may protect the end users freedoms but it inarguably does so at the expense of developer freedoms.
When he says GPL is not truly free, he means it, and I dont think anyone involved with the development of GPLv2 and GPLv3 would argue that.
I'm not sure i'd want to go anywhere near the work of an NSA "contractor..."
Can you be Even More Awesome?!
The article mentions a technicality in Apache's current contributor license agreement that appears to bar Apache from accepting public domain work because there is no copyright owner to grant an explicit copyright license.
We're going to see more of this sort of thing. Almost everyone assumes that all software is copyrighted, or that only the copyright holder can release software as free/libre/open source software (FLOSS). Neither are true!! This matters when the US government gets involved, because its "normal" rules are really different from most organization's.
For example, if a government employee develops software as part of his official duties, then in practically all cases that software is NOT subject to copyright in the US (per US law 17 USC 105). It's not just that the author doesn't have copyright; there IS no copyright in the US. Also, when a contractor writes software, the government often receives all the release rights as if it was the copyright holder yet it is not the copyright holder (these are called "unlimited rights"). In this case, the government can release the software as FLOSS, on its own initiative, even though it is NOT the copyright holder. For more details, see: Publicly Releasing Open Source Software Developed for the U.S. Government.
The US government spends billions of dollars each year developing software. It's my hope that, over time, it will release more of the software it develops to the people who paid for it.
- David A. Wheeler (see my Secure Programming HOWTO)
It seems that the extra cell-level security is more of a capability, in that you can categorize (or add a label to) a cell and when you query you specify the access level you have...and the result is included or not depending.
I wonder how it deals with "lost security levels?" If you don't know the security level of a cell, you can't ask for it. If everyone forgets, then the data just sits around, waiting to be pruned. How can you tell the difference between a resource leak and unarchived classified documents that you can't get to?
I suppose that's one of those odd problems that only happens in government. "Why is the database only returning 10 results to me when the database itself is over 16PB?" More amusingly, if the total amount of data used by an NSA system is classified, who has enough information to order more storage?
If you're an outsider, why do you care about an article that essentially only matters to insiders? And while we're explaining the intricacies of the software industry, I will take the opportunity to introduce you to this wonderful invention. It's called a search engine. When you don't know what something means, you can search for it yourself, therefore avoiding looking both ignorant and lazy.
By the by, this is /. Notice the subtitle: "News for nerds". I think you may be lost. You may feel more comfortable here: http://digg.com/ or perhaps here: http://myspace.com./
"We live as though the world were as it should be, to show it what it can be." - Joss Whedon via Angel
Public domain is perfectly GPL compatible, where did you get the idea that it wasn't?
Analogies don't equal equalities, they are merely somewhat analogous.
That's an interesting problem - most open source licenses depend on copyright for enforcement. If there is no copyright, those licenses can't be used. Is there a way to incorporate?
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
You do realize that for an organization like the NSA to trust anybody, even their own employees, would be exceedingly foolish.
As far as I know, the people of the United States have had the right to bear arms for over two centuries. However if you think this means an American can go out on a whim and buy a heavy machine gun then you are mistaken.
Not sure why all of a sudden people are locking their doors in New Zealand, but I would suspect it has more to do with an uptick in local crime than American foreign policy.
I am very small, utmostly microscopic.
What exactly is that relevant to? It's like saying that you can't release BSD licensed code under the GPL. Technically correct, but not relevant to the topic at hand. It's not an issue, which was the actual point of his post, so...
Analogies don't equal equalities, they are merely somewhat analogous.
It appears there are some hurdles that must be cleared concerning copyright before the project could be accepted.
Wait... what? I'm not trying to be a grammar nazi, I'm genuinely confused. ARE some hurdles that must be CLEARED (future tense)... COULD be accepted (past test)? Which is it-- do the hurdles still need to be cleared before the project can be accepted, or have the hurdles been cleared and the project accepted?