NSA Utah Data Center Blueprints Reveal It Holds Less Than Thought
cold fjord writes "Break out the tin foil hats, and make them double thick. Forbes reports, 'The NSA will soon cut the ribbon on a facility in Utah ... the center will be up and running by the "end of the fiscal year," ....Brewster Kahle is the engineering genius behind the Internet Archive,... Kahle estimates that a space of that size could hold 10,000 racks of servers .... "So we are talking $1 billion in machines." Kahle estimates each rack would be capable of storing 1.2 petabytes of data. ... all the phone calls made in the U.S. in a year would take up about 272 petabytes, ... If Kahle's estimations and assumptions are correct, the facility could hold up to 12,000 petabytes, or 12 exabytes – ... but is not of the scale previously reported. Previous estimates would allow the data center to easily hold hypothetical 24-hour video and audio recordings of every person in the United States for a full year. The data center's capacity as calculated by Kahle would only allow the NSA to create archives for the 13 million people living in the Los Angeles metro area. Even that reduced number struck Internet infrastructure expert Paul Vixie as high given the space allocated for data in the facility. ... he came up with an estimate of less than 3 exabytes of data capacity for the facility. That would only allow for 24-hour recordings of what every one of Philadelphia's 1.5 million residents was up to for a year. Still, he says that's a lot of data pointing to a 2009 article about Google planning multiple data centers for a single exabyte of info. '" Update: 07/25 16:33 GMT by T : For even more, see this story.
Expect more articles like this that downplay the scale of the NSA.
This is a vast amount of storage. Obviously, the puzzle they've bought a data palace of a storage facility to assemble doesn't require indefinite storage for everyone. They're looking to cache everything they can get and then filter what's interesting. Maybe they have a range of target levels from indefinite storage of everything collected for one group, a year for another group, a month for a third group, a week for another, all the way down to a day or hours for the entire slush.
They don't need it all. They just need to run whatever algorithms they care about so they can toss whatever they think doesn't matter and keep what does.
You do know they lie for a living, right?
After looking through the blueprints I couldn't find anywhere designated for a Stargate. Bummer.
On the bright side, that is one more rumor that can be laid to rest.
tsk tsk everyone knows the stargate is under Cheyenne Mountain, it probably a storage facility for pilfered alien tech
---Saying gnome 3 is better than windows 8 not so much a compliment as it is damning with light praise.
Never mind that the annual production of hard drives is about 100 million drives. If they were all on the order of 5 TB, 10 EB would represent 2% of the global hard drive market for a year. Annual tape production is actually very similar order of magnitude to the annual hard drive production, so it is not like tapes gain you much. At least this is more reasonable than the estimates that previously were in the zetabyte range that would have to assume they had ten years worth of hard drive and/or tape production at current storage density.
House Mulls NSA Restrictions in Collecting Metadata http://defensetech.org/2013/07/24/house-mulls-nsa-restrictions-in-collecting-metadata/#more-21000
I mean, sure, you could record a few million people sleeping for eight hours a day, or watching 4 hours of Simpsons reruns a night, but why? If you're recording the 1-2 hours most people spend on the phone a day (max), then 3 exabytes might actually work out okay.
What "rumor" cold?
Why would any spy agency hold "24-hour video and audio recordings" on every person?
You get a file, work, school, crime, links, where seen on the 'net', hops to other people of interest, past clearances, links to any one with a clearances.
Political insights, weaknesses, funding....
No service would store video and audio recordings as they have computer code to do that long term vs huge per frame/endless audio.
Another trick is to turn the 'voice' into text. So the data per person needed for the "size" of people of interest in the USA is usable as reported.
http://en.wikipedia.org/wiki/Main_Core showed what could be done in the early 1980's is the NSA is hoping to keep a bit more that the "essence" this time.
Domestic spying is now "Benign Information Gathering"
A billion dollars they're spending. The NIH, the people who fund research that is going to cure cancer, they had their funding cut about 1.5 billion.
Hey, NSA! I'm thinking highly unpatriotic, violent thoughts right now!
Well obviously Terrorist kill more then cancer.
Be seeing you...
tsk tsk everyone knows the stargate is under Cheyenne Mountain, it probably a storage facility for pilfered alien tech
You mean like a warehouse? I'd bet that the NSA would have at least 12 of them prior to this facility.
Or just some Area to keep the stuff in, they'd have to have at least 50 of them by now.
Even that reduced number struck Internet infrastructure expert Paul Vixie as high
My uneducated response was "Holy Fuck!". Lucky the experts were there to clarify.
Sure, it would require a ridiculous server farm to store *recordings* of every phone call placed in the US, much less worldwide. Add emails, texts, IMs, etc., and the NSA would send hard drive prices through the roof all by themselves.
But phone *records* are another thing entirely. To store a record of every phone call (timestamps, caller number, recipient number, and maybe GPS) would only take roughly 30 TB a year (@ 500,000,000 calls placed each year). That's only about 2U worth of well-stocked NAS.
The footprint of the facility doesn't concern me as much as the extent of the NSA's authority. I'm all for stopping terrorism, but I'm not a fan of living in an Orwellian society, regardless how "safe" it makes us. Slippery slope arguments aside, concentrated power will *always* be abused, and dragnet programs can *never* make us 100% safe.
I couldn't find anywhere designated for a Stargate
Thats the emergency action map. Everyone knows that in the event of an emergency you don't use the Stargate, take the stairs instead.
I've been to Cheyenne Mountain and seen the Stargate. It's not what you think, they are not doing what you think they might, it would disappoint you. Every transaction take a huge amount of paperwork, I believe that Snowden will be releasing that data soon.
If you want news from today, you have to come back tomorrow.
tsk tsk everyone knows the stargate is under Cheyenne Mountain, it probably a storage facility for pilfered alien tech
They had to move the Stargate during the Borg invasion, just before the Death Star showed up.
Blog
Which, oddly enough, is located in British Columbia. They picked the site because the surrounding countryside coincidentally resembles every habitable planet in the galaxy.
Time to get everyone to post huge files of garbage. Let them store that.
I thought that's what we've been doing...
Sheesh, evil *and* a jerk. -- Jade
The US is the easy case. Until you find a way to get China, North Korea, Iran, (oppressive regime X), et al. to give them up, and various terrorist groups to stop attacking*, you're going to be stuck with it.
I hope you are seriously not making the argument that the US must do it because regimes like China, North Korea and Iran are doing it. There are a lot of things that China, North Korea and Iran do that the US would do well not to emulate , starting with opressing their own citizens.
the free democratic nations need intelligence agencies that are capable of helping to protect their societies.
Nobody disagrees with that broad principle. Whether the intelligence agencies need to have the power to indiscriminately harvest untargeted information on everyone to be capable at their job however, is in issue. If you want to take it to extremes, you could also make the argument that the NSA should be given the powers once held by Stasi, KGB, and their Chinese equivalents to be truly capable. It is true that this would increase the effectiveness of the NSA but I dont think anyone really wants to go there.
Unilateral disarmament in the face of aggression tends to have significant negative consequences.
Strawman argument. No one is suggesting that the US, or the NSA "unilaterally disarm" against China, North Korea, Iran et. al. The whole reason why PRISM blew up was because the NSA was collecting data not on China, North Korea or Iran, but on their own citizens and innocent third parties . That is only insofar as PRISM is concerned, we have no idea what other information may be collected by other programs because the NSA won't tell us.
That would be the equivalent of using your arms on your own family and innocent outsiders in the face of aggression.
After looking through the blueprints I couldn't find anywhere designated for a Stargate. Bummer.
On the bright side, that is one more rumor that can be laid to rest.
Of course. They're building this as the studio for faking the Mars landings. They're not going to blow it by going low-budget this time around.
Sheesh, evil *and* a jerk. -- Jade
"NSA Utah" is an anagam for "anus hat".
Sheesh, evil *and* a jerk. -- Jade
The claim that a years worth of phone calls is around 272 petabytes is dead on, it matches up perfectly with some back of the napkin calculations I did a while back based on a published report from the FCC[1]. Depending on the encoding bitrate, the range I had was 107 PB for 8 Kbps audio to 430 PB for 32 Kbps audio. 272 PB is about 20 Kbps, exactly in the middle...
http://slashdot.org/comments.pl?sid=3871487&cid=44027425
[1]: http://transition.fcc.gov/Bureaus/Common_Carrier/Reports/FCC-State_Link/IAD/trend605.pdf
The report only documents up to year 2000, but I presumed POTS service had leveled out with the emergence of VOIP and SMS messaging.
He just wanted the 'big disappointment' soundbite onto his Slashdot obfuscation post.
A data center that size is about 10000 times larger than needed to hold the phone record metadata disclosed. Far larger even than all instant messages, and email content text for everyone.
Scary they can build that and nobody in Congress knows yet. They all think its for the *disclosed* metadata, but it can't possibly be, its far too big.
Obviously the previous reports were wrong. Anybody familiar with computers and storage space knew that the numbers reported by NPR and other "news" outlets were ridiculous. They were saying that the center would hold 5 zetabytes, and would only cost $1.2 billion! That's about 25 cents per TB.
Best I could tell, NPR et al misunderstood a Wired article from over a year ago. In the Wired article, somebody said that they would eventually like the processing power in the center to exceed 1 exaflops, and then maybe someday after that 1 zetaflops.
On what is kept. If it really is just the metadata and not the conversation, then the storage requirements are not all that large.
For Landlines, there is a unique identifier applied at the switch. I mis-remember what it's called, but in South Texas, it usually started with BAPA- blah blah blah for several digits.
For cell phones, there is the OMEI/UDID/ESN. Normally around 14 to 20 digits, usually 15.
Next, called number, same info.
Last, call duration.
I believe it's long been known that using particular words in a telephone conversation would raise a flag. I don't know if that's true or not. If so, lets consider this scenario:
Call metadata captured and stored - always.
Call voice session saved to a temporary storage area.
Call concludes.
Voice data is analyzed for key words using automation. (Think about when you call your credit card company, and can input your CC number by voice)
If no keyword flags are raised, delete the conversation after X time (or immediately, who knows?)
If keyword flag is raised, score by number of keywords, flag conversation for human review, preserve all data.
After human review, who knows?
What I think: If preserving our freedom comes at the price of invading all of our privacy, then the terrorists have been gifted with a victory they could have never won for themselves. We have destroyed our freedom with the illusion of security, and now have neither freedom nor security. To draw a parallel, how is having the TSA able to squeeze my balls protecting me? "Dude - don't touch my junk!"
Necessity is the plea for every infringement of human freedom. It is the argument of tyrants; it is the creed of slaves.
Scary they can build that and nobody in Congress knows yet.
Right, "nobody" in Congress knows that NSA is building that big data center. Not even the Congress members that have it in their districts.
They all think its for the *disclosed* metadata, but it can't possibly be, its far too big.
The NSA has responsibility for signal intelligence world-wide. You may recall from the news that the program involving phone records tied to direct communications with terrorists is a minor program involving only $20,000,000. Don't let your brainstorm carry you away to crank conspiracy theories.
He just wanted the 'big disappointment' soundbite onto his Slashdot obfuscation post.
I'm sure that made sense if you're drunk blogging.
much of left-wing thought is a kind of playing with fire by people who don't even know that fire is hot - George Orwell
"Or just some Area to keep the stuff in, they'd have to have at least 50 of them by now..."
Let's not forget the 72 "Fusion Centers" located throughout the country.
http://en.wikipedia.org/wiki/Fusion_center
From that article:
"MIAC report
Missouri Information Analysis Center (MIAC) made news in 2009 for targeting supporters of third party candidates, Ron Paul supporters, pro-life activists, and conspiracy theorists (Hi, Mom!) as potential militia members.[14] Anti-war activists and Islamic lobby groups were targeted in Texas, drawing criticism from the ACLU.[15]
According to the Department of Homeland Security:[16]
[T]he Privacy Office has identified a number of risks to privacy presented by the fusion center program:
Justification for fusion centers
Ambiguous Lines of Authority, Rules, and Oversight
Participation of the Military and the Private Sector
Data Mining
Excessive Secrecy
Inaccurate or Incomplete Information
Mission Creep"
Ironically, this is a report from the Dept. of Homeland Security about the risks of such centers. And yet, nobody has even mentioned how many overseas facilities we're paying for on top of all the domestic ones.
Did the author of the summary read the article? The article for some reason mentions individualized video feeds for every American which is unrealistic and nothing like the sort of thing anyone has said the NSA is recording. 12,000 PB is far, far larger than the 272 PB estimated to hold all US domestic phone calls for a year, plus the foreign and international calls (which people forgot the NSA captures).
I recommend people read the archive.org description of the problem of archiving phone calls (TL;DR 272 PB) and DJB's article on cryptanalysis (PDF) (TL;DR NSA isn't stupid).
Jumping Jedis!! Why didn't you tell me before? If the Death Star is coming and we have to evacuate the planet, I'm taking Agent Scully and 7 of 9 with me on the next outbound starship. We'll rendezvous at the nearest Battlestar. Forget Mulder, he can hitch a ride with the Vorlons or the Vulcans.
much of left-wing thought is a kind of playing with fire by people who don't even know that fire is hot - George Orwell
From The Gentlepersons Guide to Forum Spies:
"4. Use a straw man. Find or create a seeming element of your opponent's argument which you can easily knock down to make yourself look good and the opponent to look bad. Either make up an issue you may safely imply exists based on your interpretation of the opponent/opponent arguments/situation, or select the weakest aspect of the weakest charges. Amplify their significance and destroy them in a way which appears to debunk all the charges, real and fabricated alike, while actually avoiding discussion of the real issues.
http://cryptome.org/2012/07/gent-forum-spies.htm
From The Gentleperson's Guide to Forum Spies:
13. Alice in Wonderland Logic. Avoid discussion of the issues by reasoning backwards or with an apparent deductive logic which forbears any actual material fact.
http://cryptome.org/2012/07/gent-forum-spies.htm
Couldn't all phone calls be converted to text and as a result require much, much less storage space?
The estimates would be reasonable for a private datacenter of that size. This is the federal government. The NSA is evil, but it's an evil GOVERNMENT AGENCY.
On average, it takes 60 months, five years, from the time the govt orders a computer until it's installed. So this will be enterprise storage from 2008. Enterprise, not consumer. Figure SCSI drives of about 200 GB, not 3TB SATA.
Of course it'd government efficiency in all aspects, so figure 10% of the floor space is used for server racks, etc.
The NSA is absolutely violating the fourth amendment and what they all doing is inexcusable. How well are they doing it? Not well enough to notice when someone is taking their databases home, uploading them to several sites, and emailing all their confidential documents to journalists.
I read through many of the posts, the exchange between "cold fjord" and Aca something was cute with its little drama about paid writers (maybe their both paid writers for the NSA or other government agency). yet in all these posts, not one poster talked about the root of this article. Why would the NSA *need* all this space if is not suppose to be collecting information without specific warrants or in bulk against innocent citizens.
That is the story. It is like y'all have just rolled over and accepted that it is okay for the NSA to even do this, so let's argue about size. My own view is that the NSA does *not* need these data centers for they should not be collecting that much information about everyone in the USA and beyond. I listened to a politician this morning (one who voted to continue funding the NSA's current trawling expedition) tell me that their actions "saved" hundreds of American lives, but if I asked for proof he'd say "I cannot disclose that information". I see, so you can't provide facts on what the program has done to save lives, you can't talk about what the program does though we know it gathers information on people who are not related to any illegal activity, and you ask us to "Trust You"? This is a republican who cries out for spending cuts, but votes to continue funding secret projects.
Please...
The spotlight on the NSA is not what it is building, it is on what it is doing, allegedly breaking the law. We should be asking more questions about that, digging into that, pushing Congress to act on that; not on blueprints. That they want to listen in or gather information on bad guys, fine, but when they expand that same action to include everyone then I have a problem.
Life is a great ride, the vehicle doesn't matter
Fill a server rack with Dell 3260 storage units, maxed out at 240TB per server. There is room in each rack for 10 such servers, so that's 2.4 petabytes per cabinet which is twice what the article says.
The blueprints are at best a measure of those portions of the facility where they will allow low level clearance contractors, like vetted electricians.
Even the MCI headquarters in Ashburn has an off blueprints sub basement to intel use, so we should hardly expect less of a facility directly owned by a TLA.
Well, that's either an argument for it not being needed, or an argument that whatever we're doing right now (ECHELON/PRISM) is working pretty well. Don't forget that MI5/6, CIA, etc. don't give a press release every time they uncover a plot, or foil one, or recruit an agent. If such a thing were possible, we might take a different view of what they do.
It's aluminum foil. Al != Sn
They couldn't do their absolute worst case scenario if they wanted to! ...for about another 5 years until storage drives jump an order of magnitude.
it's actually like a TARDIS
CNET: Feds Put Heat On Web Firms For Master Encryption Keys
I'm sure Dice holdings folded so hard that they not only gave up their logins, but also a neverending allocation of mod-points. /fnord!
the preceding comment is my own and in no way reflects the opinion of the Joint Chiefs of Staff
1.2PB per 42u rack it seems to assume 3TB drives in backblaze style pods. 10 Pods 45 drives per pod 40 drives worth of storage in raid 5 or similar give 400 drives of storage or 1.2PB with 3TB drives 4TB is 1.6PB. Taller racks (telcom style 72ru) nearly double that density and suck to work on making it the perfect choice for government work. But the number could easily be in the 16-30EB range.
No sir I dont like it.
Disinformation no matter how you look at it.
Really? That long? I must be dreaming, then, working here as a federal contractor in the health sector, where when the biggest thing I ordered, a honkin' huge RAID box, got here in 4 mos, and most servers are here in half that time. And as for drives, I think the 20 3TB WD Red drives I ordered were here in 2 weeks from the time I put in the order.....
mark "not under the DoD like the NSA"
According to the Wired article, the data center is for storing encrypted communications between foreign governments.
The plan is that those can eventually be decrypted within a year and while out of date, the conversations should still provide some insight into how the decisions are being made.
They must be grabbing bulb content data. Attachment, Googles cloud printed documents, email content, and a shed load of photos and spreadsheets etc.
Actually, they're likely storing a lot less -- they're creating an associative web. This means that if they're doing 2 degrees of separation, they need to create 2 degrees of links between every bit of data they're archiving. All that meta-metadata adds up, and probably uses up more of the storage space than the actual data itself. Of course, they probably also have round robin pools of data and flags that capture it for analysis/long-term storage based on patterns found in the relationships. This way, while they're SEEING all the data, they don't need to store everything. Who knows? maybe they're using Google's transcription tech to compress audio too -- doesn't have to be accurate, as it's just context for the relational data, and they can use it to trigger audio capture in the cases where it might actually be useful.
Again, not to downplay what they're doing -- the data they're collecting, combined with the way they're likely storing it, will enable them to know more about what motivates individuals/how they relate to others than the individuals likely know about themselves. No need to capture the realtime data to do that.
I work in the monitoring space. But looking at newspapers rather than people.
We get fairly well structured content. And searching it is hard and that is with TEXT content. You want to do that but add speech-to-text in the mix?
The quality of such a database would be crap. It may be useful AFTER the fact when you have a starting point (HE blew up the , who has he spoken too? Oh disposable phones. Not so helpful) but not to start analyzing The Masses. I don't see it.
There's nothing to see until you hit a certain mass of interrelational data, at which point your graph matches start showing interesting correlations. As soon as you start depending on the content to define the structure, you've lost. You want to depend only on the metadata to define the content. Who cares about whether the phone was disposable? What you really want is all calls made to/from that phone, and where that connects to. Then you see that in certain situations, the same person is communicating with an awful lot of disposable phones, and that communication suddenly stopped yesterday. Flag goes up, and content of communication is searched.
The problem is, that while this does limit the search footprint (and storage footprint), it marginalizes FPs, it doesn't eliminate them. And now that FP is being deep-delved instead of quickly passed off as unimportant as would previously happen.
"24-hour video and audio recordings of [X] persons in [Y] for [Z] years" is a stupid metric. Take a close look at NCIS "Flesh and Blood" (S07E12, the 150th episode) for what a more sensible system would store and process. Listen closely to the report of a "hit" on a public internet terminal at a hotel. It's basically metadata + tags. That doesn't take a lot of storage space.
That is a movie that must be made. Shut up and take my money!
Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
3 Exabytes would be plenty, if they had free access to all servers.
Oh you said Vorlons. And I was about to recite some very horrid poetry.............
And some of those planets--the Tok'ra, for instance--have unions.