NSA Utah Data Center Blueprints Reveal It Holds Less Than Thought
cold fjord writes "Break out the tin foil hats, and make them double thick. Forbes reports, 'The NSA will soon cut the ribbon on a facility in Utah ... the center will be up and running by the "end of the fiscal year," ....Brewster Kahle is the engineering genius behind the Internet Archive,... Kahle estimates that a space of that size could hold 10,000 racks of servers .... "So we are talking $1 billion in machines." Kahle estimates each rack would be capable of storing 1.2 petabytes of data. ... all the phone calls made in the U.S. in a year would take up about 272 petabytes, ... If Kahle's estimations and assumptions are correct, the facility could hold up to 12,000 petabytes, or 12 exabytes – ... but is not of the scale previously reported. Previous estimates would allow the data center to easily hold hypothetical 24-hour video and audio recordings of every person in the United States for a full year. The data center's capacity as calculated by Kahle would only allow the NSA to create archives for the 13 million people living in the Los Angeles metro area. Even that reduced number struck Internet infrastructure expert Paul Vixie as high given the space allocated for data in the facility. ... he came up with an estimate of less than 3 exabytes of data capacity for the facility. That would only allow for 24-hour recordings of what every one of Philadelphia's 1.5 million residents was up to for a year. Still, he says that's a lot of data pointing to a 2009 article about Google planning multiple data centers for a single exabyte of info. '" Update: 07/25 16:33 GMT by T : For even more, see this story.
This was submitted by cold fjord, Slashdot's resident neo-con who supports waterboarding, said the Iraq war was "worth it", and said Bradley Manning deserved to be tortured for 'faking' feeling suicidal. What do you expect?
Oh, and this facility will "only allow for 24-hour recordings of what every one of Philadelphia's 1.5 million residents was up to for a year". It is convenient that the article fails to mention that this is only one facility out of a dozen or so.
The next step for the NSA is a small file for every human with enough space for a days internet links, chats, text for life. :)
That can bed expanded as they get politically active
The file per person would allow any persons digital life to be tracked back to the first 'connection' of interest.
In the past all that could be done was to track telephone numbers, fax, computer use and voice prints as found or via contact with a past person or group of interest.
The past sorting was very quick and left a very small amount of data to be sent to the US from any distant super computing location (UK, Australia)
ie the NSA is not after http://www.wired.co.uk/news/archive/2013-06/24/gchq-tempora-101 long term.
They don't want 'big' content long term, they need space for all your ip's used, ports, apps used keywords, links, times, locations, connections to people - all very tiny amounts of text like info for now ie the "initial filter" will go for your pic, movie, sound, text - not keeping it, but might give a facial recognition code string to everybody in the pic. You only need a good voice print every so often...
Data size has never been the issue, legality, domestic commercial 'help' have been.
Domestic spying is now "Benign Information Gathering"
and they haven't even taken into consideration compression tellaphone has really low audio quality so it should take that much space when compressed
Or if its processed to transcript and stored as text .. then deduped .. then compressed .. and who said they were using magnetic media? Or only had above-ground capacity?
Pretty unimaginative to assume that this is a giant storage node ..
"Expect more articles like this that downplay the scale..."
Downplay the scale? We haven't even seen the drawings for the below-ground facilities.
But, seriously. From the article...
"...and that the sheer size of the data centers in Utah and elsewhere suggests that the agency wants to vacuum up everything it can..."
That's my emphasis--plural. There are more then one of these centers. Take a look at the layout of the Utah Data Center article at Wikipedia.
http://en.wikipedia.org/wiki/Utah_Data_Center
Does that building layout look anything like the one at the top of the linked Forbes article? The picture of the buildings and the layout right above are a match in the Wikipedia article, yet they don't match the plans in the Forbes article.
So where is this data center that Forbes has the plans to? They're obviously not the same.
The claim that a years worth of phone calls is around 272 petabytes is dead on, it matches up perfectly with some back of the napkin calculations I did a while back based on a published report from the FCC[1]. Depending on the encoding bitrate, the range I had was 107 PB for 8 Kbps audio to 430 PB for 32 Kbps audio. 272 PB is about 20 Kbps, exactly in the middle...
http://slashdot.org/comments.pl?sid=3871487&cid=44027425
[1]: http://transition.fcc.gov/Bureaus/Common_Carrier/Reports/FCC-State_Link/IAD/trend605.pdf
The report only documents up to year 2000, but I presumed POTS service had leveled out with the emergence of VOIP and SMS messaging.
On what is kept. If it really is just the metadata and not the conversation, then the storage requirements are not all that large.
For Landlines, there is a unique identifier applied at the switch. I mis-remember what it's called, but in South Texas, it usually started with BAPA- blah blah blah for several digits.
For cell phones, there is the OMEI/UDID/ESN. Normally around 14 to 20 digits, usually 15.
Next, called number, same info.
Last, call duration.
I believe it's long been known that using particular words in a telephone conversation would raise a flag. I don't know if that's true or not. If so, lets consider this scenario:
Call metadata captured and stored - always.
Call voice session saved to a temporary storage area.
Call concludes.
Voice data is analyzed for key words using automation. (Think about when you call your credit card company, and can input your CC number by voice)
If no keyword flags are raised, delete the conversation after X time (or immediately, who knows?)
If keyword flag is raised, score by number of keywords, flag conversation for human review, preserve all data.
After human review, who knows?
What I think: If preserving our freedom comes at the price of invading all of our privacy, then the terrorists have been gifted with a victory they could have never won for themselves. We have destroyed our freedom with the illusion of security, and now have neither freedom nor security. To draw a parallel, how is having the TSA able to squeeze my balls protecting me? "Dude - don't touch my junk!"
Necessity is the plea for every infringement of human freedom. It is the argument of tyrants; it is the creed of slaves.
"Or just some Area to keep the stuff in, they'd have to have at least 50 of them by now..."
Let's not forget the 72 "Fusion Centers" located throughout the country.
http://en.wikipedia.org/wiki/Fusion_center
From that article:
"MIAC report
Missouri Information Analysis Center (MIAC) made news in 2009 for targeting supporters of third party candidates, Ron Paul supporters, pro-life activists, and conspiracy theorists (Hi, Mom!) as potential militia members.[14] Anti-war activists and Islamic lobby groups were targeted in Texas, drawing criticism from the ACLU.[15]
According to the Department of Homeland Security:[16]
[T]he Privacy Office has identified a number of risks to privacy presented by the fusion center program:
Justification for fusion centers
Ambiguous Lines of Authority, Rules, and Oversight
Participation of the Military and the Private Sector
Data Mining
Excessive Secrecy
Inaccurate or Incomplete Information
Mission Creep"
Ironically, this is a report from the Dept. of Homeland Security about the risks of such centers. And yet, nobody has even mentioned how many overseas facilities we're paying for on top of all the domestic ones.