NSA Utah Data Center Blueprints Reveal It Holds Less Than Thought
cold fjord writes "Break out the tin foil hats, and make them double thick. Forbes reports, 'The NSA will soon cut the ribbon on a facility in Utah ... the center will be up and running by the "end of the fiscal year," ....Brewster Kahle is the engineering genius behind the Internet Archive,... Kahle estimates that a space of that size could hold 10,000 racks of servers .... "So we are talking $1 billion in machines." Kahle estimates each rack would be capable of storing 1.2 petabytes of data. ... all the phone calls made in the U.S. in a year would take up about 272 petabytes, ... If Kahle's estimations and assumptions are correct, the facility could hold up to 12,000 petabytes, or 12 exabytes – ... but is not of the scale previously reported. Previous estimates would allow the data center to easily hold hypothetical 24-hour video and audio recordings of every person in the United States for a full year. The data center's capacity as calculated by Kahle would only allow the NSA to create archives for the 13 million people living in the Los Angeles metro area. Even that reduced number struck Internet infrastructure expert Paul Vixie as high given the space allocated for data in the facility. ... he came up with an estimate of less than 3 exabytes of data capacity for the facility. That would only allow for 24-hour recordings of what every one of Philadelphia's 1.5 million residents was up to for a year. Still, he says that's a lot of data pointing to a 2009 article about Google planning multiple data centers for a single exabyte of info. '" Update: 07/25 16:33 GMT by T : For even more, see this story.
They will include this thread for sure. They told me.
After looking through the blueprints I couldn't find anywhere designated for a Stargate. Bummer.
On the bright side, that is one more rumor that can be laid to rest.
much of left-wing thought is a kind of playing with fire by people who don't even know that fire is hot - George Orwell
Expect more articles like this that downplay the scale of the NSA.
This is a vast amount of storage. Obviously, the puzzle they've bought a data palace of a storage facility to assemble doesn't require indefinite storage for everyone. They're looking to cache everything they can get and then filter what's interesting. Maybe they have a range of target levels from indefinite storage of everything collected for one group, a year for another group, a month for a third group, a week for another, all the way down to a day or hours for the entire slush.
They don't need it all. They just need to run whatever algorithms they care about so they can toss whatever they think doesn't matter and keep what does.
You do know they lie for a living, right?
C3P0: Wonderful!
Did anyone really believe the estimates of zettabytes and yottabytes of data? Even if they used tape it would be nearly impossible to order that much storage. An exabyte is still a lot. This doesn't change our need for these facilities to be dismantled.
Never mind that the annual production of hard drives is about 100 million drives. If they were all on the order of 5 TB, 10 EB would represent 2% of the global hard drive market for a year. Annual tape production is actually very similar order of magnitude to the annual hard drive production, so it is not like tapes gain you much. At least this is more reasonable than the estimates that previously were in the zetabyte range that would have to assume they had ten years worth of hard drive and/or tape production at current storage density.
An estimate is an estimate is an estimate. The only people who know for sure how much data they can store aren't telling. Wake me when you have some facts. Be wary of mis/disinformation, goes without saying. I expect they have the capability to store & sift MUCH more than they let on.
House Mulls NSA Restrictions in Collecting Metadata http://defensetech.org/2013/07/24/house-mulls-nsa-restrictions-in-collecting-metadata/#more-21000
I mean, sure, you could record a few million people sleeping for eight hours a day, or watching 4 hours of Simpsons reruns a night, but why? If you're recording the 1-2 hours most people spend on the phone a day (max), then 3 exabytes might actually work out okay.
storage density will keep increasing, so the storage capacity of the facility will also grow proportionally.
Or this is a red herring and the real data center is elsewhere.
Time to get everyone to post huge files of garbage. Let them store that.
A billion dollars they're spending. The NIH, the people who fund research that is going to cure cancer, they had their funding cut about 1.5 billion.
Hey, NSA! I'm thinking highly unpatriotic, violent thoughts right now!
Even that reduced number struck Internet infrastructure expert Paul Vixie as high
My uneducated response was "Holy Fuck!". Lucky the experts were there to clarify.
Sure, it would require a ridiculous server farm to store *recordings* of every phone call placed in the US, much less worldwide. Add emails, texts, IMs, etc., and the NSA would send hard drive prices through the roof all by themselves.
But phone *records* are another thing entirely. To store a record of every phone call (timestamps, caller number, recipient number, and maybe GPS) would only take roughly 30 TB a year (@ 500,000,000 calls placed each year). That's only about 2U worth of well-stocked NAS.
The footprint of the facility doesn't concern me as much as the extent of the NSA's authority. I'm all for stopping terrorism, but I'm not a fan of living in an Orwellian society, regardless how "safe" it makes us. Slippery slope arguments aside, concentrated power will *always* be abused, and dragnet programs can *never* make us 100% safe.
Let's just assume that the smaller number is correct: 3 exabytes of data. Let's also assume that it is correct that the capacity of this data center could store 24-hour surveillance of 1.5MM individuals for a year. Presumably conservative numbers, right? According to the "Global Terrorism Database" maintained by the University of Maryland, there were 5,008 terrorism incidents in 2011. But, nearly 85% of those attacks occurred in South Asia, Africa, and the Middle East. Only 12 occurred in North America. We had 92 in Western Europe, but most of those seem to be related to Irish separatist groups. According to the US State Department, not *one* American was killed within the US in all of 2011 (As a side note, this means that in addition to pepper spraying students, UC Davis police officers were responsible for more American deaths in 2011 within the US than all of the terrorists in the world combined.) The vast, vast majority of those attacks aren't happening in the G20 nations that seem to be marginally complicit with all of this. Hard to believe that we would need to have this sort of data storage to prevent these sorts of attacks. Per Wikipedia, there were 36,000 Taliban fighters in Afganistan in 2010. They were responsible for 386 attacks in 2010. I'd give them credit for all of the unattributed attacks worldwide during that time period, but doing so only bolsters my case. Let's assume that other terrorism organizations are equally as "efficient" as the Taliban, so 5008/386*36000, so that would predict that there were ~467K terrorists *worldwide*. So, the 3 exabytes of data is more than enough to store 24/7 audio and video surveillance of every terrorist, worldwide, for about 3 years. Now I don't have a tin foil hat, but it's hard to believe that we have a surveillance program capable of offering this sort of surveillance of every known terrorist worldwide... If we did, it probably wouldn't have taken so long to find Bin Laden. So, what then, might the NSA use all of this capacity for? One could argue that the NSA wouldn't store all of this data—they're just processing it—but I would only think that this would increase their capacity for surveillance.
You don't need to store 30 frames per second video on people. Store 1 frame every 5 seconds, and you'll get the same information about people at 1/150th the scale of data saved. Even 5 seconds is too short. There's a kind of Planck scale for people - people can only move so fast in real life. Choose the interval so that you always get a shot of a person if the person has crossed the camera's field of view, in 95% of cases.
JPEG for video and ADPCM for audio !!
'nuf said !!
Get with the times, man !! Get !! With !! The !! Times !! This ain't the 80s !!
That is 42 GB per american.
1 billion dollars is only 3$ a each though, so not too bad on that front.
The US is the easy case. Until you find a way to get China, North Korea, Iran, (oppressive regime X), et al. to give them up, and various terrorist groups to stop attacking*, you're going to be stuck with it.
I hope you are seriously not making the argument that the US must do it because regimes like China, North Korea and Iran are doing it. There are a lot of things that China, North Korea and Iran do that the US would do well not to emulate , starting with opressing their own citizens.
the free democratic nations need intelligence agencies that are capable of helping to protect their societies.
Nobody disagrees with that broad principle. Whether the intelligence agencies need to have the power to indiscriminately harvest untargeted information on everyone to be capable at their job however, is in issue. If you want to take it to extremes, you could also make the argument that the NSA should be given the powers once held by Stasi, KGB, and their Chinese equivalents to be truly capable. It is true that this would increase the effectiveness of the NSA but I dont think anyone really wants to go there.
Unilateral disarmament in the face of aggression tends to have significant negative consequences.
Strawman argument. No one is suggesting that the US, or the NSA "unilaterally disarm" against China, North Korea, Iran et. al. The whole reason why PRISM blew up was because the NSA was collecting data not on China, North Korea or Iran, but on their own citizens and innocent third parties . That is only insofar as PRISM is concerned, we have no idea what other information may be collected by other programs because the NSA won't tell us.
That would be the equivalent of using your arms on your own family and innocent outsiders in the face of aggression.
If there are 3 billion people online on the planet, that's 1GB per person. Enough for all their emails, weblogs, chats, and a huge shitload of photos. The following 3 data centers would presumably increase this to 4GB per person on the planet.
Wow. 'Downplay'???
About the only thing it *can't store* is live video, but your photos, web, chats, VOIP, and telephone voice even etc. easy with plenty to spare.
" That would only allow for 24-hour recordings of what every one of Philadelphia's 1.5 million residents was up to for a year."
*only*? The average call plan is 300 minutes a month, so it could record the cellphone conversations for 432 million people.
Wow. They don't need that for call meta data which is tiny by comparison (CDRs are tiny records used to log phone calls and a months worth fits in under 10k per person). They don't need that for email either, email text is tiny.
They must be grabbing bulb content data. Attachment, Googles cloud printed documents, email content, and a shed load of photos and spreadsheets etc.
"NSA Utah" is an anagam for "anus hat".
Sheesh, evil *and* a jerk. -- Jade
That works out at 1GB for 3 billion people on the planet who are online, or 10GB for each of 300 million Americans.
Say you visit 300 webpages a day, at 128 bytes per URL = 23 years of URLs logged for every person on the planet. 230 years for US only.
Or perhaps 50/50 weburls and instant messages, for 12 years for everyone on the planet, 120 years for US only.
I find it trouble that cold fjord (who I believe is a defense contractor lobbyist since he does the talking points the NSA puts out) is pushing the idea of 24/7 voice and video recording as if that's their future goal. Their current storage capacity is already far above the disclosed 'metadata' claims.
That 'boundless informant' leak showed 3 billion items of data per month on Americans, *not* including the metadata. So that will likely be bulk content data if those storage numbers are correct. It seems the NSA has some further truth issues to be resolved.
Who is suggesting the NSA wants to store vast amounts of useless video data? This is the classic STRAWMAN ploy.
Now the NSA is collecting increasing amounts of data mined from video sources, and every Xbox One console from Microsoft will be tracking every person that passes in front of the Kinect sensor system on a daily basis, but the mined information (like identity of person or vehicle, and times) takes a tiny amount of space.
Remember, the NSA facilities use the same hardware and software systems as Google. Everything Google does is designed to provide services to the intelligence community.
When the NSA does want to store video, it is from specific surveillance operations NOT universal surveillance, and the amount of such video is trivial to store.
What the NSA does want is the facility to automatically gather semantic data from general audio and video sources. So, phone calls are automatically transcribed to text as well as possible, even though the original audio is also stored. Experiments to extract data from video are ongoing. Obviously, most of this will be automatic face recognition.
The NSA is seeking to do two main things with their surveillance
1) gather blackmail material for future possible use in coercing well placed individuals. This is 99.99% of all targeted work by the NSA.
2) gather feedback on the current mindset of the population (or subsection of that population) to provide near instantaneous feedback on the effectiveness of ongoing propaganda campaigns in the mainstream media
There is nothing unusual in this operation. Intelligence agencies have always been used by those in power for these two main reasons. The fantasy that they chase criminals or foil plots is just hilarious.
The power elite care only about themselves, and their continued dominance. They spend your money building systems to keep themselves safe and secure. Your power elite see the power elite in other nations as the same tribe. It is YOU they see as different - it is YOU they see as a potential problem. It is you they need to control, so it is you they need to monitor.
The power elite have no morals and no conscience. For them, everything is a means to an end. The power elite are disgusted that you would sit back and let them rule over you, so they consider their contempt for you as a function and consequence of YOUR behaviour, not theirs.
Google and Microsoft, with their work for the NSA, are hoping once and for all to end any possibility that the system can ever be changed by grass roots leadership or activism. They are looking to create an eternal status quo, where the control of the sheeple from the top becomes flawless. Meanwhile, shills will work very hard to distract form the real story of NSA total surveillance of the entire population.
it would require a ridiculous server farm to store *recordings*
I get your overall point, but storing audio is hardly ridiculous at this point in technology, and will be even less so in the future. Compression helps, obviously, and there really isn't that much data to every phone call you make.
The claim that a years worth of phone calls is around 272 petabytes is dead on, it matches up perfectly with some back of the napkin calculations I did a while back based on a published report from the FCC[1]. Depending on the encoding bitrate, the range I had was 107 PB for 8 Kbps audio to 430 PB for 32 Kbps audio. 272 PB is about 20 Kbps, exactly in the middle...
http://slashdot.org/comments.pl?sid=3871487&cid=44027425
[1]: http://transition.fcc.gov/Bureaus/Common_Carrier/Reports/FCC-State_Link/IAD/trend605.pdf
The report only documents up to year 2000, but I presumed POTS service had leveled out with the emergence of VOIP and SMS messaging.
I feel sorry for the people who will have to go throught all this data.
How curious you got modded offtopic - one of the dangers in engaging with cold fjord using reasoned arguments and facts.
Obviously the previous reports were wrong. Anybody familiar with computers and storage space knew that the numbers reported by NPR and other "news" outlets were ridiculous. They were saying that the center would hold 5 zetabytes, and would only cost $1.2 billion! That's about 25 cents per TB.
Best I could tell, NPR et al misunderstood a Wired article from over a year ago. In the Wired article, somebody said that they would eventually like the processing power in the center to exceed 1 exaflops, and then maybe someday after that 1 zetaflops.
Subject said it all.
On what is kept. If it really is just the metadata and not the conversation, then the storage requirements are not all that large.
For Landlines, there is a unique identifier applied at the switch. I mis-remember what it's called, but in South Texas, it usually started with BAPA- blah blah blah for several digits.
For cell phones, there is the OMEI/UDID/ESN. Normally around 14 to 20 digits, usually 15.
Next, called number, same info.
Last, call duration.
I believe it's long been known that using particular words in a telephone conversation would raise a flag. I don't know if that's true or not. If so, lets consider this scenario:
Call metadata captured and stored - always.
Call voice session saved to a temporary storage area.
Call concludes.
Voice data is analyzed for key words using automation. (Think about when you call your credit card company, and can input your CC number by voice)
If no keyword flags are raised, delete the conversation after X time (or immediately, who knows?)
If keyword flag is raised, score by number of keywords, flag conversation for human review, preserve all data.
After human review, who knows?
What I think: If preserving our freedom comes at the price of invading all of our privacy, then the terrorists have been gifted with a victory they could have never won for themselves. We have destroyed our freedom with the illusion of security, and now have neither freedom nor security. To draw a parallel, how is having the TSA able to squeeze my balls protecting me? "Dude - don't touch my junk!"
Necessity is the plea for every infringement of human freedom. It is the argument of tyrants; it is the creed of slaves.
"Right, "nobody" in Congress knows that NSA is building that big data center"
Strawman. Nobody in Congress knows its about 4 orders of magnitude too big to be just for metadata. Nor do they realize that it's just one out of a whole set of current data centers. It's content, bulk content and since the biggest source of content is their US intercepts it will be mostly US content.
They're debating whether to cancel the metadata trawl, but the data center shows its massive surveillance of content too.
"The NSA has responsibility for signal intelligence world-wide. You may recall from the news that the program involving phone records tied to direct communications with terrorists is a minor program involving only $20,000,000. Don't let your brainstorm carry you away to crank conspiracy theories."
I agree, its ordinary peoples data, several GB for everyone on the planet. Not a few thousand terrorists. If you recall Gmail big sell was it's 1GB mail storage that lets you 'keep all your email', the NSA could store that several times over FOR EVERYONE ON THE PLANET in just this one data center.
"I'm sure that made sense if you're drunk blogging."
No, you just wanted to add a soundbite to your misinformation post. Just as you raised a strawman in this argument.
There is a saying: "a translator always translates as it is profitable for the translator".
The same is with this data eavesdropping and collection. It may be used to collect and trade commercial secrets in order, well, to gain money.
The people, who open other people letters, were always considered sneaks and dastards. That is why it has always been necessary to obtain the specific court decision for each sustect to do it.
Was it really necessary to change it? The damage, which this global carpet-eavesdropping and voyeurism is creating for the moral image of the USA, is enormous. And also for the US companies.
Did the author of the summary read the article? The article for some reason mentions individualized video feeds for every American which is unrealistic and nothing like the sort of thing anyone has said the NSA is recording. 12,000 PB is far, far larger than the 272 PB estimated to hold all US domestic phone calls for a year, plus the foreign and international calls (which people forgot the NSA captures).
I recommend people read the archive.org description of the problem of archiving phone calls (TL;DR 272 PB) and DJB's article on cryptanalysis (PDF) (TL;DR NSA isn't stupid).
They have a prophecy that they will control the US government during the end of the world.
They believe that god wants them to rule the planet.
They believe that it's a sin to cheat other cult members but it's OK to cheat non members.
In the past, they've been convicted of murdering other Americans to steal their supplies and young girls.
I'm just asking - are those the people you want in charge?
" Nobody claims that the data center is there just to store metatdata except cranks lining up strawmen."
About 4GB of data per person on the planet, and given the NSA has mostly US signal taps, with UK additions, it will be mostly US and UK data. So more like 20GB of data per US citizen and a few hundred megs for everyone else.
That's not metadata that's content. The NSA chief General Alexander and President Obama have both claimed its limited to metadata for US citizens but that is not bourne out by the volume of data.
"Most of them have armed forces. There are many terrorist groups. The NSA is responsible for knowing about them. That might involve a little data."
And yet in your earlier post you pointed out that the terrorist portion of the NSAs work only accounts for $20 million. Even if they had equal coverage of everyone in the world, and not a US focussed feed, then it still wouldn't be only meta data.
About once a year I send an email to my paranoid friends which includes a few buzz phrases.
Dear Spooks,
It is once again time for me to provide you with an update on nefarious activities on the Wild Wild Web.
While you are clandestinely surveilling me through your prism of delusion, why not take a moment and stand back and ask your self; is what you are really doing protecting liberty or slow chiselling it away.
Have a good.
Redo the storage#'s witha dedup app managing the disks....(x50 or x100)
Couldn't all phone calls be converted to text and as a result require much, much less storage space?
The estimates would be reasonable for a private datacenter of that size. This is the federal government. The NSA is evil, but it's an evil GOVERNMENT AGENCY.
On average, it takes 60 months, five years, from the time the govt orders a computer until it's installed. So this will be enterprise storage from 2008. Enterprise, not consumer. Figure SCSI drives of about 200 GB, not 3TB SATA.
Of course it'd government efficiency in all aspects, so figure 10% of the floor space is used for server racks, etc.
The NSA is absolutely violating the fourth amendment and what they all doing is inexcusable. How well are they doing it? Not well enough to notice when someone is taking their databases home, uploading them to several sites, and emailing all their confidential documents to journalists.
I read through many of the posts, the exchange between "cold fjord" and Aca something was cute with its little drama about paid writers (maybe their both paid writers for the NSA or other government agency). yet in all these posts, not one poster talked about the root of this article. Why would the NSA *need* all this space if is not suppose to be collecting information without specific warrants or in bulk against innocent citizens.
That is the story. It is like y'all have just rolled over and accepted that it is okay for the NSA to even do this, so let's argue about size. My own view is that the NSA does *not* need these data centers for they should not be collecting that much information about everyone in the USA and beyond. I listened to a politician this morning (one who voted to continue funding the NSA's current trawling expedition) tell me that their actions "saved" hundreds of American lives, but if I asked for proof he'd say "I cannot disclose that information". I see, so you can't provide facts on what the program has done to save lives, you can't talk about what the program does though we know it gathers information on people who are not related to any illegal activity, and you ask us to "Trust You"? This is a republican who cries out for spending cuts, but votes to continue funding secret projects.
Please...
The spotlight on the NSA is not what it is building, it is on what it is doing, allegedly breaking the law. We should be asking more questions about that, digging into that, pushing Congress to act on that; not on blueprints. That they want to listen in or gather information on bad guys, fine, but when they expand that same action to include everyone then I have a problem.
Life is a great ride, the vehicle doesn't matter
Fill a server rack with Dell 3260 storage units, maxed out at 240TB per server. There is room in each rack for 10 such servers, so that's 2.4 petabytes per cabinet which is twice what the article says.
The blueprints are at best a measure of those portions of the facility where they will allow low level clearance contractors, like vetted electricians.
Even the MCI headquarters in Ashburn has an off blueprints sub basement to intel use, so we should hardly expect less of a facility directly owned by a TLA.
More than likely the center includes compression and deduplication that's part and parcel of most storage technology these days. And if it's just ascii metadata that will get you 90%+ reduction in space requirements for each call.
I work in the monitoring space. But looking at newspapers rather than people.
We get fairly well structured content. And searching it is hard and that is with TEXT content. You want to do that but add speech-to-text in the mix?
The quality of such a database would be crap. It may be useful AFTER the fact when you have a starting point (HE blew up the , who has he spoken too? Oh disposable phones. Not so helpful) but not to start analyzing The Masses. I don't see it.
What everyone's really forgetting is that this is just ONE of their data centers. Where do you think they've been holding all this data up until now? Most NSA data centers were probably built during the heyday of mainframes and therefore already bigger than average, with the power and cooling to support big iron. Probably only represents 20% of their capacity at most. Hell, this might just be an off-site backup for agency records or the world's largest WoW server.
It would be pointless to compare an NSA datacenter with a civilian datacenter. The technologies available to the NSA are orders of magnitude faster and larger. The US has laws that can prevent such technology from entering the common marketplace.
That facility will most likely be in the Zettabyte range. Or, given that it is public, they could fill it with standard gear as a decoy.
It's aluminum foil. Al != Sn
They couldn't do their absolute worst case scenario if they wanted to! ...for about another 5 years until storage drives jump an order of magnitude.
it's actually like a TARDIS
Considering the demand for hard drives and such for these data centres, the production of hard drives and other storage and computing elements is kept nice and high. This keeps the production lines humming and the excesses get dumped on the private market at clearance prices. Imagine how much a 2T drive would cost if there were only the commercial market to satisfy.
Now these data centres will be buying the most dense drive they can so they'll be buying the 4T drives at the moment. When the 10T drives become available and they start buying those, we'll see a glut of 4T drives at clearance prices.
Yes, I know I'm being silly but since everyone else is not taking this all seriously, why should I?
The FBI list of attributes for potential terrorists shows concern for one's privacy or excessive secrecy as a sign that one is a potential terrorist.
Anthony Wiener believes in full disclosure.
Anthony Wiener is not a terrorist.
Anthony Wiener is a red-blooded American.
Anthony Wiener is all man !
Anthony Wiener is the paradigm of new 21st century citizen.
Anthony Wiener should be your congressman^W governor^W President ^W American Idol.
oh goody, something new on /.! I was sadly disappointed when the APK/MyCleanPC/TimeCube troll ran out of steam.
The info the feds actually want is a VERY small part of the info in any telephone call...or at least the info they can usefully do something with is minimal (true AI might be able to do anything if they had it).
So think about it. Before they store a call they certainly compress it and they probably strip out all sorts of other stuff like vocal tone, exact duration/pronounciation of words etc.. Automated Voice to Text systems are almost good enough for them to just store the transcript
CNET: Feds Put Heat On Web Firms For Master Encryption Keys
I'm sure Dice holdings folded so hard that they not only gave up their logins, but also a neverending allocation of mod-points. /fnord!
the preceding comment is my own and in no way reflects the opinion of the Joint Chiefs of Staff
So the mormon church could siphon data from the NSA to know where to send their missionaries & who is not paying their proper tithing.
I guarantee you, having worked for a VAR selling exclusively to government agencies, that 60 months is not correct where enterprise storage is concerned. You can figure that they have the latest and greatest enterprise storage 3 months after its released by EMC/Hitachi/HP/IBM/etc. You can just about bet on them having the best of the best in both speed and capacity where storage is concerned, whether it be enterprise SSD or largest capacity SATA.
Your so going to be trolled by Cold Fjords army of accounts/mod points....
1.2PB per 42u rack it seems to assume 3TB drives in backblaze style pods. 10 Pods 45 drives per pod 40 drives worth of storage in raid 5 or similar give 400 drives of storage or 1.2PB with 3TB drives 4TB is 1.6PB. Taller racks (telcom style 72ru) nearly double that density and suck to work on making it the perfect choice for government work. But the number could easily be in the 16-30EB range.
No sir I dont like it.
(@ 500,000,000 calls placed each year)
That's less than 1.5 calls per US resident per year. I think you're off by a few orders of magnitude. I'll leave it as an exercise for the reader as to whether you're off in your storage estimate, your phone call estimate, or (more likely) both.
Disinformation no matter how you look at it.
Really? That long? I must be dreaming, then, working here as a federal contractor in the health sector, where when the biggest thing I ordered, a honkin' huge RAID box, got here in 4 mos, and most servers are here in half that time. And as for drives, I think the 20 3TB WD Red drives I ordered were here in 2 weeks from the time I put in the order.....
mark "not under the DoD like the NSA"
And if you post something that is critical of anti-NSA (even if you are still anti-NSA and not pro-NSA), you get linked to an explanation of why you are a spy instead something relevant, like why you are actually wrong.
It doesn't seem to matter which side of a debate you are on, on the internet there are people with too much free time that will spam you with irrelevance. Doesn't require any government or corporate conspiracy, just idiots arguing endlessly.
I work in the monitoring space. But looking at newspapers rather than people.
We get fairly well structured content. And searching it is hard and that is with TEXT content. You want to do that but add speech-to-text in the mix?
The quality of such a database would be crap. It may be useful AFTER the fact when you have a starting point (HE blew up the , who has he spoken too? Oh disposable phones. Not so helpful) but not to start analyzing The Masses. I don't see it.
There's nothing to see until you hit a certain mass of interrelational data, at which point your graph matches start showing interesting correlations. As soon as you start depending on the content to define the structure, you've lost. You want to depend only on the metadata to define the content. Who cares about whether the phone was disposable? What you really want is all calls made to/from that phone, and where that connects to. Then you see that in certain situations, the same person is communicating with an awful lot of disposable phones, and that communication suddenly stopped yesterday. Flag goes up, and content of communication is searched.
The problem is, that while this does limit the search footprint (and storage footprint), it marginalizes FPs, it doesn't eliminate them. And now that FP is being deep-delved instead of quickly passed off as unimportant as would previously happen.
3 exabytes are available. If I'm going to be spied on, why not let me have unlimited access to the content that I want to place there. My tax dollars already paid for it, why can't I have free access to it?
Modern technology scares and confuses the NSA, so how could they possibly be a real threat?
"24-hour video and audio recordings of [X] persons in [Y] for [Z] years" is a stupid metric. Take a close look at NCIS "Flesh and Blood" (S07E12, the 150th episode) for what a more sensible system would store and process. Listen closely to the report of a "hit" on a public internet terminal at a hotel. It's basically metadata + tags. That doesn't take a lot of storage space.
3 Exabytes would be plenty, if they had free access to all servers.