Declassified Report From 2009 Questions Effectiveness of NSA Spying
schwit1 writes: With debate gearing up over the coming expiration of the Patriot Act surveillance law, the Obama administration on Saturday unveiled a 6-year-old report examining the once-secret program code-named Stellarwind, which collected information on Americans' calls and emails. The report was from the inspectors general of various intelligence and law enforcement agencies.
They found that while many senior intelligence officials believe the program filled a gap by increasing access to international communications, others including FBI agents, CIA analysts and managers "had difficulty evaluating the precise contribution of the [the surveillance system] to counterterrorism efforts because it was most often viewed as one source among many available analytic and intelligence-gathering tools in these efforts."
"The report said that the secrecy surrounding the program made it less useful. Very few working-level C.I.A. analysts were told about it. ... Another part of the newly disclosed report provides an explanation for a change in F.B.I. rules during the Bush administration. Previously, F.B.I. agents had only two types of cases: "preliminary" and "full" investigations. But the Bush administration created a third, lower-level type called an "assessment." This development, it turns out, was a result of Stellarwind.
They found that while many senior intelligence officials believe the program filled a gap by increasing access to international communications, others including FBI agents, CIA analysts and managers "had difficulty evaluating the precise contribution of the [the surveillance system] to counterterrorism efforts because it was most often viewed as one source among many available analytic and intelligence-gathering tools in these efforts."
"The report said that the secrecy surrounding the program made it less useful. Very few working-level C.I.A. analysts were told about it. ... Another part of the newly disclosed report provides an explanation for a change in F.B.I. rules during the Bush administration. Previously, F.B.I. agents had only two types of cases: "preliminary" and "full" investigations. But the Bush administration created a third, lower-level type called an "assessment." This development, it turns out, was a result of Stellarwind.
Anyone who manages big data can tell you how corrupt most data sets really are. Names spelled different ways, bits of information incorrectly transcribed, copy errors, format errors, import errors are all low probability events but, when you're dealing with billions of records, there are a lot of them.
As someone who has spent the better part of two weeks fruitlessly trying to get my Experian data to remotely resemble my Equifax data (and I have exactly 18 months of credit history), I can attest to that. Heck, even in a completely contained ERP system that controls a manufacturing warehouse (one of my clients), the issues that people can cause there are surprising.
In nearly every case they didn't effectively use the information they had
The number one problem of large datasets is not knowing what's in there, therefore not knowing really how to query the data to find out. Strator had a report on that maybe a year ago, discussing the 9/11 "intelligence failure" and the beacon-lit paths the hijackers left behind: essentially, since the FBI wasn't actively looking for people who might be planning a major operation, they never saw the clues.
By way of analogy, if I'm sifting through a ledger table looking for (say) a mis-matched transaction, the odd voucher sequence a few rows up might be completely missed. You can't depend on a specific sequence of vouchers in general; that column looks like a lot of noise. But if I'm tracking down an inventory issue, that odd voucher sequence might just be the key.
The point is, it's easy to blame people for missing the obvious after the fact. But that's 20/20 hindsight; the people who missed it may have been working on something much more pressing.
so how is more information going to make things better?
It can't and wont. More unfiltered data = more noise, and more noise can obscure a real signal or give the impression of a false signal.