NSA Wants To Dump the Phone Records It Gathered Over 14 Years (thenextweb.com)
According to The Next Web, the NSA would like to get rid of something that a lot of people wish they'd never had in the first place: phone records that the agency has collected over a
decade
and
a
half (more, really)
of
mass
surveillance.
However, the EFF wants to make sure that the evidence of snooping doesn't get buried along with the actual recorded data. From the article: [T]he government says that it can't be sued by bodies like the EFF. The organization is currently involved in two pending cases seeking a remedy for the past 14 years of illegal phone record collection.
EFF wrote a letter (PDF) to the secret Foreign Intelligence Surveillance Act court last December which it has now made public, explaining that it is ready to discuss options that will allow destruction of the records in ways that still preserve its ability to prosecute the cases.
It'll be interesting to see how this pans out: if the government doesn't agree to a discussion about how to handle these phone records, it's possible that they will remain on file for years to come. Plus, it could allow the NSA to avoid being held accountable for its illegal mass surveillance.
Nobody with a shred of common sense would actually believe that the government would actually erase all of this data. There will absolutely be copies of it on a secure, secret server somewhere in Spookland. This is nothing but a diversionary tactic.
Why on earth should the NSA be held accountable for something they implemented on behalf of politicians?
How about prosecuting George W. Bush and Donald Rumsfeld for torture first?
Hmm. Yes, I should have looked that up and crunched the numbers before giving any out-of-my-ass estimates. I'd no idea it was on the order of three billion calls per day. But how big would that number be once the high volume callers (telemarketers, customer service, etc.) were vetted and either eliminated or stored in a separate low-interest database? I suspect you'd lose an order of magnitude there. Perhaps several.
Also, 64 bytes is far too big for a decent tailor-made algorithm. There's no need for a timestamp to be attached to each record. You could have a stream of timestamps in the compressed file and in-between them would be pairs of ID numbers representing the source and destination phone numbers. Four bytes gives you 4.29 billion unique phone numbers, so with this very basic scheme you're looking at barely over 16 bytes per call. This does not include any form of PPM, which for starters would allow you to truncate the destination ID in the vast majority of cases. You could also recover several bits from the ID numbers since you surely don't need a direct index of 4.29 billion numbers for optimal compression.
I admit the written transcripts are probably significantly larger than my off the cuff estimation, but you have to realize that the vast majority of those phone calls are brief (and you first have to remove the ones that are automated, unanswered, busy signal, etc.) It's still peanuts compared to the total budget of the NSA. I suspect It's something that could be very easily concealed without a thorough independent audit.
A petabyte of storage isn't really anything when you are talking data center sized storage. It wouldn't even be a gallon jug full of microSD cards.
Time to offend someone