How the NSA Converts Spoken Words Into Searchable Text
Presto Vivace writes: Dan Froomkin reports at The Intercept: "Though perfect transcription of natural conversation apparently remains the Intelligence Community's 'holy grail,' the Snowden documents describe extensive use of keyword searching as well as computer programs designed to analyze and 'extract' the content of voice conversations, and even use sophisticated algorithms to flag conversations of interest." I am torn between admiration of the technical brilliance of building software like this and horror as to how it is being used. It can't just be my brother and me who like to salt all phone conversations with interesting keywords.
" I am torn between admiration of the technical brilliance of building software like this and horror as to how it is being used."
Why? Over most of history spying has saved lives more than taken them.
I find it so odd that people on Slashdot sing the praises of the "Codebreakers" of WWII but are shocked and freaked out that they are still around today.
BTW the US and Britian both spied and used code breaking before the war started so... Yes they were spying in peacetime!!!!! Shocking.
See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
Easy: Pipe it through Siri.
"I don't understand what you mean by, The bag of bees will be in the Reich locket at the county AirPlay."
I can't even get a device - of any power - to recognise my voice beyond the very slow, pronounced basics and I have to train myself to it (not the other way around).
Would love to know how the NSA have access to technology that the top voice-recognition specialists and software can't manage, let alone dealing with noisy backgrounds, masked keywords, variety of languages, etc.
"Acres of datacentres" don't help for the simplest of obscurations in the phone call and guess who has a reason to mask their intentions behind innocent words? Terrorists.
So they use speech to text software, and that's technical brilliance? There's half a dozen exceedingly good ones used in various fields of medicine that are able to handle many different languages and thickly accented English.
I'm sorry, but your opinion seems to be wrong.
Hail Hydra!!!!
"Hey mom, I just named my new dog. I call him President Obama's Secret Isis Terror Gun Bomb. Lets talk about him a lot over the phone using his full name."
All of the voice recognition algorithms I've ever tried are terrible at recognizing anything said with an accent other than the one it was programmed to recognize. I have a Brazilian accent AND a speech impediment. I have never been able to use any voice recognition system in any language.
There was a period of a month or so that I answered the phone (when caller-ID showed that it was a friend who was in on the joke) with things like "Kill the president" or "The dirty bomb is ready". ;-) This was after the news about Total Information Awareness came out (really, anyone who was surprised by the Snowden leaks hadn't been paying attention for the past decade prior... it was obvious what sort of capabilities they were working toward. Heck even back in the 1990s with Carnivore the US government was already starting down this path).
A lot of commercial IVR systems are able to parse for keywords without altering your natural speech patterns. I called a my credit card company's customer service line yesterday, and it IVR prompt asked that I describe the reason for my call. I said, "I have a question about a charge on my bill," and it correctly connected me to the chargeback section. Speech recognition has progressed quite a bit in the last decade or so.
After seeing citizen 4, seems to me habeas corpus is worth jack squat, and NSA knows no bounds or limits or even have the faintest hint of of human decensy.
Just look at what they did to that lavabit guy. Install a backdoor, or give us your customers data or you will be tossed in some remote jail.
He chose to just go out of business. What a bunch of fucking assholes.
Even as far back as 2010, we have piloted speech anayltics that are amazing. Nice has an interesting approuch, which is pretty good if a bit slow. Nexedia has really really nice software, It's fast, and the processed data is dumped into mysql backend for reporting.
Nexedia was initially an isreali intelligence application.
And all the voice samples from "OK Google" and SIRI are theirs to sort through.
After turning on the personal voice sampling (to make OK google more accurate) the tablet never misses a beat, it knows exactly what I'm saying whether I just woke up, in the middle of eating, or with a cold.
I also left an older version of the Nexus 7 tablet (first gen) off for several months, upon turning it on I was surprised to see it was "ready to install" an system update, one that I never approved downloading in the first place, and of course IT HAD BEEN OFF.
Meh, is anything really "off" any more? And what else does it do?
"If any question why we died, Tell them because our fathers lied."
It's not like the NSA can actually DO/b anything with this shit.
We know the government would like some positive press about anticipating events, but a perfect opportunity arose recently when two men opened fire outside a contest for Prophet Mohammed cartoons in a Dallas suburb Sunday night.
Reminds me of Dionne Warwick who hawked Psychic Friends Network.She ran out of money. Why didn't her friends warn her?
It little behooves the best of us to comment on the rest of us.
Perhaps it's time to change our standard telephone salutations to include the keywords they are looking for. :P
You can be sure they're not flagging "hello" or "goodbye". I propose all phone calls now be answered by saying "Death to the president.... Yes, this is Steve. No, I am not interested in changing my long distance provider."
The difference is, when people used to spy during peace time, we didn't have the Internet. Now anybody who records router logs knows your web traffic history, basically we've all become spies. Except spies of the past were politically motivated, they were to be feared if you thought differently than whomever they worked for. Today, the IT guy fixing your computer remotely has the same technology a spy used, but he's just sipping coffee and running updates. The culture is what screwed it all up, people trusted their computer with their most intimate thoughts, love letters, hate letters, all of which are now being judged by a what could be best described as a vengeful-it-guy. That does two things, either chills the person away from the Internet and recording, or forces them to put on a happy smile and agree with whatever these new spies are hunting. And now we learn, indeed just not using a computer doesn't exclude you from the wide net of hateful strangers with opinions that could change your life. So yes indeed it's concerning, perhaps we should all watch each other watching each other at the end, or maybe ... a good idea could be, bring Snowden home, hang a few evildoers, and don't make this whole ISIS thing so ... commercialized. We used to get into wars for moral reasons, now it's just like rooting for a football team. Or we can always take the path of least resistance and just continue to assume the public is a bunch of duck dynasty idiots who'll believe whatever they see on T.V., where's Ron Burgundy?
If you want to search an audio or video recording, even a fairly poor speech to text can be very useful. A 90% success rate (1 word in 10 being incorrect) would provide a very frustrating transcript if you wanted to read it. However, if you are looking for a certain set of keywords or phrases, then 90% is likely to be perfectly adequate - after all, the point is to select "conversations of interest" that can then be listened to more intently.
Human listening was clearly not going to be the solution. âoeThere werenâ(TM)t enough ears,â he said.
That's kind of what made the average Joe feel safe. He thought, they can't possibly listen to everyone, and there's nothing interesting about me.
This technology is the bomb! But, I will provide a colloquialism, ala Admiral Ackbar: "Take evasive action!" Incinerate any predisposition you may have to using keywords, like: bomb, infidel, jihad, Great Satan, etc. Instead, peace be upon you, and all your phone conversations.
sig: sauer
A keyword alone is meaningless. "They" has had this sort of search in mind for a very long time.
They inverse model the vocal aparatus as a finger-print. They know who is talking on either side of the line, by name. Yes, even if it is grandma.
They use signal analytics to determine where the signal comes from. They know where the speakers are located.
Many conspiracy theorists have been salting the conversation for a long time. This is a known and likely long-solved problem.
There are many folks who are not allowed to have their voice recorded, or to speak on a non-classified line. This means multiple nations have this technology.
Interesting problems would include 1) being able to falsify the vocal aparatus in such a way that the voice-print recognition doesn't work, AND doesn't flag as not working and 2) creation and use of non-libraried phoneme/phonology/grammar sets so that recognition is not available by lookup.
In all likelihood, the false positives suggested by the OP and others in this discussion are unlikely to trigger any such NSA attention.
Coming from a data science background, I suspect they are transcribing and indexing all conversations as best as is possible with their elite voice recognition technology. Once it's in ASCII stored in a database, they can datamine the conversations of known radicals and jihadists. The algorithms that are generated don't so much emphasize specific keywords, but they generate a scoring system across a bunch of conversations by known haters-of-American-Freedom.
With filters in hand, they can look at who talked to the known villains and score them and run down the trails of phone calls, emails, text messages, and internet chats to see who else might be a solid villain candidate. Even just monitoring internet traffic to known jihadist websites can likely get the filters applied to a person's communications to see if they might be a person-of-interest.
Keywords will come into play AFTER an attack like the Garland Draw Mohammed contest. The NSA is right now filtering recent past conversations among suspected jihadists looking for relevant keywords such as 'Garland', 'American Freedom Defense Institute', 'Pamela Geller', and 'Elton Simpson'. Any conversation leading up to the attack including those keywords would absolutely put someone on a watchlist. And everyone who that person is talking to would be suspect as well.
Bottom line is, these tools are being used retroactively to bolster detective work. Talking about bombs and the President's name doesn't do anything because there are a thousand-million conversations using those words everyday.
$5 / month hosted VPS on linux = awesome!
I'VE BEEN WAITING YEARS FOR THIS TO LEAK OUT!
I hoped that metadata would purposely be picked as the 1st phase so people realize how bad that is. But I knew they were listening to all our calls for a really long time now. They have special DSPs that handle many phone conversations at a time and transcribe it into text; the hardware and tech was declassified almost a decade ago and it may not have been Siri grade it never needed to be (plus they don't declassify until they have something much better to replace it.)
They don't need super massive data centers to store metadata, but to store the text transcriptions of everything said; that is another situation. They wouldn't even need to trash messages classified as worthless.
As far as you being interesting; you getting flagged for some reason; just a transcription error can cause you to be flagged for more attention. Still that might turn out harmless or you might be molested at the airport for years until finally removed from that list (but likely HMS screws up and you never get off that list even if you were only flagged for a few days.)
Later on when times get bad; all that data could provide useful information to use against you-- right now you are a nobody but later on you might have somebody who doesn't like you in power. It doesn't have to be overt either... you could find yourself unable to get loans, jobs, etc with no clue as to why. (as long as they don't ban you from McDonalds, you'll never know and be too busy as you flip burgers at min wage jobs.)
Over most of history spying has saved lives more than taken them.
I find it so odd that people on Slashdot sing the praises of the "Codebreakers" of WWII but are shocked and freaked out that they are still around today. BTW the US and Britian both spied and used code breaking before the war started so... Yes they were spying in peacetime!!!!! Shocking.
No, spying has not saved more lives than it's cost. Spying is what caused countless Russians to be sent to Gulags, countless Jews to be sent to concentration camps, countless people from the DPRK to be killed because they disagreed with the "dear leader". In fact go back further in history and see how many lives spying cost throughout history.
The first problem is that you are attempting to claim foreign and domestic spying are the same. They are not the same, have never been the same. Domestic spying _always_ has nefarious purposes. We could argue similarly with foreign spying as well. How many people in DPRK has China spying caught? How many people in East Germany were caught by Russians?
JFK's famous line "the very word secrecy in a free and open society is repugnant" is spot on. It's not really hard to understand "why" if you look at the big picture.
-The wise argue that there are few absolutes, the fool argues that there are no probabilities.
Phrases are transcribed from things like:
"Government agencies like the NSA have illegally spied on countless americans without a warrant and continue to violate our constitutional right to free speech as well as privacy"
to:
"Isis anchor baby bombs are protected from harming american patrio-tastic freedom time parades using secret 007 spy gear to thwart kenyan muslim abortion terrorists who hate our freedom"
Good people go to bed earlier.
I thought they only recorded phone call metadata. Since when are they recording the content of phone calls? Do they record the content of all phone calls or just their targets?
Srsly, someone needs to write a long-format magazine article summarizing the current state of knowledge regarding the Snowden revelations. It's hard to keep up.
The problem with gadget security is it will always let you down and is why mass surveillance is counter productive. The larger the dataset, the harder it is to extract any useful information. When you're trying to process billions and billions of records, gadget security is your only option. It's a huge waste of effort and, as the Boston Marathon Bombers and those dead idiots in Texas proved, it's still relatively easy to slip through.
Terrorists are smart enough not to speak in plain language, so I don't get the NSA's addiction to mass surveillance. The tactics that work aren't sexy or easy.
That's our life, the big wheel of shit. - The Fat Man, Blue Tango Salvage
You voted for CHANGE, what you got was a change for the worse.
I call BS. I can't even get a device - of any power - to recognise my voice beyond the very slow, pronounced basics and I have to train myself to it (not the other way around).
Sorry to break it to you, but you're wrong.
For one, want you can't do, and what today computers and networks certainly can - after being configured and programmed accordingly - is sample bazillions of phonecalls from millions and millions of people at insane speeds and aggregate speech patterns and their written equivalent by searching for the fitting existing transscripts and do a weighted correlation of those. All with the support of speech and language optimized signal processing, sampling of regional habits and the target groups favorite set of vocabulary.
Guess why Apples Siri and Google Now / Voice Search need an uplink to work ... exactly, that's why. ... And those devices pre-process the signals on a freaking cheapo smartphone before sending them in for analytics to get the results back.
Turning speech into easyly searchable transscripts probably is a piece of cake by now for those who have the storage, processing power, access to unlimited phone-taps and north of 20 000 Mathematicians to programm it all.
Like a certain U.S. three letter agency that has been getting so much unwanted attention lately.
We suffer more in our imagination than in reality. - Seneca
...and your comment represents the absolutely fundamental misunderstanding that pervades this discussion.
The truth no one wants to hear:
The distinction is no longer the technology or the place, but the person(s) using a capability: the target. In a free society based on the rule of law, it is not the technological capability to do a thing, but the law, that is paramount.
Gone are the days where the US targeted foreign communications on distant shores, or cracked codes used only by our enemies. No one would have questioned the legitimacy of the US and its allies breaking the German or Japanese codes or exploiting enemy communications equipment during WWII. The difference today is that US adversaries -- from terrorists to nation-states -- use many of the same systems, services, networks, operating systems, devices, software, hardware, cloud services, encryption standards, and so on, as Americans and much of the rest of the world. They use iPhones, Windows, Dell servers, Android tablets, Cisco routers, Netgear wireless access points, Twitter, Facebook, WhatsApp, Gmail, and so on.
US adversaries now often use the very same technologies we use. The fact that Americans or others also use them does not suddenly or magically mean that no element of the US Intelligence Community should ever target them. When a terrorist in Somalia is using Hotmail or an iPhone instead of a walkie-talkie, that cannot mean we pack our bags and go home. That means that, within clear and specific legal authorities and duly authorized statutory missions of the Intelligence Community, we aggressively pursue any and all possible avenues, within the law, that allow us to intercept and exploit the communications of foreign intelligence targets.
If they are using hand couriers, we target them. If they are using walkie-talkies, we target them. If they are using their own custom methods for protecting their communications, we target them. If they are using HF radios, VSATs, satellite phones, or smoke signals, we target them. If they are using Gmail, Windows, OS X, Facebook, iPhone, Android, SSL, web forums running on Amazon Web Services, etc., we target them -- within clear and specific legal frameworks that govern the way our intelligence agencies operate, including with regard to US Persons.
That doesn't mean it's always perfect; that doesn't mean things are not up for debate; that doesn't mean everyone will agree with every possible legal interpretation; that doesn't mean that some may not fundamentally disagree with the US approach to, e.g., counterterrorism. But the intelligence agencies do not make the rules, and while they may inform issues, they do not define national policy or priorities.
Without the authorities granted by the FISA Amendments Act of 2008 (FAA), the United States cannot target non-US Persons who are foreign intelligence targets if their communications enters, traverses, or otherwise touches the United States, a system within the United States, or, arguably, a system or network operated by a US corporation (i.e., a US Person) anywhere in the world. FAA in particular is almost exclusively focused on non-US Persons outside the US, who now exist in the same global web of digital communications as innocent Americans.
Without FAA, the very same Constitutional protections and warrant requirements reserved for US Persons would extend to foreign nations and foreign terrorists simply by using US networks and services â" whether intentionally or not. Without FAA, an individualized warrant would be required to collect on a foreign intelligence target using, say, Facebook, Gmail, or Yahoo!, or even exclusively foreign providers if their communications happens to enter the United States, as 70% of international internet traffic does. If you do not think there is a problem with this, there might be an even greater and more basic misunderstanding about how foreign SIGINT and cyber activities fundamentally must work.
If you believe NSA should not have these capabi
The mobile prepaid contract of my mother apparently locks down when the phone is unused for a month or so and she hates calling anybody with the mobile but still needs it occasionally when travelling. So she calls herself from mobile to landline every few weeks and tells herself hello and a few niceties.
Both metadata and content look quite weird for someone purportedly living alone.
There is a story (I've heard it from several sources over the years but I won't vouch for its veracity) about an early translation program that the US military commissioned, sometime in the sixties I think, that illustrates some of the problems. This program was meant to translate English to Russian and vice versa. At the demo all the higher-ups were there, typically not having a clue about the complexity or pitfalls of the task. One of them suggested that the phrase "The spirit is willing but the flesh is weak" be entered, translated into Russian, and the resulting phrase translated back to English. The technician did what he was told and I have visions of cabinet-sized tape drives spinning furiously for awhile until the processing was complete. Finally, the printer spat out the result. I hear the technician dropped the paper and ran away. When someone picked up the printout it said "The vodka is strong but the meat is rotten!"
Don't they keep insisting that they only collect metadata and not actual conversations? If they collect specific conversations with specific targeted people involved in actual crime, couldn't they just deal with those manually?
Anyone with a minimal level of training knows this, and uses methods that our intercepts won't catch.
We only catch the n00bZ.
And, in point of fact, the times we get people to give away things, they're not in the US, but in the Middle East (Saudi Arabia, Yemen, Pakistan mostly).
Intercepts in the US rarely catch anything useful, and have such a high level of red herrings we waste a lot of resources that would be otherwise used profitably overseas, not in the US itself.
-- Tigger warning: This post may contain tiggers! --
Think about this for a second. Why is this surprising?
I don't know about other people here, but I don't even check my voicemail anymore. Google handles that, and has for years. The voicemail transcription I get through Google Voice is almost always good enough that I can determine who called, what they want, and where to call them back to talk further.
Keep in mind, this is a 'free' service to me, I don't pay anything. Due to the volume of people they do it for, I'm certain they they're trying to meet economies of scale and reducing the overhead. Who do you think funds the storage, equipment, etc. for all of this? Adsense?
And it's no secret that the NSA had early involvement with Google.
~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
Like esperanto :-) or even better some obscure language like the Navaho Code,
... for starters.
Yes, it's not a solution to the problem but it's a start.
There is ZERO need to record ALL phone conversations, instead there must be a small list of persons which have their communications tapped, because of their political, criminal or terrorist affiliations.
But that is not what NSA wants. They want TOTAL COLLECTION, even against Americans and Germans. The computer and automatic transcription make it possible. And the tame public, who allow this Tshekist Shite to go on because of some Flimsy Excuses.
Have the people learned anything since Rome ? No, all they need is bread and the naked flesh of the Kardashians. Damn freedom.
Interesting problems would include 1) being able to falsify the vocal aparatus in such a way that the voice-print recognition doesn't work, AND doesn't flag as not working and 2) creation and use of non-libraried phoneme/phonology/grammar sets so that recognition is not available by lookup.
Or use plain simple end-to-end encryption. (Constant encryption all they way between the two correspondents)
instead of using Skype (bascially a black box, and back befor microsoft them, their EULA mentionned that they'll collaborate with any local law enforcement agency) or analog POTS, try instead using standards like SIP or XMPP/Jabber/Jingle with proper encryption (e.g.: Jitzi is a software that implements SRTP/ZRTP encryption)
Then anyone trying to tap into that communication will only get noise.
Not that it's impossible for the NSA to do anything against this (they'll happily try to abuse any backdoor that they know of at each end-point).
At at least it will make it a bit less trivial for them to plain scan anything.
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
Right ON !
The record all calls, all e-mails, and all text messages.
That's why they need the massive data center in Bluffdale UT !
Blatant violation of the 4th amendment.
Freedom is toast in the USA.
I am torn between admiration of the technical brilliance of building software like this and horror as to how it is being used.
The technical brilliance of voice recognition combined with data mining need not be met with horror... All the horror can be reserved for the separate issue of mass surveillance.
Dialling random numbers from a public phone and saying "Is it done?", or "Man, you gotta help me, I did it but there's blood and brains everywhere!"
For the win.
Political debates have me rolling my eyes so much I think I got optical whiplash. I should sue. - Foamy The Squirrel
I was given a demo of a commercial archival/search product that searches audio files (standalone and also as email attachments) for words. It was surprisingly accurate.
There was a period of a month or so that I answered the phone (when caller-ID showed that it was a friend who was in on the joke) with things like "Kill the president" or "The dirty bomb is ready". ;-) This was after the news about Total Information Awareness came out (really, anyone who was surprised by the Snowden leaks hadn't been paying attention for the past decade prior... it was obvious what sort of capabilities they were working toward. Heck even back in the 1990s with Carnivore the US government was already starting down this path).
So if teh evil government are recording all this, why didn't they send someone round to check up on you and/or beat a confession out of you? It's almost like they only apply it to actual malefactors.
To have a right to do a thing is not at all the same as to be right in doing it
That's what all those detainees in Gitmo, and in the dark sites, are doing. It's prison labour, converting speech to text. The accuracy is terrible but the cycles are free!