Siri, Alexa, and Google Assistant Can Be Controlled By Inaudible Commands (venturebeat.com)
Apple's Siri, Amazon's Alexa, and Google's Assistant were meant to be controlled by live human voices, but all three AI assistants are susceptible to hidden commands undetectable to the human ear, researchers in China and the United States have discovered. From a report: The New York Times reports today that the assistants can be controlled using subsonic commands hidden in radio music, YouTube videos, or even white noise played over speakers, a potentially huge security risk for users. According to the report, the assistants can be made to dial phone numbers, launch websites, make purchases, and access smart home accessories -- such as door locks -- at the same time as human listeners are perceiving anything from completely different spoken text to recordings of music.
In some cases, assistants can be instructed to take pictures or send text messages, receiving commands from up to 25 feet away through a building's open windows. Researchers at Berkeley said that they can modestly alter audio files "to cancel out the sound that the speech recognition system was supposed to hear and replace it with a sound that would be transcribed differently by machines while being nearly undetectable to the human ear."
In some cases, assistants can be instructed to take pictures or send text messages, receiving commands from up to 25 feet away through a building's open windows. Researchers at Berkeley said that they can modestly alter audio files "to cancel out the sound that the speech recognition system was supposed to hear and replace it with a sound that would be transcribed differently by machines while being nearly undetectable to the human ear."
This is not "news" because it's not "new"
It's been known since September 2017: https://www.infosecurity-magazine.com/news/ultrasonic-dolphinattack-hack-voice/
Funny how the original research listed only Chinese researchers. Now, NYT attributes this researcher to some Berkley guys, which is highly inaccurate. The DolphinAttack was the sole creation of the Chinese research team.
I wonder how long before we get inaudiable malware / trolled -- Alexa add big hairy balls to my shopping list!
And really most of this stuff is just as bad even if it is audible. It just means one has to figure out when you aren't home before they hold a speaker up to your mail slot / under the door / up to a window.
And how are they going to secure it? Voiceprints -- we already have software that can defeat voiceprinting with a small sample. Passwords? That you have to say aloud everytime you use the device? That's pretty much pointless.
This type of technology is fundamentally broken and from what i can see so far, it cannot be fixed.
TFA seems to indicate they believe this to be an unexpected and curious flaw in the software, but the fact that this works as well as it does, from up to 25 feet away, is inaudible to humans, and nearly all these PA devices can hear and respond to these types of ostensibly surreptitious commands.. well, maybe I'm paranoid, but maybe they just stumbled onto another NSA backdoor. Or even a Google/Apple/Amazon backdoor.
I find this creepy and suspicious as hell.
Look back up at my post, now look back down, you're on the Internet. Now look back up. I'm a signature.
If you have one of these, you are too stupid to continue as part of the human race. Please, for the benefit of the rest of us, off yourself. At least do not breed.
Researchers at Berkeley said that they can modestly alter audio files "to cancel out the sound that the speech recognition system was supposed to hear and replace it with a sound that would be transcribed differently by machines while being nearly undetectable to the human ear."
But did these so-called researchers see what Siri, Alexa, and Google Assistant do when they play the audio clip backwards? What kind of half-assed research is this?
Anyone know a good tool to play commands to Alexa in an inaudible range? My goals are mostly harmless.
"Alexa Simon Says, Kids go do your homework!"
That kind of thing.
"That's the way to do it" - Punch
foiled again.
-your friendly neighborhood government agency
The article points out the invasion of privacy aspects of some hacks. Secretly listening in and recording things is one way to do that. This is different. This is hacking the speaker to do unwanted and unrequested (by the human "owning" the device). Would that not fall under the various hacking, unauthorized access, etc. laws? Granted, the AI needs to know better, but I would think somebody doing this with nefarious purposes (or even not, if it's undisclosed otherwise) might violate one of those statutes.
They're already controlled by inaudible commands. Ethernet packets are silent. Do people think they "control" these things? How fucking stupid do you have to be to think that? Am I living in Douglas Adams's reality, where white mice are really running experiments on humans?
I don't respond to AC's.
In voice recognition the first thing you usually do is applying filters to the signal removing anything below 1kHz and above somewhere of 8kHz or 10kHz.
There is no way that there can me a sublime message in infra sound or ultrasonic sound.
How would you actually "interpret it"? You would need a deliberated trojan horse/backdoor to translate a human voice sentence "transmitted" at infra sound into something the machine can interpret as a message, same for ultrasonic sounds. With infra sound you probably would even need to make a sentence in time much longer, you never can pack a high pitched command yelled by a woman around 8kHz lasting 3 seconds into a 3 second unhearable infra sound command, it would be more likely 15 seconds or 20 seconds long. And why would a machine pick that up if not deliberately hacked with a backdoor to do so?
How do you transform a 3 seconds message into ulta sonic ranges without making it much much shorter? Just shifting the frequency? Anyway, it would not go through the filters then.
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
If you're aiming for humor I find it fell way short... your silent ethernet packets are aimed at the antenna, not the microphone, which is the subject of TFA.
The phones are susceptible to silent control VIA THE MIKE.
And as for white mice, I, for one, welcome our new Presidential Overlords, Pinky and the Brain. They've *got* to be better than what we've had since 1969!!!
The "Civilized World" jumped the shark ca. 1973.
According to reports a man could be heard yelling the phrase "Alexa open the front door" shortly before the TV was noticed missing.
A suspect was later apprehended with missing TV found in Frunk of his self-driving get away vehicle after it autonomously allided with an inanimate barrier.
His point is these devices are already controlled by the network and the mega corporations that control the device. Those corporations can instruct those devices to do whatever they wish. You don't "control" them, you just use them to get access to some of their functionality. I don't find that humorous myself.
How long before someone goes to jail because they can "start a nuclear war by whistling into an Amazon Echo"?
Yup, that's what I meant, thanks.
I don't respond to AC's.
Hi, former technician here.
I've been constructing and building so many robotic, listening devices, radio communication devices that I have enough under the belt to tell you that you don't really need to worry TOO much about all of that, at least not for now, here's why:
1) For this to be at all possible, the devices involved must meet a range of technical specifications and capabilities. For example, you have a mobile speaker that is specced to work within 20 hz to 20KHz, most of these will fail above 10KHz anyway, and you don't need them to be better than that, for its purpose, headphones however - is an entirely different case.
2) I've tested numerous microphones so small we're talking 2-3 mm size, and most of these failed to pick up frequencies above 20KHz. As a young person, you could potentially hear up to 24KHz (I could pick up 23KHz sounds when I was 18 and worked in an electronics store, we tested with a Function Generator and a Piezo speaker specced well above 28KHz). Today I can pick up around 16.5-17KHz, which is not bad for my age, but on the plus side, I don't need expensive headphones anymore.
3) We're talking inaudible sounds to the human ears here, therefor we're above the 20KHz range, to be entirely safe - we should be above 25KHz for this, very few phones, televisions, computer speakers and whatnot are capable of vibrating or picking up vibrations at those speeds, therefor this kind of communication in that frequency spectrum would fail drastically.
What you COULD do tho, is that you use the upper audible frequency spectrum of say just above 10KHz and mix it with existing sounds, time it correctly with proper known synchronization (remember the old modems and their sounds? Now imagine a much higher pitch) - and albeit quite slow, it would still be possible to use it to trigger commands, communicate short messages etc. Anything needing more bandwidth than this would be impractical. You wouldn't hear this, albeit the sound technically would be possible to pick up if it was too long, but if just a split second there, in sequence not spaced too close, you'd be able to get away with it, possibly disguised by music or voice, but you'd still need some form of "trigger" sequence to pick it up and start reading, otherwise you'd get timing errors. Kinda like "fast morsecode" if you like.
If you're worried about eavesdropping, you should be far more concerned with your home's windows - those are like giant eardrums, and light hitting those will create a small vibration of the reflected light, this tech has been known for years, you just don't hear about it very often.
What this world is coming to - is for you and me to decide.
The only thing that will disable this is cutting power to the internal microphone. Windows themselves are one of the ways we used to "hear" conversations, typing (which can also be picked up by your cellphone and any device with a microphone, as well as nearby vibration sensors in your cellphone).
Even inaudible humming frequently can be translated.
Just don't install devices in your tin foil shielded and sound baffled escape room, and make sure it's not just airgapped but it's also without fans.
(thinks about people failing to get how air works, or what sound is, and how useless all of this is to virtually everyone)
-- Tigger warning: This post may contain tiggers! --
Same thing as the image detection routine hack that was recently published. Looks like a house, AI sees a dog.
Am I living in Douglas Adams's reality, where white mice are really running experiments on humans?
Of course not.
They're brown mice. Kind of a chestnut brown. The white mice thing was a ruse so you'd choose the wrong observers.
-- Tigger warning: This post may contain tiggers! --
So potentially malicious actors could stand outside my door with speakers and get my Alexa to...do what, exactly? Play my Spotify playlist? If they're already on my property blasting speakers at me, shouldn't I worry more that they might steal something?
This is a panic over nonsense.
framing.
"ok google, find me some child porn"
"ok google, message and say are we still going to have sex again tonight?"
"ok google, send a message to the sherriff that says I love kiddie porn and abusing children, please come arrest me"
and of course, have your phone text-to-speech this post to you near any google device.
The basic form of this problem was solved long ago by using user accounts and permissions to give everyone their own preferences and storage spaces and dictate who has access to what resources. It just needs to be extended to these assistant devices by using voice recognition. Then any attack would have to be personalized for you which solves any attack trying to throw a wide net. Personalized attacks would have to be addressed by having the assistant verify it sounds like a real voice by a previously-identifed user and not a synthetic voice that's been shifted into an inaudible range or whatever.
So Alexa is not only a dumb liberal (https://m.youtube.com/watch?v=MECcIJW67-M),
but also a whore cheating on you whenever she can.
Interesting...
I know you can change the "wake word" for Alexa, though you can't change it to anything other than a few words on a list.
If all of these devices had the ability to change to a truly customized wake word, it would be harder for an attacker to activate the device.
I don't own one, but if I did I'd change the wake word to "Shit head". Hey, shit head, what's the weather like today?
The journal article can be found at https://arxiv.org/pdf/1708.09537.pdf
The key is that the microphone is non-linear. The desired command is AM modulated on a 20kHz subcarrier, and the nonlinearity in the microphone demodulates it before the low-pass filter can filter it out.
Taking the word "control" out of the context of the article is mostly what I'm getting out of this take...
number9number9number9number9
Wait for it ... BECAUSE I DON'T FRIGGIN HAVE an open mic product in my house ...
There. I feel better now.
vibrator to earthquake
Shitty tech journalism is shitty.
My Pixel 2 can't even hear me when it's in my pocket, so I'm not overly concerned
And yes, even though I disable the "ok Google..." hotword on my phone, I know people say that the NSA/FBI/CIA/Whatever can still spy on me through it, but I view that differently. If the government's men in black want to get me, they're going to find a way get me.
There's a difference between something that can be done by some large corporations that don't want to scare away customers, and something that can be done by anyone with a little technology from outside if your window is open.
"When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes
So these are sounds that humans cannot physically hear? And that travel slower than the speed of sound? Or maybe they are infrasound?
Words don't always mean what writers think they mean. Next time try "unintelligible" and "subliminal".