Alexa Scientists Claim Audio Watermarking Technique Nearing 100% Accuracy (venturebeat.com)
georgecarlyle76 brought our attention to Amazon's claim of an algorithm that "solves the 'second-screen problem' in real-time."
"Ever hear (no pun intended) of audio watermarking?" asks VentureBeat. It's the process of adding distinctive sound patterns identifiable to PCs, and it's a major way web video hosts, set-top boxes, and media players spot copyrighted tracks. But watermarking schemes aren't particularly reliable in noisy environments, like when the audio in question is broadcasted over a loudspeaker. The resulting noise and interference -- referred to in academic literature as the "second-screen" problem -- severely distorts watermarks, and introduces delays that detectors often struggle to reconcile. Researchers at Amazon, though, believe they've pioneered a novel workaround, which they describe in a paper newly published on the preprint server Arxiv ("Audio Watermarking over the Air with Modulated Self-Correlation") and an accompanying blog post. The team claims their method -- which they'll detail at the International Conference on Acoustics, Speech, and Signal Processing in May -- can detect watermarks added to about two seconds of audio with "almost perfect accuracy," even when the distance between the speaker and detector is greater than 20 feet...
So how's it work? As Tai explains, the model employs a "spread-spectrum" technique in which watermark energy is spread across time and frequency, rendering it inaudible to human ears while robustifying it against postprocessing (like compression). And it generates watermarks from noise blocks of a fixed duration, each of which introduces its own distinct pattern to selected frequency components in the host audio signal. Conventional detectors would compare the resulting sequence of noise blocks -- the decoding key -- with a reference copy. But Tai and colleagues take a different approach: Their algorithm embeds the noise pattern in the audio signal multiple times and compares it to itself. Because said signal passes through the same acoustic environment, Tai explains, instances of the pattern are distorted in similar ways, enabling them to be compared directly. "The detector takes advantage of the distortion due to the acoustic channel, rather than combatting it," he added.
"Audio content that Alexa plays -- music, audiobooks, podcasts, radio broadcasts, movies -- could be watermarked on the fly," explains Amazon's blog post. It argues that this could be useful "so that Alexa-enabled devices can better gauge room reverberation and filter out echoes."
"Ever hear (no pun intended) of audio watermarking?" asks VentureBeat. It's the process of adding distinctive sound patterns identifiable to PCs, and it's a major way web video hosts, set-top boxes, and media players spot copyrighted tracks. But watermarking schemes aren't particularly reliable in noisy environments, like when the audio in question is broadcasted over a loudspeaker. The resulting noise and interference -- referred to in academic literature as the "second-screen" problem -- severely distorts watermarks, and introduces delays that detectors often struggle to reconcile. Researchers at Amazon, though, believe they've pioneered a novel workaround, which they describe in a paper newly published on the preprint server Arxiv ("Audio Watermarking over the Air with Modulated Self-Correlation") and an accompanying blog post. The team claims their method -- which they'll detail at the International Conference on Acoustics, Speech, and Signal Processing in May -- can detect watermarks added to about two seconds of audio with "almost perfect accuracy," even when the distance between the speaker and detector is greater than 20 feet...
So how's it work? As Tai explains, the model employs a "spread-spectrum" technique in which watermark energy is spread across time and frequency, rendering it inaudible to human ears while robustifying it against postprocessing (like compression). And it generates watermarks from noise blocks of a fixed duration, each of which introduces its own distinct pattern to selected frequency components in the host audio signal. Conventional detectors would compare the resulting sequence of noise blocks -- the decoding key -- with a reference copy. But Tai and colleagues take a different approach: Their algorithm embeds the noise pattern in the audio signal multiple times and compares it to itself. Because said signal passes through the same acoustic environment, Tai explains, instances of the pattern are distorted in similar ways, enabling them to be compared directly. "The detector takes advantage of the distortion due to the acoustic channel, rather than combatting it," he added.
"Audio content that Alexa plays -- music, audiobooks, podcasts, radio broadcasts, movies -- could be watermarked on the fly," explains Amazon's blog post. It argues that this could be useful "so that Alexa-enabled devices can better gauge room reverberation and filter out echoes."
Yet another reason to buy physical media and keep you out of my house. Whats next? Alexa patents method to watermark air so she can charge you for the right to breathe?
Is that even English? I guess it is if I figured out what he meant.
But it certainly sounds lazy.
Add low-intensity, imperceptible white noise to stream, destroy any chance of any detection. Profit.
Fuck those particular scientists and their vapid corporate subservience. The world would be a better place if their mothers would have had abortions.
Why would he lie??
Their algorithm embeds the noise pattern in the audio signal multiple times and compares it to itself.
they are going to make the watermark "more detectable in noisy environments" without making it more detectable to the listener. There is already much discussion of how current watermarks are commonly audible in otherwise high fidelity music files.
https://www.mattmontag.com/mus...
Of course, nobody is ever going to use Alexa for anything remotely related to hifi, but this is certainly not something we would want to see spread anywhere else.
Prove it - Open Source The Code.
CAPTCHA: lipstick
No word on false positives, so I'm going to assume those are 100% too. 100% everywhere for everything! Perfect score!
1. Spy agencies around the world will now be able to effectively eavesdrop on every public conversation and, apparently with "100%" accuracy, convert the speech to text and store in an easy-to-search manner.
2. Elite sites like YouTube and Facebook will be able to "100%" detect every bit of copyrighted material and ban and block anyone who posts, say, more than 5 seconds of something, even if it's in a home video.
The Amazon devs need to be put in prison either now or after the Collapse / Revolution.
You mean those DTMF sequences with silence before and after? Yea, totally inconspicuous. And there's totally nothing you can do against it like say... replace it with silence?
Why would anyone buy a device that might refuse to play any damn sound file you point it at? As if I needed another reason not to buy one of these things.
Laws are rules for the court, but merely a bottom bar to hit for life. Think beyond laws in your actions always.
You misgendered her. Misgendering a normal person isn't a crime, but because she is a trannie, you have committed high treason against the internet. I hereby sentence you to Barbara Hudson for life, without the possibility of escape.
Reading the article title ... Audio Watermarking Algorithm Is First to Solve "Second-Screen Problem" in Real Time
Here's a clue stick. Instead of treating symptoms how about addressing the cause, namely:
a) availability (lack of legal availability), and
b) price (due to expensive licensing)
because Piracy "solves" those two problems. Treating the symptom, audio watermarking, is not going to stop people from sharing music. Content sharing is called free advertising -- or am I in "violation" because the rest of my family can listen to my music even though only I paid for it? If you don't want people to share it, then don't release it. Real simple.
Maybe it is time to bring back Sneaker Net ?
I always enjoy seeing idiots whine about how we should not go to space or get better Iphones or whatever the fuck because the entire world is not working full time on whatever their personal grievance is.
Hey, guess what, moron? There are billions of people on the planet and having all of them work on cancer is just stupid. Putting aside the pedantic questions about who would grow food, if we put every scientist, engineer, and actually smart people on cancer then absolute not a single second earlier would cancer be cured. You know why? Life is not a video game where you get to assign people. People do what they feel like, not what some net troll wants them to do. And furthermore even if all of them really did want to join your fantasy search for cancer most do not have the skill, education, background or ability to advance the science of cancer research. It would be a waste of resources and actually put cancer research behind if computer scientists and acoustic engineers switched fields and started sucking up cancer research dollars.
You are a moron and a troll but I am bored and felt like setting your dumbass straight anyway.
why are you so hateful
i know the title has it in there but it wasn't really a post about cancer, calm down
I'm not sure how this would work without am Alexa spy device in the house.
I read the headline and thought it was about making Alexa work nicely in an environment where it plays loud music.
Apple's HomePod does that very nicely. Instead of adding a watermark, it compares the signal entering its microphones with the signal leaving the speakers, so if you have loud music playing through your HomePod, it can eliminate that music almost completely before it starts speech recognition.
Next, if some person in the room says "Hey, Siri", it analyses the voice of the person saying the words, and eliminates what anyone else in the room is saying. Apple published a paper about this, and has some demos somewhere. One is very loud music in a room with many people talking. Phase 1 eliminates music, leaving many people talking and a bit of white noise. Phase 2 eliminates the voices of anyone except the person saying "Hey, Siri" and what's left is one perfectly recognisable voice, plus a bit more white noise. So "Hey Siri" works with loud music as long as it is played by the HomePod, and lots of people talking. What Amazon is planning here, on the other hand, doesn't seem to be something that any of the customers buying Alexa is asking for.
Why does enforcing Copyright policy in the information age seem like ramming a square peg into a round hole?
If it previously was 51% accurate and now it is 52% accurate, does that qualify as "nearing 100%?"
I used to work at a startup where the CEO issued press releases saying our user base was "approaching 1 million." Not sure if having only 50,000 users qualifies but that's what he did...
I think Amazon's motive is to come up with a way for a home theater amp, TV, media device, etc. to watermark its audio output in a way that tells a listening Alexa-implementing device, "don't be triggered by THIS SPECIFIC audio" (so every time someone on TV says, "Alexa" the device WON'T be triggered).
The catch is, the device needs a way to distinguish between "media audio" (that should NOT trigger it) and people in the room (who should ALWAYS be able to trigger it, even while watching a TV show or movie with the 'ignore me' watermark).
It has to be something that a device on the consumer end can add, because remastering a century's worth of media to add it at the content-producer's end just plain isn't going to happen.
Amazon is painfully aware of the "TV triggered Alexa" problem. It's not just annoying, it's a real potential vulnerability (mitigated mostly by the fact that buying radio & TV ads is both expensive & non-anonymous, so an ad that INTENTIONALLY tried to exploit it would get the advertiser sued). They don't want to just overlay a "dumb" "ignore everything for {n}ms" ultasonic tone burst, because THAT could be abused as well (say, by advertisers who wanted to prevent an Alexa-controlled device from accepting commands from ANYONE during the ad). So... it needs to be:
* specific to media being played in the presence of an Alexa-implementing device
* able to be injected at the consumer end, and something that could cheaply be added to something like a blu-ray player (ideally, lightweight enough to implement as a firmware update to existing players).
* NOT affect verbal commands from humans in the room.
Incidentally, I believe Amazon initially considered trying to use Cinavia for this purpose (since it's already present in many movies), but quickly realized it would cause more problems than it solved. Cinavia was designed to robustly (and indiscriminately) scream, "stop recording!", not "ahem... please don't attempt speech-recognition on THIS SPECIFIC audio". If Echo ignored 'Alexa' for {n} seconds after recognizing a Cinavia watermark, mere playback of Cinavia-watermarked content within listening range would effectively disable the use of 'Alexa' entirely for those {n} seconds. Ergo, Amazon had to come up with something better.
To wit, this is NOT about imposing DRM. It's about preventing media content from triggering the device by having someone on-screen say 'Alexa', by giving the device a way to distinguish BETWEEN media content and local users.
I want Alexa verifying that I've paid for what I'm listening to. I also want Alexa verifying that I'm renting all my music and video from Amazon.
This is yet another reason why I don't have Alexa, Siri, or Google's thing in my house or car.
I suspect that this will be mandatory soon, much like the BBC license in the UK.
Well, that's nice and all for the "TV", but it won't do anything to stop other humans (like me) from doing things with strangers' Alexa gadgets. I have a neighbor that routinely leaves his "smart speaker" on too loudly. I just shut it off through the door. If he continues, I'll just start order large tubs of lube and rubbers for him.
Also, when I go into somebody's house, I'll ask them to turn off any of their recording devices. Just to make sure, I will also try to order some escorts via a smart speaker. It works 100% of the time.
I don't respond to AC's.
The job of lossy audio compression algorithms is to discard inaudible characteristics of the signal.
Any watermark that's inaudible is, by definition, a candidate.
If I'm applying a watermark and compression at the same time, I can teach my compression software that the watermark signal is an important characteristic not to be discarded.
But if someone recompresses the audio?
Doing THIS.
Hello, 1995 called and wants to return.
Sorry Amazon you can go eat shit with this.
It will get hacked and what then eh?
Sign our lives away just to listen to the latest bit of gangsta rap or drill?
I don't see this as a problem that needs solving. If there's a copyrighted soundtrack playing in a noisy environment, then quite obviously the music (1) is secondary, tertiary, or non-essential to whatever else is going on in the video and thus not a copyright violation, and (2) is not a reproduction someone wanting an illegal copy of the copyrighted work would be interested in. So there's zero reason for a copyright holder to even want to detect it. It would result in stupid things like people getting a copyright strike on their YouTube video because some car passing by in the background has the radio on playing a song.
The only use I can think of this is to figure out what music and TV shows you like by eavesdropping in on what your Alexa / Google Home device's always-on microphone picks up as you play the radio or TV in your home.
can better gauge room reverberation and filter out echoes
For some reason, I read that statement and immediately felt that the speaker was more likely to be thinking about how much more accurately they can measure the room for many other reasons. I guess I'm getting more cynical.
Rooms have audio signatures. Those signatures are altered by how many people are in the room and where they are at. How much more information will they now be able to gain about the room and its contents?
If you're going to listen to or watch anything that may be of "questionable" providence, better put ear muffs on your Alexa first or she might tattle.
Please sue your grade school for malpractice.
---- The above post was generated by the Turing Institute. Maybe.
Can't you just "return true;"? I bet that would be a lot more accurate. Surely damn near everything is copyrighted.
If a player is certified BD-compliant, it will look for, and respect, Cinavia.
The rest of the world doesn't. So, if you rip a BD disk and play it in, say, Kodi or VLC, Cinavia doesn't matter. Play it on an Oppo player, though, and you may get a surprise.
The Beatles (yes, that band) spent a fortune on a similar scheme to protect their music on tap and record. Didn't work then and won't work now. Here's why.
You watermark- which implies a method to detect the watermark. So the pirate plays with the watermarked source until the watermark detector fails. Game over.
Only if the watermark detector is kept a secret (NOT the watermark maker) can the method enjoy some protection- but clearly this cannot happen.
So how do these cons work? They rely on the fact that the business man buying into the scheme is technically stupid - always a pretty good bet. Good in biz tends to correlate with thick in maths/science.
PS read the article again and notice something VERY creepy. The real intent is using microphones in PRIVATE homes spying on the people there to see if they are playing pirated material. Another reason not to have an NSA linked microphone from Google or Amazon.
Even on all other 364 days of the year
I do not believe in karma. "Funny"=-6. Do good and forbid evil. Yours, Oft-Offtopic Flamebaiting Troll.
We could have devoted our research efforts to curing disease. Or reducing pollution. Or improving energy efficiency. Or even building better killer robots.
But no. We spent millions in research money on... Preventing unauthorized enjoyment of music.
Capitalism FTW! Fuck you, plebs, that's why!
Doesn't this need a microphone to work? If there's no input from the noisy room, there's no way to compare the outgoing signal to the ambient sound. This is just another argument against IOT devices that snoop on you.
ion. So if copyrighted work is shared, they'll know who it was. More Evil of course.
So figure out the freqencies and generate filters?
I'll have to setup a full scope tonight and play around.
By the way if you own one of these spy devices (Smart TV, Alexia etc...) you are the problem.