How Google's Pixel 2 'Now Playing' Song Identification Works (venturebeat.com)

← Back to Stories (view on slashdot.org)

How Google's Pixel 2 'Now Playing' Song Identification Works (venturebeat.com)

Posted by BeauHD on Thursday October 19, 2017 @01:00PM from the behind-the-scenes dept.

An anonymous reader shares a report from VentureBeat, written by Emil Protalinski: The most interesting Google Pixel 2 and Pixel 2 XL feature, to me, is Now Playing. If you've ever used Shazam or SoundHound, you probably understand the basics: The app uses your device's microphone to capture an audio sample and creates an acoustic fingerprint to compare against a central song database. If a match is found, information such as the song title and artist are sent back to the user. Now Playing achieves this with two important differentiators. First, Now Playing detects songs automatically without you explicitly asking -- the feature works when your phone is locked and the information is displayed on the Pixel 2's lock screen (you'll eventually be able to ask Google Assistant what's currently playing, but not yet). Secondly, it's an on-device and local feature: Now Playing functions completely offline (we tested this, and indeed it works with mobile data and Wi-Fi turned off). No audio is ever sent to Google.

6 of 129 comments (clear)

Min score:

Reason:

Sort:

Re:How big is an "acoustic fingerprint"? by BradleyUffner · 2017-10-19 13:40 · Score: 5, Informative

Yet another lump of unremovable pre-installed stuff taking precious space on your phone.

The Google spokesperson wouldn’t give us an exact size for the database file (which is not surprising, since it changes every week and is based on your country) but did say the whole feature should take up less than 500MB. Again, if you never turn the feature on, don’t worry — you won’t lose this space.
If you don't turn it on, it doesn't ever download the fingerprint database.
Re:works offline? by lucm · 2017-10-19 13:42 · Score: 5, Informative

How in the actual fuck is this possible? They have an audio an audio signature of every song built in?
Yes. And this is not surprising; the data needed to identify songs is tiny. Essentially it's just vectors (big numerical arrays), they don't need to store the whole mp3.
More and more can be done locally on the devices. For instance, look at what is actually needed to detect English speech using CMU sphinx:
https://github.com/cmusphinx/p...
(look at the hmm model)
This used to require huge computing power and storage, but now it can work on a mobile device.
Another example: once upon a time you needed Google datacenters to do gender and age recognition on photos. Now you can download pre-trained models for that, and the result can fit on a mobile device. Or you can download the entire dataset (500k photos of celebs) and train it yourself on your own servers;
https://data.vision.ee.ethz.ch...
Or you want a model to recognize basically any kind of object in a photo?
https://github.com/tensorflow/...
(there's a model specifically designed to run on mobile devices)
i know it's disturbing but this is where things are today. Just a few years ago, this XKCD comic was true:
https://xkcd.com/1425/
Now you can actually download the code and models to do that completely offline and in a few ms.

--
lucm, indeed.
Re:OK by Dan+East · 2017-10-19 14:28 · Score: 4, Interesting

Although I think you're being funny, no, this couldn't be used in that way. Noise cancelling headphones work by using destructive interference, which requires an exact opposite waveform of the sound being cancelled out. Since the analog waveform of the music would be affected by any number of factors (the quality of the speakers playing it, the equalizer settings of their audio equipment, the bitrate of their source, the echoing of the sound off various objects, multiple speakers playing the audio, which would result in multiple "copies" of the music reaching your ear just very slightly delayed from one another, etc, etc), you couldn't use a "canned" waveform (the original MP3) to cancel out the actual waveform reaching your ears.
Now, while it might be possible, using AI, to try to do a best match of the ambient sound against a canned waveform, and cancel out only the ambient sound that seems to match, it still would not work perfectly. That would result in echos and certain portions of the frequency spectrum still being heard, which would sound very strange.

--
Better known as 318230.
Re:works offline? by lucm · 2017-10-19 15:05 · Score: 4, Insightful

Why shit on mp3 and try to re-invent the wheel with vectors?
First, nobody is shittng on mp3. As for the reason to use tiny vectors instead of storing big mp3 files, I'm not sure why I have to explain it to you but it comes down to two things.
1) Storage
2) Availability of advanced, high quality vector processing libraries like BLAS or LAPACK
this being said, it was just my guess, for all I know maybe they are storing data in sqlite3 or in the headers of a jpeg file that shows your mom pleasuring herself with a maglite.

--
lucm, indeed.
Re: works offline? by Anonymous Coward · 2017-10-19 21:40 · Score: 4, Funny

jazz songs ?
why do you want to stress their app with sending random data ?
Re: works offline? by DontBeAMoran · 2017-10-20 03:40 · Score: 4, Interesting

32 thousand CDs, using slim jewel cases at 5mm thickness, means you have a CD tower 160 metres tall. Given a standard height of three metres per floor, your CD stack is over 53 stories high.

--
#DeleteFacebook