Google Wants Your Voice Data
00_NOP writes "Peter Norvig, Google's director of research, has told New Scientist that one of the reasons the search engine launched Google Voice is that it needs more human voice data to perfect the sort of 'big data, simple algorithm' probabilistic approach to translating voices to text that drives Google Translate. Norvig says that no one is listening to your calls on Google Voice — it is simply their servers trying to get the translation right."
I will say that the translation of my voice mails is terrible. Although, how can you tell if it is translated correctly if you don't listen to it? You can look for proper English, but even some of my translations are proper English yet still incorrect. (names, etc come out wrong.) Though most of the time it it's just a jumbled mess that I can't deduce the actual meaning of.
How do servers assess whether they've got the translation correct without having a human-in-the-loop to listen to the conversation and concurrently read what the server translated? Maybe the data is anonymous by the time it gets to a human, but it seems like humans need to interface with the voice data somehow to validate that the server is translating accurately.
oh shit! Google accidentally my voice data!
I can tell that they want the voice data badly. They make it very difficult to delete call and voicemail history. You can't delete more than 10 records at a time and even then they go into trash and keep piling up over there. You can delete the data from trash but again only 10 at a time. There is no option to empty the trash. Their help section says that the history is purged from trash after 30 days automatically but only that it isn't. My call history sits in the trash indefinitely unless I painstakingly delete all history 10 records at a time.
Apparently they're already on par with your average Slashdot editor.
PocketPermissions Android Permission Guide
They have another server that checks the first server's translation. Part of their work is checking that server's affectiveness, too.
I gave up trying to get voice software to work over a decade ago. The reason is that I'm trilingual and use all 3 daily. So the software needs to be able to:
- understand a lousy accent: there are some words I cannot and will never be able to pronounce 'right'
- recognize what language is being spoken (having those 3 and only those 3 preset in the options)
Now I haven't tried Google Voice, but none of the software I've tried or heard about could even remotely do those two basic things.
Non-Linux Penguins ?
Hey, that would be useful, a service that translates slashdot summaries into English.
The truth is that all men having power ought to be mistrusted. James Madison
"Peter Norvig, Google's director of research, has told New Scientist that one of the reasons the search engine launched Google Voice is that it needs more human voice data to perfect the sort of 'big data, simple algorithm' probabilistic approach to translating voices to text that drives Google Translate. Norvig says that no one is listening to your calls on Google Voice — it is simply their servers trying to the translation right."
I think Google Voice translated the last part of that sentence.
Right. Everyone get on Google Voice with funny unnatural accents, unusual intonation and non-native grammar! Let's skew their data.
We are derisive towards "Hai This is Facebook. Plz give us ur full name, address, cell phone number, age, and eye color so we can give you five Farmville sheep."
But you bring up the more interesting case, "Awesome service versus abused data". (Shout out to Holland and TomTom for yesterday's example.)
Or here, Google Translate vs ... a billion hours of juicy phone calls!
Speech is "Audio" - All we need is a hacker and a Wikileaks Dump!
My first Journal Entry ever, in 8 years! http://slashdot.org/journal/365947/aphelion-scifi-fantasy-horror-poetry-webzine
There was a Userfriendly.org strip years ago which pretty much summarizes my experience with voice recognition software for the past 15 years. . .
I can't find the link to the comic anymore, but basically, one of the guys in the office had been trying to use voice recog software. Some of his coworkers come to his office. He's not there, but on the screen, they wonder about the mysterious message, "Cod Am Pizza Ship".
The Strip.
I'd be willing to let this happen if google then released the derived heuristics as free open source software. I'll share if you share.
Do you not understand what voicemail is? How can record a message for someone without consenting to it being recorded?
We hope your rules and wisdom choke you / Now we are one in everlasting peace