Slashdot Mirror


Google Wants Your Voice Data

00_NOP writes "Peter Norvig, Google's director of research, has told New Scientist that one of the reasons the search engine launched Google Voice is that it needs more human voice data to perfect the sort of 'big data, simple algorithm' probabilistic approach to translating voices to text that drives Google Translate. Norvig says that no one is listening to your calls on Google Voice — it is simply their servers trying to get the translation right."

29 of 138 comments (clear)

  1. Um, how by C_Kode · · Score: 2

    I will say that the translation of my voice mails is terrible. Although, how can you tell if it is translated correctly if you don't listen to it? You can look for proper English, but even some of my translations are proper English yet still incorrect. (names, etc come out wrong.) Though most of the time it it's just a jumbled mess that I can't deduce the actual meaning of.

    1. Re:Um, how by The+MAZZTer · · Score: 2

      There is a checkmark and an X you can click if the translation is good/bad.

    2. Re:Um, how by Anonymous Coward · · Score: 5, Insightful

      We should just get over the fact that privacy is gone, eh? Not here, my friend. Not ever. People have a RIGHT to privacy despite what anyone will tell you.

      The fact that the majority of people could give a hoot in hell about their personal and our collective privacy will come back to haunt us. I don't understand why people would willingly give up their privacy for a little functionality, cool tech, gadgets, whatever. I value my privacy and I don't share my info willingly with anyone or any organization without a lawful requirement, e.g. SSN for employers, banks. I even fought my medical insurance company on getting my SSN because they have no legal mandate to possess that information.

      I will not have grocery store cards to save money for the same reasons. I will not trade my personal information for a little savings.

      These companies take our information from us and profit greatly while we in turn get what? A "free" email account laden with ads that track our behavior? This is not a win-win situation and no one really cares, because they can chat with their friends across the globe in real time, make "friends" on Facebook they will never meet or really know.

      Where does all this end? When the entire world is one transparent collective society where no one has any privacy whatsoever? Personal information is a goldmine as is shown by how desperately companies want to get their hands on it. I think there should be a citizens' clearinghouse where people can agree to sell their info for a profit -- opt-in by default. Anyone caught trying to get around this clearinghouse pays dearly legally. Companies bid for your personal information and you profit as well. Anything short of some model like this is completely lopsided in favor of corporate interests that don't have our best interests at heart.

    3. Re:Um, how by Cytotoxic · · Score: 2

      They've gotten some pretty good data from me. My Google Voice number isn't currently published, so all of the voicemails I get are wrong numbers (or tests). They are completely incomprehensible to me, and to Google Voice - although one did a fair approximation of jibberish English (I think it was in some African dialect). Most seem to be in African languages, although a few are central European sounding. Good luck getting a good translation - but that's the magic that Google is trying to accomplish: translate some spoken language without any foreknowledge. Kind of like Google Goggles only for voice. So I volunteer all of these for their translation database.

      Oh, and I do get a fair number of advertizements and service calls. If you had an appointment with Comcast last Thursday, the tech called the wrong number - that's why he didn't show up. Google did a good job on the translation though...

    4. Re:Um, how by thePig · · Score: 2

      In that case, they can use Youtube videos for this, right? Their automatic translation is quite horrible - they could use the good/bad check there too.
      Actually they can translate the same videos everytime people sees it - and until quite a high percentage of people say yes, they can test it again.
      Also, when they have more than 200 Million videos in youtube, why do they need to store data from Google Voice - which is much more personal and important.

      --
      rajmohan_h@yahoo.com
    5. Re:Um, how by aztracker1 · · Score: 2

      What's funny is this is the case for everyone but both of my grandmothers... either one of them leave a voicemail, and it transcribes > 95% accurate, better than anyone else (30-60% usually). I guess it works well for old women raised in the U.S. midwest.

      --
      Michael J. Ryan - tracker1.info
    6. Re:Um, how by davester666 · · Score: 2

      No"BODY" is listening, but computers are analyzing every call and transcribing it to text?

      Hmm, I would guess there would need to be at least some spot-checks that the transcription is working properly.

      And isn't there some kind of federal wiretapping law preventing this or is it a "well, we told you we were listening in on every call"?

      And methinks it just might be easier for the gov't to get these transcriptions instead of the actual audio recordings. And more convenient to, because it's much faster to read/search them instead of listening to hours and hours of audio.

      --
      Sleep your way to a whiter smile...date a dentist!
  2. Self-checking by rwv · · Score: 2

    How do servers assess whether they've got the translation correct without having a human-in-the-loop to listen to the conversation and concurrently read what the server translated? Maybe the data is anonymous by the time it gets to a human, but it seems like humans need to interface with the voice data somehow to validate that the server is translating accurately.

    1. Re:Self-checking by msauve · · Score: 5, Informative
      "How do servers assess whether they've got the translation correct without having a human-in-the-loop to listen to the conversation and concurrently read what the server translated?"

      If you log into your Google Voice page, and look at a translated message, in the lower right corner there is the question - "Transcript useful?" along with yes/no checkboxes. If you check one, it asks if you want to "donate" that VM to improve the translations, you can answer yes/no/never:

      Want to help Google's automated transcription get better? Donated voicemails will be listened to, manually transcribed, and used to improve our transcribing server's accuracy. They are only used for this purpose.

      --
      "National Security is the chief cause of national insecurity." - Celine's First Law
  3. servers trying to the translation by foma84 · · Score: 2

    oh shit! Google accidentally my voice data!

    1. Re:servers trying to the translation by m0rphin3 · · Score: 2

      Once, I accidentally my voice data, but then the Google server so it was OK.

      --
      for great justice
  4. They Make it Hard to Delete History by Quantum_Infinity · · Score: 5, Interesting

    I can tell that they want the voice data badly. They make it very difficult to delete call and voicemail history. You can't delete more than 10 records at a time and even then they go into trash and keep piling up over there. You can delete the data from trash but again only 10 at a time. There is no option to empty the trash. Their help section says that the history is purged from trash after 30 days automatically but only that it isn't. My call history sits in the trash indefinitely unless I painstakingly delete all history 10 records at a time.

  5. Re:nice by alostpacket · · Score: 4, Funny

    Apparently they're already on par with your average Slashdot editor.

    --
    PocketPermissions Android Permission Guide
  6. Another server by KingSkippus · · Score: 2

    They have another server that checks the first server's translation. Part of their work is checking that server's affectiveness, too.

  7. I gave up by dargaud · · Score: 3

    I gave up trying to get voice software to work over a decade ago. The reason is that I'm trilingual and use all 3 daily. So the software needs to be able to:
    - understand a lousy accent: there are some words I cannot and will never be able to pronounce 'right'
    - recognize what language is being spoken (having those 3 and only those 3 preset in the options)
    Now I haven't tried Google Voice, but none of the software I've tried or heard about could even remotely do those two basic things.

    --
    Non-Linux Penguins ?
    1. Re:I gave up by Thavilden · · Score: 2

      Those two requirements don't exactly strike me as "basic".

    2. Re:I gave up by the_fat_kid · · Score: 2

      I used to work with a "trilingual" fella.
      Born in Itally, raised in France, and then lived in the USA for 17+ years.
      He effectively spoke no language.
      Bad Itallian, worse French and jumbled English.
      is there an app for that?

      --
      -- Sig under construction...
    3. Re:I gave up by hedwards · · Score: 2

      I could be wrong, but I doubt most Europeans are fluent in more than two languages, and I bet a significant number aren't fluent in multiple languages. The reason I'm singling out Africa there is that in parts it's very common for people to speak not just one or two, but three, four or more languages and to have to learn a new language at marriage so that they can communicate.

      Trust me, Europeans have nothing on that.

    4. Re:I gave up by _0xd0ad · · Score: 2

      have to learn a new language at marriage so that they can communicate.

      Married people communicate?

    5. Re:I gave up by Abreu · · Score: 2

      I have had "conversations" with people coming back to Mexico after living years in the US... poor fellows, their Spanish is incomprehensible and their English sounds like a racist joke.

      Pochos have really developed their own pidgin language :S

      --
      No sig for the moment.
    6. Re:I gave up by xaxa · · Score: 2

      Considering that outside of Africa only a very small fraction of the population speaks more than two languages let alone fluently, I don't think that it's a basic request.

      40% of EUropeans speak English well enough to have a conversation (not including native speakers). In some areas (Switzerland, Belgium, Luxembourg, places near country borders) it's not unusual to speak an extra language.

      If you're a European child you speak [your version of European], learn English at school because English is useful, and if you like languages you might choose another; in the same way, perhaps, that an American child might choose to learn Spanish.

      I know a little French and a little German -- nowhere near enough to have a proper conversation (I'm still learning) but enough that if I see some French or German text I try and understand it before hitting the "Translate" button. I'm also by far the least multi-lingual in my house, the other three people are fluent in either two, three or four languages.

  8. Re:Maybe it tried to translate the summary by Attila+Dimedici · · Score: 2

    Hey, that would be useful, a service that translates slashdot summaries into English.

    --
    The truth is that all men having power ought to be mistrusted. James Madison
  9. Doesn't work well... by sglewis100 · · Score: 3, Insightful

    "Peter Norvig, Google's director of research, has told New Scientist that one of the reasons the search engine launched Google Voice is that it needs more human voice data to perfect the sort of 'big data, simple algorithm' probabilistic approach to translating voices to text that drives Google Translate. Norvig says that no one is listening to your calls on Google Voice — it is simply their servers trying to the translation right."

    I think Google Voice translated the last part of that sentence.

  10. I is have hamburger? by ook_boo · · Score: 2

    Right. Everyone get on Google Voice with funny unnatural accents, unusual intonation and non-native grammar! Let's skew their data.

  11. Re:Is anyone surprised? by TaoPhoenix · · Score: 2

    We are derisive towards "Hai This is Facebook. Plz give us ur full name, address, cell phone number, age, and eye color so we can give you five Farmville sheep."

    But you bring up the more interesting case, "Awesome service versus abused data". (Shout out to Holland and TomTom for yesterday's example.)

    Or here, Google Translate vs ... a billion hours of juicy phone calls!

    Speech is "Audio" - All we need is a hacker and a Wikileaks Dump!

    --
    My first Journal Entry ever, in 8 years! http://slashdot.org/journal/365947/aphelion-scifi-fantasy-horror-poetry-webzine
  12. Oblig. UserFriendly comic by JSBiff · · Score: 2

    There was a Userfriendly.org strip years ago which pretty much summarizes my experience with voice recognition software for the past 15 years. . .

    I can't find the link to the comic anymore, but basically, one of the guys in the office had been trying to use voice recog software. Some of his coworkers come to his office. He's not there, but on the screen, they wonder about the mysterious message, "Cod Am Pizza Ship".

  13. Just found it. . . by JSBiff · · Score: 2
  14. If we get the heuristics back as FOSS by cyrus0101 · · Score: 2

    I'd be willing to let this happen if google then released the derived heuristics as free open source software. I'll share if you share.

  15. Re:Google is breaking wiretapping laws everywhere by amRadioHed · · Score: 3, Informative

    Do you not understand what voicemail is? How can record a message for someone without consenting to it being recorded?

    --
    We hope your rules and wisdom choke you / Now we are one in everlasting peace