Slashdot Mirror


Google Opens Access To Its Speech Recognition API, Going Head To Head With Nuance (techcrunch.com)

An anonymous reader quotes a report from TechCrunch: Google is planning to compete with Nuance and other voice recognition companies head on by opening up its speech recognition API to third-party developers. To attract developers, the app will be free at launch with pricing to be introduced at a later date. The company formally announced the service today during its NEXT cloud user conference, where it also unveiled a raft of other machine learning developments and updates, most significantly a new machine learning platform. The Google Cloud Speech API, which will cover over 80 languages and will work with any application in real-time streaming or batch mode, will offer full set of APIs for applications to "see, hear and translate," Google says. It is based on the same neural network tech that powers Google's voice search in the Google app and voice typing in Google's Keyboard. Google's move will have a large impact on the industry as a whole -- and particularly on Nuance, the company long thought of as offering the best voice recognition capabilities in the business, and most certainly the biggest offering such services.

6 of 46 comments (clear)

  1. Nuance the Biggest by Ksevio · · Score: 3, Interesting

    It's not so much that Nuance is known for being the best for a long time, it's more that they've bought out all their competitors and have pretty much controlled the market.

    1. Re:Nuance the Biggest by Anonymous Coward · · Score: 3, Interesting

      we work in transcription business. that is exactly what nuance did, and do, especially the medical transcription segment.

      american-based, native english speaking transcriptionists are essentially just training nuance's computers to do the transciptionists' jobs. once the voice recognition accuracy hits a certain mark, they outsource to india or some other piss-poor country with lower wages and more favorable-to-them contract and labor laws, the editing of their now trained and automated output

      and we do that work with lower wages (usually a piece rate per line; if per hour then with quotas to meet) than we got 5 or 10 or even 20 years ago. nuance's end user software is buggy, proprietary, prone to crashes, and really not all that secure either... not even mentioning the security and privacy nightmare of sending the work offshore.

      in five years, our work will be hard to come by, as facilities continue buying into the bullshit nuance sells. the only thing that might save some of our jobs is if congress passes a law that says our medical records have to stay in the country unless absolutely needed for a traveling patient's care.. then we'd at least get the editing work.

      but that editing work (that does stay in country) pays half or less the rate of actually transcribing a document, and nuance already pays half or even only a third the rate a facility or doctor would pay directly to a contracting transcriptionist. they likely pay nuance the same, but they of course, nuance has to have their own, and the largest, piece of the pie. so not only are we only working ourselves out of jobs by training nuance's computers, but we get shitty pay to do it, and have no choice in the matter because of nuance's chokehold on the market.

    2. Re:Nuance the Biggest by Anonymous Coward · · Score: 0, Interesting

      Oh well, another sore loser business with no skills but hearing and typing.
      The whining is very interesting though.

      Before the end of this decade, it is predicted that AI/Machine learning is going kill off five million jobs.

      The good thing is, you won't have to blame the piss-poor countries for it.

    3. Re:Nuance the Biggest by Tupper · · Score: 3, Interesting

      The nerds at Ma Bell used to provide very high quality telephony; they were shocked and appalled when the market chose low quality low cost telephony. The medical transcription market has gone through the same change..

      The documents, especially the ones used clinically, can suffer from lower quality of ASR and/or offshoring.. Also, in the old days, light editing was usually part of the process. This happens less in today's price obsessed market and sadly results in less readable reports.

      On the other hand, today it's possible to get turn around times of 0 with document issues identified in real time by NLP. That is a really big improvement. (I don't know if Nuance has that, but if they don't, they will soon)

  2. Privacy by markdavis · · Score: 4, Interesting

    >" Google says. It is based on the same neural network tech that powers Google's voice search in the Google app and voice typing in Google's Keyboard."

    Indeed. So does this mean Google will store and mine and analyze and profitize the spoken text data too?

  3. Pebble Time has been waiting for this by Wizarth · · Score: 4, Interesting

    I'm waiting to see if/how this affects Pebble Time. We've been wanting access to the Google Voice API for ages now. Personally I want it mostly for Google Now integration, which may or may not be separate.