Baidu Releases Open Source Artificial Intelligence Code (thestack.com)
An anonymous reader writes: Chinese web services company Baidu has released a new artificial intelligence software called WARP-CTC. The code is apparently capable of speech recognition, particularly for short segments, that exceeds human capability. The source code uses an approach called 'connectionist temporal classification' and has been released on GitHub.
I've been developing automatic speech recognition systems in the 1990s. Back then, the best performing recognizers were based on Hidden Markov Models, and for "out of context" tasks like "determine whether an individual spoken word from an unknown speaker is 'nine' or 'none'", the automatic recognizers already achieved better recognition rates than humans. However, the specific human strength when recognizing fluent speech is to (a) quickly adapt to different speakers and (b) to fill in all the uncertain words from the understanding of the context, requiring "world knowledge". And that strength makes a very big difference. So the claim in the article is not really anything special, it is to be expected that computers are better than humans in this special task, for at least the last 20 years.
Automatic recognizers achieve better rates on this task - but they'll loose against you when it's complete, sensible sentences that are being spoken, even more so if you heard more sentences from the speaker, before.