IBM to Open Voice Recognition Software
phug writes "According to the NY Times, IBM is donating code that it estimates cost the company $10 million to develop. One collection of speech software for handling basic words for dates, time and locations, like cities and states, will go to the Apache Software Foundation. The company is also contributing speech-editing tools to a second open-source group, the Eclipse Foundation." There's not much information out there yet - e.g. no word on licenses etc. It is worth pointing out that the Eclipse Foundation was started by IBM.
This is great, ViaVoice has disappeared for quite a while now on linux, I hope that this will open a great variety of cool open source applications. If this will be made modular like e.g. festival, I can think of endless applications worth using it.
Life is just nature's way of keeping meat fresh.
Is this ViaVoice? The linux packages have been pulled off the IBM site a year or so ago but they're still floating around.
8 of 13 people found this answer helpful. Did you?
Are you sure you meant to say "All your base are belong to us?" Did you mean "All you lasers are better than us?"
Shameless self promotion
Eclipse is actually a kind-of Swiss Army Chainsaw -IDE. You can make plugins for pretty much everything, so one could speculate that a voice recognition plugin would be feasible.
I don't know about everyone else, but the concept of coding by voice does fascinate me. There are obvious issues (like eliminating having to say every single control character (if at all possible)), but with a background of RSI I think it's at least worth a shot.
Thoughts?
.: Max Romantschuk
I love you, IBM. I want you inside me.
Welcome to the land of the free...pay toll ahead...no photography...please open your bag...
Why is it doing this, is it because they think they can make more money with increased software sales? It also might be an advertising campaign, $10 million donation is buying a lot of free coverage.
Corporations dont usually give a way stuff for nothing, in fact their mission by law is to maximize profit.
When you look at GNU/Linux as a complex system and think of the things that users complain about when Linux usability is concerned, GPL'd speech recognition software is definitely one of them.
Hooray for IBM and as Ali said in the Linux ad "don't back down"!!
Never underestimate the power of idiots in large groups
It is great to see some open voice recognition.
But does the exist any open grammar checker software ???
Is voice recognition software really viable? When you take into account the different accents, dialect and slang, is it just a pipe dream? Is it a software or hardware related issue?
...if only computers (namely Macs) had this technology back in the 80's our favourite 23rd century engineering hero wouldn't have had so much trouble using one at the plexiglass plant. "Hellooooo computer". Still cracks me up.
"Backups are for wimps. Real men upload their data to an FTP site and have everyone else mirror it." -- Linus Torvalds
Nice way to take a related issue to the company and leave out the important facts to make your point.
My brother (who works for IBM) recently sent me an article on USA Today about the system IBM and Honda have developed for speech-interface with a GPS-enabled navigation computer. Really cool stuff.
For those of you who haven't read it, check out The Unfinished Revolution by Michael Dertouzos. I don't agree with all of his analysis (he was a little lacking in pragmatism on some points), but overall this book was very insightful. This book, along with Weaving the Web by Tim Berners-Lee, caused a big paradigm shift in my thinking about computer technology.
In the late 90s I talked with an IBM representative about releasing the ViaVoice source under a Free Software license and the person I talked to (I don't recall his name) said that they might be willing to release the source code- the code wasn't valuable to them. The value in the ViaVoice is the "thousands of hours of training" that the code uses to determine words and voices.
So my question is- will the code released include training to make it work and or will someone be able to put together the necessary resources to train the system.
They'll never do better than Microsoft Sam!
Oh...this is voice recognition...umm...let me revise.
They'll never be able to understand Microsoft Sam!
This is not earth-shattering news, since HTK has been available for some years. HTK was owned by a company called Entropic and was released as open source when it was bought by Microsoft. HTK can be found at http://htk.eng.cam.ac.uk/. and can handle network grammars. This lessens the impact of IBM's news.
It is also worth noting that the Eclipse Foundation recently introduced the Eclipse Public License, and are in the process of transitioning all code from the CPL to the EPL.
All new contributions will be under the EPL, so if IBM wants to donate anything to the Eclipse project it will be under this license.
Nice title;
.Net technology. What exactly is the difference in quality and approach between the package from M$ and the one here mentioned from IBM ?
Speech code from IBM to become open source
And even better.. the comment from Microsoft, quoted at the end of the article
"IBM has not executed in bringing this technology to a broad market as Microsoft has."
Beside the jokes; The article states as well that Microsoft introduced their Speech Server 2004 last March, and that 100,000 software programmers have downloaded Microsoft's free software developers' kit for building speech applications on its Windows
IBM Hursley labs had a name dialler 5 years ago that let you phone the computer, say the name fo the person you wanted to speak with, and get put through. They also had a system that provided weather forecasts based on the name of the city or country you said. I was pleased to name the latter "Global Weather Information System" or GWIS, pronounced Gee-whizz. Both ran on the machine under my desk. Both worked reasonably well, especially given that a lot of the acoustic models for names and places were automagically generated.
"The new wave is not value-added; it's garbage-subtracted" - Esther Dyson, Dec 1994
2.1 The Licensor hereby grants the Licensee a non-exclusive license to a) make copies of the Licensed Software in source and object code form for use within the Licensee's organisation; b) modify copies of the Licensed Software to create derivative works thereof for use within the Licensee's organisation.
2.2 The Licensed Software either in whole or in part can not be distributed or sub-licensed to any third party in any form.
This license is in no way Open Source. Yes, you can play with the source, but you cannot build something useful with it and redistribute under the same license.
Watch great movie opening scenes!
I've used ViaVoice for dictation and it was very good indeed. One of the serious lacks at this point in the linux community has been speech recognition software - opening this up will make lots of cool things possible.
Good will in the geek community, free publicity for something that would have just laid around collecting dust otherwise, and maybe a $10 million tax deduction for donating to a non-profit. Not sure about the tax deduction, but this is a donation to a charitable organization, and you can deduct the value of what you donate to these organizations, such as the value of a used car.
Tax Write Off
Hmm, this is nice, but I was never impressed by ViaVoice. Sphinx is much better to work with.
Reed
VOS/Interreality project: www.interreality.org
Free software for speech recognition.
For an open-source speech recognition system with a real open source licence check out the CMU Sphinx Project, a family of speech recognition engines, training tools and associated acoustic and language models. The latest version Sphinx-4 is written in Java and is released under a BSD-style license.
I make a reference to "Star Trek 4" when Scotty held the mouse up and was speaking into it and it's labled "flamebait"?
Oh well, I suppose the moderators are wiser than I.
"Leo Fender was in a 'state of grace' when he designed the Stratocaster." -- Paul Reed Smith
2. sell a lot of hardware and associated extended warranties and servicing and consulting and other services and support.
It's better to be the foot on the boot than the face on the pavement. ~~ tkx Kadin2048
Imagine hooking this up to, say, Quake 3 Arena, and being able to switch weapons without moving your hands.
"Railgun!"
"BFG!"
"Chain gun!"
This is an example of speech recognition not voice.
Voice recognition is identifiying an individual by there voice. Example: movie Sneakers which any good geek should have seen. "My voice is my passport, verify."
Speech recognition is simply trying to identify the words being spoken. Like the lackluster system used by United when you call up to get flight times.
The Apache Software Foundation (ASF) is a non-profit 501(c)(3) corporation, incorporated in Delaware, USA, in June of 1999.
From http://www.eclipse.org/org/documents/Eclipse%20BYL AWS%202003_11_10%20Final.pdf
The Eclipse Foundation is formed exclusively as a non-profit trade association, as set out in section 501 (c) (6) of the Internal Revenue Code (the "Code").
Modern voice dictation software is pretty good I'm using viavoice now to write this and I find bark bark shaddup I find that it bark bark shut up damnit bark bark don't make me come down there I find that bark bark okay that's it I'm coming down there argh crash thud bark bark bark bark bark bark
Curiosity was framed. Ignorance killed the cat.
Comment removed based on user account deletion
Someone take that moderator stick from the insane guy over there.
This is really going to mess with the "Free as in beer/speech" analogies :)
Does this mean that speech is now free as in beer?
Lost: Sig, white with black letters. No collar. Reward if found!
Sadly, the following, from TFA, is true:
"This is a case of IBM following Microsoft," said James Mastan, director of marketing for Microsoft Speech Technologies. "IBM has not executed in bringing this technology to a broad market as Microsoft has."
IBM could have really taken the lead here if htey would have opened their doors earlier! Grr. Mind you, IBM does have the ability to really open the doors in the OS world, so all is not lost.
Big Blue is becoming Big Red. It all makes sense when you think about it. Let's hope they are merciful when they start taking prisoners from M$FT.
A voice tools proposal has been posted on eclipse site.
(See subject.)
(Duh.)
There are no trails. There are no trees out here.
Parent is sage and insightful, and the OP would do well to follow its advice.
They're not open-sourcing anything resembling ViaVoice to the Eclipse folks. Check out the eclipse voice tools proposal. It's directed at making it easier at creating call-center type voice reco apps - not at making Eclipse a voice-directed IDE.
:-) As a stop-gap they're hoping to get WINE support for Dragon/Scansoft NaturallySpeaking.
If you're interested in open-source voice recognition check out OSSRI - an effort to bring together some sort of practical large vocab speech recog to linux. They're just starting up, but the mailing list archives hold a fair amount of discussion about the current state of the open-source SR world. (Which, to sum up, isn't that great
...all of the IBM voice-recognition software I keep getting spammed about, so the spammers lose their incentive.
It's a product based on the Eclipse patform (not a plugin, more a standalone application).
It's a VoiceXML-oriented IDE. In a nutshell, VoiceXML is a specification that defines how to make a speech recognition (or DTMF) application for the *phone* (not the desktop) using a Web model (that is, exchanging documents over HTTP). The toolkit developped by IBM allows programmers to build call flows graphically, to edit VoiceXML and grammar documents, to manipulate pronounciation dictionnaries and to do other related tasks. I believe this is the part that they are going to give to Eclipse.
The other piece they're going to open is "Reusable Dialog Components", a set of VoiceXML documents (or templates), grammars and code. Theses modules allow programmer to combine different components together in order to build a complete application. I think this part is going to Apache.
Also note that:
Currently, Voice Toolkit for WebSphere Studio is only available on Windows
Although VoiceXML is a growing standard, many area are still uncovered by the spec. AFAIK, this toolkit is not likely to integrate nicely with run-time platforms other than IBM WebSphere Voice Server.
This is just an IDE. You need to buy the runtime (the VoiceXML gateway). I really don't think they will open their speech recognition software (a lot more than a 10M$ investment).
All humans are mortal. Socrates is a human. Socrates is dead.
You are right that HTK is not OpenSource and nowhere on the web site does it claim to be. However, your second claim is totaly wrong. Many groups use HTK to train acoustic models and language models that they then ship in their products (with their own recognition software).
Gunnar (maintainer of HTK)
It would be cool if this was integrated into a project like dashboard http://www.nat.org/dashboard/. ;)
Dashboard would create cluepackets from the speechrecognition software and dig up info on what whatever you are talking about.
I would be cool for a radiostation, they could get cool trivia on whatever the callers or dj is talking about.
And it would be cool if you just could pipe your favourite radiostation throught dashboard and get relevant info as you listen!
-- My site
hey aht about Sphinx
"This is a case of IBM following Microsoft," said James Mastan, director of marketing for Microsoft Speech Technologies."
Maybe. When did IBM come out with ViaVoice? It's been a number of years. They even offered it for Linux for a while. When did Microsoft jump on board? Maybe Mr. Mastan's statement is just bull too.
Either way, I'm glad to see IBM doing this. Voice recognition enabled programs open's a whole new and exciting frontier for software developer's both on the desktop and in embedded projects.
The race isn't always to the swift... but that's the way to bet!
He was joking (or at least flaming in a humerous way). Stop taking life so seriously.
If they realease the ViaVoice for Linux stuff again that would make Linux based voice recognition a little more accessible and might help improve it.
When shit hits the fan get some of these https://youtu.be/pY-GncsZ-UE
Maybe this means someone (anyone, please!) will be able to bring ViaVoice up to date on the Mac, which hasn't been updated for more than a couple years, and with every OS X update becomes more and more broken.
Bill Gates announced today that the source code for Microsoft Bob® and Microsoft Clippy®, valued on Microsoft's books at $175 million, has been donated to the Free Software Foundation, a tax-exempt entity.
No Beer Was Harmed In The Making Of This Drivel.
There is no god; get over it already! Never exchange a walk on part in the war, for a lead role in a cage.
This is not so much a comment as a request for help. I do not write code; I build PC Networks and overstress software for fun and profit.(Gates Cracks it, Norton Hacks it and I Stack & Whack it."DeFrags-R-Us.ORG-chicago3.net-natural.shs") (Uh, yeah, i've got a few TWAIN32 compliant USB Imaging Devices attached to the SCRAP.DOC Hauler) My "Native LAN" is built on an HP-3COM NIC P-III 650EMHz INTEL 32(whoppers)Win98-NT 4 Workstation with plenty groovy cable and wireless "thin-clients", two phony lines in and 1 wininet SBC Yahoo! DIAL 123 OUT accountant. (bad prodigybiz.net buzz-off indexing blogger) The TERMINATOR I,II,III, ARNOLD IV, CA.GOV.US V Hasta La Vista, Baby! Let's Chat about RECALL Elections. Because of obsolete SWBT Hardware and MSN Switch Politics (for) Entertainment, I am only receiving illegal broadband data packages TCP Inbound and a 26.4Kbps upstream dial connection. Causes lots of problems in the Intranet(Local MACHINE.INF) Zone. Also, because of my skill at training computers to repair their own softwares and protect themselves from Extranet Idiots, the High-Speed Business Class Internet/E-Commerce MSN Partners have my home under surveillance and are raping, pillaging and plundering its resources.(Abbreviated by us victims of genocide to RP&P. Sort of like "Plug-And-Pray" the uber-geek aryan hordes don't notice your ASL Middle-finger salute pinned to the front gate counting frames rushing past it). We are also getting some really strange results from the SpeakPad listening to me talk and all of the Internet garbage MSN-SBC-YAHOO-PRODIGY are pumping down the PIPE into the machine's event receiver.(SBC Yahoo! DIAL 123 SLIP UPS WIN32 more from SCREWUUNET EDU - Stuff that matters - Bend over, grab your cheeks and crack a smile. MopNET ONEAC USA Chloride Power Group, LLC(1)) Are there any Slashdotters out there who know their way around these types of "Techno-Gadget SCAMS,SPAM, VIRUSES and Miscellaneous Malicious Media attacks" that are willing to help us? (me and BIGMAMA1(hp)computer) if yes, please call 512-247-6696 at your earliest convenience to arrange an "Inter-Active - Inter-Operative" live help session. if no, you can call anyway; me and the computer love to record and backtrace crank calls. also, if anybody wants to participate in the multimedia legal circus of the new millenium, you are invited to sign-on as CO-PLAINTIFF in the COMPLAINT (In Composition): IBM PC AND APPLE MAC DOE(S)VS. MICROSOFT CORPORATION ET. AL." to be instantiated into the calendar of the U.S. District Court at Austin, Texas, USA1 World Wide Wierdest CITY JPEG (ASAP)
My orgianisation is the Popes of Discordianism. Since every man, woman, and child on Earth is a Pope, I can distribute it to anyone. Since I am a pope, and popes are infallible, this is correct.
Not a sentence!
A serial mouse with a built in microphone. Tempted - but practicality won over sheer geek value.
I am a member of the UDA. Fuck the pope!
Thanks for the support. It was lame, however, but I gota start somewhere.