W3C Seeks Feedback on VoiceXML
jdaly writes: "Today, W3C announced that VoiceXML 2.0 has been issued as a first public Working Draft. Press materials went across various wire services. Rather than send simply a press release here, W3C would like to give more specific information of interest to Slashdot readers. Of note is a section from the "Status of the document" section of VoiceXML 2.0 draft:
"This document seeks Member and public comment on both the technical design and the patent licensing issues arising out of the disclosure and licensing statements that have been made. Our decision to publish this first public working draft has been made to secure early comments from the community, but does not imply that all questions of patent licensing have been resolved or clarified. They must be resolved or work on this document in W3C will stop.
As things stand at the time of publication of this specification, implementations conforming to this specification may require royalty bearing licenses for essential IPR. Further information can be found in the patent disclosures page. The patent policy for W3C as a whole is under wide discussion. A set of commitments by all participants in the Voice Browser Activity to royalty free is a possibility for the future but has NOT been made at time of publication."
As IPR issues are important to Slashdot readers, we are striving to make this information available to them as soon as possible. W3C strongly encourages those with an interest in this specification to consider using the comment list, www-voice@w3.org, which is archived. There is no deadline for comments on a first public Working Draft.
Regards, Janet Daly, W3C"
Thanks
Bruce
Bruce Perens.
...that when we are browsing /. we will actually hear "first post"?
This space left intentionally blank.
I thought it was a given that to get the patent issues resolved things will have to be changed. Why then seek public comment now rather than wait until it is more stable? Is it to create pressure on potential claimants? I can see how pressure would help but why would public comment create it?
Patens are evil, they'll ruin our ability to do free software work with all other software. Imagine if Shakespere had patented his own findings of how to put the english language to good use. We'd either be paying royalties for speaking or we'd be using diferent dialects of english to avoid patent issues.
This is the exact same case. XML/HTML/XHTML/etc are the languages of the internet, they define the structure of our speech, it's grammar. Patent them and we'll be paying royalties for speaking through this wonderful electronic medium.
This situation just plain sucks.
Pedro Côrte-Real.
Slashdot rejected your story because it is old news.
;)
I saw this commercial a month ago, and during sunday NFL of all places!
So sorry, don't post offtopic you troll
BTW, it is a very good commercial.
...and my karma sinks slowly into oblivion...
We dance to all the wrong songs.
--Refused.
This is unbelievably funny. Always look forward to Troll Tuesday.
I alternate between posting +5 and -1 Comments. Karma: +53 -47 = 6
A talking webserver..now instead of having to guess if its overheating, has a virus, or is under a denial of service attack, it can tell you. Does this have anything to do with that emotional car that was posted a while ago? I don't want a webserver under my control crying or frowning at me...a server farm is scary enough without having talking servers..
Job? I don't have time to get a job! Who will sit around and bitch about being broke and unemployed then?
I thought that feedback was one of the biggest problems with voices. My ears still ring from a Who concert years ago!
"Provided by the management for your protection."
I currently hold a patent on the idea of an idea. If anybody out there has ideas, you must pay royalties to me. Money is okay, blood is preferred.
imagine if this weren't something as fringe-useful (yes, it is useful to hearing impaired and a token number of folks who desperately want to hear web pages or use TellMe) as VoiceXML... imagine if it were... SMTP (yeah, I know, IETF not W3C).
what if the SMTP spec was approved and made an official "standard" with Micro$oft or $un claiming ownership? Would e-mail be the most widely used Net application? Would we be back in the days of LANs supporting 10 different e-mail standards?
VoiceXML *is* cool (I occasionaly use TellMe to get movie times/locations), but what's the point of making it a "standard" if I'll have to license my software to the firm with the highet paid lawyers?
we should give it all up if this is going to be the wave of the future for the W3C. Why not just develop and license apps to recognise and display docs written in QuarkXPress tags.
better yet, let's all just switch the web to PDF and wait a year for it to d/l @ 56K.
hat's off to payware "standards"...
maybe i'll just go back to FTP and plain text unless someone manages to patent that.
Mind the gap...
Sorry couldn't resist.
there are no stupid questions, but there are a lot of inquisitive idiots
OpenVXI 2.0 was released just last week. According to the message on the VXI-discuss mailing list:
There is currently support for Windows (binaries are included) and Linux. Developers are currently working to add Solaris and Mac OS X.
NOTE: This is a VoiceXML interpreter. A real system would require a full speech recognition engine and a full text-to-speech implementation. SpeechWorks International ships a commercial version which connects to their recognizer and TTS products. This is a good playground for experimentation.
Given one hour to live, the student replied: "I'd spend it with professor FP who can make an hour seem like a lifetime."
The thing with VoiceXML is, we probably won't be seeing an open-sourced engine for it. VoiceXML is a standard which works over telephones and VoIP, and thus needs complicated software to run.
Actually the price of IBM's VoiceServer (i think it's called) is around $40,000. All the ones I've found through research were aimed purely at large companies who'll likely host VoiceXML applications for others.
In this sort of situation, I don't see any point in paying royalties to the developers of this technology. These companies are the same which'll be selling the server software. How much money could they possibly need?
(Note: It would be really cool if somebody started developing a free-as-in-everything VoiceXML server.. I'm just not sure if anyone has that much time to devote, since the free text-to-speech technology is a little rough around the edges still)
when the rain comes, they run and hide their heads. they might as well be dead.
As tightly bound up in patents as voice/sound is, unless W3C takes a truly RAND stance (free, no less), they may as well get out of the way. You may be looking at "non-discriminatory" license fees of $10,000 from half a dozen big companies. That's $10,000 EACH. (Or maybe more, depending on how greedy they are.)
Each of these will negotiate a licensing agreement with the others, for free. But they won't discriminate against anybody else, oh, no!
So why does W3C want to get their hands dirty? Let the big boys go off and negotiate it themselves; that's what they're doing now. This patent-encumbered "standard" will be rather like X was in its early days. And it will fall apart, just like X did when XFree86 started doing the real work, maintenance, and innovation.
If there is a real RAND, free to anyone using the standard (as written, no Microsoft extensions), then the standard has a chance. That's what W3C should drive home before they promulgate a bunch of "open" (aka proprietary) standards.
The non-negotiable condition for inclusion of patented technology should be that the patent be provided on a royalty-free, non-exclusive basis. Any patent not available on such terms should be automatically disqualified.
The message must be clear. Software patents do not serve the public interest. Instead, they constitute at the best roadblocks -- useful ideas off limits to the public, and at the worst, landmines -- when the patent office grants a patent on a widely used technology.
Granted june 2001. They must be joking, or we are looking at another case of "standard group members takes notes on meetings and writes stupid patent that is accepted by the patent office".
I don't care how fair and square they enforce their RAND policy. A high tech company, especially one that has INNOVATE as their slogan, should be ashamed by filing such patents. Shows total lack of quality control.
But not to worry. Fiorentina will run them to the ground with the Compaq merger, so some geek could buy the patents at the firesale, and then we could have a patent BBQ?
-- Another senseless waste of fine bytes.
Sometimes the w3 comes out with something useful, clear and powerful. SVG [w3.org] and the original version of XML are examples of this. But they quickly forget their design goals and everything goes to hell. Example: XML is supposed to be a human readable, HTML like markup language for arbitrary data that is easy for a program to parse and understand. Then the committee does its thing and now with name spaces and the other additions, XML is about as readable as a binary file. W3's problem is that they are victims of feature creep. They take something simple and elegant and turn it into a monster. Features are good but they don't seem to know how to stop.
The easiest thing to do with VoiceXML would be to wait for Microsoft to appropriate it, embrace it, extend it, and make it a free download. They already have pretty decent speech recognition & synthesis (not the best, but servicable) so chances are they will have the majority of the niche users that actually want to talk to their computers.
I'm sure people out there need text to speech technology. I'm sure that VXML would be used by some niche that desperately needs it.
I'm also sure that it'll never take over the internet, because it's a different medium, and has the same drawbacks as other spoken media, both citizen band and broadcast. Audio is linear, the web is random access. If you are interested in a portion of a web page, you will skip to that portion immediately, am I right? Besides, audio is almost as intrusive as Flash and Shockwave, only with VXML, it'll be a patented standard. The last thing I like is web sites with noise on them. If I wanted a multimedia experience, I'd play a good game, not Joe Generic's lame attempt at an interactive web page. I surf for information, not for a memorable experience.
Hmmph. Seems to me W3C should be documenting emerging standards, not creating them.
Any connection between your reality and mine is purely coincidental.
It is usually the pr0n business that implements new technology, both on the Internet and home multimedia fronts. While it could be really cool to have those nekkid pictures talking to me, the idea of all of those pop-ups literally screaming out at me a dozen at a time would really freak me out.
I thought feedback was something to be avoided on a sound channel...
mp3's are only for those with bad memories
At least people are disclosing their patents right now, not after a standard has become de-facto. We don't need more companies like Rambuzz.
No way. Not even close. VoiceXML is mostly used to rapidly develop IVR systems. WAP was an attempt to squeeze HTML onto a cell phone. VoiceXML is meant to speed development and enhance the capabilities of those automated voice response systems that we've all already been using for years and years.
There are enough posts already claiming that "my web server should yell for help when it gets slashdotted" that it's pretty obvious no one has read the article yet.
VXML does not make your browser "talk". It is a markup language which allows a client known as a "voice browser" to interpret this markup language and speak to you locally.
obligatory google cache of slasdotted article here.
The issue here isn't that they can't release a new VXML spec. It is that the new spec will logically include ideas that have been patented by other companies.
The big problem is that VXML is currently at 1.0 and companies are pantenting extensions to that spec. Here is a prime example of how rather than getting involved with creating the spec and helping to push out new revisions, the companies start patenting every obvious thing missing from the 1.0 specification. This is obviously going to prevent further revision implementations from emerging from any company that isn't as rich as HP or IBM or MS etc.
As for the usefulness of VXML whoever posted this story missed the boat. VXML isn't used to make your server speak it is used to quickly create a IVR system. This is really a useful ability that few slashdotters have realized.
Fear trumps hope and ignorance trumps both
*sigh*
Poof.
Some /.ers don't seem to care so much about speech recognition - niche technology? When natural language parsers get more intelligent, speech recognition will be the internet in your car. Just think star trek.
This completely describes any typical IVR scripting engine that has been around since the late 1980's (AT&T's Conversant IVRs on Unix systems come to mind). Visual Voice products that I used to create an IVR chat system, back in mid 1990's would do exactly the above - and since it was VB based, I could even pull up web pages for data (which I did just to provide a wather report option to feed to the TTS engine as a secret test menu option, as well as Tuxedo screen scraping of a virtual 3270 hooked to big iron). The patent quoted was applied for after that time. Its clearly bogus.
This is more of the same fomr a differnt patent. Liek I said, this is all obvious and common practice for IVR script writers, and anyone that has a few brain cells going. Furthermore, "input variables" and things like that are not inventions, they are common sense. Its just not that hard.
Damnit, how do they get away with patenting what are commonn practices? the patent examiners must be total f**king morons.
Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo! http://goo.gl/J9bkO
Patents normally take a LONG time to be granted.
When was this applied for? That's what matters.
I had a quick look at it, and it looks pretty nice. The companies give away the software then charge for their central 800 number are interesting. I'd like the option of directly supporting voice modems, but I haven't looked at all the companies yet.
Of course it always comes down to how good the voice command recognition is. I found Microsoft Voice to be a little iffy, and you have to train it. I've got more computer horse power now, so maybe I'll give it a try again. (And most voice modems suck! My USR is 8 kHz. Go with high-end stuff like Dialogic if you can afford it.)
One line blog. I hear that they're called Twitters now.
Yup, Adcritc has had a number of IBM's excellent Linux commercials for month.
"Where are the flying cars? I thought we are suppost to have flying cars..."
One line blog. I hear that they're called Twitters now.
Software patents do not serve the public interest. Instead, they constitute at the best roadblocks -- useful ideas off limits to the public, and at the worst, landmines
Absolutely.
The best way expose the faults of the software patent system is to expose the damage it does, not just talk about it. Kudos to W3C if they make a policy of "no standard for royalty burdened patents. It may take a few years but eventually the comfortable computer community will notice that the available standards suck and are missing obvious and necessary solutions.
If W3C makes a practice of including patented technology, they become a money-making tool for opportunists and big businesses. You don't think smart execs see the $ in getting their patented stuff in a W3C standard?
--- -- - -
Give me LIBERTY, or give me a check.
Dude, thanks for the mental image, google would get insane sysadmins. Imagine running around 3000 nodes of talking servers!
I looked at this article for two seconds, went to the w3c page, and spent the next five hours fixing my html so it was html 4.01 transitional compliant.
Linux: The world's best text-adventure game.
I would prefer to see better content, rather than seeing current just suck up more bandwidth. Sure, voice is neat, but in most cases it would just be used to add more widgets rather than fulfilling a needed function. Yes, there are legitimate uses for this, but most uses will just be for the 'gee whiz' factor.
I recently had to become a bit of a VoiceXML expert for a project at work. From what I have seen, using VoiceXML for talking web sites is actually not what most people are using it for. VoiceXML is used primarily for automated phone transactions. If you have ever called your bank or credit card company to get your balance or conduct some other type of transaction, then you know the type of phone system that I am talking about. If that system was capable of voice recognition, chances are that it was programmed using VoiceXML. VoiceXML is also quite capable of making outbound calls; it is conceivable that you might start receiving completely automated telemarketer calls in the next couple of years.
I wrote my own VoiceXML app which prompted you to say your name and the name of someone whom you wanted to get ahold of and the system would hunt down that person on various devices (e-mail, AIM, pager, telephone, SMS, fax, etc.) and let that person know that you are looking for them. It worked unbelievably well and VoiceXML made the voice recognition part of it trivial. And if you need a VoiceXML solution, I would strongly suggest that you consder Voice Genie (www.voicegenie.com).
Send/track messages to 100K people: www.xPressAlert.com
...less chance of me seeing XML.
I've been playing with the telephony hardware out there. Dialogic and NMS are pretty cool, but make no mistake. The future for hardware in this business is SIP. Internet telephony, either routed through the net, or even over POTS.
The likes of Cisco are making SIP gateways with huge port counts, allowing companies large and small to cut the cost of their telephony solutions by orders of magnitude.
And, even cooler for the Slashdot crowd, there are companies ramping up production on little analog SIP gateways. Get this: you plug your home phone line into the box, you plug an ethernet line in the other end. Now you can use this box to route incoming calls to VoiceXML apps hosted anywhere on the net, or just forward the calls to any SIP phone (say, the softphone application on your desktop at work) or route it through yet another analog line plugged into your little box. A fun toy, if you're into playing around with telephony in the home. And I think some of these will come in well under $1000, even in the initial pricing.
-----
Kvetch is Yiddish for "throw an exception" --Dr. Ron Cytron
Ok, truth first: I haven't read the patent licence, but here's a though:
Why the heck would I want to look over the public draft, suggest corrections and then (if my corrections are incorporated) pay a fee to use this standard?
Isn't that a bit stupid? Like Microsoft asking you to write code for Windows, which it can sell back to you later?
I say boycott this. W3C Patent = closed stadards = noone using them = we need another free body?
Boky
boky
I can't seem to find anything already posted, so I am gonna mention it...
Didn't anyone notice that Slashdot was singled out specifically and appealed to for comment. Thats like a huge step, in gaining relavance in the community. Slashdot, is slow becoming a legit political force of sorts.
Power Corrupts,Absolute Power Corrupts Absolutely, leaving one person(group)in charge is absolutely corrupt.
who cares about specifications...the whole idea is complety pointless.Its technology for technology's sake...I am struggling to think of any areas where speech has the advantage over a visually represented app...
and besides think of the amount of training you have to do train voice recognition software...
PLEASE FLAME ME
Great idea, except they'll only cross-license with each other, leaving the rest of humanity out in the cold.