How IBM Plans To Win Jeopardy!
wjousts writes "Technology Review is reporting on IBM's plans to take on Trebek at his own game. The 'Watson' computer system uses natural-language processing techniques to break down questions into their structural components and then search its database for relevant answers. A televised matchup with Trebek is planned for next year. 'David Ferrucci, the IBM computer scientist leading the effort, explains that the system breaks a question into pieces, searches its own databases for "related knowledge," and then finally makes connections to assemble a result. Watson is not designed to search the Web, and IBM's end goal is a system that it can sell to its corporate customers who need to make large quantities of information more accessible.'"
I wonder how they plan to do with categories that have implications for all the answers. I've seen categories where words must be so many letters in length or perhaps start with certain things and Alex will interject while reading the category such as "'Cats'--and that means all the words in this category start with 'Cat'." Now, with that in mind, a clue could come in as "They are the popular makers of earth moving equipment." Might prompt Watson to find the most popular makers of earth moving equipment--Who is John Deere? The category of 'Cats' would do nothing for Watson without the aid of Alex's interjection ... thus failing at finding "Who is Caterpillar?" (bonus points if you also thought of "Who is Bobcat?" but that answer doesn't start with Cat).
As a fairly avid though novice crossword puzzler, my mind explodes with questions. Could Watson discern a four letter word for "Pleasant French city" (Nice)? Or what about a four letter word for "Beefy Laker" (Kobe)?
Lastly, will Watson have something inane and boring to talk about during the break?
Alex Trebek: Now, Watson, it says here that you are named after Thomas J. Watson who forbade his employees to drink and even frowned upon it while off the job?
Watson: That is correct. It is against IBM regulation 4-245 Section 8 to consume alcohol on the premises of any facility.
Alex Trebek: Fascinating, I'm sure you've never broken that strict regulation, ha ha.
Watson: Good sir, I am a computer, drinking is not within my capacity.
Alex Trebek: Um, right. So could you tell us something interesting about yourself?
Watson: *pauses to search records* During the fabrication of my circuitry, several engineers went months without sleep. Leading one to go insane and killed his wife and kid before taking his own life in a double homicide/suicide case.
Alex Trebek: How unfortunate. Well, I wish you the best of luck today in Jeopardy.
Watson: Thank you, my snide game show master.
My work here is dung.
I fed all the Jeopardy questions into Wolfram|Alpha and it got every single one right.
What was an extra-terrestrial?
It can answer in Sean Connery's voice and make your mother jokes at him.
Otherwise I'll probably pass and look up old SNL skits on youtube instead.
They plan to answer "Kebert Xela" and send that bastard back to the dimension where he belongs.
I wonder how well it'll do at Anal bum cover.
sell to its corporate customers who need to make large quantities of information more accessible.'"
They want to replace the call centres in India with call computers.
"Hello you're speaking to Susan Blue Gene how can I help you?"
I am a free slashdotter. I will not be modded, blogged, DRM'd, patented, podcasted or RFID'd. My life is my own.
What was an extra-terrestrial?
How tastelessly incorrect. Extra-terrestrials don't come back to life. Watson would cross reference The Bible with many recent movies and come up with the correct question we were looking for: "What was a zombie?"
My work here is dung.
A lot of Jeopardy questions are wordplay-dependent, something AI doesn't have the hang of yet (unless IBM has been toiling in secret on something truly amazing). Categories like "Rhyme Time" and questions like "Qhat does a Pharoah need when he has a cold?" (Answer: an Egyptian Prescription) are beyond the ken of a data search.
Many Jeopardy "answers" have the key to the answer within the question, though in some cases it may be enough to throw the program off. IE in a category like "Musicals" an answer like "Unlike his other hits, this musical wasn't 'the cat's meow' on Broadway." Raw data crunching will pair musicals, Broadway and "cats" but won't know where to go with "unlike." Only an aficionado will know that Andrew Lloyd Weber's "Starlight Express" tanked on Broadway.
So the writers, given any knowledge of the limitations of AI, can set a challenge which will be nearly impossible for current AI to meet. John Henry will live another day.
I'm the queer the atheists sent here to take away your gun!
IBM is laying off American citizens, but hiring in Asia, and yet are spending all this money on gimmicks. This is the kind of thing that gives big companies bad names. Hopefully, as a consolation prize, the laid-off Americans can watch their former company go down in smoke on the game show, hoping it starts smoking and sparking like a cheesy Trek android meltdown.
Table-ized A.I.
Alex will interject while reading the category such as "'Cats'--and that means all the words in this category start with 'Cat'."
Then the bot would read the closed caption that the category is "CATS MEANING ALL RESPONSES HAVE A WORD THAT STARTS WITH CAT" and include that in its reasoning. Then the clue "They are the popular makers of earth moving equipment" becomes something like "They are the popular makers of earth moving equipment, starting with 'CAT'".
The summary clearly should have been titled "How does IBM plan to win Jeopardy?"
Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
A computer that can play Jeopardy?
THE END IS NEAR!
No sig for you. YOU GET NO SIG!
That's S-words for four hundred...
Because Thomas J. Watson was the man who turned IBM into a global empire, and Thomas J. Watson Jr. brought it into computers. They successively held the top position at IBM for 57 years. So it's a very important name at IBM, and the connection with Sherlock Holmes is serendipitous.
http://en.wikipedia.org/wiki/Thomas_J._Watson
"I'm sorry, Watson. Your answer must be in the form of a question."
There are many Watsons. Working in biotech I've seen dozens of machines and applications named Watson.
sic transit gloria mundi
I hope "how many roads a can a man walk down..." is not a question
IBM just MUST make it sound like Sean Connery! Watson: I Google'ed your mother last night Trebeck!
I love random hex numbers! Just like this one, 09f911029d74e35bd84156c5635688c0.
What is the answer to life, the universe, and everything?
The system is not designed to access the web?
Horse shit.
That huge fucking pile of data is getting in there from the web. It won't be accessing the web during the game, but it's still a fucking cache of random shit (mostly geography and world history) from the internet.
So will IBM then try to get a patent for "Winning Jeopardy", then all the contestants have to pay royalties if they win?
All they need to do is use their super computers to generate some digital footage of Alex Trebek engaged in beastiality (tappin' Rosie O'Donnell) and then tell him that if they win, the footage disappears forever.
It'd be far cheaper than what they are planning to do...and they can always leak the footage to youtube after they walk off with the winnings...
Sig Follows: "Suppose you were an idiot. And suppose you were a member of Congress. But I repeat myself." -- Mark Twain
Alex: "Here to present the Video Daily Double is Harry Mudd, who always lies."
Harry: "I am lying."
Support Right To Repair Legislation.
By searching for all the answer on http://www.wikipedia.org/ because we know all the information on that site is correct!
(Yes.. that was a joke)
Well, we already know that, it's 42.
The real question, is what is the real question for which 42 is the answer? That one is the tough one.
I suggest we build a planet, who's sole purpose is to calculate that question...
Security is mostly a superstition... Avoiding danger is no safer in the long run than outright exposure. - Helen Keller
It's a shame that 'Watson' works by breaking down questions into their structural components and then searching its database for relevant answers. After all, on Jeopardy all the answers are given freely.
Comment removed based on user account deletion
The first time I read it
Fascism starts when the efficiency of the government becomes more important than the rights of the people.
"Hello you're speaking to Susan Blue Gene how can I help you?"
...?
Making that statement in the form a question was appropriate there... unfortunately all the statements will be in the form of questions, with no answers in sight...
Is it plugged in?
Is it turned on?
Did you reboot?
Do you have your serial number?
Would you like me to drop this call under the guise of transferring you to someone who has no script to follow?
This issue is a bit more complicated than you think.
They claim they won't use Web data, but there's no way they can compile enough databases on their own to handle Jeopardy's general knowledge. Awards, lyrics, plots, characters... the list goes on and on and on.
WolframAlpha is a recent disappointment that's spent years collecting databases and delivers almost nothing useful yet.
I'd suggest celebrity blind-items as a fun test domain that might be manageable, eg:
Here's a way to build a simple Jeopardy player that would kick a human's ass and doesn't require 4 years of programming:
- Type entire "answer" as given on the board directly into google without quotes.
- Search the returned page for the most common word (ignoring 2 letter ones) in the titles of the pages.
- If the most common word appears more than 3 times, print "What is X?" where X is the common word.
- If no one term appears that often, don't ring in.
Voila. Instant human-crushing Jeopardy player.
If you tweak the rule set to make it a little more complicated (looking for whole phrases, etc) and tweak the threshhold for how "certain" it must be before ringing in (the appearance count), you might be unbeatable.
Answer: The number of minutes it would take for gravity-powered travel between antipodes, and the angle in degrees which causes a rainbow to appear.
https://www.eff.org/https-everywhere
IBM has a history of inadvertently making terrible PR for themselves with these man-vs-machine stunts. Everyone here should remember Kasparov vs. Deep Blue. Expect IBM to win Jeopardy, and expect there to be a hailstorm of "IBM cheats" controversy after the game.
http://www.youtube.com/watch?v=cK0YOGJ58a0
What I want to know is this:
The machine will probably be able to come up with an answer (maybe not the right one) much faster than all of the human opponents. But, what confidence will it have in that answer, and will it realize that a wrong answer will cost it?
Obviously if the machine just answers immediately (and no 'confidence' factor is involved) then it could provide wrong answers very quickly, and thus just lose money on every question as it "presses the button" to answer the question before the opponents, but answers incorrectly.
So, IBM, how are you giving Watson a confidence factor? Will Watson's confidence change based on the number of incorrect/correct answers it has given in a row, based on how much time it waited to find the best, most 'confident' answer? In short, will Watson learn?
I do not respond to cowards. Especially anonymous ones.
(for the lazy mods who blindly modded the parent up)
The real question, is what is the real question for which 42 is the answer?
And that is why *this* computer's answer would be interesting, because it's designed for Jeopardy, where answers must be in the form of a question.
I suggest we build a planet, who's sole purpose is to calculate that question...
The Earth, in the Douglas Adams universe, was NOT a planet; it was an organic supercomputer frequently mistaken for one.
- RG>
Hey pal, this isn't a pleasantforest, so don't waste my time with pleasantries!
I'd presumed that Watson refered to the assistant of Bell who first understood electric speech. Probably a wink to the quirky and confounding associations Jeapordy delights in.
In theory, there's no difference between theory and practice. In practice, there is.
Let's see if it can Win Ben Steins Money.
Never answer an anonymous letter. - Yogi Berra
better yet. How to reverse the direction of entropy?
Seriously, he lets himself get taken for a ride by the IDer's. What a moron.
Um, Trebek doesn't compete. He's the host of the show. The summary is stupid. The computer will be playing against other human contestants.
My Freakin Blog
Wow. Seriously?
Ok, first, it was a joke, don't have a cow man.
Second, I have no clue why it was modded insightful, but whatever.
Third, no, phrasing answers in the form of a question is easy, have you evern watched Jeopardy before? That's the easiest part of the whole thing, "What is/are X" and "Who is/are X" are all you have to do the answers to phrase them as a question. No, the interesting thing will be if IBM's machine can parse the question in the correct context and come up with the correct answer faster than a human. It's an area that computers thus far have very much sucked at.
Fourth, wth are you smoking? Do you really take D. Adams that seriously? His whole series was a big joke! Literally!
And last, yes, the Earth was a planet in Adams' universe. Look up the definition of a planet man. Seriously. Being an organic computer would not negate its planet status. It is a planet because of its size and the fact that it is the largest object in its orbital path. Not being a planet because it was a giant computer doesn't even make sense. One of the questions we ask about whether an object was a planet was NOT "Well, is it a giant computer? Or no?"
Good night man, are you a professional buzz kill or something? Or is it just something you do as a hobby?
Security is mostly a superstition... Avoiding danger is no safer in the long run than outright exposure. - Helen Keller
Nasty hobbitses! It cheats, Precious!
Seriously, he lets himself get taken for a ride by the IDer's. What a moron.
WHAT? Richard Nixon's speechwriter isn't a bubbling font of integrity, wisdom, and truth? MY WORLD VIEW HAS BEEN SHATTERED! MY PARADIGM HAS BEEN PERILOUSLY SHIFTED! I CAN'T GO ON!
A good poker AI might actually be more interesting than a Jeopardy AI, even if the game it played was online so as to eliminate the factor of reading body language. For Jeopardy questions that boil down to "What is the capital of X" or "In what year did X happen?", a winning AI could basically just be Google running on an internal database. In contrast, winning at poker would involve social reasoning about questions like "When this guy suddenly raises his bet, is he often bluffing, and how likely is it that he thinks I think he's bluffing?"
Build a great Jeopardy AI, and you have a slight upgrade to Google. Build a great poker AI, and you have something that can start to tackle other human social situations.
Reality check: how good are actual poker AIs? A quick search turns up claims that some are pretty good.
Revive the Constitution.
Well DUH, that won't work! On Jeopardy they give you the answer and you have to respond with the question. Geez, haven't they ever watched the show?
J