Clever Clues Clobber Crossword Computer
Hugh Pickens writes "Steve Lohr reports that an impressive crossword-solving computer program called Dr. Fill matched its digital wits against 600 of the nation's best human crossword-solvers, finishing only 141st at the American Crossword Puzzle Tournament in New York. 'I wish it had done better,' says Dr. Matthew Ginsberg, the creator of Dr. Fill and an expert in artificial intelligence. Dr. Fill typically thrives on conventional crosswords, even ones with arcane clues and answers; it solved one of the most difficult puzzles at the tournament perfectly. But the computer does poorly with clever clues based on puns or jokes, because humans and machines solve the crosswords very differently. Humans recognize patterns based on accumulated knowledge and experience, while computers make endless calculations to determine the most statistically probable answer. The computer program is literal minded, and tends to struggle on puzzles with humor, and puzzles with unusual themes or letter arrangements. Take this clue from a 2010 puzzle in The Times: Apollo 11 and 12 (180 degrees). The answer is SNOISSIWNOOW, seemingly gibberish. A clever human could eventually figure out that those letters, when rotated 180 degrees, spell MOON MISSIONS. Humans get the joke, while a literal-minded computer does not. 'Occasionally, Dr. Fill just doesn't get it,' says Ginsberg. 'That's my nightmare.'"
Can it manage ironic clues?
It's not Dr Fill's fault for not getting it. Clearly it's Dr Ginsberg who is not getting it.
If I see someone doing a crossword I usually say "I was stuck on a crossword the other day - the clue was 'very busy postman'". Eventually (sometimes it takes a while) they ask "how many letters" at which point you can say "hundreds!"
I'm such a funny guy...
Oh - another one is to say "seven up is lemonade"...
I'd luck to congratulate submitter on a clever title. Does not happen very often here.
I do not believe in karma. "Funny"=-6. Do good and forbid evil. Yours, Oft-Offtopic Flamebaiting Troll.
...film at 11. And extended debugging session afterwards.
Non-Linux Penguins ?
Would the Apollo example really trip up a decently-written program though? I mean, my first thought was "Well, what if it had a fallback routine where it tries anagrams of possible answers?" so I have to imagine someone smarter than me has thought of that. I guess there's some limitation I'm not seeing...
On second thought...what am I still doing awake at 5:48am commenting on a post about crossword-solving computers?!
Friend: "The NIC is misconfigured..." Me: "No prob, I'll just telnet in and fix it." *Silence*
All decent crosswords in the UK tend to be of the cryptic kind, rather than just needing a thesaurus most of the time. Writing answers backwards wouldn't be allowed, though, as the answer has to be an actual word. Here's one that a computer might struggle with.... V? (6,2,7) Answer: Centre of Gravity
Different cultures apparently have different rules for crossword puzzles. AFAIK, Finnish crossword puzzles would require that each answer must be a valid, independent word (in singular or plural). Moreover, in Finnish crossword puzzles, half of the clues are graphical.
the computer needs help from Joe Piscopo
I'd like to see if Dr Fill manages these two:
HIJKLMNO (5)
___ (2, 3, 4, 1, 4)
Ydco co
Not being able to guess a few words might not be a problem, skip it and solve the other ones, once there are enough letters in it a computer can easily look up the available words, and if there are more than one even use a nonlinear approach. Even without any clues, a few words can't be that hard to bruteforce.
Ya that is right and computer will randomly change the crossword puzzle. The writer has discussed a very good point that we humans recognize and take decision on patterns based on accumulated knowledge and experience, while computers make endless calculations to determine the most statistically probable answer not even knowing what are they doing. Bathroom Faucets
Unfortunately for Dr Fill's creator, the problem of how to get the program to work with such unorthodox solutions is the same as getting it to think like a person. At a certain point, all AI questions become the same AI question: this is the very essence of Turing horizon, and all such efforts converge there.
The program he wants to write is, sadly, doomed, as it will be impossible until such time as our species generates a true artificial consciousness with human intelligence, at which point the problem will be trivial - we will have much larger concerns that day.
The crossword puzzle guy needs lessons from Watson, who clobbered several Jeopardy human champions. That show has clever categories and clues. Watson probably had more impressive computing power, but I doubt that was the issue. The Watson designers clearly had a better grasp of natural language, including humor-filled and storied language.
There is nothing magical about the example given. 180 degrees correlates with rotation. Apollo correlates to Moon missions. Rotation correlates to crew rotation, reversing numbers, rotating around the center... bingo. That fits. Humorous answers follow the exact same rules as any other answers.
Humans recognize patterns based on accumulated knowledge and experience, while computers make endless calculations to determine the most statistically probable answer.
what's different about that? I've often said, "The more I learn about AI, the more I think it lacks any intelligence at all. The more I learn about psychology, the more I believe that humans think just like an AI."
Humans are also determining the most statistically probable answer. They just have a better algorithm for factoring humor into those statistics.
No, they're correct - you're just thinking of rotation on the wrong axis. To see how it works, try this: Take the word SNOISSIWNOOW and write it on a piece of paper. Rotate the paper 180 degrees so the lower right becomes the upper left. The part of the letters that were the bottom are now on the top - hence "W" becomes an "M" and the result is MOONMISSIONS.
*HSOOM*
Seems like you're a failed AI.
So Doctor Fill has the same ability to comprehend humour as Doctor Phil?
"That's the way to do it" - Punch
Blaine the Mono was unavailable for comment.
The first puzzle it tanked on (#2 in the tournament) had every other across line reading backwards. Thus, the answer to "Title in a Joel Chandler Harris story" was RERB, not BRER. I would expect a computer to fail at this.
The other one puzzle it tanked on (#5) had long words containing the trigram ANT split into to three parts, with the ANT portion connecting the beginning and the end by running diagonally through the grid. I'd expect the computer to fail on that one too.
However, it did very well on the straightforward, gimmick-less puzzles, cruising through the final puzzle, which was deemed the hardest by humans.
Isn't having a gibberish answer in a crossword puzzle like making up your own words in Scrabble? Doesn't the creation of a crossword puzzle have any rules? No wonder I often do poorly with them; I had no idea that they could be making up nonsense words.
Gamingmuseum.com: Give your 3D accelerator a rest.
...were supposed to be composed of REAL words.
WTF is "SNOISSIWNOOW"??
so it is just plain stupid...
I can't blame the computer for not doing well on these; a lot of crossword puzzles are a puzzle of "guess what the creator was thinking", and not a puzzle of words and language. Quite frankly, I'm not interested in guessing what someone else happens to be thinking when they write down a clue like "blue, red, and big"; I find that fundamentally uninteresting and of no long term value.
I have the same problem with many Mensa puzzles. A lot of them I can do, but puzzles that require deep and specific information from an extremely narrow field are really not useful tests of intelligence. Just as the "SNOISSIWNOOW" question above requires deep and specific information about an extremely narrow field (the narrow field being "how certain subsets of human minds think"), so to do questions such as "guess the next number in the sequence: 3, 5, 205782654, 6, 308" (which also requires knowledge about how certain human minds think). Neither answer is derivable, and both must be guessed via trial and error from models that few people will have.
The most common response to this viewpoint has been along the lines of "ha ha, you just can't think outside the box". In reality, I'm actually pretty damned good at thinking outside the box. What I'm not particularly good at is thinking inside someone else's box, because if I actually need to know, I can simply talk to them and ask them. It's far more efficient.
Alter Aeon Multiclass MUD - http://www.alteraeon.com