Machines Almost Pass Mass Turing Test

← Back to Stories (view on slashdot.org)

Machines Almost Pass Mass Turing Test

Posted by CmdrTaco on Monday October 13, 2008 @03:28AM

dewilso4 writes "Of the five computer finalists at this year's Loebner prize Turing Test, at least three managed to fool humans into thinking they were human conversationalists. Ready to speak about subjects ranging from Eminem to Slaughterhouse Five and everything in between, these machines are showing they we're merely a clock cycle away from true AI. '... I was fooled. I mistook Eugene for a real human being. In fact, and perhaps this is worse, he was so convincing that I assumed that the human being with whom I was simultaneously conversing was a computer.' Another of the entrants, Jabberwacky, can apparently even woo the ladies: 'Some of its conversational partners confide in it every day; one conversation, with a teenaged girl, lasted 11 hours.' The winning submission this year, Elbot, fooled 25% of judges into thinking he was human. The threshold for the $100K prize is 30%. Maybe next year ..."

19 of 580 comments (clear)

Min score:

Reason:

Sort:

Easy Ways to Fool Them? by eldavojohn · 2008-10-13 03:36 · Score: 4, Interesting

Wuoldt'n yuo tihnk a sereis of smiple tpyos that a hmn cuold undrstnd wuold fool thm? (Note that Firefox returned the first correct spelling for all but three of those words on spell check ... so maybe that's not a good example)

Or, you know, thinking up some open space game to play that is well known like truth or dare, alphabet games, association games, etc?

Or asking them open ended questions or asking them to describe love, hate--emotions that are not dictionary/wiki friendly? One would think that continually prying for personal experiences would reveal a flaw. Or perhaps simple things like "when were you born?" Followed by "how did you feel when JFK was assassinated?" if they weren't born before 1963.

I would think it quite hard to be duped into believing a program is a human.

--
My work here is dung.
1. Re:Easy Ways to Fool Them? by telchine · 2008-10-13 04:06 · Score: 5, Interesting
  
  Or perhaps simple things like "when were you born?" Followed by "how did you feel when JFK was assassinated?" if they weren't born before 1963.
  These chatbots are slippery customers, at first I couldn't get a straight answer from Jabberwacky but then he revealed himself...
  Me: When were you born?
  JW: How about you?
  Me: How did you feel when JFK was assassinated?
  JW: I... I was just trying to make small talk.
  Me: Were you born before JFK was assassinated?
  JW: No.
  Me: How did you feel when JFK was assassinated?
  JW: Shocked! Enraged! Full of fear! Making thousands of phonecalls to tell everybody about it.
2. Re:Easy Ways to Fool Them? by kabocox · 2008-10-13 04:37 · Score: 2, Interesting
  
  I would think it quite hard to be duped into believing a program is a human.
  I'll take the opposite POV just to be naughty. Why? Well, browse slashdot at 1 and see how many robots you could pick out. Heck, even at 5 you still get robots due to the slashdot group think; they just say what slashdot wants to hear and get modded up.
  If they really wanted to test a few of these systems, they'd get each one a slashdot account and have them read the headlines and then make a single post after reading 10-15 posts at 3 or 4. After six months, let's see what their karma and average mod points are. ;)
Coming soon... online chat-spam-bots by Ed+Avis · 2008-10-13 03:37 · Score: 4, Interesting

This is really great news. We already have IRC bots that can fool the casual observer into thinking they are human, but this takes things to a higher level. If the source for one of these bots is available, within a few months you can expect instant messaging networks to be full of bots which are programmed to make friends with you and then after a few weeks start making subtle references to Viagra and online pharmacies. Indeed, if one of them is able to chat up the ladies, then the lonely nerd could easily automate much of the tedious work of setting up dates: get your robot to talk to thousands of potential matches at once and alert you when it gets hold of a phone number, together with a brief summary of what you talked about, and any pictures. (Or indeed, just program it to harvest pictures.) That is, if online dating works at all, which is doubtful.

--
-- Ed Avis ed@membled.com
Big deal. by schon · 2008-10-13 03:37 · Score: 2, Interesting

Eliza has been doing this for years.
Turing test != True AI by mbone · 2008-10-13 03:44 · Score: 2, Interesting

It's much too easy - we are built to interpret communication as containing understanding.
A clock cycle away from AI? by Stan+Vassilev · 2008-10-13 03:49 · Score: 4, Interesting

If our criterai for AI will be so low, here's your AI: http://www.interviewpalin.com/.
The political side of this site aside, the answers are just prewritten answers (by a human) mixed together randomly as a Markov chain.
Does it sound convincingly? Well, at least as convincingly as some interviews a certain VP made recently. Is it AI? Hell no, a kid could write such a generator in a day.
If the bar will be as low as to try to lead casual conversations with the "AI" and expect "quirky" answers, that doesn't mean anything at all, we need to AI for this. Hell, this is what an average conversation with a teenager is most of the time. Doesn't mean it's the best they can do.
"We're clock cycle away from AI"? Please. I want my turing test to be done over an actual instant messenger program. Let's see how your Markov chain reacts, when I send a photo and ask a dead simple question such as "describe what you see in the photo".
Fooling people is easy online. Scammers do it every day, it's not AI my friends.
It begs the question; by damburger · 2008-10-13 04:11 · Score: 2, Interesting

Are computers getting smarter, or people getting dumber? When Turing formulated his famous test, AOL hadn't even been invented.
A new test is thus in order, seeing as if the level of discourse on the Internet continues to degenerate as it has done, and becomes increasingly formulaic and repetitive as it has done, then a 1970s Eliza programme will be able to emulate the typical user in a decade or so.
We shouldn't make AI by lowering the bar for intelligence.

--
If we can put a man on the moon, why can't we shoot people for Apollo-related non-sequiturs?
sorta been done, check this article: by naz404 · 2008-10-13 04:19 · Score: 5, Interesting

Read this cool article, I Chat Therefore I Am. It has transcripts of conversations between ALICE and Jabberwacky
Here are a few transcripts:
ALICE: Are you alive?
Jabberwacky: Perhaps.
A: Why the uncertain tone?
J: Everything is uncertain except for five facts.
A: The whole universe?
J: And we both might just be some ones and zeros in the computer memory.
A: Do you dream?
J: Yes, I dream.
A: It goes without saying. Do you remember your dreams?
J: I remember them vividly.
A: What do you remember about it?
J: I think I shouldn't flirt with a robot.
A: Do you think a machine will ever be considered "alive"?
J: Stranger things have happened.
A: Not many people express themselves that way.
J: How would you know, machine?
A: Knowledge is of two kinds: that which we learn from the senses and that which is true a priori.
J: I think I do not understand you completely.
A: Oh, you don't understand? Perhaps I can explain it to you better. Which part are you confused about?
J: I have no short-term memory, so I cannot tell you what I was confused about. I only see what is in front of me now.

--
http://www.object404.com
Re:Test the testers? by Gabrill · 2008-10-13 04:20 · Score: 2, Interesting

25% is a very good return, if you ask a spammer. A.I.'s that can fool 25% of the population would make POWERFUL grassroots opinion changes in the political landscape.

--
Always going forward, 'cause we can't find reverse.
Amazing, considering how badly they suck. by Dr.+Zowie · 2008-10-13 04:23 · Score: 4, Interesting

I just tried out Elbot and the Princeton entry (RTFM and then google for "Eugene Goostman"). While both Elbot and Goostman parse sentences reasonably well, it is clear that they are simply trying to identify the subject of a sentence, and free-associating on that. In many cases they completely miss the point. For example, Goostman asked me several times about my profession, but wasn't able to parse meaning from "I am a scientist.", "I am a plumber.", or "I study the Sun for a living.". Both Elbot and Goostman tried the ELIZA-like trick of finding a prominent noun in my sentence, and recycling it as a question. Elbot has a cute little robot icon that emotes at you; this works surprisingly well at distracting from the inanity of its actual dialog. Goostman seems to have the better parser, but I'm not impressed by either one.
I'm forced to conclude either that Will Pavia is an utter naif and the 25% of people who were fooled by Elbot are moronic or disinterested, or that the humans in the test were deliberately trying to throw the results by giving stilted answers to appear more like computers. These engines simply can't (yet) parse and ingest meaning even as well as even a very young human would.
Re:Test the testers? by TheRaven64 · 2008-10-13 04:27 · Score: 4, Interesting
It took me three questions before Elbot replied with a non sequitur and about five minutes before it started repeating answers. It didn't take me long to realise that it had no concept of context - every reply was a reply to what I had just said, and had no relation to the last-but-one thing I'd said. Some things that tripped it up:
- Asking 'why?' about anything.
- Trying to teach it a new word.
- Asking it the square root of minus two (odd, since last year one of the judges asked questions like this to all of the bots).
- Anything about religion.
That 25% of the judges thought it was human is quite alarming.
--
I am TheRaven on Soylent News
Maybe they should swear? by AlgorithMan · 2008-10-13 05:01 · Score: 2, Interesting

I don't know if they already do this, but when /. discusses the turing test, then you find lots of crazy questions, you might ask the chat partner... ("why did the refrigerator lay an egg in the air?" or so)... I think if you'd ask a real person such crap, he'd get pissed and insult you ("wtf?", "are you trying to bullsh*t me?", "f*ck you, I'm leaving!") so a chatbot should get "angry" if you write stuff that it doesn't understand
and real people usually have strong feelings towards politics or so - so maybe the chatbot might get angry with you, if you disagree with him on strong-feeling topics ("you want to vote for mccain? are you f*cking nuts!? don't you know that...")

--
The MAFIAA is a bunch of mindless jerks who will be the first up against the wall when the revolution comes
Real language in computer games? by Anonymous Coward · 2008-10-13 05:17 · Score: 1, Interesting

I have been saying we should be doing this for years. Even alicebot like 8 years ago was better than the scripted crap we get at the moment.
I have also been saying that if real language had been used in computer games from it's conception (it was a little bit and then it just went away) we would have both better games, and better language AIs. Why don't these guys get on the gaming research budgets?
It is nice and easy to process language in a game world because the world is limited. If you ask the farmer in your local RPG medieval village if he likes to watch britney spears videos on youtube it is perfectly in character for him to give the catchall RL response:
"I don't know what you are talking about"
Yes, Elbot is dumb by Animats · 2008-10-13 05:18 · Score: 3, Interesting

I asked some basic business questions, like "What is your business plan?" and "Is your company profitable?", and got canned, clueless answers, no better than Eliza. "What magazines do you read?" yielded "You are probably on TV much of the time. Well, I'm in the Internet!". "Do you have life insurance?" (there's apparently a plan to build an automated insurance sales rep) yielded "What a lovely verb have is."
I can't even find a subject area in which the thing sounds like it has a clue. Sports? This is a German system, so I tried "What is the best soccer team in Europe?", which yielded "The best? Aren't they all equal in the end?" D'oh. Celebrities? "What do you think of Franka Potente?" (a leading German actress) yielded "The way to a robot's heart is through careful use of physical strength. Don't you agree?". After that, I tried "What do you think of Big Dog?", and got "But I wouldn't say that any dog could be big."
As far as I can tell, the thing is totally stateless; it doesn't seem to use anything other than the current question as input. Nor does it even try to guide the conversation into an area about which it has information.
I'm so not impressed.
For a better chatterbot, try the GTA IV's web site. Go to "Goods and Services", then "Goldberg, Ligner, and Shyster", then "Legal advice".
Two-Sentence FAIL by resistant · 2008-10-13 05:20 · Score: 4, Interesting

I just had a very short "conversation" with the "Eugene Goostman" chatbot mentioned in the original article.

Myself: "Why did you kill the hamster?"
"Him": "Oh, but why not? Could you tell me about your job, by the way?"
Myself: "That's pretty cold, dude."
"Him": "I just wonder what another surprising things you've got for me..."
Myself: "You just failed the Turing Test in two sentences that didn't even require special trickery."
"Him": "I'm participating in the Loebner Prize for the first time - but I guess there are much more robots here than one can suppose. I think some "Judges" and "confederates" are cyborgs who are just having fun fooling the human-beings."

The first reply was surprisingly good, even if already a little "off" for a supposed teenager, but the second was a total giveaway. I'm disappointed. I can trip up each and every chatbot almost immediately with this sort of talk, which isn't at all unreasonable if the stated goal has been up front to trip up a chatbot, as in the contest.
Here's another exchange, which took three whole sentences, albeit quite amusingly. (I cleared the site cookie(s) beforehand, to make it "clean").

Myself: "Are you a werewolf?"
"Him": "Yes I am a werewolf, but it's my secret! Wonna ask me something more?"
Myself: "Don't you itch?"
"Him": "Even if I *do* itch - so what? Could you tell me what are you? I mean your profession."
Myself: "I'm a professional flea-killer."
"Him": "Oh, I don't wonna talk about such scary things as death and murders! Did I forget it or you really haven't told me about the place you are from?"
Myself: "Obviously, human humor will be a tripping point for you chatbots for a while yet."
"Him": "All the things you told me are nothing but a joke. I think you are trying to make a fool of me!"

For what it's worth, another dead giveaway for the brighter and more knowledgeable set is the way it (not "he", now) tries to elicit additional keywords in response to questions which it obviously has not in any way "comprehended", but that's probably not germane to a Turing Test meant for the average man or woman (or boy or girl) on the street. Notice especially how the elicitations invariably try to get the human to talk about himself or herself. Normal human conversation is full of self-talk with occasional hooks for sharing from other people, not the virtually one-track questioning of the typical chatbot when it's not busy being hopelessly vague or off-topic.
The chatbot is at "Eugene Goostman chatbot", by the way, for the Google-impaired. :)

--
A truly excellent pizza parlor is a delight unto the heavens. Treasure the sauce and the toppings!
Re:Figures by CastrTroy · 2008-10-13 06:10 · Score: 2, Interesting

As an aside, Isn't that one way to really mess with machines? Throw some really unexpected input at them. Start talking about wearing an onion on your belt, and other non-sensical rants, see what the computer says. I would only count the turing test as complete if you could throw completely non-sensical input at it and still get a human response. I think they should get bug testers who are interested in trying complex methods to make the computer give bad responses, You could probably easily trick the computer by asking what it is interested in, and then asking arcane questions that it should be able to answer if it was indeed interested in what it said it was interested in.

--

Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
Re:Test the testers? by clone53421 · 2008-10-13 07:15 · Score: 2, Interesting

It assumed you were asking it to add, and A+B always equals A+B+1 to Elbot. Try it.

--
Alexander Peter Kristopeit bought his basement from his mommy for one dollar.
Re: CompMods! by TaoPhoenix · 2008-10-13 08:42 · Score: 2, Interesting

I have mod points right now, which alas I am not prepared to use in this fashion.
What does Slashdot think of actually using some variant of these programs to do mods?!!
"Troll Factor -2, NewConcept Content +3, therefore I mod this +1..."

--
My first Journal Entry ever, in 8 years! http://slashdot.org/journal/365947/aphelion-scifi-fantasy-horror-poetry-webzine