Was Turing Test Legitimately Beaten, Or Just Cleverly Tricked?
beaker_72 (1845996) writes "On Sunday we saw a story that the Turing Test had finally been passed. The same story was picked up by most of the mainstream media and reported all over the place over the weekend and yesterday. However, today we see an article in TechDirt telling us that in fact the original press release was just a load of hype. So who's right? Have researchers at a well established university managed to beat this test for the first time, or should we believe TechDirt who have pointed out some aspects of the story which, if true, are pretty damning?"
Kevin Warwick gives the bot a thumbs up, but the TechDirt piece takes heavy issue with Warwick himself on this front.
It has nothing to do with actual artificial intelligence and everything to do with writing deceptive scripts. It's not just this incident, it's a problem with the goal of the Turing test itself. I always found the Turing test a kind of stupid exercise due to this.
Why do you ask if the Turing Test was legitimately beaten or just cleverly tricked?
But seriously, yes, it was 'legitimately beaten', just like it's been 'legitimately beaten' in times past, going back to ELIZA in the 60s.
Was it MEANINGFULLY beaten is the question to ask, and no, no it wasn't. Until the computer can actually 'understand' context to a meaningful degree, the answer to that will continue to be no.
I want to talk to these AIs myself! Give me a webpage or irc chatroom to interact with it directly. It sounds fascinating even if its only 'close' to passing the test.
Similarly, the computer must convince the judge it is a human with it's full mental capacity, not child, nor a mentally defective person, nor someone in a coma.
The test is whether a computer can, in an extended conversation, fool a competent human into thinking it is a competent human being speaking the same language,at least 50% of the time.
excitingthingstodo.blogspot.com
If only there were a method, where people could let others know about their findings, in enough detail so that the results could be reproduced. Just for fun, we could call this method "the scientific method."
Oh and hey, why don't we create a 'magazine,' where 'scientists' can submit their findings, that way they will be easy to find. We can call them 'scientific journals.' Extra benefit, the journals can make an attempt to filter out stuff that's not original.
Oh wait. Why didn't these guys submit to a journal? Probably because it adds nothing to what Joseph Weizenbaum back in the 60s.
"First they came for the slanderers and i said nothing."
For those who haven't read the article (I read one yesterday and assume the details are the the same): The program claimed to be a Ukrainian boy of 13 years old, a non-native English speaker, writing in English to English speakers. This allowed the program to avoid the problem of people using language to make judgements about whether the responses were from a person or a program. Also, since the program was claiming to be a boy instead of an adult, it also greatly reduced what could be expected of the responses, again greatly simplifying the programs parameters and reducing what the testers could use to test. So basically, the Turing Test is supposed to be a test if a person can tell if the program acts like a person, but here the test was rewritten to see if the program acted like a child from a different culture and who was supposed not to be speaking in his native language. Many are apparently crying foul.
I personally agree.
Maybe we need to more formalize the Turing test to give it specific rigor?
That or come up with a whole new test ... I don't know, maybe call it the Void Kampf test.
It's a Turing test if I know one of the candidates is, in fact, an AI. If you tell me it's a 13 year old, you're cheating.
Lost at C:>. Found at C.
...why didn't /. just wait for the skeptical posts calling the original news articles bullshit in the first place?
Seriously, weeding out the garbage posts, 3/4ths of the comments were calling bullshit when they saw it, and 1/4th were making pointless references to Skynet and HAL.
"But remember, most lynch mobs aren't this nice." (H.Simpson)
-- Joe
That's the whole point. To cleverly trick the tester into believing something that isn't true. The test can't be beaten without clever tricking.
... was not actually performed in the research. End of story.
It's also worth mentioning that a lot of times, the way these tests are set up (with a human and a computer and the judge has to decide which), what really happens is the human manages to convince the judge that it's a computer, not the other way around.
"First they came for the slanderers and i said nothing."
The first time I saw ELIZA in action, I realized that the Turing test is basically meaningless, as it fails on two fronts. We are not good judges for it, as we are hard-wired to assume intelligence behind communications, and Turing's assumption that the ability to carry on a reasonable conversation was a proof of intelligence was wrong.
This is not to fault Turing's work, as you have to start somewhere, but, really, after all of these years we should have a better test for intelligence.
Maybe you design an obstacle course that required the leg to function in a range of everyday scenarios, that tests its endurance, comfort, and flexibility.
These chat bots would be the equivalent of calling a helicopter a "prosthetic leg" and flying over the course.
In both cases, they're avoiding the meat of the challenge. Yes, arriving at the finish line is the goal, but it's how you got there that is the interesting part. That's not to say these are useless projects - they're fun, and there's some legitimately interesting stuff there. But it'll be a very different beast that truly passes the Turing Test.
Let's not stir that bag of worms...
A 13 year old boy was also able to pass the Turing test, convincing a panel of middle-aged judges that he was an actual person.
Kevin Warwick is a narcissistic, publicity seeking shitcock.
Make him speak Spanish and make sure all the judges only speak Norwegian. See, you can cheat it. But anyway, they should be disqualified for the age. 13 year olds are predictable, quite dumb, and easy to imitate. To be more scientific, between approx age 10 and 18, your brain doubles in its overall processing power and in the middle, your frontal lobe can't process logical decisions very well. That's quite a cover story for an AI to pretend to be a human.
I created a chat bot that emulates a 65-year-old grocery store clerk who speaks perfect English. Here is a sample transcript:
Tester: Hello, and welcome to the Turing test!
Bot: Hey, gimme one sec. I gotta pee really bad. BRB.
.
.
.
Tester: You back yet?
.
.
.
Tester: Hello?
.
.
.
As the saying goes "haters gonna hate", but really, it's a big accomplishment. To pass the Turing test, you'd need to choose some "identity" for your AI. The idea of using a kid with limited cognative skills was clever, but not cheating -- but it's also not simulating a professor. If there is truly intellgient AI in the future, it's reasonable to expect its evolution to start with easier people to emulate before trying harder.
-- Political fascism requires a Fuhrer.
You can fool all the people some of the time, and some of the people all the time, but you haven't /really/ passed the Turing test until you can fool all of the people all of the time.
No really... Eliza fooled some of the people back in 1966. There is nothing really new to see here, move right along.
Lets see the bot first win the Loebner prize. This "test" it won seemed a little focused on free advertising for the research group.
Loebner, despite not being a true Turing test, is a long established competition with clear rules and evaluation process.
"Kevin Warwick gives the bot a thumbs up"
That's a point *against*, not a point in favour.
Adam's Law of British Technology Self-Publicists: if the name "Sharkey" is attached, be suspicious. If the name "Warwick" is attached, be very suspicious. If both "Sharkey" and "Warwick" are attached, run like hell.
The Turing Test is the ONLY test we have for artificial intelligence. Every other year we get some research team or the other claiming that their system is as intelligent as a dog, and now it's just a matter of scaling. The Turing Test is analogous to the test the Patent Office has for perpetual motion machines - if you can't pass the test, then you're not there yet. Simple, and easy to measure.
Support microSD: in a post 9/11 world, it is unwise to carry your data on media that you cannot comfortably swallow.
Would be to get two bots to talk to each other and see where the conversation goes after two minutes -- my guess is that all the code is biased towards tricking actual people in a one-on-one "conversation".
But when a machine converses with another machine, all that code no longer has an effect, and pretty soon the two machines will be essentially babbling *at* each other without actually having a conversation. An outside observer will immediately recognize that both of them are machines.
If telephones are outlawed, then only outlaws will have telephones.
As has been repeatedly emphasized, the Turing Test itself is rather subjective. Most schools measure language ability by way of standardized reading comprehension tests, ranging in difficulty according to grade level. I suggest that all natural language programs be similarly graded, using the exact same tests given to students, with the same criteria for passing. Any bot that can pass the college entrance exam (SAT, any version; ACT; or other similar exam) with a perfect score may be considered intelligent.
The trick is to know how to accurately measure what you want to get.
If we want a test that validates human-like behavior in an AI, then the test criteria must rigorously define what that condition is. Tricking a single person in a subjective test is terribly skewed.
Warning: Teh poster of this messaeg is lysdexic
Lt Saavik: [to Kirk] On the test, sir. Will you tell me what you did? I would really like to know.
Dr Leonard McCoy: Lieutenant, you are looking at the only Starfleet cadet who ever beat the "No-Win" scenario.
Saavik: How?
James Kirk: I reprogrammed the simulation so that it was possible to save the ship.
Saavik: What?!
David: He cheated.
Kirk: Changed the conditions of the test. Got a commendation for original thinking. I don't like to lose.
If telephones are outlawed, then only outlaws will have telephones.
So its true ... you are a chatbot!
Sent from my ASR33 using ASCII
Have a bunch of human judges and some instances of the bot in question all participating in a chat together, or randomly paired together for a while and then re-paired, so that humans are judging humans as well as bots, and have no idea which is which.
If a human is frequently judged as a bot by other humans, that human's judgements are de-weighted, because apparently they're too stupid to be distinguished themselves from an AI, so why should we trust their ability to distinguish other humans from AIs.
Although, I wonder if exceptionally intelligent humans with perfect spelling and grammar, a wide range of knowledge, and high typing speed, might be mis-judged as AIs too, for being "too good". Some hunt-and-pecker who can't tell their/they're/there apart might see someone who gives an intelligent response in complete, grammatically-correct sentences in half a minute as inhuman.
-Forrest Cameranesi, Geek of all Trades
"I am Sam. Sam I am. I do not like trolls, flames, or spam."
Was Turing Test Legitimately Beaten, Or Just Cleverly Tricked?
Neither.
a) It wasn't a Turing Test.
b) It may have been legitimately beaten by the rules of this test, but were the rules remotely legitimate as far as rating AI is concerned? Most Turing-type tests set the bar at a 50% fool-rate (and that's versus a human). This bot got 30%.
c) It was about as clever as sending over random keystrokes to pass the Turing-Cat-On-My-Keyboard Test.
systemd is Roko's Basilisk.
really? Please just ignore him. He's has become an attention whore and nothing more.
The Kruger Dunning explains most post on
It would probably be more convincing if the computer could correctly tell whether the judges were real people or programs...
You can hook ELIZA to autorespond to email and fool some people.
That one's obvious. The then/than troll would be sort of a challenge to program.
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
Tricked but not cleverly.
I can tell my colleague that I just saw his car in the parking lot and one of his tires is flat. This would be tricking him, but it doesn't require any cleverness to do this.
Similarly, you can tell the judges at a turing test that the "person" on the other end has some sort of deficiency (e.g. mental problems, doesn't speak the language of the judges, etc) that explains his/her odd (e.g. inhuman) responses. You can even stack the human subjects with humans who are also similarly deficient to further disguise which subjects are computers.
This is basically just creating a rigged test.
It's obvious why it is necessary to create a rigged test. Modern computers are nowhere near passing a legitimate Turing test.
From metalev.org: Every news outlet is currently covering the story that a chatbot pretending to be a 13-year old Ukranian boy has deceived 33% of human judges into thinking it is a human, thereby "passing the Turing test for the first time". There are so many problems with the Turing test (even with the numerous refinements to it that many have proposed) that I don't know if it will ever tell us anything useful. The creators of the above chatbot hinted that part of their success in convincing the judges was that “his age ... makes it perfectly reasonable that he doesn’t know everything” -- in other words, to make a believable bot, you can't give your bot super-human knowledge or capabilities, even if this is technically possible to do (e.g. computers can multiply large numbers almost instantly). Limiting computational power to appear human-like is known as "artificial stupidity". The need for artificial stupidity to pass the Turing test illustrates one of the deepest issues with the test, and one that cannot be fixed by simply tweaking the rules: the Turing test is a test of human dupe-ability, not of machine intelligence.
I'm pretty sure we'll start seeing several claims per year that a bot has "passed the Turing test", followed by a flurry of discussion about what was actually tested and whether the result is believable or even meaningful, until it becomes so cliche'd to say that your bot passed the Turing test that nobody with a halfway decent AI would actually *want* to claim that their AI passed a test of this form.
Hopefully we see the day when the Turing test is inverted, and we realize we need a test to establish that someone is a "genuine human" and not a bot ;-) But until then, we still have a heck of a lot of work to do!
It's the Turing Test itself that is meaningless. In a possibly apocryphal account of an AI conference in the early 2000's, a learned panel of AI experts elaborated on the Turing test to explain that passing the test didn't just mean a minimal level of intelligence, but intelligence as advanced as humanity's itself, since it was able to fool a human. An undergraduate attendee asked the panel, "So, if I can write a program that can fool a dog into thinking it's interacting with another dog, the program is as intelligent as a dog?"
The room fell silent.
Since then, nobody has proposed a reasonable alternative for what Turing meant by "intelligence" as the target in his test.
Myself, I think AI is Computer Science's biggest Ponzi scheme. We are not one iota closer to actual artificial intelligence than we were in the 1950s. Yet the public's expectation, and the impression given by AI researchers, is that we've been making steady progress. So every new AI "advance" must be more spectacular than the last, with lots of hand waving explaining how this moves us closer to the goal of sentient computing. It started back in the 1960s with natural language processing, which was really just elaborate table lookup. Then it advanced to the 1970s, with Chess-playing machines -- also just elaborate table lookup. The 1980s brought expert systems and neural networks, otherwise known as elaborate table lookup. Today we have computer navigation, plain-language database queries, and speech processing such as Siri. AI? No. Table lookup, elaborate.
We can't even define what intelligence is or how it works in even the simplest organism, let alone explain it in humans. Until we can do that, we can't have an artificial version of it.
Turing was a con man.
#!/bin/sh
a[0]=""
a[1]="whaaa"
a[2]="gurgll"
a[3]="whaaa aaa aa"
echo "=== Turin Baby 0.1 prototype ==="
while read q
do
random -e 3
echo ${a[$?]}
done
It doesn't make any difference whether the TT is passed 'legitimately' or by 'tricking', the point is that if the test is valid (which is obviously a huge debate in itself), then a trick sufficiently sophisticated to pass it must be considered just as intelligent as passing the test legitimately. What difference does it make?
Real intelligence is nothing but a sophisticated trick pulled off by having a sufficient density of firing neurons. We're all performing that trick constantly. Some better than others actually - there are plenty of people I meet that couldn't pass the Turing Test.
It was probably the real AI convincing this Warwick guy do the press release, then adding media hype and after everybody finds out it is a hoax, it will convince the scientific community that the real AI is still not yet possible, however the real IS there.
.. finishing my third beer. But in case it turns out I am right, I can tell you later that "See, there was a real AI, I told you so!".
Pfff
Stanislaw Lem's Solaris is an imaginative take on a similar problem.
Play Command HQ online
If the Turing Test is a test to see if universities can release press releases that the media churn out without doing any basic thinking or background checking then yes. Otherwise no. See http://en.wikipedia.org/wiki/Churnalism
UK university staff are getting more and more pressure to get publicity for their work. Why? Because the student market is much more competitive than it was. Every Uni now has a small army of press and "impact" people who aim to get the Uni in the papers, on twitter, etc etc. Not that Kevin Warwick needs much help with that, he's been doing it for years.
The press release about this so-called Turing Test was pretty much written in a style ideal for lazy journos to cut and paste into Quark Xpress. http://www.reading.ac.uk/news-and-events/releases/PR583836.aspx
actually believe that the comments are written by real people and respond to them.
Please don't blame this on computer scientists. This story was almost certainly generated by marketing types trying to line up with some anniversary of some kind. I'm not exactly sure because TFA for the original story says something about "60th anniversary of Turing's death" and "created in 2001". And the 30% is clearly a lie to make up for their failure to even reach the 59% at which Cleverbot was already tested.
I sometimes ask revealing, often ignorant-seeming questions. Maybe they're harder to answer than you think.
They should use the Voight-Kampff" test instead