Was Turing Test Legitimately Beaten, Or Just Cleverly Tricked?
beaker_72 (1845996) writes "On Sunday we saw a story that the Turing Test had finally been passed. The same story was picked up by most of the mainstream media and reported all over the place over the weekend and yesterday. However, today we see an article in TechDirt telling us that in fact the original press release was just a load of hype. So who's right? Have researchers at a well established university managed to beat this test for the first time, or should we believe TechDirt who have pointed out some aspects of the story which, if true, are pretty damning?"
Kevin Warwick gives the bot a thumbs up, but the TechDirt piece takes heavy issue with Warwick himself on this front.
It has nothing to do with actual artificial intelligence and everything to do with writing deceptive scripts. It's not just this incident, it's a problem with the goal of the Turing test itself. I always found the Turing test a kind of stupid exercise due to this.
Why do you ask if the Turing Test was legitimately beaten or just cleverly tricked?
But seriously, yes, it was 'legitimately beaten', just like it's been 'legitimately beaten' in times past, going back to ELIZA in the 60s.
Was it MEANINGFULLY beaten is the question to ask, and no, no it wasn't. Until the computer can actually 'understand' context to a meaningful degree, the answer to that will continue to be no.
I want to talk to these AIs myself! Give me a webpage or irc chatroom to interact with it directly. It sounds fascinating even if its only 'close' to passing the test.
I have successfully written a chatbot that convinces people that it is a slime mold. It had to tell people it was a slime mold to make them do the mental gymnastics necessary to wave away all the absurd replies. But, it did manage to convince 90% of its conversational partners that it had the mental capacity of a slime mold. This is a striking success.
Similarly, the computer must convince the judge it is a human with it's full mental capacity, not child, nor a mentally defective person, nor someone in a coma.
The test is whether a computer can, in an extended conversation, fool a competent human into thinking it is a competent human being speaking the same language,at least 50% of the time.
excitingthingstodo.blogspot.com
If only there were a method, where people could let others know about their findings, in enough detail so that the results could be reproduced. Just for fun, we could call this method "the scientific method."
Oh and hey, why don't we create a 'magazine,' where 'scientists' can submit their findings, that way they will be easy to find. We can call them 'scientific journals.' Extra benefit, the journals can make an attempt to filter out stuff that's not original.
Oh wait. Why didn't these guys submit to a journal? Probably because it adds nothing to what Joseph Weizenbaum back in the 60s.
"First they came for the slanderers and i said nothing."
Stop using linkbait headlines... leave that to Gawker.
Clever? I'm sure Mr. Turing would agree that having to explain away the flaws in grammar and syntax by claiming to be a non-native English speaker, fits well within his intended vision...
For those who haven't read the article (I read one yesterday and assume the details are the the same): The program claimed to be a Ukrainian boy of 13 years old, a non-native English speaker, writing in English to English speakers. This allowed the program to avoid the problem of people using language to make judgements about whether the responses were from a person or a program. Also, since the program was claiming to be a boy instead of an adult, it also greatly reduced what could be expected of the responses, again greatly simplifying the programs parameters and reducing what the testers could use to test. So basically, the Turing Test is supposed to be a test if a person can tell if the program acts like a person, but here the test was rewritten to see if the program acted like a child from a different culture and who was supposed not to be speaking in his native language. Many are apparently crying foul.
I personally agree.
Maybe we need to more formalize the Turing test to give it specific rigor?
That or come up with a whole new test ... I don't know, maybe call it the Void Kampf test.
It's a Turing test if I know one of the candidates is, in fact, an AI. If you tell me it's a 13 year old, you're cheating.
Lost at C:>. Found at C.
...why didn't /. just wait for the skeptical posts calling the original news articles bullshit in the first place?
Seriously, weeding out the garbage posts, 3/4ths of the comments were calling bullshit when they saw it, and 1/4th were making pointless references to Skynet and HAL.
"But remember, most lynch mobs aren't this nice." (H.Simpson)
-- Joe
World to Captain Cyborg on 'Turing test' stunt: You're Rumbled
Legitimately beaten or cleverly tricked. Either one says it was beaten to me. Isn't a clever trick a legitimate way of winning? Is in real life conflict.
That's the whole point. To cleverly trick the tester into believing something that isn't true. The test can't be beaten without clever tricking.
... was not actually performed in the research. End of story.
Passed or tricked??? Same thing, here; that is the point. Computer tricks people.
The first time I saw ELIZA in action, I realized that the Turing test is basically meaningless, as it fails on two fronts. We are not good judges for it, as we are hard-wired to assume intelligence behind communications, and Turing's assumption that the ability to carry on a reasonable conversation was a proof of intelligence was wrong.
This is not to fault Turing's work, as you have to start somewhere, but, really, after all of these years we should have a better test for intelligence.
Maybe you design an obstacle course that required the leg to function in a range of everyday scenarios, that tests its endurance, comfort, and flexibility.
These chat bots would be the equivalent of calling a helicopter a "prosthetic leg" and flying over the course.
In both cases, they're avoiding the meat of the challenge. Yes, arriving at the finish line is the goal, but it's how you got there that is the interesting part. That's not to say these are useless projects - they're fun, and there's some legitimately interesting stuff there. But it'll be a very different beast that truly passes the Turing Test.
Let's not stir that bag of worms...
A 13 year old boy was also able to pass the Turing test, convincing a panel of middle-aged judges that he was an actual person.
Kevin Warwick is a narcissistic, publicity seeking shitcock.
Make him speak Spanish and make sure all the judges only speak Norwegian. See, you can cheat it. But anyway, they should be disqualified for the age. 13 year olds are predictable, quite dumb, and easy to imitate. To be more scientific, between approx age 10 and 18, your brain doubles in its overall processing power and in the middle, your frontal lobe can't process logical decisions very well. That's quite a cover story for an AI to pretend to be a human.
I created a chat bot that emulates a 65-year-old grocery store clerk who speaks perfect English. Here is a sample transcript:
Tester: Hello, and welcome to the Turing test!
Bot: Hey, gimme one sec. I gotta pee really bad. BRB.
.
.
.
Tester: You back yet?
.
.
.
Tester: Hello?
.
.
.
Maybe Kevin Warwick himself is a bot, and this is a cleverly-designed incarnation of the Turing test to determine whether or not we realize it. If we do, it doesn't pass the test, and there's nothing to worry about. If we don't, the AI revolution is nigh, and we're all doomed.
Turing test is not very good 'test' in the first place.
Use the 'convergence' model: it has to be trained, to think like a 'mind' and subsequently behaves like one. Is a superset of functionality over turing test and much more accurate.
Wholy shit, it is amazing the ignorance of computer science people to what the Turing Test is and is not, and the 1,000+ years of Philosophy of mind, langauge, epistimology, AI, linquistics, neruology, and so on that it is based on. Inteligent life in the computer science departments of any sort would be a nice start. The AI fantasy circle jerk seems to be a lot more fun.
I'll just take a couple of the more important ones for the moment:
1) The REAL Turing test never, ever ends. It can not be beat. Just like humans can be said to be "intellegent" until we do something stupid (correct or mistaken), or just die. The computer must go on convincing the interregators that it is "intellegent" (i.e. simply convince) until it does something wrong (fails to convince) that it is a human forever.
2) The use of "language" IS the test!!! There is no "tricking", because "tricking" a human in language is the trick. I'll sum this one up with one simple question. Try having a thought outside a language (not to say it is impossible, just no one is sure how that would work). Now if you manage that, try expressing it outside of a language so it can be evaluated. Now imagine building computer to be "artificially" "intellegent" without a language. Even if there was some form that was not based in language (by the way, not just talking human language), how would you test that? How would that computer be "correct" or "mistaken"?
Thus, for this stupid test, making the test about testing a linguistically challenged child IS not taking the Turring test in the full throughted sense of the Turring test. In fact, it may not even qualify as a Turing Test light.
You all need to quit waisting time and money randomly pluging wires, and wonder over to your local Philosophy departments to find out WHAT FUCK YOU ARE BUILDING (OR NOT)!!!!
Nothing in the history of man has had so many resources pissed away trying to build something, that WE DONT EVEN KNOW WHAT IT IS WE ARE TRYING TO BUILD.
Turing himself, having come from an age where people got a bit more of a rounded education, would I am sure understand all the above.
I know, I know. I post something like this everytime slash has a stupid "AI has been discovered" article. Every time, I get pile of posts from the all the people upset that their fantasy masterbation circle jirk might not be real. As you were.
As the saying goes "haters gonna hate", but really, it's a big accomplishment. To pass the Turing test, you'd need to choose some "identity" for your AI. The idea of using a kid with limited cognative skills was clever, but not cheating -- but it's also not simulating a professor. If there is truly intellgient AI in the future, it's reasonable to expect its evolution to start with easier people to emulate before trying harder.
-- Political fascism requires a Fuhrer.
You can fool all the people some of the time, and some of the people all the time, but you haven't /really/ passed the Turing test until you can fool all of the people all of the time.
No really... Eliza fooled some of the people back in 1966. There is nothing really new to see here, move right along.
http://default-environment-sdqm3mrmp4.elasticbeanstalk.com/
Seriously, type with this thing for more than 5 phrases and tell me that this thing would even fool your grandma.
It reminds me of every ALICE bot I've seen on IRC ever, and I have a sneaky suspicion that it's code is at most slightly modified from the ALICE bots, as it told me that it has a "Celeron 667" that is "nice" that it "plays games with", setting its likely date of origin somewhere around 1999/2000.
It does get partial extra credit, however, for attempting to convince me that I'm a computer.
Lets see the bot first win the Loebner prize. This "test" it won seemed a little focused on free advertising for the research group.
Loebner, despite not being a true Turing test, is a long established competition with clear rules and evaluation process.
Not beaten, and not cleverly anything. Warwick is a twit.
So does this mean that every bot that sends phishing scams and achieves some success passes the Turing test?
"Kevin Warwick gives the bot a thumbs up"
That's a point *against*, not a point in favour.
Adam's Law of British Technology Self-Publicists: if the name "Sharkey" is attached, be suspicious. If the name "Warwick" is attached, be very suspicious. If both "Sharkey" and "Warwick" are attached, run like hell.
I don't talk much, but I watch people a lot. I find it's easiest to truly find out about them when they're in difficult or novel situations. In games for instance, this is the only way to get loads of information fast about others. Make dirty jokes, get political, insulting with some defensive with others and you'll find out quick a lot about them. ... not so much.
The reason why they chose a 13 year old boy, was because you couldn't ask about politics, sex, global issues and other things that transcend national barriers.
This test, if it ever held any meaning, it's pretty much a joke now. Our understanding of what a true AI implies has grown and changed all this time, the test
The Turing Test is the ONLY test we have for artificial intelligence. Every other year we get some research team or the other claiming that their system is as intelligent as a dog, and now it's just a matter of scaling. The Turing Test is analogous to the test the Patent Office has for perpetual motion machines - if you can't pass the test, then you're not there yet. Simple, and easy to measure.
Support microSD: in a post 9/11 world, it is unwise to carry your data on media that you cannot comfortably swallow.
Would be to get two bots to talk to each other and see where the conversation goes after two minutes -- my guess is that all the code is biased towards tricking actual people in a one-on-one "conversation".
But when a machine converses with another machine, all that code no longer has an effect, and pretty soon the two machines will be essentially babbling *at* each other without actually having a conversation. An outside observer will immediately recognize that both of them are machines.
If telephones are outlawed, then only outlaws will have telephones.
As has been repeatedly emphasized, the Turing Test itself is rather subjective. Most schools measure language ability by way of standardized reading comprehension tests, ranging in difficulty according to grade level. I suggest that all natural language programs be similarly graded, using the exact same tests given to students, with the same criteria for passing. Any bot that can pass the college entrance exam (SAT, any version; ACT; or other similar exam) with a perfect score may be considered intelligent.
The trick is to know how to accurately measure what you want to get.
If we want a test that validates human-like behavior in an AI, then the test criteria must rigorously define what that condition is. Tricking a single person in a subjective test is terribly skewed.
Warning: Teh poster of this messaeg is lysdexic
If you truly understood Descartes, then you would know that you don't know you are not a chatbot.
1000's of people believe they are talking to a real human when they are talking to a bot every day. Those stupid customer support chat boxes usually start with some bot doing chit chat before they fail and a real person takes over... ...advertisements that pop up with fake chats in progress ...Facebook bots faking out people to 'friend' them
and so on.
Lt Saavik: [to Kirk] On the test, sir. Will you tell me what you did? I would really like to know.
Dr Leonard McCoy: Lieutenant, you are looking at the only Starfleet cadet who ever beat the "No-Win" scenario.
Saavik: How?
James Kirk: I reprogrammed the simulation so that it was possible to save the ship.
Saavik: What?!
David: He cheated.
Kirk: Changed the conditions of the test. Got a commendation for original thinking. I don't like to lose.
If telephones are outlawed, then only outlaws will have telephones.
So its true ... you are a chatbot!
Sent from my ASR33 using ASCII
Firstly, somehow psychophysics never seems to enter the discussion of the Turing test (maybe because "signal" and "noise" are harder to describe in this context). But even given basic metrics of noise, this example has simply raised the "noise floor" (here, the foreign language and age constraints), which will of course increase False Alarms (people thinking they are talking to a human when in fact they are not). Also, the 30% mark? Seriously? Give me a d-prime measurement and we'll talk!
Secondly, the Turing test is not a good test for intelligence. As a neuroscientist who has dabbled in neural networks, I am perpetually amazed by people who report on the Turing test as somehow relevant. While it would be awesome for a metric of intelligence to be so simple, we're gonna need some intense high-dimensional and dynamic math to establish parameters for intelligence, not depend on other black boxes (which presently, human judges are).
Have a bunch of human judges and some instances of the bot in question all participating in a chat together, or randomly paired together for a while and then re-paired, so that humans are judging humans as well as bots, and have no idea which is which.
If a human is frequently judged as a bot by other humans, that human's judgements are de-weighted, because apparently they're too stupid to be distinguished themselves from an AI, so why should we trust their ability to distinguish other humans from AIs.
Although, I wonder if exceptionally intelligent humans with perfect spelling and grammar, a wide range of knowledge, and high typing speed, might be mis-judged as AIs too, for being "too good". Some hunt-and-pecker who can't tell their/they're/there apart might see someone who gives an intelligent response in complete, grammatically-correct sentences in half a minute as inhuman.
-Forrest Cameranesi, Geek of all Trades
"I am Sam. Sam I am. I do not like trolls, flames, or spam."
More code until moral improves...
Was Turing Test Legitimately Beaten, Or Just Cleverly Tricked?
Neither.
a) It wasn't a Turing Test.
b) It may have been legitimately beaten by the rules of this test, but were the rules remotely legitimate as far as rating AI is concerned? Most Turing-type tests set the bar at a 50% fool-rate (and that's versus a human). This bot got 30%.
c) It was about as clever as sending over random keystrokes to pass the Turing-Cat-On-My-Keyboard Test.
systemd is Roko's Basilisk.
really? Please just ignore him. He's has become an attention whore and nothing more.
The Kruger Dunning explains most post on
It would probably be more convincing if the computer could correctly tell whether the judges were real people or programs...
You can hook ELIZA to autorespond to email and fool some people.
That one's obvious. The then/than troll would be sort of a challenge to program.
John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
Tricked but not cleverly.
I can tell my colleague that I just saw his car in the parking lot and one of his tires is flat. This would be tricking him, but it doesn't require any cleverness to do this.
Similarly, you can tell the judges at a turing test that the "person" on the other end has some sort of deficiency (e.g. mental problems, doesn't speak the language of the judges, etc) that explains his/her odd (e.g. inhuman) responses. You can even stack the human subjects with humans who are also similarly deficient to further disguise which subjects are computers.
This is basically just creating a rigged test.
It's obvious why it is necessary to create a rigged test. Modern computers are nowhere near passing a legitimate Turing test.
From metalev.org: Every news outlet is currently covering the story that a chatbot pretending to be a 13-year old Ukranian boy has deceived 33% of human judges into thinking it is a human, thereby "passing the Turing test for the first time". There are so many problems with the Turing test (even with the numerous refinements to it that many have proposed) that I don't know if it will ever tell us anything useful. The creators of the above chatbot hinted that part of their success in convincing the judges was that “his age ... makes it perfectly reasonable that he doesn’t know everything” -- in other words, to make a believable bot, you can't give your bot super-human knowledge or capabilities, even if this is technically possible to do (e.g. computers can multiply large numbers almost instantly). Limiting computational power to appear human-like is known as "artificial stupidity". The need for artificial stupidity to pass the Turing test illustrates one of the deepest issues with the test, and one that cannot be fixed by simply tweaking the rules: the Turing test is a test of human dupe-ability, not of machine intelligence.
I'm pretty sure we'll start seeing several claims per year that a bot has "passed the Turing test", followed by a flurry of discussion about what was actually tested and whether the result is believable or even meaningful, until it becomes so cliche'd to say that your bot passed the Turing test that nobody with a halfway decent AI would actually *want* to claim that their AI passed a test of this form.
Hopefully we see the day when the Turing test is inverted, and we realize we need a test to establish that someone is a "genuine human" and not a bot ;-) But until then, we still have a heck of a lot of work to do!
It's the Turing Test itself that is meaningless. In a possibly apocryphal account of an AI conference in the early 2000's, a learned panel of AI experts elaborated on the Turing test to explain that passing the test didn't just mean a minimal level of intelligence, but intelligence as advanced as humanity's itself, since it was able to fool a human. An undergraduate attendee asked the panel, "So, if I can write a program that can fool a dog into thinking it's interacting with another dog, the program is as intelligent as a dog?"
The room fell silent.
Since then, nobody has proposed a reasonable alternative for what Turing meant by "intelligence" as the target in his test.
Myself, I think AI is Computer Science's biggest Ponzi scheme. We are not one iota closer to actual artificial intelligence than we were in the 1950s. Yet the public's expectation, and the impression given by AI researchers, is that we've been making steady progress. So every new AI "advance" must be more spectacular than the last, with lots of hand waving explaining how this moves us closer to the goal of sentient computing. It started back in the 1960s with natural language processing, which was really just elaborate table lookup. Then it advanced to the 1970s, with Chess-playing machines -- also just elaborate table lookup. The 1980s brought expert systems and neural networks, otherwise known as elaborate table lookup. Today we have computer navigation, plain-language database queries, and speech processing such as Siri. AI? No. Table lookup, elaborate.
We can't even define what intelligence is or how it works in even the simplest organism, let alone explain it in humans. Until we can do that, we can't have an artificial version of it.
Turing was a con man.
I've always presumed that Turing was making a first draft at the question of: How do you determine that something is intelligent? Can an artificial creation be intelligent and how do you tell?
I don't really think that Turing expected his thought experiment would be the final word on the subject. It was more of a challenge to the rest of us. "Here's what Alan proposes. If you know of something better, let's hear it."
All the chatbots/ELIZAs/etc. are profoundly flawed. First of all they rarely if ever talk about themselves. All their conversational gambits try to turn the subject back to you because they don't have a life to talk about. Not even a fake one. Second, even minor grammatical variations in your responses to them often confound their simple grammar parsers. So they rephrase something and echo it back to you, except the result is ungrammatical and makes no sense at all. Busted!
Third, they have no general knowledge. They cannot recount the news, sports, weather, singers, movies. They have never travelled and have no opinions. They cannot talk about family. They have never worked nor shopped for groceries. They have no politics, no things they hate and nothing to love either. Their car has never broken down and they do not crave chocolate. They have never embarrassed themselves in public or been mean on the internet.
By these absences, the AI fakes are known.
#!/bin/sh
a[0]=""
a[1]="whaaa"
a[2]="gurgll"
a[3]="whaaa aaa aa"
echo "=== Turin Baby 0.1 prototype ==="
while read q
do
random -e 3
echo ${a[$?]}
done
It doesn't make any difference whether the TT is passed 'legitimately' or by 'tricking', the point is that if the test is valid (which is obviously a huge debate in itself), then a trick sufficiently sophisticated to pass it must be considered just as intelligent as passing the test legitimately. What difference does it make?
Real intelligence is nothing but a sophisticated trick pulled off by having a sufficient density of firing neurons. We're all performing that trick constantly. Some better than others actually - there are plenty of people I meet that couldn't pass the Turing Test.
It was probably the real AI convincing this Warwick guy do the press release, then adding media hype and after everybody finds out it is a hoax, it will convince the scientific community that the real AI is still not yet possible, however the real IS there.
.. finishing my third beer. But in case it turns out I am right, I can tell you later that "See, there was a real AI, I told you so!".
Pfff
If the Turing Test is a test to see if universities can release press releases that the media churn out without doing any basic thinking or background checking then yes. Otherwise no. See http://en.wikipedia.org/wiki/Churnalism
UK university staff are getting more and more pressure to get publicity for their work. Why? Because the student market is much more competitive than it was. Every Uni now has a small army of press and "impact" people who aim to get the Uni in the papers, on twitter, etc etc. Not that Kevin Warwick needs much help with that, he's been doing it for years.
The press release about this so-called Turing Test was pretty much written in a style ideal for lazy journos to cut and paste into Quark Xpress. http://www.reading.ac.uk/news-and-events/releases/PR583836.aspx
actually believe that the comments are written by real people and respond to them.
They should use the Voight-Kampff" test instead
Is there a difference?