Slashdot Mirror


Was Turing Test Legitimately Beaten, Or Just Cleverly Tricked?

beaker_72 (1845996) writes "On Sunday we saw a story that the Turing Test had finally been passed. The same story was picked up by most of the mainstream media and reported all over the place over the weekend and yesterday. However, today we see an article in TechDirt telling us that in fact the original press release was just a load of hype. So who's right? Have researchers at a well established university managed to beat this test for the first time, or should we believe TechDirt who have pointed out some aspects of the story which, if true, are pretty damning?" Kevin Warwick gives the bot a thumbs up, but the TechDirt piece takes heavy issue with Warwick himself on this front.

41 of 309 comments (clear)

  1. but that's the problem with the turing test... by Anonymous Coward · · Score: 5, Interesting

    It has nothing to do with actual artificial intelligence and everything to do with writing deceptive scripts. It's not just this incident, it's a problem with the goal of the Turing test itself. I always found the Turing test a kind of stupid exercise due to this.

    1. Re:but that's the problem with the turing test... by flaming+error · · Score: 5, Insightful

      They got 30% of the people to think they were texting with a child with limited language skills. I don't think that's what Alan Turing had in mind.

    2. Re:but that's the problem with the turing test... by i+kan+reed · · Score: 3, Interesting

      Sure it is.

      They convinced a human that they were talking to an unimpressive human. That's definitely a step above "not human at all".

    3. Re:but that's the problem with the turing test... by TheCarp · · Score: 4, Insightful

      I always thought of it as more a philosophical question or thought experiment. How do you know that anything has an internal consciousness when you can't actually observe it? I can't even observe your process, I just assume that you and I are similarly in so many other ways (well I assume, you could be a chatbot, whreas I know I am definitely not)....and I have it, so you must too, aferall, we can talk.

      So.... if a machine can talk like we can, if it can communicate well enough that we suspect it also has an internal cosciousness, then isn't our evidence for it every bit as strong as the real evidence that anyone else does?

      --
      "I opened my eyes, and everything went dark again"
    4. Re:but that's the problem with the turing test... by Jane+Q.+Public · · Score: 5, Insightful

      It has nothing to do with actual artificial intelligence and everything to do with writing deceptive scripts. It's not just this incident, it's a problem with the goal of the Turing test itself. I always found the Turing test a kind of stupid exercise due to this.

      Yes. TechDirt's points 3 and 6 are basically the same thing I wrote here the other day:

      First, that the "natural language" requirement was gamed. It deliberately simulated someone for whom English is not their first language, in order to cover its inability to actually hold a good English conversation. Fail.

      Second, that we have learned over time that the Turing test doesn't really mean much of anything. We are capable of creating a machine that holds its own in limited conversation, but in the process we have learned that it has little to do with "AI".

      I think some of TechDirt's other points are also valid. In point 4, for example, they explain that this wasn't even the real Turing test.

    5. Re:but that's the problem with the turing test... by Anonymous Coward · · Score: 4, Interesting

      So according to you I could make a machine that simulates texting with a baby. Every now and then it would randomly pound out gibberish as if a baby was walking on the keyboard.

    6. Re:but that's the problem with the turing test... by Anonymous Coward · · Score: 3, Funny

      I have a program passing the Turing test simulating a catatonic human to a degree where more than 80% of all judges cannot tell the difference.

      Once you stipulate side conditions like that, the idea falls apart.

    7. Re:but that's the problem with the turing test... by BasilBrush · · Score: 4, Insightful

      The problem is that priming the judges with excuses about why the candidate may make incorrect, irrational, or poor language answers is not part of the test.

      If the unprimed judges themselves came to the conclusion they were speaking to a 13 year old from the Ukraine, then that would not be a problem. But that's not what happened.

    8. Re:but that's the problem with the turing test... by shaitand · · Score: 2

      I don't think a chat bot was what Turing had in mind in any case. A bot that was intelligent enough to be able to LEARN and SPEAK well enough that another human couldn't tell the difference between it and another human is the point.

      Everything we see now is trying to win the letter of the turing test and ignoring the spirit. Turing's point was that if we can make it able to reason as well as we can we no longer have the right to deny it as intelligent life. Scripts that skip the reasoning and learning part and just try to con the judges are just attempts to cheat at the test.

      It's akin to doing nothing but studying test dumps to pass an IT certification exam or memorizing the question bank to get an Amateur radio license. It being possible to cheat on Turing's test does make it a flawed test but it doesn't mean that Turing was wrong about what it would indicate if a machine passed the test WITHOUT cheating.

    9. Re:but that's the problem with the turing test... by TheCarp · · Score: 5, Funny

      Please tell me more about like something a chatbot would say.

      --
      "I opened my eyes, and everything went dark again"
    10. Re:but that's the problem with the turing test... by Agent0013 · · Score: 2

      It was always 30%: "human", "not human", and "not sure".

      Always?! The test created by Turing specified that there were two subjects that the judge were interacting with. One human and one computer. There is no "not sure" choice. There is which one is a human and which one is a computer. You cannot answer that they are both human, both computer, one a human and the other is unknown, one a computer and the other is unknown, both unknown, etc. It seems that 50% is the correct percentage to me!

      --

      -- ssoorrrryy,, dduupplleexx sswwiittcchh oonn.. -Quote found on actual fortune cookie.
    11. Re:but that's the problem with the turing test... by radtea · · Score: 3, Insightful

      So.... if a machine can talk like we can, if it can communicate well enough that we suspect it also has an internal cosciousness, then isn't our evidence for it every bit as strong as the real evidence that anyone else does?

      Not even close, because our conclusion about other humans is based on a huge amount of non-verbal communication and experience, starting from the moment we are born. AI researchers (and researchers into "intelligence" generally) conveniently forget that the vast majority of intelligent behaviour is non-verbal, and we rely on that when we are inferring from verbal behaviour that there is intelligence present.

      Simply put: without non-verbal intelligent behaviour we would not even know that other humans are intelligent. Likewise, we know that dogs are intelligent even though they are non-verbal (I'm using an unrestrictive notion of "intelligent" here, quite deliberately in contrast to the restrictive use that is common--although thankfully not universal--in the AI community.)

      With regard to the Turing test as a measure of "intelligence", consider it's original form: http://psych.utoronto.ca/users...

      Turing started by considering a situation where a woman and a man are trying to convince a judge which one of them is male, using only a teletype console as a means of communication. He then considered replacing the woman with a computer.

      Think about that for a second. Concluding, "If a computer can convince a judge it is the human more than 50% of the time we can say that it is 'really' intelligent" implies "If a woman can convince a judge she is male more than 50% of the time we can say she is 'really' a dude."

      The absurdity of the latter conclusion should give us pause in putting too much weight on the former.

      --
      Blasphemy is a human right. Blasphemophobia kills.
    12. Re:but that's the problem with the turing test... by BasilBrush · · Score: 4, Insightful

      What should the program have claimed to have been?

      I don't care. What I care about is what the organisers of the "test" told the judges. I was under the impression they had told the judges it was a 13 years old boy from the Ukraine. Now I look again, it's not clear who told them that. Which brings another problem: we don't know what the judges were told. Given the effort to invite a celebrity to take part as one of the judges, you'd have thought there would be video of the contest. But no.

      If you've been around tech for a while, you will have come across some of Kevin Warwick's bullshit claims to the press before. He's a charlatan. So therefore we need more than his say so that he conducted the test in a reasonable way.

      We also need independent reproduction of the result. You know, the scientific method and all that.

    13. Re:but that's the problem with the turing test... by Java+Pimp · · Score: 2

      I swear when I read about this when it was first posted that they said there were only 3 judges. Now, the best I can find is this from the Reg:

      It's not clear if there were more than three judges, although some reporters say they were told there were 30 judges.

      It would be nice to see exactly how many judges and what the actual conditions were. Fooling 1 person is easy, fooling 10 would be much harder...

      --
      Ascalante: Your bride is over 3,000 years old.
      Kull: She told me she was 19!
    14. Re:but that's the problem with the turing test... by Spy+Handler · · Score: 4, Interesting

      The test as specified by Alan Turing involves a human judge sitting in front of two terminals. One is a computer and the other is human-operated. The judge asks both terminals questions and tries to figure out which one is computer and which is human. It's quite specific.

      It does not involve unsuspecting normal people in everyday situations who are duped into thinking they're interacting with a human... that would be quite easy. For instance if somebody asked the TigerDirect customer service chat window questions they have about a product and receive a good answer, they might not suspect it's a bot. Doesn't mean the TigerDirect bot passed the Turing test.

      Turing also didn't say anything about crippling the test by making it a child who doesn't speak fluent English.

    15. Re:but that's the problem with the turing test... by AthanasiusKircher · · Score: 2

      The test created by Turing specified that there were two subjects that the judge were interacting with. One human and one computer. There is no "not sure" choice.

      Yes, you are correct.

      It seems that 50% is the correct percentage to me!

      Turing's original discussion included the following claim:

      I believe that in about fifty years' time it will be possible to programme computers, with a storage capacity of about 10^9, to make them play the imitation game so well that an average interrogator will not have more than 70 percent chance of making the right identification after five minutes of questioning.

      This wasn't part of the "test" per se, but where Turing originally thought technology would be 50 years after he wrote those words. (He wrote them in 1950, so that would be a claim about 2000.)

      But I believe that's where all this "30%" stuff comes from.

    16. Re:but that's the problem with the turing test... by nine-times · · Score: 3, Insightful

      That was my general understanding of it too. It was less about devising an actual test for computer intelligence, and more about making a philosophic point that we never directly observe intelligence. We only observe the effects-- either in words or actions-- and then guess whether something is intelligent by working backward from those effects. For example, my coworker is sitting next to me, and I see him talking in sentences that appear to make sense. I ask him a question, and I get a response back. When I listen to his response, I analyze it and decide whether it seems like an appropriate or insightful response to my question. As I result, I guess that he's reasonably intelligent, but that's the only thing I have to go on.

      So in talking about machine intelligence, Turing suggested that it may not be worthwhile to dwell on whether the machine is actually intelligent, but instead look at whether it can present behavior capable of convincing people that it's intelligent. If I can present questions to a machine and analyze the response, finding that it's as appropriate and insightful as when I'm talking to a human, then maybe we should consider that machine to be intelligent whether it "actually" is intelligent or not.

      Still, to me it seems like there's some room for debate and room for problems. For example, do we want to consider it intelligent when a machine can convince me that it's a person of average intelligence, or do we want to require that it's actually sensible and smart? It may be that if an AI gets to be really intelligent, it starts failing the test again because it's answers are too correct, specific, and precise.

      There's a further problem in asking the question, once we have AI that we consider "intelligent", will it be worth talking to it? Maybe it will fool us by telling us what we want to hear or expect to hear.

      I'm not sure Turing had the answers to whether an AI was intelligent any more than Asimov has the perfect rules to keeping AI benign.

    17. Re:but that's the problem with the turing test... by Oligonicella · · Score: 2

      The idea is to create a machine that is intelligent, not to downgrade the definition of intelligent until a cat strolling across a keyboard qualifies. No, a machine pumping out gibberish like an infant does not qualify.

    18. Re:but that's the problem with the turing test... by TapeCutter · · Score: 2

      not to downgrade the definition of intelligent

      The problem with all intelligence tests regardless of whether they are applied to man of machine is the term intelligence is usually left undefined. The turing test itself is an empirical definition of intelligence, however it measures a qualitative judgement by the humans.

      To give an example of what I mean - An ant colony can solve the travelling salesman problem faster that any human or computer (who does not 'ape' the ant algorithm), in fact there are number of ant algorithms that solve complex logistical problems more efficiently than we can using traditional maths, does any of that make an ant colony more or less intelligent than a human?

      --
      And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
    19. Re:but that's the problem with the turing test... by Capsaicin · · Score: 2

      The idea is to create a machine that is intelligent

      The idea is to create a machine with verbal behaviour of a level sufficient to convince a human that they are conversing with another human. Neither a cat strolling across a keyboard, nor the gibberish of an infant is likely to satisfy that test. But perhaps you believe the teenagers with whom you converse lack intelligence?

      Turing was explicit that intelligence was to be inferred by that behaviour because, he argued, we accept that other humans, based on their verbal behaviour, have minds. I'm not sure I agree.

      --
      Better to be despised for too anxious apprehensions, than ruined by too confident a security. --Edmund Burke
  2. open access to the AIs by dgp · · Score: 3, Insightful

    I want to talk to these AIs myself! Give me a webpage or irc chatroom to interact with it directly. It sounds fascinating even if its only 'close' to passing the test.

    1. Re:open access to the AIs by AthanasiusKircher · · Score: 3, Insightful

      I want to talk to these AIs myself! Give me a webpage or irc chatroom to interact with it directly.

      It might be interesting, but when these things have been made available in the past, I've always been disappointed.

      Example: Cleverbot, which, as TFA notes, supposedly passed the Turing test by convincing people it was 59% human, as reported almost three years ago here.

      The numbers for Cleverbot sounded a LOT better than this story, and yet -- well, chat with the damn thing for a couple minutes. See what you think. Try to "test" it with even some basic questions designed to fool an AI that even a relatively stupid 13-year-old could answer. It will fail. It comes across as an unresponsive idiot. It's only if you engage with its crap questions that it begins to seem anything like "conversation" -- if you try to get it to actually talk about ANYTHING, it will rapidly become apparent that it's useless.

      I have no doubt this thing does something similar.

    2. Re: open access to the AIs by angel'o'sphere · · Score: 2

      I huess the AI just typod and ment: retard, instead of Richard.

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
  3. Stupidly tricked, not clever by gurps_npc · · Score: 4, Informative
    Turnign test is NOT supposed to be limited to 15 minutes, nor is it supposed to be conducted by someone that does not understand the main language claimed to be used by the computer.

    Similarly, the computer must convince the judge it is a human with it's full mental capacity, not child, nor a mentally defective person, nor someone in a coma.

    The test is whether a computer can, in an extended conversation, fool a competent human into thinking it is a competent human being speaking the same language,at least 50% of the time.

    --
    excitingthingstodo.blogspot.com
    1. Re:Stupidly tricked, not clever by Trepidity · · Score: 2, Interesting

      Restricted Turing tests, which test only indistinguishability from humans in a more limited range of tasks, can sometimes be useful research benchmarks as well, so limiting them isn't entirely illegitimate. For example, an annual AI conference has a "Mario AI Turing test" where the goal is to enter a bot that tries to play levels in a "human-like" way so that judges can't distinguish its play from humans' play, which is a harder task than just beating them (speedrunning a Mario level can be done with standard A* search, so isn't that interesting as an AI benchmark). This is useful as a benchmark for things like algorithms that try to mimic action styles in general (whether in games or elsewhere).

      However it would definitely be misleading to claim passing these kinds of restricted Turing tests constitutes passing the Turing test in the sense that Turing had in mind: obviously playing Mario levels in a human-like way is not equivalent to full general intelligence, and serious researchers wouldn't claim that.

    2. Re:Stupidly tricked, not clever by Anonymous Coward · · Score: 2, Insightful

      So if there were an AI system which genuinely had the intellect and communication capabilities of a 13-year-old Ukrainian boy (conversing in English), you would not consider it intelligent?

      Not until I posed questions in Ukrainian.

  4. Program pretends to be foreign child, not adult by dunkindave · · Score: 5, Informative

    For those who haven't read the article (I read one yesterday and assume the details are the the same): The program claimed to be a Ukrainian boy of 13 years old, a non-native English speaker, writing in English to English speakers. This allowed the program to avoid the problem of people using language to make judgements about whether the responses were from a person or a program. Also, since the program was claiming to be a boy instead of an adult, it also greatly reduced what could be expected of the responses, again greatly simplifying the programs parameters and reducing what the testers could use to test. So basically, the Turing Test is supposed to be a test if a person can tell if the program acts like a person, but here the test was rewritten to see if the program acted like a child from a different culture and who was supposed not to be speaking in his native language. Many are apparently crying foul.

    I personally agree.

    1. Re:Program pretends to be foreign child, not adult by iluvcapra · · Score: 2

      Foreign, no cultural context, limited language skills -- It sounds like this AI is ready to be deployed at Dell technical support. (You laugh today.)

      --
      Don't blame me, I voted for Baltar.
  5. The Turing test by KramberryKoncerto · · Score: 4, Informative

    ... was not actually performed in the research. End of story.

  6. Re:I see. by RDW · · Score: 4, Funny

    But seriously, yes, it was 'legitimately beaten', just like it's been 'legitimately beaten' in times past, going back to ELIZA in the 60s.

    How does that make you feel?

  7. Re:Isn't that the only way to beat it? by Anonymous Coward · · Score: 3, Insightful

    Actually, that's not the whole point -- it's not even the point at all, which is what most people here are pointing out.

    The test CAN be beaten without clever tricking: it can be beaten with a program that actually thinks.

    This was Turing's original intent. He didn't think, "I'm going to make a test to find someone who can write a program to trick everyone into thinking the program is intelligent." He thought, "I'm going to make a test to find someone who has written a program that is actually intelligent." See the difference?

    The only reason we're in this stupid mess with the Turing test right now is that most laypeople (including reporters) can't see the difference between those two positions.

    (posting AC because I lost my password)

  8. I don't care by mbone · · Score: 4, Insightful

    The first time I saw ELIZA in action, I realized that the Turing test is basically meaningless, as it fails on two fronts. We are not good judges for it, as we are hard-wired to assume intelligence behind communications, and Turing's assumption that the ability to carry on a reasonable conversation was a proof of intelligence was wrong.

    This is not to fault Turing's work, as you have to start somewhere, but, really, after all of these years we should have a better test for intelligence.

    1. Re:I don't care by AthanasiusKircher · · Score: 4, Insightful

      The first time I saw ELIZA in action, I realized that the Turing test is basically meaningless, as it fails on two fronts. We are not good judges for it, as we are hard-wired to assume intelligence behind communications, and Turing's assumption that the ability to carry on a reasonable conversation was a proof of intelligence was wrong.

      But that wasn't Turing's assumption, nor was it the standard for the Turing test.

      Turing assumed that a computer would be tested against a real person who was just having a normal intelligent conversation. Not a mentally retarded person, or a person who only spoke a different language, or a person trying to "trick" the interrogator into thinking he/she is a bot.

      Note that Turing referred to an "interrogator" -- this was an intensive test, where the "interrogator" is familiar with the test and how it works, and is deliberately trying to ask questions to determine which is the machine and which is the person.

      ELIZA only works if you respond to its stupid questions. If you actually try to get it to actually TALK about ANYTHING, you will quickly realize there's nothing there -- or perhaps that you're talking to a mentally retarded unresponsive human.

      The "assumption" is NOT "the ability to carry on a reasonable conversation," but rather the ability to carry on a reasonable conversation with someone specifically trying to probe the "intelligence" while simultaneously comparing responses with a real human.

      I've tried a number of chatbots over the years when these stories come out, and within 30 seconds I generally manage to get the thing to either say something ridiculous that no intelligent human would utter in response to anything I said (breaking conversational or social conventions), or the responses become so repetitive or unresponsive (e.g., just saying random things) that it's clear the "AI" is not engaging with anything I'm saying.

      You're absolutely right that people can and have had meaningful "conversations" with chatbots for decades. That's NOT the standard. The standard is whether I can come up with deliberate conversational tests determined to figure out whether I'm talking to a human or computer, and then have the computer be indistinguishable from an actual intelligent human.

      I've never seen any chatbot that could last 30 seconds with my questions and still seem like (even a fairly stupid) human to me -- assuming the comparison human in the test is willingly participating and just trying to answer questions normally (as Turing assumed). If somebody walked up to me in a social situation and started talking like any of the chatbots do, I'd end up walking away in frustration within a minute or two, having concluded the person is either unwilling to actually have a conversation or is mentally ill. That's obviously not what Turing meant in his "test."

    2. Re:I don't care by Darinbob · · Score: 2

      That's not the Turing Test. It is supposed to be done by interrogators who are very suspicious, knowledgeable on the subject, and who are actively trying to discern if it is human or not.

      The whole point of the Turing Test was that if it looks like a duck, acts like a duck, and quacks like a duck, then it's good enough to act as a very reasonable substitute in a pond (even if you can't eat it with orange sauce). Likewise, passing the Turing Test should mean that the other end can serve as a reasonable substitute in applications where intelligence is normally required.

      And the 30% mark was pure B.S. I think. Turing mentioned that the interrogators should be 70% sure, not that there was going to be a voting process.

  9. Re:Isn't that the only way to beat it? by jkauzlar · · Score: 3, Insightful

    This is a good point. I'm guessing every single one of the entries into these Turing test competitions since 'Eliza' has been an attempt by the programmer to trick the judges. Turing's goal, however, was that the AI itself would be doing the tricking. If the programmer is spending time thinking of how to manufacture bogus spelling errors so that they bot looks human, then I'm guessing Turing's response would be that this is missing the point.

  10. Re:Isn't that the only way to beat it? by JMZero · · Score: 2

    A legitimately intelligent computer wouldn't have to do much tricking. It'd have to lie, sure, if it was asked "are you a computer?" - but it could demonstrate its intelligence and basic world understanding without resorting to obfuscation, filibustering, and confusion. Those are "tricks".

    By contrast, building a system that can associate information in ways that result in reasonable answers (eg. Darwin), is not so much a "clever trick" as a reasonable step in building an intelligent agent. Both are clever, but hardly in the same way.

    --
    Let's not stir that bag of worms...
  11. Kevin Warwick by Anonymous Coward · · Score: 2, Insightful

    Kevin Warwick is a narcissistic, publicity seeking shitcock.

  12. Re:I see. by NatasRevol · · Score: 4, Funny

    I can't answer that right now.

    --
    There are two types of people in the world: Those who crave closure
  13. Chatbot transcript by MobyDisk · · Score: 2

    I created a chat bot that emulates a 65-year-old grocery store clerk who speaks perfect English. Here is a sample transcript:

    Tester: Hello, and welcome to the Turing test!
    Bot: Hey, gimme one sec. I gotta pee really bad. BRB.
    .
    .
    .
    Tester: You back yet?
    .
    .
    .
    Tester: Hello?
    .
    .
    .

  14. Kobayashi Maru by tekrat · · Score: 2

    Lt Saavik: [to Kirk] On the test, sir. Will you tell me what you did? I would really like to know.
    Dr Leonard McCoy: Lieutenant, you are looking at the only Starfleet cadet who ever beat the "No-Win" scenario.
    Saavik: How?
    James Kirk: I reprogrammed the simulation so that it was possible to save the ship.
    Saavik: What?!
    David: He cheated.
    Kirk: Changed the conditions of the test. Got a commendation for original thinking. I don't like to lose.

    --
    If telephones are outlawed, then only outlaws will have telephones.
  15. None of the above by wonkey_monkey · · Score: 2

    Was Turing Test Legitimately Beaten, Or Just Cleverly Tricked?

    Neither.

    a) It wasn't a Turing Test.
    b) It may have been legitimately beaten by the rules of this test, but were the rules remotely legitimate as far as rating AI is concerned? Most Turing-type tests set the bar at a 50% fool-rate (and that's versus a human). This bot got 30%.
    c) It was about as clever as sending over random keystrokes to pass the Turing-Cat-On-My-Keyboard Test.

    --
    systemd is Roko's Basilisk.