Slashdot Mirror


Turing Test Passed

schwit1 (797399) writes "Eugene Goostman, a computer program pretending to be a young Ukrainian boy, successfully duped enough humans to pass the iconic test. The Turing Test which requires that computers are indistinguishable from humans — is considered a landmark in the development of artificial intelligence, but academics have warned that the technology could be used for cybercrime. Computing pioneer Alan Turing said that a computer could be understood to be thinking if it passed the test, which requires that a computer dupes 30 per cent of human interrogators in five-minute text conversations."

23 of 432 comments (clear)

  1. Re:Turing Test Failed by Anonymous Coward · · Score: 5, Funny

    Why do you think the test failed and is meaningless?

    --ELIZA

  2. Re:Thirty percent? by NoNonAlphaCharsHere · · Score: 5, Insightful

    Because most humans would fail?

  3. Re:Thirty percent? by nitehawk214 · · Score: 5, Insightful

    By random chance you would detect the computer 50% of the time, so that should be the goal.

    Still 30% as "passing" seems unreasonably low.

    --
    I'm a good cook. I'm a fantastic eater. - Steven Brust
  4. An autist chat simulator duped 100% of people. by Anonymous Coward · · Score: 5, Interesting

    Way back in my college days, I worked in a lab with a guy who wrote a chat bot that babbled on like an autist or otherwise mentally retarded youth would.

    It would dupe 100% of the people who chatted with it. They couldn't distinguish it from an actual autist.

    After seeing this work in action, I learned a very good lesson: the Turing Test is nothing but academic masturbatory fodder. It is not something to be taken seriously.

    1. Re:An autist chat simulator duped 100% of people. by Anonymous Coward · · Score: 5, Funny

      Point of order: Austin is indeed a from of mental retardation.

      I would extend that to most of Texas.

    2. Re:An autist chat simulator duped 100% of people. by Your.Master · · Score: 5, Funny

      That's absoexactally right. The worditudinality of an utterance is defined completely by comprehension. Anywhom that says otherwise is being an obnoxialous prescriptivist!

  5. Voight-Kampff test? by ScooterComputer · · Score: 5, Funny

    Did anyone ask it the questions we already know will trip up a non-human?

    "You're in a desert, walking along in the sand when all of a sudden you look down and see a tortoise..."
    "You're watching a stage play. A banquet is in progress. The guests are enjoying an appetizer of raw oysters. The entree consists of boiled dog..."

    --
    Scott
    "Hokey religions and ancient weapons are no match for a good blaster at your side, kid."
  6. Re:Not literally a test by NoNonAlphaCharsHere · · Score: 5, Funny

    Next you'll say that Turing machines were a thought experiment and never meant to perform calculations in the real world.

  7. Re:Turing Test Failed by EuclideanSilence · · Score: 5, Insightful

    It's a bit of an underhanded way to pass to pretend to be someone who doesn't speak English natively. The point of the test is to have a conversation for 5 minutes, not 5 minutes of "oh I can't understand you because I'm from Ukraine".

  8. Hasn't this happened a bunch of times? by RyanFenton · · Score: 5, Interesting

    Just googling a few seconds brought me to:

    This article about cleverbot., which also eeked out enough votes to 'pass' a turing test.

    It's all sounds just like Eliza, just put into a character with enough human limitations that you'd expect it not to string together phrases well, or keep to one topic more than a sentence.

    I'd interpret it basically as an automated DJ sound board with generic text instead of movie quotes - you can certainly string a lot of folks along with even really bad ones, but that speaks more to pareidolia than anything else.

    I'd classify this stage of AI closer to "parlour trick" than "might as well be human" that a lot of people think of when they hear Turing test - but that's also part of the test, to see what we consider to be human.

    Ryan Fenton

  9. Re:Turing Test Failed by Spy+Handler · · Score: 5, Interesting

    Not only that, a non-native speaker who is a child.

    5 minutes of "oh I can't understand you because I'm from Ukraine" plus 5 minutes of "oh I don't know about that because I'm only 13".

  10. Re:Turing Test Failed by Culture20 · · Score: 5, Funny

    Heck, one of my first programs mimicked an insensate child. Here's some of the responses:





    And I'm sure it used fewer lines of code.

  11. Re:Dupe 30% of humans? by Tablizer · · Score: 5, Funny

    Damn dogs will pass that test.

    One dog would have if it wasn't for those meddling kids.

  12. Re:Not Really Passed... by James+McGuigan · · Score: 5, Funny

    Now only if it could have a 33% rate success in convincing other humans it was an exiled Nigerian dictator who needed some help moving his money out of the country.

  13. Re:A pretty low requirement by tangent · · Score: 5, Insightful

    I'd say we keep raising the bar.

    "If a computer can play chess better than a human, it's intelligent."
    "No, that's just a chess program."

    "If a computer can fly a plane better than a human, it's intelligent."
    "No, that's just an application of control theory."

    "If a computer can solve a useful subset of the knapsack problem, it's intelligent."
    "No, that's just a shipping center expert system."

    "If a computer can understand the spoken word, it's intelligent."
    "No, that's just a big pattern matching program."

    "If a computer can beat top players at Jeopardy, it's intelligent."
    "No, it's just a big fast database."

  14. Re:Turing Test Failed by marcello_dl · · Score: 5, Funny

    I'd say the test is obsolete. It's not measuring the advances in AI, but the involution of humans. Have you looked at Facebook status messages?

    --
    ---- MISSING MISCELLANEOUS DATA SEGMENT --- [sigdash] trolololol
  15. Re:Turing Test Failed by Concerned+Onlooker · · Score: 5, Funny

    Someone please verify, but I think we have a double-Whoosh here.

    --
    http://www.rootstrikers.org/
  16. Re:Turing Test Failed by Jane+Q.+Public · · Score: 5, Interesting

    You may consider it verified... subjectively, by a panel of judges, under very narrowly defined circumstances.

    In more seriousness, GP makes a very important point. Not only was this nothing like a real Turing test (a computer would have to fool the average person in more generalized and everyday circumstances for that to happen), the real point here is that we have learned since the days of Turing that even the full-blown Turing test doesn't really indicate much of anything.

    People were fooled (really, really fooled) by Eliza way back in the day. It doesn't mean squat.

  17. Re:Turing Test Failed by lgw · · Score: 5, Interesting

    The Turing test is a great test if done properly (Turing wasn't envisioning Twitter). While it's hard to pin down a good definition of sapience/intelligence (people want to keep redefining it to what humans have and no computer or animal has demonstrated this year), a good answer comes from studying communication. Intelligence in that sense is the ability to resolve the ambiguity of natural language by interaction as well as context.

    In a very shallow way, search engines do that now - with a big enough data set they don't need an abstract mental model to ask "did you mean X?" But that's not really interactive - it's a single suggestion, with nowhere to go from there. When you're walking your dog and someone greets you with "hey, that's a nice dog" is that a content-free politeness, a flirtation, a discussion about dog breeding, a polite reminder that your neighbors are watching to make sure you clean up after the dog?

    Part of being a socialized human is resolving that sort of ambiguity gracefully. We have an abstract mental model of other people and their motivations (learned from growing up with others) and we can use it without even noticing how neat that is that we can do that. Posing as someone young and socially awkward precisely defeats the purpose of the test.

    Another sort of conversation that's hard to simulate is the way enthusiasts about something technical will talk. While it's easy for the computer to have all the technical details handy for something like a sports car enthusiast and tuner, or a baseball stats hound, the test is in the way people actually talk about that stuff. You see a lot of it on /.. Broad, passionate over-generalizations challenged, emotional argument becoming hot as first but then cooling as you discover that what you're really talking about is two different specific data points, and don't really disagree about anything important, just were over-generalizing from different things. That sort of conversation require both a social abstraction and an abstraction of the topic at hand. E.g. "you think Honda engines are better because you think X is important in an engine, while I think Toyota engines are better because I think Y is important" to mutually understand that requires more than just a knowledge of parts lists, you have to understand why someone would care.

    IMO, if you have an abstract mental model of both people and the meaningful objects in the world (and, critically, yourself), and you make decisions based on modeling the hypothetical results of those choices, you are sapient/intelligent. Without invoking the supernatural, that's all there is to have.

    --
    Socialism: a lie told by totalitarians and believed by fools.
  18. Wake me up when those program solve this problem. by aepervius · · Score: 5, Interesting

    Wake me up when those program solve this problem, which most human would do, but a machine not *specifically* coded for this will have a hard time. "take the first word of each next 7 sentences , put them together to form a new sentence, and then answer the question the sentence form please :
    * What is your name ?
    * is it cold here ?
    * The test is going well
    * Color me surprised but are you a machine ?
    * of course I am a human
    * the keyboard is clean
    * sky is the tv channel I watch a lot
    * please answer the question now. "


    When one AI not specifically programmed for that problem answer it correctly, I will be surprised and intrigued. Until then chatbot are just using cheap tricks to fool human.

    --
    C. Sagan : A demon haunted world:
    http://www.amazon.com/gp/product/0345409469/
    visit randi.org
  19. Re:Turing Test Failed by Anonymous Coward · · Score: 5, Interesting

    I was a BBS operator in the early 1990s. I had a game, which I titled "in case you really need for chat". It was an Eliza program, that I somewhat tuned to speak as I would (and translated to my local language). Plus, the user got to see the pretended typing in real time — Even with some typos and corrections.

    Looking at the log files was *really* worth a laugh. But it made me feel wrong — Some users left in disgust, after "I" had insulted them.

    And yes, they were not really aware I was playing a Turing test on them, so I don't know if this would have validity. But, by 1994 standards, I do believe it was quite an achievement (or perhaps, my users were mostly silly teens just like myself, and not worthy deciders for what constituted intelligent behaviour).

    (Or maybe I'm *that* stupid in real life)

  20. The 'test' was fixed by Camael · · Score: 5, Insightful

    What has been conducted precisely matches Turing's proposed immitation game.

    While they may have matched the letter of it, they subverted the spirit of the test. This quote from the programme maker in particular is highly suggestive that they lowered the standards :-

    The computer programme claims to be a 13-year-old boy from Odessa in Ukraine.

    "Our main idea was that he can claim that he knows anything, but his age also makes it perfectly reasonable that he doesn't know everything," said Vladimir Veselov, one of the creators of the programme. "We spent a lot of time developing a character with a believable personality."

    To illustrate what I mean by lowered standards, imagine if I set up the same test, with 10 entries, and I tell the judges some of them are 2 year old babies playing on the keyboard. Armed with this information, some of the judges are likely to interpret even gibberish as typed by a human and it is not too farfetched to get more than 30% of them to agree.

    This "result" is bollocks and a pure publicity stunt conveniently on falling on the 60th anniversary of Turing's death.

    I want to see the actual transcripts which do not appear to have been released so far, which in itself is highly suspicious.

    1. Re:The 'test' was fixed by Dr.+Spork · · Score: 5, Insightful
      Here was a sample of a hypothetical conversation from Turing's original article:

      Interrogator: In the first line of your sonnet which reads "Shall I compare thee to a summer's day," would not "a spring day" do as well or better?

      Witness: It wouldn't scan.

      Interrogator: How about "a winter's day," That would scan all right.

      Witness: Yes, but nobody wants to be compared to a winter's day.

      Interrogator: Would you say Mr. Pickwick reminded you of Christmas?

      Witness: In a way.

      Interrogator: Yet Christmas is a winter's day, and I do not think Mr. Pickwick would mind the comparison.

      Witness: I don't think you're serious. By a winter's day one means a typical winter's day, rather than a special one like Christmas.

      I think the problem is that the way Turing was picturing the test, the human interrogators would be as smart as Turing and his friends, people who actually know how to ask probing questions. When you look at the conversation above, you see that he had in mind a program that does things which is decades beyond of what chatbots can do today. Everybody is dissing the Turing test, and if it has a problem, it's in that Turing overestimated people, in assuming that they actually know how to have conversations of significance. I still think there is something deeply significant about the Turing test, but in the one that I'm picturing, the interrogators must all be broadly educated experts on natural language processing with specific training in how to expose chatbots. And there should be money on the line for the interrogators: $1000 bonus for each correct identification, $2000 penalty for incorrect identification, no penalty for "not sure". If the majority of such experts can be fooled by an AI under these circumstances, then I think we should all be impressed.