Replacing the Turing Test
mikejuk writes A plan is afoot to replace the Turing test as a measure of a computer's ability to think. The idea is for an annual or bi-annual Turing Championship consisting of three to five different challenging tasks. A recent workshop at the 2015 AAAI Conference of Artificial Intelligence was chaired by Gary Marcus, a professor of psychology at New York University. His opinion is that the Turing Test had reached its expiry date and has become "an exercise in deception and evasion." Marcus points out: the real value of the Turing Test comes from the sense of competition it sparks amongst programmers and engineers which has motivated the new initiative for a multi-task competition. The one of the tasks is based on Winograd Schemas. This requires participants to grasp the meaning of sentences that are easy for humans to understand through their knowledge of the world. One simple example is: "The trophy would not fit in the brown suitcase because it was too big. What was too big?" Another suggestion is for the program to answer questions about a TV program: No existing program — not Watson, not Goostman, not Siri — can currently come close to doing what any bright, real teenager can do: watch an episode of "The Simpsons," and tell us when to laugh. Another is called the "Ikea" challenge and asks for robots to co-operate with humans to build flat-pack furniture. This involves interpreting written instructions, choosing the right piece, and holding it in just the right position for a human teammate. This at least is a useful skill that might encourage us to welcome machines into our homes.
The Turing test was a CONCEPT, not an actual test.
How about learning what it is about before giving idiotic opinions?
Insightful and helpful comments like this is why /. is gaining participants at an outstanding rate.
To really foul things up, you need a computer. -- Paul Ehrlich
I'll build the furniture on my own, thank you very much.
When a computer can poop then it is A.I. Because everybody poops..everybody but computers, that is.
"The trophy would not fit in the brown suitcase because it was too big. What was too big?"
If you change this to "The trophy would not fit in the brown suitcase snugly because it was too big" I wouldn't be able to answer it, either.
When the copyright term is "forever minus a day", live every day like it's the last.
I like the idea of the IKEA challenge but why include a human? I would think having a robot
open a box, pull out the instructions, and assemble the piece of furniture would be huge.
Having a person involved just muddles the issue. You obviously might have to start with
simple furniture but this seems like a worthwhile challenge as assembling furniture at
times can even stump humans.
that should be the criteria
Clever programming and mechanics do not make "AI" and human "robots." Interesting machines, but nothing more. Nature is not an idiot.
E Proelio Veritas.
Pretty much - "We still haven't figured out how to pass the turing test" - time to move the goal post.
Which reminds me. - anyone else shit their pants when the Seahawks lined up on the 1 yard line? How come Pete Carrol hasn't been fired yet ? I'm pretty sure any computer could have called a better play. I mean you saw how fast Brian Williams got flushed for doing his John Kerry imitation. Oh look Clarence, another angel just got his wings.
Thats fair. However, the article is fair in its opinion too.
What was good in 1950 may not be so relevant anymore.
The base of the test is probably fine. But an updated one for things we want an AI to do today is a good idea.
Much like the ACID tests for browsers. They help set the bar for what we want out of our computers.
Right now most 'AI' is brute force depth searching with some statistical weighting. Is that AI?
The difference is that the "swiftboaters" were lying and ended up getting sued.
fit
v. fitted or fit, fitted, fitting, fits
v.tr.
1.a. To be the proper size and shape for. e.g: These shoes fit me.
Really -- someone suggests a computer program could identify when to laugh at a sitcom? When humans are likely to disagree rather strongly about which parts are the funniest? Heck, even Mycroft's first jokes were on the weak side of humor. It took a lot of coaching from the humans to get (his) jokes classy.
https://app.box.com/WitthoftResume Code: https://github.com/cellocgw
See? Slashdot is the ultimate platform for these turing tests. Only a human could properly respond to a nested troll 3 levels deep. Blue Gene probably would have wound up redirecting to a saturday night live skit.
The problem with the Turing Test is that it's so often done wrong. The test is supposed to be adversarial, with two humans and a computer. One human (the investigator) has two terminals and can communicate with the other human and the computer, but doesn't know which is which. The goal of the computer is to convince the investigator that it is the human. The goal of the second human (the foil) is to convince the investigator that he or she is the human. This is then supposed to be repeated with different investigators and foils, and only when a statistically significant portion of the investigators fail is the test passed by the computer.
Investigators should be trying to find which one is human, not simply chatting with the computer. Too often people are simply connected to a chatbot and not told that it might be a computer until after the fact, no foil is involved, etc. The test is also often declared to be passed if even a single investigator fails.
Not a sentence!
BlueGene is "just" a computer architecture. Maybe you're trying to make a funny of the no longer existing Deep Blue?
Yes, you should make sure you're in the call center before you share your opinion with this audience
If I were part of a test, trying to convince the interpreter I was real, I would assert that I was Roko's basilisk and an eternity of pain, torment, and virtual damnation in a simulacra awaited the interpreter and everyone they had ever loved... if the interpreter did not immediately vote that my opponent was a computer and I am not.
TFA repeats a common misconception about the Turing Test. It is not a test of whether an AI can fool an average person, but whether it can fool an expert. ELIZA would never fool an AI expert, because that expert would be well aware than even a simple algorithm can be quite good at generating vacuous chit-chat. The pronoun disambiguation is a good test, because AI does that poorly, and humans do it well. But that is not a replacement for the Turing Test, that IS the Turing Test. Using humor is a good way to distinguish AI from humans. As anyone who has learned a foreign language, or raised children, knows, "getting jokes" is one of the last skills mastered. Humor often requires not only knowledge about the physical world, but deep understanding of cultural nuances. But I am not sure how useful that is, since no current AI would come close to passing it, and understanding jokes is probably not the most economically useful target for current AI research.
It's not about moving the goalposts, it's that bullshit "solutions" are coming up. Like programming a chatbot pretend they are a child that does not speak the language well. That's not AI, that's meta-gaming.
The goal is to come up with challenges that are less exploitable.
An AI to add a laugh track to the Simpsons so you'll know when there has been a joke.
Sheesh, evil *and* a jerk. -- Jade
It seems like the startup investors would get sucked in then. Way more cool to be 2.0 than 1.0.
And tell it to make something useful.
Virtual junk is okay and any virtual tools can be used.
First, what was talked about in the summary is not a replacement for the Turing test, but other tests unrelated to the Turing test outcomes.
Second, why? It seems that the proposal is simply trying to lower the bar so not-so-bright AIs can appear useful to justify their continued, mediocre development. It's the same BS as No Child Left Behind. Sorry, that didn't work out well and neither will this.
You don't need two terminals. All you need is the human interrogator, and have him/her talk to either a human or a computer. I agree that they must be aware of the challenge, and also have the ability to ask some decent questions.
And of course there should be. But that doesn't diminish the importance of the Turing test.
The Turing test has two huge and closely related advantages (1) it is conceptually simple and (2) it takes no philosophical position on the fundamental nature of "intelligence". That such huge advantages necessarily entails certain disadvantages should come as no surprise to anyone.
The Turing test treats intelligence as a black box, but once you've contrived to pass the test the next logical step is to open up that black box and ask whether it's reasonable to consider what's inside "intelligent" or a tricky gimmick. That's a messy question, and that's *why* something like the Turing test is valuable. It is what social scientists call an "operational definition"; it allows us to explore an idea we haven't quite nailed down yet, which is a reasonable first step toward creating a useful *conceptual* definition. Science builds theories inductively from observations, after all.
If the Turing test were a suitable *conceptual* definition of intelligence than an intelligent agent would never fail it, but we know that can and does happen. We have to assume as well that people can be fooled by something nobody would really call "intelligence". Stage magicians do this all the time by manipulating audience expectations and assumptions.
Think of the Turing test as a screening test. Science is full of such procedures -- simple, cheap tests that aren't definitive but allow us to discard events that are probably not worth putting a lot of effort into.
Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
There are no "current AIs", as that would require them to be intelligent, which in turn requires them to be conscious. And they aren't. Not even close.
If we really want to water the term "intelligence" down so that is applies to any clever algorithm, then perhaps we should spend a few minutes contemplating what it is we plan to call an actual intelligence, once we get that far.
"AI" has reasonable meaning in the context of research towards that goal, or as a bright line in the sand that we have yet to reach, much less cross.
I've fallen off your lawn, and I can't get up.
"The one of the tasks"...
"TV program"...
Idiots.
understanding jokes is probably not the most economically useful target for current AI research.
A joke detector? That's funny. I mean a sarcasm detector? That's real useful.
What has become of those compression tests? Wasn't the answer to AI not (at least partially) found in the ability to compress?
Religion is what happens when nature strikes and groupthink goes wrong.
Could there be more things wrong with this line?
A plan is afoot to replace the Turing test as a measure of a computer's ability to think.
The Turing test doesn't involve the ability to think. It intentionally avoids the concept of thinking, and explicitly targets imitation instead.
It also doesn't measure anything. The computer either succeeds at a subjective imitation or it doesn't.
It also isn't any sort of standard practice that can be "replaced" by something else.
A test of intelligence should be dealing with unforeseen input. The problem with chatbots is that they are just giving pre-planned responses. How about trying to land a rocket on the moon while being bombarded with spurious input from a radar device that was accidentally left on? Given the computers in use by NASA in 1969 that's pretty intelligent behavior.
Another would be landing a rocket on a small floating platform. We'll see how that plays out tonight.
That's all we need. Computers with a sense of humour:
"Oops! I deleted all your files!"
"Just kidding. I moved them to your Documents folder. :P"
I do not fail; I succeed at finding out what does not work.
The Turing Test has been abused, bypassed, and cheated to the point that almost no one knows what the actual Turing Test is. At this point, a new test needs to be created, a test that is difficult to cheat without making it obvious that it's not the real test. This could be "The Real Turing Test administered by [reputable group]".
Or we could make a new test, with incredibly explicit criteria that no one can nerf with a straight face and a different name. But from the sounds of it, it would be an easier test.
Don't waste your vote! Vote for whoever you want, unless you live in a swing state it won't matter anyways
The original Turing Test, as published in "Computing Machinery and Intelligence" as "Imitation Game" was not about whether a machine could successfully pretend to be a human.
He proposed a test, where a computer and men both pretended to be women, and the test would be passed if the computer would be more successful in lying about being a woman than the men were.
http://en.wikipedia.org/wiki/T...
I don't see that as a problem with the test itself.
I see that as various individuals trying to cheat in order to claim that they have achieved something they have not.
Suppose someone claimed to have beaten the world record for the 200 meter dash. But could only do it with a 190 meter headstart.
Okay, no headstart but I get to use a motorcycle.
Okay, okay, no headstart and no motorcycle but I will be using "meters" that are 10cm long.
No one would bother reporting on those because those are STUPID.
But the equivalent claims can be made about "beating" the Turing Test because the people reporting on it are STUPID. As you've pointed out, the test itself is easy to set up and easy to verify. There is no problem with the test.
Not having interacted with Watson, I don't know what it is capable of. I have interacted with Goostman, and it can fool you only if you want to be fooled - it does not take more than three or four exchanges to notice that it is barely better than the venerable Eliza. As far as Siri, Cortana, etc. I have never been able to use them for anything but the most trivial tasks (which I can do myself faster and more reliably) and for grins and giggles - they are way too stupid for anything else.
The goal of the second human (the foil) is to convince the investigator that he or she is the human.
This is a part that often gets missed or glossed over. Not just that there needs to be a human in the mix with the computers, but that the foils are *actively trying* to convince the investigators that they're *human*. The foils' "win" condition is if the investigator declares them human and the computer not human.
I've seen to many examples where foils somehow misinterpret their goal as making the competition hard for the investigators/judges - that is, the foils try to act like a computer. They've misinterpreted their role, thinking that because the computer is trying to fool the investigator, their job is also to fool the investigator. This, of course, fails miserably. Humans can do a bang up job of acting like a computer if they put their mind to it. So if you have a computer acting like a human, and a human acting like a computer, the investigator will have great difficulty with it.
As anyone who has spent at least a minute with "Turing Test Winning" chatbots can attest, they're crap. The only way you'd think they might be human is if you think they're a human trying to act like a computer, or are trying to make your life hard. If you knew that the human was doing its best to prove to you it actually was a human, the inanities of CleverBot, or the aloof petulance of Eugene Goostman (the "13 year-old Ukrainian") wouldn't ever fly. You'd know something was wrong because of their inability to work with you to convince you they are human.
tl;dr: Most "Turing Tests" are set up to prove that a human can convince you it's a computer, not the other way around.
A long time ago I used to work in the field of AI (Expert systems and Neural nets). IT frustrates and pisses me off no end how frequently press and even a lot of IT people fail to understand what is essentially a straight forward test and then complain about it being inadequate for modern computing. No computer has passed it, not even close. The test REQUIRES, a human and a computer, The test REQUIRES the expert to be aware that one is human and the other a computer. The test REQUIRES that they then get to interrogate both to try to discover which is which (not ask one question, not read a piece of text generated and then try and guess, they get to question them for considerable time). The test REQUIRES this to be done many times to get a statistically significant sampling with differing test subjects. The test is as relevant today as it was when it was devised.
I quite liked how they handled it in recent film "The Machine" ( http://www.imdb.com/title/tt23... ). Questions like "Which smells better, a hospital corridor or a donkeys ass?" and "Mary saw a puppy in the window and she wanted it. What did Mary want?"
I am a viral sig. Please copy me and help me spread. Thank you.
You are not allowed to redefine the test just because it makes you more comfortable to do so. The original paper simply said "A man, a woman, and an interrogator". It did not qualify that interrogator as an expert, but simply the one who poses the questions (thus, an interrogator) and is then to state his opinion of the gender of each. He will have some amount of inaccuracy. Then, if one of the two is replaced by a machine, does his accuracy improve?
Actually, I think we do. We at least have an actual model, free of woo-woo, for which no counter evidence has been brought forth as yet.
Even the low level stuff seems to finally be yielding some clarity.
I've fallen off your lawn, and I can't get up.
No, they have to talk to both a human (trying to convince the investigator that he/she is human) and a computer. Removing the foil means it's not the Turing test anymore, it's a very different test.
Not a sentence!
You are correct, I should have said it's a problem with the popular conception of the Turing Test. The popular descriptions in the media are quite unlike the test described by Turing.
Not a sentence!
Bullshit. http://www.artificial-intelligence.com/comic/7
People's vanity about human exceptionalism has them move the goal-posts as required to preserve their sense of identity at the top of the food chain.
AI is already able to beat people at most isolated tasks. With ROS uniting the disparate fractured efforts under a single framework, the inefficiency of researchers working on problems in isolation from each other has been solved. There's a standard now, and a couple versions of jQuery from now: your personal shopper sales AI will be loading your psychological profile push-buttons as a cookie and monetizing the fuck out of human frailty.
H1B visas will be able to replace the human touch of the retail experience in under 10 lines of code. The singularity is already here, most people are just too blind to recognize what's staring them in the face. The machines will have us by the balls long before they start cackling like a super villain from a movie. We're already working for them in the same way alcoholics work for the bottle.
It won't be long until the contents of every written word on the internet will be linguistically fingerprinted identifying the author better than an IP address. All the sock-puppets will fall off and a search engine like archive.org will allow you to track down every word written online by anyone given on a writing sample.
The cylons won. Humans are on the retreat.
"No existing program — not Watson, not Goostman, not Siri — can currently come close to doing what any bright, real teenager can do: watch an episode of "The Simpsons," and tell us when to laugh."
Doesn't the Simpsons have a laugh track to tell you when to laugh? I think a program to recognize the laugh track would be pretty easy.
Fucking bravo!!!! Finaly.
What the test has proved is computer science majors have no fucking clue what the turing test is about.
I ask all computer science morons in what language they propose to hold their new test?
If they find one without language, then they are finally no longer using the Turing test; but there is also a good chance they are no longer testing inteligence either.
P.s. the real turing test never ends. The toy turing test is simply pr bullshit.
The code that can fool all the human participants and other AIs while detecting all the other AIs is clearly intelligent, and more so than the average human, which is the sort of functionality that is actually required.
The next stage is a GAI that can generate a better AI that passes the above test, including evading and detecting it's own parent GAI.
d@3-e.net
The Turing Test is a thought experiment. It's just saying "if you can talk to this, and can't tell if it's a person or a computer, then it doesn't matter: it's intellegent." It's not a method for a scientific, practical process. It's just something to think about when considering what might constitute intelligence.
"The WOPR spends all of its time thinking about [Turing Tests]. 24 hours a day, 365 days a year, it plays an endless series of [Turing test 'games'], using all available information on the state of [human sentience]. It has already proved the existence of [machine intelligence] as a game, time and time again. It estimates human and machine responses to our test responses to their responses, and so on. Estimates probabilities, tallies the score, and it looks for ways to ---"
"The point is, key decisions of every available option in determining [the presence of Artificial Intelligence] have already been made by the WOPR."
"So what you're really telling me is all this trillion dollar hardware is really at the mercy of those men with the little brass keys...?"
"That's exactly right. Whose only problem is that they are human beings. In 30 days, we could upgrade the Turing Test scoring process with electronic relays. Get the men out of the loop."
Which... as it would seem... we might all welcome, I for one.
And then, 150,000 years later...
<blink>down the rabbit hole</blink>
that maybe the point shouldn't be to recreate human intelligence, but lay a foundation for a unique intelligence to evolve itself. It may end up not even understanding the concept of words and sentences, but still be capable of horizontal associations that haven't been considered as of yet that yield data that makes sense to them, and could possibly further our own progress as humans?
If you understand that intelligent life on other planets, aliens, can be as simple as a microbe, then I don't think this should be hard to grasp.
Rather than recreate something that can "beat us at our own game" or "fool" us, maybe we should focus on something that is in itself it's own unique, silicon based self-referential and self-modifying species.
Using a language based approach completely undermines this concept. Thoughts?
If the trophy is too big, the sentence contains a dangling particple, which would be bad grammar. According to the rules of English, the suitcase is too big. Simple logic tells you which is too big.
How is this a replacement for the Turing test?
"Describe in single words, only the good things that come into your mind. About your mother."
Have gnu, will travel.
Exactly! The actual Turing test is a great test, but the common modifications remove its ability to determine anything of interest.
Not a sentence!
The pronoun disambiguation is a good test, because AI does that poorly, and humans do it well. But that is not a replacement for the Turing Test, that IS the Turing Test.
Indeed. Here's an excerpt from Turing's original paper that described the "imitation game," replying to a possible objection that his test would not be able to be used to gauge true understanding as a human might:
Probably [the objector to the test] would be quite willing to accept the imitation game as a test. The game (with the player B omitted) is frequently used in practice under the name of viva voce to discover whether some one really understands something or has "learnt it parrot fashion." Let us listen in to a part of such a viva voce:
Interrogator: In the first line of your sonnet which reads "Shall I compare thee to a summer's day," would not "a spring day" do as well or better?
Witness: It wouldn't scan.
Interrogator: How about "a winter's day," That would scan all right.
Witness: Yes, but nobody wants to be compared to a winter's day.
Interrogator: Would you say Mr. Pickwick reminded you of Christmas?
Witness: In a way.
Interrogator: Yet Christmas is a winter's day, and I do not think Mr. Pickwick would mind the comparison.
Witness: I don't think you're serious. By a winter's day one means a typical winter's day, rather than a special one like Christmas.
And so on, What would Professor Jefferson say if the sonnet-writing machine was able to answer like this in the viva voce? I do not know whether he would regard the machine as "merely artificially signalling" these answers, but if the answers were as satisfactory and sustained as in the above passage I do not think he would describe it as "an easy contrivance."
THAT is the sort of standard of AI that Turing was envisioning could be passed in his "test." It isn't a computer pretending to be a non-responsive teenager with an attitude problem who doesn't really speak the same language as the interrogator (as some chatbots might claim).
It's an idea of AI as something that could debate word replacement in a Shakespearean sonnet, would understand and be able to process poetic scansion, understand the subtle word meanings and connotations in language, and be able to synthesize these various things together while applying such concepts to evaluations of classic literary references.
Turing's test then assumes an AI competent enough to have a flawless conversation on the level of a bright university student or even a colleague of Turing's. Now, granted, we might find the literature quiz a little unnecessary, but in a more general sense this example gets at the idea of probing the AI's understanding of concepts, connecting disparate uses of things together (like a literary character to an abstract concept to a matter of style or poetic form), and in general a fluent and adaptive recognition of linguistic meaning.
I think we would all agree that the various chatbots that have claimed in recent years to have "passed the Turing test" are NOWHERE near this level.
This is the kind of standard Turing himself explicitly mentioned in his original article on the test. And frankly, if I encountered an AI that could have a conversation this fluid and wide-ranging (even if not on literature specifically) in flawless English, I'd be happy to declare it "intelligent." But we don't have anything close to that -- and pretending the "Turing test" is obsolete and needs to be more strict is misunderstanding the ridiculously high expectations Turing himself set out many decades ago.
It is not a test of whether an AI can fool an average person, but whether it can fool an expert.
You are not allowed to redefine the test just because it makes you more comfortable to do so. The original paper simply said "A man, a woman, and an interrogator". It did not qualify that interrogator as an expert, but simply the one who poses the questions (thus, an interrogator)
Well, please re-read the original paper.
You are correct that the original test did not specify an AI expert as interrogator. On the other hand, read the types of dialogue Turing offers as examples. It's very clear that he is imagining "interrogators" (note that word -- it implies someone with a strong drive to ask probing questions) who are not only quite intelligent but also keep asking very probing questions designed to test the intellect of the person/thing on the other side.
The standard is clearly NOT, "Gee, can I have a nice small talk conversation?" Instead, the "interrogator" uses questions varying from computational problems to chess problems to questions about composing a sonnet to detailed discussion of subtle linguistic meanings in English, related in abstract ways to classic literature.
That doesn't sound like your "average Joe" interrogator to me. Does it to you? I'm sure Turing didn't expect all his interrogators to be so intelligent, but they were clearly expected (based on his sample dialogues) to understand how to probe intelligence at a pretty sophisticated level.
"Mary saw a puppy in the window and she wanted it. What did Mary want?"
An ambiguous subject in a phrase is a classic problem in AI, however natural language algorithms (such as the one found in Watson) have been able to resolve ambiguous statements like your "Mary" example and the trophy/bag example in TFS, for over a decade now. The trick to resolving such ambiguities is the same one used by humans; context, probability, and lots of prior examples.
And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
The Turing Test is a thought experiment. It's just saying "if you can talk to this, and can't tell if it's a person or a computer, then it doesn't matter: it's intellegent." It's not a method for a scientific, practical process.
If that's true, then why did Turing claim in his original paper that by the year 2000, computers would be able to fool humans and "pass the test" 30% of the time? Why state such a specific prediction for a test that was not intended to be practical and only a "thought experiment"?
It's just something to think about when considering what might constitute intelligence.
Why can't it be both? In Turing's time (and still today) there were (and are) people who think real strong human-like AI is impossible. In order to evaluate "intelligence," though, we need a standard test that we could agree on. Turing attempted to roughly define the outlines of such a test, which also involved a lot of philosophical debate. On the other hand, he predicted within 50 years of his paper that computers would be around which could pass this test, which suggests that he thought it was in fact a practical (if a little vague) way of gauging progress in AI.
A rigorous definition of general intelligence now exists and has been applied by the Deep Mind folks. See this video lecture by Deep Mind's Shane Legg at Singularity Summit 2010 on a new metric for measuring machine intelligence.
If you want something more accessible to the general public, The Hutter Prize for Lossless Compression of Human Knowledge has the same theoretic basis as the test used by Deep Mind and has the virtue that it uses a natural language criterion, in the form of a Wikipedia snapshot. If the 100M snapshot of Wikipedia used by the Hutter Prize is no longer challenging enough, then substitute Matt Mahoney's Large Text Compression Benchmark which is basically just the Hutter Prize enlarged by an order of magnitude.
Seastead this.
Turing wanted to show that computers could be intelligent, while avoiding the nasty problem of giving it a definition. (he was a smart guy--decades later we still haven't found a good definition.)
His trick was to use humans, the only intelligent beings available, as a standard for comparison. Hence the imitation game.
But his real trick was to strip away the bullshit. "Machines can't taste strawberries", "machines can't feel love", "machines don't have consciousness". blah, blah. By using a teletype for communication he reduced human behavior to a stream of ascii characters, while still allowing the essence of intelligence behavior to get through.
But he didn't take it far enough. We need a stronger filter that hides pop-culture references, language idioms, maybe even pronoun misuse. Don't imitate a human--just imitate the intelligent aspects of a human.
Anyway, I use to think he was a clever, sneaky bastard but it turns out he really believed this was a good idea. A few years later he was talking about the imitation game in a radio interview, saying that we should really do it. Sadly the press still thinks that AI researchers care about the imitation game.
I saw a shadow puppet show today. An expert, with only his hands, created landscapes, animals and detailed caricatures of people all in captivating brilliant morphing motion. The thought struck me; "Here's a good 'Turing test' for robotic prosthesis", for the dexterity on display is seldom encountered and seemingly still so far off from being replicated in any capacity by our crude roboticized attempts utilizing rigid polymers and metals.
Reasons the Turing Test will always come up:
1. The Turing Test.. was invented by Alan Turing, and he was a genuinous. Thought about this stuff decades before others. Academic will make new tests. This will always be the original.
2. The Turing Test.. is the ultimate test. If you can be convinced a machine is a human, that's it. No test of some abilities, the Turing test addresses all mental abilities.
3. The Turing Test.. is ambiguous. The is a major criticism, but also why well never forget it. It shows how complex it is to be human, and can be qualified in so many ways.
4. The Turing Test.. is hard to win. We need intermediate tests to show progress, but the TT will always show how much further we have to go.
Thus, I am officially tired of media trying to call the TT outdated, obsolete, or no longer relevant - because the usual motive is to lessen the blow of how poorly cognitive ai does now. We''ll be taking about the Turing Test for decades to come, even for the simple reason that it was the first of its kind. It won't be "replaced".
One cannot make a "plan" to "replace" something which has already been committed to history. . Every computer knows this (stupid humans).
"No existing program — not Watson, not Goostman, not Siri — can currently come close to doing what any bright, real teenager can do: watch an episode of "The Simpsons," and tell us when to laugh."
This may be easier to do than you would expect. Laughter is an innate response to a particular type of surprise. The surprise is triggered when you're expecting something to happen, but instead an unexpected turnout to a situation occurs. Your brain then triggers laughter as a way to deal with your shattered worldview.
A neural network trained on voice recognition and text prediction may then be able to expect where these cognitive dissonances occur. This could be measured as a high error rate in expected output vs actual output.
Would be fun to test at least ;)
To easily detect most AI, tell it this:
The Turing Test was set up as a three-entity interaction: one questioner, one human, and one AI. The questioner is supposed to converse with both the human and the AI (presumably by typing and reading messages), and decide which of the others is the human and which the AI. There was no mention of expertise in any field, and it would be hard for Turing to put that in since there were no AI experts in Turing's day.
Two of the questions could be put into the Turing test easily: the pronoun assignment one and the when-to-laugh one, although the latter would have to be in reference to something the questionee claimed to have seen. The assembly one couldn't be part of it, but is a good test.
"When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes
Turing's test was about the ability to imitate human behavior/knowledge. The real question we need to answer I will call the Mycroft test. The purpose of the test is to determine if the program has earned the right to not be turned off, that is, does it have a right to a trial before it is "terminated"? A program that has earned that right has crossed the blurry line between inanimate and "human" in a way that should be important to us. Defining a test that can measure this is at the heart of deciding what makes us us, vs what makes us tick.
"There is no god but allah" - well, they got it half right.
Let me see if I've got this straight. If you can watch an episode of the Simpsons and know when to laugh, then you're intelligent.
Or at least a real person.
Better go with answer number two. Doh!
I have watched episodes of the Simpsons where I had no idea where to laugh...