Slashdot Mirror


'Tit for Tat' Defeated In Prisoner's Dilemma Challenge

colonist writes "Tit for Tat, the reigning champion of the Iterated Prisoner's Dilemma Competition, has been defeated by a group of cooperating programs from the University of Southampton. The Prisoner's Dilemma is a game with two players and two possible moves: cooperate or defect. If the two players cooperate, they both have small wins. If one player cooperates and the other defects, the cooperator has a big loss and the defector has a big win. If both players defect, they both have small losses. Tit for Tat cooperates in the first round and imitates its opponent's previous move for the rest of the game. Tit for Tat is similar to the Mutual Assured Destruction strategy used by the two nuclear superpowers during the Cold War. Southampton's programs executed a known series of 5 to 10 moves which allowed them to recognize each other. After recognition, the two Southampton programs became 'master and slave': one program would keep defecting and the other would keep cooperating. If a Southampton program determined that another program was non-Southampton, it would defect." Update: 10/14 15:08 GMT by J : If anyone wants to try writing their own PD strategy and see how it fares in a Darwinian contest, I'll host a tournament of Slashdot readers. Here are the docs, sample code, notes on previous runs, and my email address.

28 of 356 comments (clear)

  1. ...by cheating! by The-Bus · · Score: 4, Insightful
    If the program recognized that another player was not a Southampton entry, it would immediately defect to act as a spoiler for the non-Southampton player. The result is that Southampton had the top three performers -- but also a load of utter failures at the bottom of the table who sacrificed themselves for the good of the team. Another twist to the game was the addition of noise, which allowed some moves to be deliberately misrepresented. In the original game, the two prisoners could not communicate. But Southampton's design lets the prisoners do the equivalent of signaling to each other their intentions by tapping in Morse code on the prison wall. Kendall noted that there was nothing in the competition rules to preclude such a strategy, though he admitted that the ability to submit multiple players means it's difficult to tell whether this strategy would really beat Tit for Tat in the original version. But he believes it would be impossible to prevent collusion between entrants.


    Yeah, that's not the Prisoner's Dilemma. Or even the Iterated PD. This whole "signaligng Morse code" on the prison walls is nonsense, because it was not part of the original plan. Just because it's not in the rules doesn't mean you can do it. In Chess there's no rule specifically against me bringing a SuperGrape(TM) onto the board. The SuperGrape(TM) immediately destroys all pawns on a color of my choosing.

    No, it doesn't work that way.

    While this is an interesting experiment, it's not a true victory.
    --

    Small potatoes make the steak look bigger.

  2. Evolutionarily stable? by Dr.+Manhattan · · Score: 4, Insightful
    From TFA:"Our initial results tell us that ours is an evolutionarily stable strategy -- if we start off with a reasonable number of our colluders in the system, in the end everyone will be a colluder like ours," he said.

    It's not clear to me how the entries determined who would be the 'master' and who would be the 'slave'. It seems that if you had lots of 'colluders' around who could be induced to 'suicide' for another's benefit, you'd very quickly get cheaters who worked to be the 'master' in all situations.

    This strikes me as a lot more reminiscient of the Hawk/Dove situation.

    --
    PHEM - party like it's 1997-2003!
  3. The important codicil to the story is... by aug24 · · Score: 4, Insightful

    "The result is that Southampton had the top three performers -- but also a load of utter failures at the bottom of the table who sacrificed themselves for the good of the team."

    J.

    --
    You're only jealous cos the little penguins are talking to me.
  4. Re:Uh, isn't that just cheating? by Urban+Garlic · · Score: 4, Insightful

    Well, part of the interest is that these programmers found a way, within the rules, to get more information, by means of their "secret handshake". The important lesson (to my mind) is that the environment can be manipulated in surprising ways to get a desired result. That's creativity and innovation doing its thing.

    Interestingly, this strategy is also fairly "brittle", I think, in that simple rule-changes could foil it. Requiring only one submission per team, for example, or scoring teams according to the total (or average) scores of all their programs, would complicate any strategy of collusion.

    --
    2*3*3*3*3*11*251
  5. Re:Uh, isn't that just cheating? by julesh · · Score: 3, Insightful

    I suspect you could come up with a solution that beats this system by mimicing it, then changing its behaviour suddenly.

    That's why this isn't a good answer to the problem -- not because it's somehow "cheating", but because its a strategy that only works in limited circumstances and fails spectacularly in others. Kind of like chasing down losses when gambling.

  6. Re:Uh, isn't that just cheating? by Minwee · · Score: 4, Insightful

    I don't see it as cheating. It's a lot like Bridge -- The rules say that you can't show your partner your hand and you can't tell them what you have, but you are allowed to use prearranged bidding conventions to pass information across the table. All that the Southhampton agents did was use a bidding strategy. They did act as a team, but they had no out-of-game way of knowing that they were up against a team member. That doesn't break any rules, and it did work. The Southampton team took the top three spots in the competition. If you insist on comparing the entrants to people, consider this. They worked as a team, for the good of the team, knowing that at least some of them would win even if the others bombed. People do that kind of thing all the time outside of competitions. Why should it be so out of place here?

  7. Re:Practicality by Daniel_Staal · · Score: 4, Insightful

    Sorry, you've probably already lost that one. The prisoner's dilemma is quite useful in normal life, or at least the thinking that gives rise to the solution is. It applies any time there is significant advantage to be gained by working together, but also much advantage to be the one 'cheating'.

    For /., try this interpretation:
    If we both share our source code, we will both will be more productive.
    If I share my source code, and you don't, you can be more productive. (Assuming you can use mine.)
    If neither of us share, we both will have to re-create other's work...

    --
    'Sensible' is a curse word.
  8. Re:Asian mentality by Perianwyr+Stormcrow · · Score: 3, Insightful

    China has usually been cultured but not powerful. Chinese history is a long sequence of conquests by powerful outsiders (Manchurians, Mongols, Europeans.)

    --

    What we call folk wisdom is often no more than a kind of expedient stupidity.-Edward Abbey

  9. It always works for card games by CyberGarp · · Score: 2, Insightful

    Communication between secret partners has been one of the most undefeatable stratgies in cards for a long time. Didn't take a computer to figure that out. Someone just figured out how to do in the rules given for this competition.

    --

    I used to wonder what was so holy about a silent night, now I have a child.
  10. Re:The article got it wrong by Anonymous Coward · · Score: 5, Insightful

    "The article got it wrong: they compared the tit-for-tat strategy for the iterated prisoner's dilemma to mutual assured destruction. That's wrong, since nuclear war is usually considered to be a one-time game: once you've blown each other up, there is no next round. Tit-for-tat requires that there always be a following round."

    The nuclear MAD comparison is apt, because of the time lag between launch detection and detonation. During the flight time of the first launch, there is time for several rounds to occur.

    Actually, the nuclear standoff could be considered an ongoing PD game with both sides playing Tit-For-Tat strategies. The rounds occur every few minutes with both sides asking "did the other side screw us yet" and responding "no, so we won't screw them yet". This PD game has consisted of millions and millions of turns already, with both players using historical knowledge to influence their current choices.

  11. Secret societies & paranoia by G4from128k · · Score: 2, Insightful

    This story illustrates the power of groups and societies to coordinate to the detriment of individuals and outsiders. The Southampton team used a "secret handshake" to recognize members of the society and discriminate against outsiders. It is a natural explanation for people's fear of closed/secret societies -- people fear the group's ability to break the rules of individualistic "fair play."

    If the agents in the game were capable of higher order reasoning and could see these coordinated actions between members, then they would become paranoid -- all the Southampton team members were "out to get them."

    --
    Two wrongs don't make a right, but three lefts do.
  12. It's interesting stuff by Perianwyr+Stormcrow · · Score: 2, Insightful

    Tit for tat has a secret handshake too, but it's a code of ethics. It is robust in any iterated situation. That's what makes it neat.

    --

    What we call folk wisdom is often no more than a kind of expedient stupidity.-Edward Abbey

  13. Tit for Tat always has been beatable... by jafiwam · · Score: 2, Insightful

    Except Tit for Tat is more robust than other plans, deals well with a wide variety of opponents, and is easy for opponents to "figure out" and is "forgiving" so it does not get caught in endless loops of mutual punishment easily.

    Being that, beating Tit for Tat isn't that big of a deal. Doing BETTER than Tit for Tat consistently _IS_ a big deal.

    The game is a positive sum game, so it pays off to end up in a cooperative (or semi-cooperative) sequence over repeated "defections".

    For some good reading on the Prisoner's Dilemma Game and how it fits in some biological systems read;

    "The Evolution of Cooperation" by Robert Axelrod (and newer books)

    "The Selfish-Gene" by Richard Dawkins

    There may be more recent books too, it's been while since I studied the subject.

    Having one plan that can beat Tit for Tat

  14. Re:That's not really so special by harrkev · · Score: 4, Insightful

    Yup! It is the "outnumbers" thing which (in my opinion) makes things unfair.

    Had this been an actual prisoner's dilemma, this winning strategy would require recruiting a large number of thugs who LIKE going to prison and are willing to "take one for the team."

    Although cooperation is not explicitly defined as being against the rules, IMHO, it goes against the "spirit" of the competition. The point is that each algorithm is supposed to act in a greedy manner.

    This will no doubt spark a LOT of discussion, but to me, they "cheated." (OK. Maybe "worked the system" is a better phrase).

    --
    "-1 Troll" is the apparently the same as "-1 I disagree with you."
  15. Re:It is not the first by jaaron · · Score: 2, Insightful

    Southamptom entries, on the other hand, are complex, sneaky, and cheating against (perhaps unwritten, but nonetheless agreed-upon) rules. They're ugly. They only prove that backstabbing cheating bastards may defeat just-and-fair if the referee is looking the other way for a moment.

    Sorry to be the one to break it to you, but sometimes life is just that way. :)

    --
    Who said Freedom was Fair?
  16. Except that's not the "prisoner's dilemma" at all! by tomhudson · · Score: 2, Insightful
    Southampton's programs executed a known series of 5 to 10 moves which allowed them to recognize each other
    The whole idea behind the prisoner's dilemma is that neither party is privy to what the other party is currently doing.

    By using this "recognition system", the program is capable of "knowing" in a deterministic fashion what some of the other programs will do in advance.

    In other words, at the very least, a cheat.

  17. Re:Asian mentality by Anonymous Coward · · Score: 3, Insightful

    Actually they have been powerful, many, many times. But always at the hight of the current dynasty.

    The cycles usually work as such: There's a period of chaos (warring states, etc), usually ending up with some external power coming in and conquering. Then a majority kingdom is established (didn't always own all of what we now call China, or sometimes more than present day, but whatever). Then there'd be a period of hightened trade. The influence of external nations would prompt both an interest in other cultures and a florishing of culture within the country. Then the nation would gain in power, boarders would become more defined, government would be stable and well established. The civilization would reach the peak of it's power. This high period could last anywhere from tens of years (later dynasties) to thousands (early zhao?).

    Then the country would start to get too bureaucratic, too dogmatic. The boarders would be systematically locked down, the country would isolate itself. Xenophobia would reign. Finally interior corruption would fragment the government into separate regions. Civil wars would begin, and with this discord the next batch of foreigners would invade.

    The moral: Isolation and xenophobia suck.

    Mod -2 Too Much Fucking Information ;)

  18. Re:"This must be banned" by maxwell+demon · · Score: 2, Insightful

    Isn't this the way the terrorist organisation works? The actual attackers totally lose (they lose even their life), and their masters profit from it. The experiment shows that tit for tat isn't a good strategy against this.

    --
    The Tao of math: The numbers you can count are not the real numbers.
  19. Re:That's not really so special by KDan · · Score: 2, Insightful

    That's what I thought while reading the article... also making use of that code is assuming a meta-knowledge of the game, ie some sort of way you can have prepared for that particular and specific instance of the game where the specific problem is the prisoner's dilemna with its known simple outcomes. Whereas tit-for-tat is a much more generic theory that can apply to a game which you don't know about yet (eg, say, one that has N possible outcomes rather than only 4), by simply stating "try to choose the mutually beneficial outcome first and then mirror your opponent's moves", the program they devised makes use of specific knowledge of how the game works (eg with those recognition sequences..). Because of that it is clearly inferior and more a hack of PD than a game theory idea.

    But, says Kendall, "Everybody in our field knows the name of Anatol Rapoport, who won the Axelrod competition. So if you can win the 20th-anniversary one, in our field there's a certain historical significance."

    But in this guy's case the significance will be lost because he didn't win through any significant idea, but through a hack. As he says earlier, it's the research that counts, not the outcome.

    Daniel

    --
    Carpe Diem
  20. Re:Scary Stuff by Unoti · · Score: 2, Insightful

    But, if you confess, then freedom marches forward!

  21. Re:Practicality by clausiam · · Score: 2, Insightful
    It applies any time there is significant advantage to be gained by working together, but also much advantage to be the one 'cheating'

    As in VotePairing :-)

  22. Re:Don't you see the beauty? by kisrael · · Score: 4, Insightful

    They "cheated", and the other guy didn't, so they won big! Wasn't that the whole premise?

    Well, they kind of went for a win on the "metalevel", utilizing the circumstances of the competition rather than solving the originally stated issue in an abstract way. On the one hand that's cool because evolution can work like that sometimes, but on the other hand, it really isn't answering the original question any more. (the question is "what's probably the best strategy for any given individual in Prisoner's Dilemna" and they changed the question to "how can we get some individuals to be super-players with the way this prisoner's dilemna simulator is setup"

    --
    SO YOU'RE GOING TO DIE: The Comic for Dealing with Death
  23. Re:Asian mentality by Jerf · · Score: 2, Insightful

    Perhaps, as this experiment shows, the Asian mentality may actually be the superior strategy?

    Oh, this is a bad time to get all multicultural.

    Sure, it works out great for the Masters, who get to the winners circles on the backs of their Slaves.

    Meanwhile, if you want to call Tit-for-Tat the Western strategy, everybody mostly wins after a while, even though few do really well.

    I don't believe either categorization. I'm just pointing out that if you're going to base your argument on this article, you are saying that it is good that a few individuals come out better, at the expense of a lot of other individuals, in the putative Asian system of thought. Which I find barbaric, though YMMV.

    I once defined a political axis as "people who know they would be kings, vs. people who think they would be serfs". Sounds like I can guess where you come out on that.

  24. Interestingly, that's what the omerta is all about by Perianwyr+Stormcrow · · Score: 2, Insightful

    The omerta, or code of silence, is the ideal that the mob works toward when caught. If you get caught, you simply clam up and take whatever's thrown at you as a point of honor. It is instructive, however, that this of course does not apply universally (everyone knows that the mob is rife with snitches.)

    --

    What we call folk wisdom is often no more than a kind of expedient stupidity.-Edward Abbey

  25. Missing option... by balaam's+ass · · Score: 5, Insightful

    I agree that this defnition of the "Prisoner's Dilemma" is no more than a "meta-game," and not really a problem of philosophical ethics (though it may appear to be to some people.)

    What I find disturbing this is the way that the problem is framed presupposes no underlying system of ethics. To wit....
    * If you confess and your partner denies taking part in the crime, you go free and your partner goes to prison for five years. * If your partner confesses and you deny participating in the crime, you go to prison for five years and yor partner goes free. * If you both confess you will serve four years each. * If you both deny taking part in the crime, you both go to prison for two years. What do you do?

    How about: Tell the truth? Regardless of what your partner does, tell the truth. I find it disturbing that the problem is framed in a way that the actual truth of the matter is irrelevant. (i.e. the problem would be unchanged if I replaced "You and your partner have committed a crime and are caught" with "You and a friend have been accused of a crime which you may or may not have committed.")

    I'm not trolling or off-topic here. I'm dead serious. This formulation of the PD is ethically doomed from the get-go, and thus the results of the experiment may be of interest to mathematical game theorists of this particular game, but I find it unwise to think the results make any significant implications about ethics (or anything else for that matter).

    Someone will counter that since this is a "Prisoner's" dilemma the person involved must be a criminal with no "ethical" principles other than an interest in self-preservation (i.e. the person is already debased as can contribute nothing meaningful on the subject of ethics! ;-) ). I'd say that just because someone committed a crime does not mean they necessarily want to continue committing crimes...

  26. Re:Asian mentality by mdfst13 · · Score: 2, Insightful

    "In many Asian countries, the mentality is to work as a group, rather than individually, with the individual sacrificing themselves for the group if necessary."

    But that isn't what this system does. Individuals do not sacrifice themselves for the good of the group; the group sacrifices itself to build up individuals. It is more like a feudal joust. If the king enters, all his opponents withdraw, making him the defacto winner.

  27. Re:Practicality by spitzak · · Score: 4, Insightful

    Yes I agree that public domain code is very much the same as the prisoner delimma.

    The GPL is an attempt to make it *not* the prisoner delimma by forcing the other side to cooperate if you do. This eliminates the losing part of the cooperation choice and thus it is no longer a delimma.

  28. Kobiyashi Maru by Bugmaster · · Score: 2, Insightful

    So, essentially, the winning program(s) hacked (or exploited, if you prefer) the game in order to win ? That's pretty clever, but does this count as a true victory ? It's sort of like what Captain Kirk did to rig his Kobiyashi Maru scenario. Sure, he won on a technicality, but in doing so he missed the whole point of the challenge.

    --
    >|<*:=