'Tit for Tat' Defeated In Prisoner's Dilemma Challenge
colonist writes "Tit for Tat, the reigning champion of the Iterated Prisoner's Dilemma Competition, has been defeated by a group of cooperating programs from the University of Southampton. The Prisoner's Dilemma is a game with two players and two possible moves: cooperate or defect. If the two players cooperate, they both have small wins. If one player cooperates and the other defects, the cooperator has a big loss and the defector has a big win. If both players defect, they both have small losses. Tit for Tat cooperates in the first round and imitates its opponent's previous move for the rest of the game. Tit for Tat is similar to the Mutual Assured Destruction strategy used by the two nuclear superpowers during the Cold War. Southampton's programs executed a known series of 5 to 10 moves which allowed them to recognize each other. After recognition, the two Southampton programs became 'master and slave': one program would keep defecting and the other would keep cooperating. If a Southampton program determined that another program was non-Southampton, it would defect."
Update: 10/14 15:08 GMT by J : If anyone wants to try writing their own PD strategy and see how it fares in a Darwinian contest, I'll host a tournament of Slashdot readers. Here are the docs, sample code, notes on previous runs, and my email address.
- If you confess and your partner denies taking part in the crime, you go free and your partner goes to prison for five years.
- If your partner confesses and you deny participating in the crime, you go to prison for five years and yor [sic] partner goes free.
- If you both confess you will serve four years each.
- If you both deny taking part in the crime, you both go to prison for two years.
This sounds pretty much like the RIAA might be involved. I would deny everything if I were you!The dangers of knowledge trigger emotional distress in human beings.
This seems to me to be an unfair way to "win." The point of the PD simulation is to talk about whether, in the absence of any social consequences, it is better to screw someone over for money or to work cooperatively with them. It's not a perfect model for that question, but that is still the question that makes us care about the PD in the first place.
All this has done is make a meta-PD game in which the two programs create a meta-game in which they agree to cooperate. That is to say, this is a solution to the PD problem that relies on the cooperation of a cohort (Someone to keep choosing loyalty while you defect and get all the money). Which is exactly not the point of PD.
So the real headline, I think, is "Trivial flaw found in definition of Prisoner's Dillema problem. University of Southhampton wastes money demonstrating flaw instead of writing a goddamn paper like a normal person would."
Philip Sandifer's academic website
What's being ignored is that the total profit of all the colluding algorithms is less than that of Tit-for-Tat, which makes the solution unviable in real-world Prisoner Dilemma situations. (bidding on large construction projects under certain auction formats, etc)
As an analogy of unprofitable collusion, I could win the World Series of Poker by hiring enough shills and paying their time and entry fees. I would lose money by doing this, probably more than I could recoup with post-tournament income via endorsements/books/whatever.
The parent is correct. Tit-for-Tat is still superior in equal numbers, and a modified Tit-for-Tat that can spoof the recognition algorithm of colluders will trounce them.
Repeated games have radically different outcomes than one-time games. It's long been known that where cooperation is possible, cooperation can beat solitary strategies in repeated games. I really don't think there's anything surprising here.
See what I've been reading.
The length of the code is one of the largest problems to overcome. Performing any signal other than all-cooperate produces a net loss of 1 or 4 points per round for your team in traditional (0,1,3,5) IPD. Simple signalling, ie 4th round defect was very effective. While the master/slave aspect was amazingly effective in my research, the "spoiler" was not. A small population of master/slaves could invade an arbitrariliy large block of TitForTat if evolution was by duplicating winner and removing loser after n iterations. The population of "spoilers" stagnates very quickly in a large TFT population. TFT should be considered a friend, not an enemy because they are a positive growth environment. Going "spoiler" on any non-TFT/ally was quite effective as any bot not prone to cooperate posed the only real risk of "master" losing.
I agree that this defnition of the "Prisoner's Dilemma" is no more than a "meta-game," and not really a problem of philosophical ethics (though it may appear to be to some people.)
;-) ). I'd say that just because someone committed a crime does not mean they necessarily want to continue committing crimes...
What I find disturbing this is the way that the problem is framed presupposes no underlying system of ethics. To wit....
* If you confess and your partner denies taking part in the crime, you go free and your partner goes to prison for five years. * If your partner confesses and you deny participating in the crime, you go to prison for five years and yor partner goes free. * If you both confess you will serve four years each. * If you both deny taking part in the crime, you both go to prison for two years. What do you do?
How about: Tell the truth? Regardless of what your partner does, tell the truth. I find it disturbing that the problem is framed in a way that the actual truth of the matter is irrelevant. (i.e. the problem would be unchanged if I replaced "You and your partner have committed a crime and are caught" with "You and a friend have been accused of a crime which you may or may not have committed.")
I'm not trolling or off-topic here. I'm dead serious. This formulation of the PD is ethically doomed from the get-go, and thus the results of the experiment may be of interest to mathematical game theorists of this particular game, but I find it unwise to think the results make any significant implications about ethics (or anything else for that matter).
Someone will counter that since this is a "Prisoner's" dilemma the person involved must be a criminal with no "ethical" principles other than an interest in self-preservation (i.e. the person is already debased as can contribute nothing meaningful on the subject of ethics!