'Tit for Tat' Defeated In Prisoner's Dilemma Challenge

← Back to Stories (view on slashdot.org)

'Tit for Tat' Defeated In Prisoner's Dilemma Challenge

Posted by ryuzaki0 on Thursday October 14, 2004 @02:06AM from the taking-one-for-the-team dept.

colonist writes "Tit for Tat, the reigning champion of the Iterated Prisoner's Dilemma Competition, has been defeated by a group of cooperating programs from the University of Southampton. The Prisoner's Dilemma is a game with two players and two possible moves: cooperate or defect. If the two players cooperate, they both have small wins. If one player cooperates and the other defects, the cooperator has a big loss and the defector has a big win. If both players defect, they both have small losses. Tit for Tat cooperates in the first round and imitates its opponent's previous move for the rest of the game. Tit for Tat is similar to the Mutual Assured Destruction strategy used by the two nuclear superpowers during the Cold War. Southampton's programs executed a known series of 5 to 10 moves which allowed them to recognize each other. After recognition, the two Southampton programs became 'master and slave': one program would keep defecting and the other would keep cooperating. If a Southampton program determined that another program was non-Southampton, it would defect." Update: 10/14 15:08 GMT by J : If anyone wants to try writing their own PD strategy and see how it fares in a Darwinian contest, I'll host a tournament of Slashdot readers. Here are the docs, sample code, notes on previous runs, and my email address.

26 of 356 comments (clear)

Min score:

Reason:

Sort:

That's not really so special by Perianwyr+Stormcrow · 2004-10-14 02:09 · Score: 4, Interesting

In other words, an in-group can work vs. tit for tat if it outnumbers it. I'd like to see a trial with a slow trickle of immigration of tit for tats into a large population of S/M programs. That might be illuminating. I suspect the outcome would be that tit for tat still does well.

--
What we call folk wisdom is often no more than a kind of expedient stupidity.-Edward Abbey
1. Re:That's not really so special by phoenix321 · 2004-10-14 07:30 · Score: 2, Interesting
  
  I think this is a fairly stunning result that is able to explain, why aristocracy and monarchy proved pretty successful for thousands of years.
  
  If Most sacrifice themselves for a few "chosen" ones they can do better than most others against the rest of the "world". Pretty convincing.
  
  If this strategy succeeds, we have a fairly sound theory why and how monarchy evolved from simple tribal structures. Secret societies, hierarchies and everything else would suddenly seem logical.
  
  Does not leave a feelgood-residue like having read Axelrod, but at least we know it now...
Re:Scary Stuff by parvenu74 · 2004-10-14 02:10 · Score: 2, Interesting

Or under the Patriot Act, whether you confess or not you go to jail for an undetermined period of time, during which charges may or may not be filed against you...
Uh, isn't that just cheating? by DoorFrame · 2004-10-14 02:14 · Score: 4, Interesting

I mean, the whole point of the Prisoner's Dilemma is that you don't have all the information. You don't know what your partner/opponent is going to do and you have decide based entirely on what little information you have based on your history with your partner/opponent. What these people are doing is creating a pattern to be recognized by another player, and then working as a team. And, it's not like they're people where one person might change their mind and decide to defect unilaterally... they're programs. Once they've locked onto each other as the same program, that's it. They'll play to their advantage until the end.

The real trick is to find a program that can beat other DIFFERENT programs, not beat itself. This seems really stupid, or am I missing something?

--

--
RumorsDaily
1. Re:Uh, isn't that just cheating? by gadget+junkie · 2004-10-14 08:37 · Score: 2, Interesting
  
  "So the payoff of alternating is 2.5. This is still lower than "tit for tat" if the "tit for tat" programme manages to cooperate all the time. But this will generally not be the case."
  
  In reality this WILL be the case. Tit for Tat cooperates at the first turn, and then copies the opponent, so the payoff is 3 with no volatility of returns.For all the effort they make, both TFTs could go have a beer and let a computer take over ;-).
  S/M has a lower return, PLUS it pays something in the handshake period.
  
  Now for something really interesting: nobody has really spent time on the economy of effort of the TFT strategy. all the other strategies that proved viable, not only did not win, but they needed substantially more resources (line of code, etc). In my view, that's part of the reason of the survival of cooperation in its present form: it is simple, it works, and when you play with someone using it you're buddies for life.
  
  --
  "If a boss demands loyalty, give him integrity. But if he demands integrity, give him loyalty." (John Boyd, 1927-1997)
Does this defeat the purpose? by Snowspinner · 2004-10-14 02:14 · Score: 5, Interesting

This seems to me to be an unfair way to "win." The point of the PD simulation is to talk about whether, in the absence of any social consequences, it is better to screw someone over for money or to work cooperatively with them. It's not a perfect model for that question, but that is still the question that makes us care about the PD in the first place.

All this has done is make a meta-PD game in which the two programs create a meta-game in which they agree to cooperate. That is to say, this is a solution to the PD problem that relies on the cooperation of a cohort (Someone to keep choosing loyalty while you defect and get all the money). Which is exactly not the point of PD.

So the real headline, I think, is "Trivial flaw found in definition of Prisoner's Dillema problem. University of Southhampton wastes money demonstrating flaw instead of writing a goddamn paper like a normal person would."

--
Philip Sandifer's academic website
Interesting Idea, but... by TJ_Phazerhacki · 2004-10-14 02:15 · Score: 2, Interesting

How difficult is something like this, really? One of my graduate level friends did some sociology work on a Game theory system like this, and if you know the rules, you can really beat the system.

Just curious, thats all. Anyone have any experience in the field?

--
Physics is nothing like religion. If it was, we'd have an easier time trying to raise money!
Spirit of the PD by johnthorensen · 2004-10-14 02:19 · Score: 4, Interesting

Not precisely cheating, as the rules are set up to play this way...but this certainly violates the spirit of the original Prisoner's Dilemma. Why?

Real prisoners only get to choose ONCE.

By taking advantage of the multiple-iteration aspect of the simulation with this sort of 'portknocking' strategy, the winning programs kind of take a cheap shot at the original PD.

Of course, it's all hypothetical anyway, and come to think of it Tit For Tat technically takes advantage of the multiple-iteration aspect as well by doing whatever the opponent did the last time...

Ah well, at least the Wikipedia entry makes a distinction between regular "Prisoner's Dilemma" and "Iterated Prisoner's Dilemma".
Asian mentality by pubjames · 2004-10-14 02:26 · Score: 3, Interesting

ok, here's a weird thought. In many Asian countries, the mentality is to work as a group, rather than individually, with the individual sacrificing themselves for the group if necessary. In the USA and most of the "western" world, we tend to act more as individuals. We tend to think think our system is better, but what if we're wrong? Perhaps, as this experiment shows, the Asian mentality may actually be the superior strategy?

China has been most consistently the biggest superpower over mankinds history, and it looks like it's going to be that way again in a couple of decades. Perhaps these things are related...
It is not the first by Flyboy+Connor · 2004-10-14 02:28 · Score: 4, Interesting

Axelrod never claimed that Tit-for-Tat was the best PD-playing program. He just stated that Tit-for-Tat would play well against any other combination of programs. Actually, IIRC, in the second tournament he organised Tit-for-Tat came in second. There was a different program that managed to exploit faults in other programs.
It is easy to score better than Tit-for-Tat in Axelrod's (original) tournament. He included a program that played random moves. It is not difficult to recognise this program after, say, ten moves have been played. You can always defect against random, because its moves are unrelated to its history. So, a program that plays Tit-for-Tat by default, but always defects against Random, scores better than Tit-for-Tat.
Does this dillute Tit-for-Tat's accomplishment? Of course not. Tit-for-Tat still plays well. And it is such a simple strategy that it can be programmed in two lines ("C on move 1, then copy opponent's previous move"), which none of the other programs achieve. Tit-for-Tat is simple, elegant, and strong. It's beautiful.
Southamptom entries, on the other hand, are complex, sneaky, and cheating against (perhaps unwritten, but nonetheless agreed-upon) rules. They're ugly. They only prove that backstabbing cheating bastards may defeat just-and-fair if the referee is looking the other way for a moment.
I must be missing something by Wind_Walker · 2004-10-14 02:41 · Score: 3, Interesting

Wouldn't an algorithm that defected every time (call it Traitor) beat the Tit for Tat program?
1st iteration - Traitor defects, TfT cooperates, TfT loses and Traitor wins.
Nth iteration - both defect, minor losses for both
Thus Traitor beats TfT... What am I missing?
1. Re:I must be missing something by Gogl · 2004-10-14 06:41 · Score: 2, Interesting
  
  You're right that in a one-on-one matchup, always defect would beat TfT. However, the point of TfT isn't that it would win every single one-on-one matchup but that it does extremely well versus any number of other strategies. Your "always defect" would beat TfT, but if you played Grim Trigger then you wouldn't do that well, whereas TfT would do very well playing with Grim Trigger (Grim Trigger is a strategy that cooperates until the opponent defects, and then it defects forever).
  
  As has been stated in this thread, the claim isn't particularly that TfT is the best strategy of all time in all circumstances, but that it is an elegant and versatile strategy that fares well in a variety of situations.
2. Re:I must be missing something by Pretzalzz · 2004-10-14 08:18 · Score: 2, Interesting
  
  This isn't a masochistic game, you win by scoring above 0, and lose by scoring less than 0. In Tit for Tat v Traitor, they both lose.
The winner basically cheated (good for him :) by jamie · 2004-10-14 02:50 · Score: 4, Interesting

It's pretty trivial that if two or more Dilemma agents are able to recognize each other, they have an advantage over those which cannot. I've got a Prisoner's Dilemma simulation running on my website -- I wrote some code for it over the summer and have been playing around with it on and off.
Once I experimented with letting the agents recognize which "species" they were in and which "species" their opponent was. The runaway winner, of course, was the one which always cooperated with itself, and was less nice to every other species. (In my version, "less nice" meant playing Tit-For-Tat, but the idea's the same.)
Being able to do this is like having the teacher's edition. If recognizing which species other agents belong to is allowed, that's a pretty trivial strategy. It's not called cooperation. It's called xenophobia, or to put it into the most familiar anthropomorphization, racism.
(The life lesson, if I may go out on a limb, is that in an environment where some recognize a quality called "race" and discriminate based on it, being unable to see that quality is a liability. Being truly color-blind means you are unable to recognize not only race but racism, which means you will be taken advantage of.)
When I ran my first tournament and got some interesting results based on this, I realized that knowledge of what "species" an agent belongs to is too powerful, it throws a monkey wrench into the works. So I scrapped it and moved on to stuff I found more interesting.
But the winner of this PD tournament was even craftier; he submitted a ton of entries, all of which were xenophobic in this way, except that they all recognized one "species" as the top dog. The other "species" essentially committed suicide to give the highest score to the top dog. That wouldn't have worked in my tournament, since they literally would have committed suicide (my agents starve to death if they don't score high enough) and that would have shaped the resulting environment. Every tournament is artificial in some way, and the human submitting entries to this one was clever enough to take advantage of these particular artificialities.
Since it's now been shown that inter-agent communication is possible, that's going to be fair game for every tournament from now on. The next step is going to be designing tournaments to work with this trick, not against it. As I wrote to this tournament's organizers:

Since that's such a powerful strategy, I think the next step in PD tournaments is not to try to overcome it, but to embrace it: allow agents to communicate, not just with their own species, but with whoever they're playing against. My guess is that mere xenophobia would be eclipsed by the much more powerful strategy of joining the ongoing discussion about which agents can and can't be trusted. That's the next big feature I want to try.
Re:Scary Stuff by Anonymous Coward · 2004-10-14 02:51 · Score: 1, Interesting

The intresting part of game theory contests like this is when you introduce freewill and project the competition onto real world situations. Game theory is great in understanding the best mean outcomes for a given competative scenario. The problem in the real world comes when one outcome is simply unacceptable to one or more competing parties. Then the parties must play to avoid a given outcome rather than achieve the best result. The situation often determines which of the many game theory strategies offers the best chance of success because in the real world the situation and the sticky issues of free will often determine what "success" means where as in most game theory expiraments the definition of "success" is predetermined and agreed upon.

Still very interesting stuff.
Don't you see the beauty? by Q2Serpent · 2004-10-14 03:12 · Score: 3, Interesting

They "cheated", and the other guy didn't, so they won big! Wasn't that the whole premise?

-Serpent
1. Re:Don't you see the beauty? by gadget+junkie · 2004-10-14 05:37 · Score: 2, Interesting
  
  "the question is "what's probably the best strategy for any given individual in Prisoner's Dilemna" and they changed the question to "how can we get some individuals to be super-players with the way this prisoner's dilemna simulator is setup""
  
  . How very true.I doubt that this solution is applicable in real life, if only for the fact that one of the assumptions would be that a subset of a winning team consistently and repeatably wants to be defeated. Mother nature took care of those long ago.
  
  --
  "If a boss demands loyalty, give him integrity. But if he demands integrity, give him loyalty." (John Boyd, 1927-1997)
2. Re:Don't you see the beauty? by kisrael · 2004-10-14 06:02 · Score: 2, Interesting
  
  subset of a winning team consistently and repeatably wants to be defeated. Mother nature took care of those long ago.
  
  Well, it's an interesting philisophical point. It depends on how you define "win" and "lose"...certainly some specices have formed partnerships with other, often larger species, and if you define "win" as something besides "just survive", they might be seen as subjugating themselves to the other creature, so that the partnership prospers, even if their life doesn't seem that swell.
  
  In other words, nature is more complex than the Prisoner's Dilemna, and sometimes ends up finding stuff more like the "cheaters" solution.
  
  --
  SO YOU'RE GOING TO DIE: The Comic for Dealing with Death
Not a cheat by Carmody · 2004-10-14 03:21 · Score: 2, Interesting

I've been giving talks on the Prisoner's Dilemma for a few years. (No original research, just following the thing and explaining the game to the Youth)

It is kind of an orthodoxy in the literature: Tit for Tat always ties or loses by a little bit, but in tournaments, it is the best strategy.

Well - it ain't. Someone found a way around it. Instead of urging rule-changes to prevent this new challenger, we should all be happy and excited that PD tournaments have just got MORE INTERESTING.

I can't wait to see what happens next - what new programs will emerge to have the advantages of Tit for Tat but also the ability to defend against Master-Slave programs that communicate with each other.

The game has changed - now let's leave it alone and watch.

--
God is real unless declared integer
nested bogosity by nusratt · 2004-10-14 04:13 · Score: 2, Interesting

Everyone is posting about how this is bogus because it's really not the same game as PD.

But even if you don't agree with that view, another important question is:
in what meaningful sense is this new strategy a "victory"?
After all, it achieves "victory" for half of the cooperators, at the cost of sacrificing the other half.

To use one nuclear-war analogy, it's a choice between strategy "A",
where you acquiesce to the death of half of your populace, with the reward that the remaining populace is completely unaffected --
and strategy "B", with the guaranteed result that no one dies but everyone is injured.

Which populace would *you* choose to join on the eve of war?
So, GPL is Tit for Tat by john_lewmanny · 2004-10-14 04:20 · Score: 2, Interesting

And GPL is more or less 'Tit for Tat' in which it will only cooperate with those also cooperating.
The next game by Gyorg_Lavode · 2004-10-14 04:22 · Score: 2, Interesting

I think what will become more interesting is that, now that we know the best lone player (tit for tat) can be defeated by players playing together, can we write our players to look for a player trying to communicate to another player so as to take advantage of it. Can my player play tit for tat against normal players, but, when it sees a S/M player, convince the S/M player to play slave for my gain?

--
I do security
Pavlov, Grim, and the other strategies. by DrRobin · 2004-10-14 05:09 · Score: 4, Interesting

As a microbiologist with interest in evolution, I have followed this field from afar for years. Looking over the results, I was surprised at how relatively poorly "Pavlov" (win-stay lose-shift) did, since it performs so strongly in noisy, evolutionany, versions of the game. [see:
http://www.ncbi.nlm.nih.gov/entrez/query.fc gi?hold ing=npg&cmd=Retrieve&db=PubMed&list_uids=8316296&d opt=Abstract
It was also a bit dismaying to see how well "Grim" (hold a grudge forever) did in both games. In evolutionary versions of the game, Pavlov helps keep down the population of "suckers" (thereby decreasing the food supply for more predatory and parasitic strategies) while still rewarding "provokable" cooperators (thereby increasing the total aggregate "reward" of the ecosystem.
Also, one essential part of the payoff structure that deserves emphasis is that the payoff for cooperating has to be more than half the average of the winner and loser's payoff for defection, else one benefits by simply alternating each turn. This is a little bit like the winners did here, where they got the top spots at the cost of a lower total take for their "team". One real world example of slashdot interest where this might make sense is if you take these losses in order to eliminate your rivals from the game and then reap monopoly benefits once you control the game (not to mention any names...).
Maybe someone who has analyzed the results in more detail could comment on how the various well known strategies fared and why.
Key element - guaranteed draw strategy by BobaFett · 2004-10-14 06:04 · Score: 3, Interesting

While entering a team into a tournament scored for individuals and then sacrificing the whole team for one player is by no means a new idea, what makes it so remarkably successfull here is the existance of a "guaranteed draw" strategy (in this case, always defect). The best individual response to "always defect" is to defect yourself, anything else is a suicide, so if you always defect you can force a draw. Then all your team loses to one team member, and he is the winner.

Compare this with, for example, a chess tournament. You could secretly enter a team and have them all lose to you. While this will keep you from ending last, it won't assure victory, unless all players are roughly equal. If there is a very strong player, he'll win against all your team, yourself included. So you can cheat by redistributing players of comparable strenghts, but at least you can't rob a clear champion of his deserved victory.

This is not the case in the PD tournament. But let's redefine the problem slightly: say, if both sides cooperate, each gets a dollar. If then defect, each pays a dollar. Sucker's reward is paying 10 dollars. Now the Southampton team's strategy boils down to using the tournament to give all their money to one player, while paying a hefty tax in the process. There is a cheaper way to do this, just give all money to one guy outside the tournament :) But now we can gauge any strategy: enter one player or a team, recognize your own team members or not, transfer money between team members as you wish, but can you make money, overall, from this tournament?
Does this really apply to human behaviour? by Thangodin · 2004-10-14 06:18 · Score: 2, Interesting

Optimistic Tit-for-Tat models human behaviour well in a social setting--we give others the benefit of the doubt, and continue to cooperate when others do. When someone violates our trust, we stop trusting them and punish them, but if they act beneficially towards us again, we might be willing to forgive. Most notable, OTFT produces the best overall score, which in competition between social groups is the deciding factor.

The Southampton strategy is dependent upon large numbers of people who will sacrifice all for the good of the other, and not for the good of their community (the collective performance is worse than OTFT.) I can see sacrifice for the greater good, but this is sacrifice to another person without hope of recompensation or an increase in general wellbeing. This does happen in human societies (I think it's happening now in some political systems), but only when the winner has managed to convince the losers that its all in everyone's best interest. What Southampton has added to this mix is a capacity for extreme self-delusion that directly contravenes the economic assumption of informed choice and self-interest. For purposes of economic modelling, Southampton should probably be disqualified, or these assumptions dropped. But this should also tell you something about what could happen to those nice economic models when they hit the messy world of human beings, who for the most part aren't very informed and often work against their own best interests as a result.

The consequence for a societal group running Southhampton against an OTFT group would be the defeat of the Southhampton group every time. Selection works at individual AND group levels. So the challenge should probably be two-tier: run the programs individually against each other, and run them as tribes against each other.
Re:Practicality by Anonymous Coward · 2004-10-14 08:08 · Score: 1, Interesting

It's "dilemma", not "delimma"... same root word as "Lemma", meaning theory, in math.