Slashdot Mirror


'Tit for Tat' Defeated In Prisoner's Dilemma Challenge

colonist writes "Tit for Tat, the reigning champion of the Iterated Prisoner's Dilemma Competition, has been defeated by a group of cooperating programs from the University of Southampton. The Prisoner's Dilemma is a game with two players and two possible moves: cooperate or defect. If the two players cooperate, they both have small wins. If one player cooperates and the other defects, the cooperator has a big loss and the defector has a big win. If both players defect, they both have small losses. Tit for Tat cooperates in the first round and imitates its opponent's previous move for the rest of the game. Tit for Tat is similar to the Mutual Assured Destruction strategy used by the two nuclear superpowers during the Cold War. Southampton's programs executed a known series of 5 to 10 moves which allowed them to recognize each other. After recognition, the two Southampton programs became 'master and slave': one program would keep defecting and the other would keep cooperating. If a Southampton program determined that another program was non-Southampton, it would defect." Update: 10/14 15:08 GMT by J : If anyone wants to try writing their own PD strategy and see how it fares in a Darwinian contest, I'll host a tournament of Slashdot readers. Here are the docs, sample code, notes on previous runs, and my email address.

19 of 356 comments (clear)

  1. Yes, actually by Perianwyr+Stormcrow · · Score: 3, Informative

    But the proper test is really whether the master half of these programs can do better than tit for tat on a large scale basis. I suspect that the S/M program will still do less simply because it plays a pattern during the interaction phase which is likely to result in tit for tat still coming out ahead- if there is one tit for tat, it won't do so well since the costs of being tit for tat are relevant if you don't know the master sign and most of those you interact with are expecting to hear it. But that's already well known. If tit for tat's numbers start growing, it does better. You see, tit for tat has an identification mechanism too, which is simply that it always starts out nice and immediately gets nasty if it gets fucked. If the number of tit for tats increases to a reasonable critical mass, they can have enough positive reactions to do very well. In fact, they'd become a secret society within the S/Ms!

    In short, if tit for tat is isolated, it won't do so well since everyone is fucking with it. If there are just a few tit for tats out there, their power increases significantly with each one added.

    --

    What we call folk wisdom is often no more than a kind of expedient stupidity.-Edward Abbey

  2. Re:That's not really so special by Anonymous Coward · · Score: 5, Informative

    What's being ignored is that the total profit of all the colluding algorithms is less than that of Tit-for-Tat, which makes the solution unviable in real-world Prisoner Dilemma situations. (bidding on large construction projects under certain auction formats, etc)

    As an analogy of unprofitable collusion, I could win the World Series of Poker by hiring enough shills and paying their time and entry fees. I would lose money by doing this, probably more than I could recoup with post-tournament income via endorsements/books/whatever.

    The parent is correct. Tit-for-Tat is still superior in equal numbers, and a modified Tit-for-Tat that can spoof the recognition algorithm of colluders will trounce them.

  3. The article got it wrong by nels_tomlinson · · Score: 5, Informative
    The article got it wrong: they compared the tit-for-tat strategy for the iterated prisoner's dilemma to mutual assured destruction. That's wrong, since nuclear war is usually considered to be a one-time game: once you've blown each other up, there is no next round. Tit-for-tat requires that there always be a following round.

    Repeated games have radically different outcomes than one-time games. It's long been known that where cooperation is possible, cooperation can beat solitary strategies in repeated games. I really don't think there's anything surprising here.

  4. Re:...by cheating! by dr_labrat · · Score: 2, Informative

    I'm not sure you read the article correctly.

    this is not cheating, what is happening is throughout the iterations the programs can experience the equivalent of morse code in the patterns of defections and co-operations in the form of the penalties.

    i.e.: 4yrs 4yrs free free 2years.

    As this is iterative, and no actual lifespan (in human terms) the pattern can become quickly evident. Kind of like a "penalty handshake".

    In this way one program can detect the intentions of what is happening, without actually communicating directly.

    --
    The secret of success is honesty and fair dealing. If you can fake those, you've got it made. (Marx)
  5. Re:Uh, isn't that just cheating? by Have+Blue · · Score: 2, Informative

    That's why this is the Iterated PD Challenge, not just the classic PD. If the competitors only played ONE round of PD each, the contest would not be very interesting. Their past performance in the contest itself is part of the "history" they are using to evaluate their choices.

  6. Kin Selection in Genetic Algorithms by Baldrson · · Score: 3, Informative
    This is a clever demonstration of kin selection among groups of competing algorithms.

    A mathematical treatment of population genetics in groups was given by W. D. Hamilton in "Innate Social Aptitudes of Man". In the last sentence of that paper, Hamilton, the originator of modern kin selection theory, states:

    One hears that game theorists, trying to persuade people to play even two-person games like 'Prisoner's Dilemma', often encounter exasperated remarks like: 'There ought to be a law against such games!' Some of the main points of this paper can be summarized as an answer to this comment: that often, in real life, there is a law, and we can see why, and that sadly we also see the protean nature of this Dilemma, which, when suppressed at one level, gathers its strength at another.
    What Hamilton is referring to is the fact that in any structure of components vs composite, there is the opportunity to defect. An individual gene can defect against the organism within which it resides via, say, meiotic drive. An individual may defect against his tribe made up of his close relatives. A tribe may defect against the others making up a nation. A nation may defect against others making up a geographic race. A geographic race may defect against others making up humanity as a whole.

    It is indeed a dilemma but it isn't without a rigorous treatement within genetic theory.

    Steve Sailer has written an an excellent review of the politically touchy issue of ethnic nepotism given from Hamilton's group selective perspective.

  7. Re:Spirit of the PD by amorsen · · Score: 3, Informative

    Err, of course this is the iterated prisoner's dilemma. It is quite easy to do the optimal thing in the non-iterated case: defect. You couldn't make a competition out of that.

    --
    Finally! A year of moderation! Ready for 2019?
  8. Did the same thing a few years ago... by Anonymous Coward · · Score: 5, Informative

    The length of the code is one of the largest problems to overcome. Performing any signal other than all-cooperate produces a net loss of 1 or 4 points per round for your team in traditional (0,1,3,5) IPD. Simple signalling, ie 4th round defect was very effective. While the master/slave aspect was amazingly effective in my research, the "spoiler" was not. A small population of master/slaves could invade an arbitrariliy large block of TitForTat if evolution was by duplicating winner and removing loser after n iterations. The population of "spoilers" stagnates very quickly in a large TFT population. TFT should be considered a friend, not an enemy because they are a positive growth environment. Going "spoiler" on any non-TFT/ally was quite effective as any bot not prone to cooperate posed the only real risk of "master" losing.

  9. Re:...by cheating! by billbaggins · · Score: 4, Informative
    Actually, this is exactly the sort of thing that the organizers were hoping would happen. From the FAQ, question 12:
    But we don't want to [impose limits on the number of entries] as it will be interesting to see if people can come up with strategies that cooperate with themselves within the whole population.
    --
    "The best argument against democracy is a five minute chat with the average voter."
    --Winston Churchill
  10. Re:I must be missing something by Anonymous Coward · · Score: 1, Informative

    You aren't missing anything. In fact, TfT never beats *any* other strategy. The best it can do is tie.

    Also worth noting is that the Southampton programs *were* cooperating -- with each other.

  11. That's right by Anonymous Coward · · Score: 4, Informative

    That's right, traitor (hawk) beats TfT in any given trial.

    BUT, in an environment made up of a few players playing each strategy, then you have the following matchups:

    Hawk vs Hawk. Horrible horrible loss for both of them.
    TfT vs Hawk. Hawk wins, but only by a single round.
    TfT vs TfT. Both TfT 'win' - neither betray the other.

    So, overall, TfT does better than hawk.

    The interesting part isn't beating TfT (which, as you point out, isn't THAT hard to do) but in doing consistently better than it against a wide variety of programs. Which is what TfT has long been the baseline for.

  12. Tit for Tat has already been beaten by bigHairyDog · · Score: 2, Informative

    Tit for Tat is outperformed by "Tit for Two Tats", because it is better at avoiding long runs of damaging mutual recrimination. That was 5 years ago. The performance of any of these strategies is only determined by the opponent strategies that they face, which is arbitrary. It is therefore meaningless to talk of one strategy being 'better' than another - most advanced strategies can beat Tit for Tat given the right opponents.

    --

    foo mane padme hum

  13. Re:That's not really so special by jamie · · Score: 3, Informative
    If you'd like to try your hand at it, please feel free to email me the code any program you'd like to try. (Or just describe to me the strategy you want to take and I can probably write code for it.)

    I've written perl scripts to run Prisoner's Dilemma tournaments on my website (the perl source code for all the species in the last run is on that page) and new submissions for the next tournament are welcome. Since my tournament does Darwinian selection on every agent that plays in it -- agents which don't earn points eventually die off, those which earn a lot of points reproduce -- self-sacrificing pairs of strategies won't do very well.

    I haven't yet written code which can trickle in new members over time, but that really wouldn't be very hard to add (I could probably set it up in an hour if you are really interested :)

  14. Re:Spirit of the PD by b1t+r0t · · Score: 2, Informative
    Yep. If you know that a given move is the final move, the optimal action is to defect. When it's not iterated, there is only one move, so of course it's going to be the last move.

    I presume there is no way the entrants can know how many moves will be in a given round, or Tit-for-Tat could be slightly be beaten by a modified TfT which always defected in the final round.

    --

    --
    "Open source is good." - Steve Jobs
    "Open source is evil." - Microsoft
  15. Re:Practicality by naoursla · · Score: 2, Informative

    Unfortunately, PD has a dominant strategy to cheat your opponent. No matter what your oppoent does, you can do better by cheating.

    Iterated PD is more interested since it lets your opponent punish you for cheating. So you get into some interesting social issues.

    How well a given strategy does depends on the strategies other in its community are using. If the population is heavily cheater based, then agents that cooperate will lose big time. However, if there are enough cooperators, then they will form a coalition of sorts, and even though they will lose to the cheaters, in the end they will come out on top.

  16. Re:Practicality by gadget+junkie · · Score: 2, Informative

    "[...]How well a given strategy does depends on the strategies other in its community are using. If the population is heavily cheater based, then agents that cooperate will lose big time. However, if there are enough cooperators, then they will form a coalition of sorts, and even though they will lose to the cheaters, in the end they will come out on top."

    I remember reading about the tournament in a "Science" magazine article, back when the original tournament was done; Tit for Tat won irrespectively of the number of strategies it played against.
    Furthermore, in a variant in which the winning strategy "spawned" more often, and the loser did not, Tit for Tat became a majority of the population irrespective of the initial sample, except in the extreme case of only one tit for tat and defectors. This is explained better here. The most interesting thing was, for me, the fact that Tit for Tat was superior to a strategy in which the program responded with a delay, i.e. it made the opponent's move two turns down the line. So, remember this if you have kids, or a pet, or both like me: whatever reaction you deem appropriate, it should be done soon, or not at all.

    NOTE: the payoff was described as such:if A cooperates and B cooperates, both get X points; if A "defects" it gets W points to Z for B;if both defect, they get Y, where:

    W > X > Y > Z, all positive integers.

    --
    "If a boss demands loyalty, give him integrity. But if he demands integrity, give him loyalty." (John Boyd, 1927-1997)
  17. Re:The winner basically cheated (good for him :) by jamie · · Score: 2, Informative

    In the particular case I ran, it didn't make any difference. Now, though, I've added errors to what each agent tries to play. Tit-For-Tat responds to those errors and, playing against another Tit-For-Tat, quickly plunks down to (say) 90% cooperation plus or minus error, and random-walks around from there. "Always cooperate," though, sticks at 100% plus or minus error, so it does a little better.

  18. Re:...by cheating! by Kupek · · Score: 2, Informative

    The whole reason people are interested in the Prisoner's Dilemma (and Iterated) is that they are models for situations encountered in science. This isn't a comepetition like chess where we're trying to see who (as in the humans) are the best at it. We're trying to see what interesting results come of it. This is an interesting result.

  19. Re:Uh, isn't that just cheating? by snilloc · · Score: 2, Informative
    I'm not sure how this particular iterated PD game works, but in the serious academic version, there is uncertainty about which round is the last round. This accounts for "last round" defections, and also the subsequent collapse of any cooperation strategy by working backwards induction through the game and defecting on the "real" last round to one-up your opponent. (Knowing that your opponent is likely to defect on round N, you defect on N-1, N-2, N-3, back to the start of the game with a dominant "defect" strategy).

    Sort of like the parable of the teacher who said there'd be a pop quiz but you wouldn't know what day it was. Obviously it wasn't Friday, because on Thursday you'd know the quiz was friday. Likewise for Thursday because if the quiz wasn't friday, and if you got to wednesday w/o a quiz, then you would know the quiz was thursday, and therefore the quiz couldn't be on thursday... leading to the conclusion that there is no quiz. Not a perfect example (since then the teacher could give the quiz on any day and it would be equally unknowable given the possibility of "no quiz"), but I hope it clarifies some.