'Tit for Tat' Defeated In Prisoner's Dilemma Challenge
colonist writes "Tit for Tat, the reigning champion of the Iterated Prisoner's Dilemma Competition, has been defeated by a group of cooperating programs from the University of Southampton. The Prisoner's Dilemma is a game with two players and two possible moves: cooperate or defect. If the two players cooperate, they both have small wins. If one player cooperates and the other defects, the cooperator has a big loss and the defector has a big win. If both players defect, they both have small losses. Tit for Tat cooperates in the first round and imitates its opponent's previous move for the rest of the game. Tit for Tat is similar to the Mutual Assured Destruction strategy used by the two nuclear superpowers during the Cold War. Southampton's programs executed a known series of 5 to 10 moves which allowed them to recognize each other. After recognition, the two Southampton programs became 'master and slave': one program would keep defecting and the other would keep cooperating. If a Southampton program determined that another program was non-Southampton, it would defect."
Update: 10/14 15:08 GMT by J : If anyone wants to try writing their own PD strategy and see how it fares in a Darwinian contest, I'll host a tournament of Slashdot readers. Here are the docs, sample code, notes on previous runs, and my email address.
- If you confess and your partner denies taking part in the crime, you go free and your partner goes to prison for five years.
- If your partner confesses and you deny participating in the crime, you go to prison for five years and yor [sic] partner goes free.
- If you both confess you will serve four years each.
- If you both deny taking part in the crime, you both go to prison for two years.
This sounds pretty much like the RIAA might be involved. I would deny everything if I were you!The dangers of knowledge trigger emotional distress in human beings.
In other words, an in-group can work vs. tit for tat if it outnumbers it. I'd like to see a trial with a slow trickle of immigration of tit for tats into a large population of S/M programs. That might be illuminating. I suspect the outcome would be that tit for tat still does well.
What we call folk wisdom is often no more than a kind of expedient stupidity.-Edward Abbey
Tit-for-tat, while a simple and effective strategy, isn't perfect. This certainly isn't the first time it has been beaten. What's the big deal?
The rules are similar to those of the gameshow Friend or Foe.
Dan East
Better known as 318230.
...fraternities and secret societies work so well!
I'm off to join the Freemasons. Be back in a few.
I claim first use of "Error No. 0B" - or "No. 0B error." It'll be the new ID 10T!
I generally hope that knowledge of the prisoner's dilemma will never become a practical factor in my life.
What we call folk wisdom is often no more than a kind of expedient stupidity.-Edward Abbey
Lameness filter encountered. Post aborted!
Reason: Don't use so many caps. It's like YELLING. It's not at all like a TALKING COMPUTER. You are a bad man. Go away.
I mean, the whole point of the Prisoner's Dilemma is that you don't have all the information. You don't know what your partner/opponent is going to do and you have decide based entirely on what little information you have based on your history with your partner/opponent. What these people are doing is creating a pattern to be recognized by another player, and then working as a team. And, it's not like they're people where one person might change their mind and decide to defect unilaterally... they're programs. Once they've locked onto each other as the same program, that's it. They'll play to their advantage until the end.
The real trick is to find a program that can beat other DIFFERENT programs, not beat itself. This seems really stupid, or am I missing something?
--
RumorsDaily
This seems to me to be an unfair way to "win." The point of the PD simulation is to talk about whether, in the absence of any social consequences, it is better to screw someone over for money or to work cooperatively with them. It's not a perfect model for that question, but that is still the question that makes us care about the PD in the first place.
All this has done is make a meta-PD game in which the two programs create a meta-game in which they agree to cooperate. That is to say, this is a solution to the PD problem that relies on the cooperation of a cohort (Someone to keep choosing loyalty while you defect and get all the money). Which is exactly not the point of PD.
So the real headline, I think, is "Trivial flaw found in definition of Prisoner's Dillema problem. University of Southhampton wastes money demonstrating flaw instead of writing a goddamn paper like a normal person would."
Philip Sandifer's academic website
Yeah, that's not the Prisoner's Dilemma. Or even the Iterated PD. This whole "signaligng Morse code" on the prison walls is nonsense, because it was not part of the original plan. Just because it's not in the rules doesn't mean you can do it. In Chess there's no rule specifically against me bringing a SuperGrape(TM) onto the board. The SuperGrape(TM) immediately destroys all pawns on a color of my choosing.
No, it doesn't work that way.
While this is an interesting experiment, it's not a true victory.
Small potatoes make the steak look bigger.
Just curious, thats all. Anyone have any experience in the field?
Physics is nothing like religion. If it was, we'd have an easier time trying to raise money!
It's not clear to me how the entries determined who would be the 'master' and who would be the 'slave'. It seems that if you had lots of 'colluders' around who could be induced to 'suicide' for another's benefit, you'd very quickly get cheaters who worked to be the 'master' in all situations.
This strikes me as a lot more reminiscient of the Hawk/Dove situation.
PHEM - party like it's 1997-2003!
Why should Tat get all the fun?
Southampton's programs executed a known series of 5 to 10 moves which allowed them to recognize each other. After recognition, the two Southampton programs became 'master and slave': one program would keep defecting and the other would keep cooperating.
Am I the only one who thinks this is just kind of obvious and silly?
Also, what kind of "moves" can be made by a "prisoner" that can be seen by the other prisoner?
Pat
But the proper test is really whether the master half of these programs can do better than tit for tat on a large scale basis. I suspect that the S/M program will still do less simply because it plays a pattern during the interaction phase which is likely to result in tit for tat still coming out ahead- if there is one tit for tat, it won't do so well since the costs of being tit for tat are relevant if you don't know the master sign and most of those you interact with are expecting to hear it. But that's already well known. If tit for tat's numbers start growing, it does better. You see, tit for tat has an identification mechanism too, which is simply that it always starts out nice and immediately gets nasty if it gets fucked. If the number of tit for tats increases to a reasonable critical mass, they can have enough positive reactions to do very well. In fact, they'd become a secret society within the S/Ms!
In short, if tit for tat is isolated, it won't do so well since everyone is fucking with it. If there are just a few tit for tats out there, their power increases significantly with each one added.
What we call folk wisdom is often no more than a kind of expedient stupidity.-Edward Abbey
Not precisely cheating, as the rules are set up to play this way...but this certainly violates the spirit of the original Prisoner's Dilemma. Why?
Real prisoners only get to choose ONCE.
By taking advantage of the multiple-iteration aspect of the simulation with this sort of 'portknocking' strategy, the winning programs kind of take a cheap shot at the original PD.
Of course, it's all hypothetical anyway, and come to think of it Tit For Tat technically takes advantage of the multiple-iteration aspect as well by doing whatever the opponent did the last time...
Ah well, at least the Wikipedia entry makes a distinction between regular "Prisoner's Dilemma" and "Iterated Prisoner's Dilemma".
"The result is that Southampton had the top three performers -- but also a load of utter failures at the bottom of the table who sacrificed themselves for the good of the team."
J.
You're only jealous cos the little penguins are talking to me.
So, the lesson here is... if they're not your friend, fuck them over the first chance you get.
No denying it though - it obviously works damn well. Just ask the Bush administration.
Repeated games have radically different outcomes than one-time games. It's long been known that where cooperation is possible, cooperation can beat solitary strategies in repeated games. I really don't think there's anything surprising here.
See what I've been reading.
Ah, you're using the UK SuperGrape rules. I think most competitions nowadays use the Russian rules (SuperGrape takes _all_ pieces except King on a given color).
Of course in USA play the SuperGrape is almost worthless as your opponent can simply invoke the Citrus Rule next turn (which usually destroys the whole board).
Whence? Hence. Whither? Thither.
You have the nukes. Why shouldn't they have them?
I'll do the stupid thing first and then you shy people follow...
ok, here's a weird thought. In many Asian countries, the mentality is to work as a group, rather than individually, with the individual sacrificing themselves for the group if necessary. In the USA and most of the "western" world, we tend to act more as individuals. We tend to think think our system is better, but what if we're wrong? Perhaps, as this experiment shows, the Asian mentality may actually be the superior strategy?
China has been most consistently the biggest superpower over mankinds history, and it looks like it's going to be that way again in a couple of decades. Perhaps these things are related...
It is easy to score better than Tit-for-Tat in Axelrod's (original) tournament. He included a program that played random moves. It is not difficult to recognise this program after, say, ten moves have been played. You can always defect against random, because its moves are unrelated to its history. So, a program that plays Tit-for-Tat by default, but always defects against Random, scores better than Tit-for-Tat.
Does this dillute Tit-for-Tat's accomplishment? Of course not. Tit-for-Tat still plays well. And it is such a simple strategy that it can be programmed in two lines ("C on move 1, then copy opponent's previous move"), which none of the other programs achieve. Tit-for-Tat is simple, elegant, and strong. It's beautiful.
Southamptom entries, on the other hand, are complex, sneaky, and cheating against (perhaps unwritten, but nonetheless agreed-upon) rules. They're ugly. They only prove that backstabbing cheating bastards may defeat just-and-fair if the referee is looking the other way for a moment.
..it's half the fun you would normaly get ;-)
The possible moves are:
* Defect (turning in your mate) - this means you avoid the 'worst' case (when your opponent turns you in but you stay silent) but also avoids the 'best' case (When you both stay silent)
* Stay silent (the reverse of the above)
Simple PD is interesting but trivial. A quick
count of the matrix of possible moves shows that your best option is to always defect. The counterintuitiveness of this is why it is interesting.
HOWEVER - in this game, the simple game is iterated.
And you quickly learn that when the 'last' game is known, everyone struggles to defect then - and it quickly devolve sinto the simple case.
Interesting things happen once the game is open ended. Dove strategies (which sometimes, or even always, stay silent) start to win.
The whole point of the iterated game is to try and learn about your opponent - applying a strategy which is 'better' than theirs.
The long time 'best' of these is Tit-For-Tat. Which, as named, is the simple strategy of doing to your opponent whatever they did to you last round.
That way you play as well as possible against a hawk (a player who always defects) and also as well as possible against a dove.
Against itself, it ALSO plays as well as posisble, since they both sit there never seeing a defection and so never defecting - thus getting the highest possible score.
The interesting part of this study is that they claim that their colluding program doesn't just win against the existing programs by overwhelming them - but that it is also a stable strategy. That is, a program which tends to copy good strategies that it sees, will tend to adopt this strategy and ALSO do well.
It does, however, seem very cheaty.
This is not the first time that tit-for-tat has failed to win an iterated prisoner's dilemma competition. (Cannot find a good link to past results of similar competitions, but here's a link to results of one simulation.)
The cooperation strategy used by the Southhampton programs is interesting. At first it may seem unrealistic and not very informative about human behavior, especially in the context of life and death decisions. In other words, what incentive do I have to cooperate with a "brother" to my detriment, when at the end of the day, he lives and I die? But you can think of rationales for such a strategy (e.g., familial ties, idealistic reasons). These rationales may not be rational, strictly speaking -- If I die, what do I really care what happens after I'm gone? I cannot know or benefit from the beyond. (Or can I?) -- but being able to identify, account for and respond to them rationally would be beneficial.
This is very easy to defeat! Just imagine - imitate "master" behaviour and then (after recognized by opponent) just abuse it...
Communication between secret partners has been one of the most undefeatable stratgies in cards for a long time. Didn't take a computer to figure that out. Someone just figured out how to do in the rules given for this competition.
I used to wonder what was so holy about a silent night, now I have a child.
What is tat?
Where do I get it?
And how do I exchange it for the other thing?
--Dennis Miller (IIRC)
If Mutually-Assured-Destruction really were an implementation of one-time-only PD, then only non-rational play would explain the outcome.
Or, to put it colloquially:
Either world leaders are insane, or we're all dead.
A mathematical treatment of population genetics in groups was given by W. D. Hamilton in "Innate Social Aptitudes of Man". In the last sentence of that paper, Hamilton, the originator of modern kin selection theory, states:
What Hamilton is referring to is the fact that in any structure of components vs composite, there is the opportunity to defect. An individual gene can defect against the organism within which it resides via, say, meiotic drive. An individual may defect against his tribe made up of his close relatives. A tribe may defect against the others making up a nation. A nation may defect against others making up a geographic race. A geographic race may defect against others making up humanity as a whole.It is indeed a dilemma but it isn't without a rigorous treatement within genetic theory.
Steve Sailer has written an an excellent review of the politically touchy issue of ethnic nepotism given from Hamilton's group selective perspective.
Seastead this.
This story illustrates the power of groups and societies to coordinate to the detriment of individuals and outsiders. The Southampton team used a "secret handshake" to recognize members of the society and discriminate against outsiders. It is a natural explanation for people's fear of closed/secret societies -- people fear the group's ability to break the rules of individualistic "fair play."
If the agents in the game were capable of higher order reasoning and could see these coordinated actions between members, then they would become paranoid -- all the Southampton team members were "out to get them."
Two wrongs don't make a right, but three lefts do.
A strategy is just simply a function that maps an observed history into some sort of behaviour.
So this is not unfair. It does, however, require the presence of other nice agents that cooperate.
I conject that given this last fact it would be easy to design strategies that do better by exploiting this cooperative behaviour: simple design a strategy so that the program initially seems to cooperate (by making the correct "handshakewith the nice agents, but even when pretends to be in slave mode quickly starts to defect. It would probably do even better probably; and then you would just have the old prisoner's delemma back (i.e. cooperation is unsustainable).
Tit for tat has a secret handshake too, but it's a code of ethics. It is robust in any iterated situation. That's what makes it neat.
What we call folk wisdom is often no more than a kind of expedient stupidity.-Edward Abbey
I did a research on this topic and I found a fun little game. You can play in "The Prisoner's Dilemma" against a computer opponent - , and discover some strategies.
Well then tat must really suck
"Look Lois, the two symbols of the Republican Party: an elephant, and a fat white guy who is threatened by change."
Except Tit for Tat is more robust than other plans, deals well with a wide variety of opponents, and is easy for opponents to "figure out" and is "forgiving" so it does not get caught in endless loops of mutual punishment easily.
Being that, beating Tit for Tat isn't that big of a deal. Doing BETTER than Tit for Tat consistently _IS_ a big deal.
The game is a positive sum game, so it pays off to end up in a cooperative (or semi-cooperative) sequence over repeated "defections".
For some good reading on the Prisoner's Dilemma Game and how it fits in some biological systems read;
"The Evolution of Cooperation" by Robert Axelrod (and newer books)
"The Selfish-Gene" by Richard Dawkins
There may be more recent books too, it's been while since I studied the subject.
Having one plan that can beat Tit for Tat
I beat you thusly anyway.
1st iteration - Traitor defects, TfT cooperates, TfT loses and Traitor wins.
Nth iteration - both defect, minor losses for both
Thus Traitor beats TfT... What am I missing?
Once I "won" a PD-tournament by fiddling with the organising engine. The friendly engine supplied the programs with the recent histories, and I simply inserted all cooperates in my opponent's table. The scores were only counted at the end. This was in a trial session, and I only wrote the program to expose a problem in the engine. I wouldn't dream of entering the program for real, because it would defeat the purpose of the tournament. I can't say I have much respect for the Southampton team that didn't have any qualms about cheating.
The length of the code is one of the largest problems to overcome. Performing any signal other than all-cooperate produces a net loss of 1 or 4 points per round for your team in traditional (0,1,3,5) IPD. Simple signalling, ie 4th round defect was very effective. While the master/slave aspect was amazingly effective in my research, the "spoiler" was not. A small population of master/slaves could invade an arbitrariliy large block of TitForTat if evolution was by duplicating winner and removing loser after n iterations. The population of "spoilers" stagnates very quickly in a large TFT population. TFT should be considered a friend, not an enemy because they are a positive growth environment. Going "spoiler" on any non-TFT/ally was quite effective as any bot not prone to cooperate posed the only real risk of "master" losing.
In an organism as ancient and lowly as the slime mold, a genetic feedback mech evolved so that cheating would balance with altruism...You should hope your species hangs around as long as the slime mold [> 1 billion years!] see this article at BetterHumans and elsewhere.
SLASHDOT: news for people who can't concentrate on work or have no life at all and got tired of yelling back at the TV.
It takes so much of my brain's processing power like no other game!
On a side note, i bet this story will get over 1000 comments.Sure.
Why does yahoo do this
So...
Did I get that right?
Belief is the currency of delusion.
Because the North Korean government does things like kidnap citizens from other countries, conducts experiments on human subjects, and starves their population.
All in all, since the North Korean really can't build nukes without China's tacit acquiescence, I'd say we should go to the Chinese and say "If North Korea doesn't give up its nukes now, we'll support the nuclear arming of Japan, South Korea, and Taiwan."
That'd go over well with the Commies in Peking.
Oh, and you should see from both game theory and the results of Jimmy Carter's previous negotiations during the Clinton administration with North Korea why negotiating directly with North Korea only is a bad idea. Not that being a bad idea ever stopped John Kerry from opening his yap.
Did you also know that a few days after Kerry went off about North Korea in the first debate, the North Koreans pulled out of negotiations and want to wait until after the US elections? I guess they think they can get a better deal from Kerry.
Is telepathy scientifically possible? ( communicating to another person without any contact - not even eye,but by sheer WILL)
Why does yahoo do this
Once I experimented with letting the agents recognize which "species" they were in and which "species" their opponent was. The runaway winner, of course, was the one which always cooperated with itself, and was less nice to every other species. (In my version, "less nice" meant playing Tit-For-Tat, but the idea's the same.)
Being able to do this is like having the teacher's edition. If recognizing which species other agents belong to is allowed, that's a pretty trivial strategy. It's not called cooperation. It's called xenophobia, or to put it into the most familiar anthropomorphization, racism.
(The life lesson, if I may go out on a limb, is that in an environment where some recognize a quality called "race" and discriminate based on it, being unable to see that quality is a liability. Being truly color-blind means you are unable to recognize not only race but racism, which means you will be taken advantage of.)
When I ran my first tournament and got some interesting results based on this, I realized that knowledge of what "species" an agent belongs to is too powerful, it throws a monkey wrench into the works. So I scrapped it and moved on to stuff I found more interesting.
But the winner of this PD tournament was even craftier; he submitted a ton of entries, all of which were xenophobic in this way, except that they all recognized one "species" as the top dog. The other "species" essentially committed suicide to give the highest score to the top dog. That wouldn't have worked in my tournament, since they literally would have committed suicide (my agents starve to death if they don't score high enough) and that would have shaped the resulting environment. Every tournament is artificial in some way, and the human submitting entries to this one was clever enough to take advantage of these particular artificialities.
Since it's now been shown that inter-agent communication is possible, that's going to be fair game for every tournament from now on. The next step is going to be designing tournaments to work with this trick, not against it. As I wrote to this tournament's organizers:
Hmm, super-cooperation. They are cooperating outside of the problem to achieve a goal outside of the problem. I think this is just a cheat. The programs are not out for personal gain at all and so are not truly participtaing.
Powered by onion juice.
That's right, traitor (hawk) beats TfT in any given trial.
BUT, in an environment made up of a few players playing each strategy, then you have the following matchups:
Hawk vs Hawk. Horrible horrible loss for both of them.
TfT vs Hawk. Hawk wins, but only by a single round.
TfT vs TfT. Both TfT 'win' - neither betray the other.
So, overall, TfT does better than hawk.
The interesting part isn't beating TfT (which, as you point out, isn't THAT hard to do) but in doing consistently better than it against a wide variety of programs. Which is what TfT has long been the baseline for.
Curse these researchers, now black hats will be using this technique to let exploit code escape from chroot prisons!
Tit for Tat is outperformed by "Tit for Two Tats", because it is better at avoiding long runs of damaging mutual recrimination. That was 5 years ago. The performance of any of these strategies is only determined by the opponent strategies that they face, which is arbitrary. It is therefore meaningless to talk of one strategy being 'better' than another - most advanced strategies can beat Tit for Tat given the right opponents.
foo mane padme hum
There are a lot of post about the Southamton programs "cheating" and I agree. However, the interesting part of the prisoner's dilemma, or the zero-sum game is that this is one of those areas where math and philosophy overlap. This math problem yields a facinating insight into society. I would see the "cheating" Southamton method as analogous to a "charismatic leader or organization" who has amassed an array of followers willing to sacrifice themselves for his (or thier) benefit.
Think of people or organizations that have fanatical followers. I think these guys might be on to something quite fascinating.
A goal is a dream with a deadline
After recognition, the two Southampton programs became 'master and slave'
This game will be banned under the Patriot Act. Ashcroft will claim that if Osoma got ahold of these results, he will use it to justify making us all Islamic slaves, with Osoma the Master.
Table-ized A.I.
At first blush it seems like a "so what?", but in the context of current events, I wonder if it does give us a little insight.
As quoted above, we see that there are a few winners who got there only by making other players siphon their own potential into the designated winner.
Does this differ significantly from the US presidential elections? I mean, here we have two people who have convinced a nation that one of these two are the only ones who can be winners, and all of us other players can only playing our game their way. They're ensuring their own success, and for some reason we peons are going along with the game.
Hopefully they affixed plenty of LEDs to the front of their server hardware and forced their developers to use 300bps acoustic couplers.
"How about a nice game of Tit for Tat?"
I can find a similar example. Having a league of an sport that most of the results are draws or very equal.
Someone takes part not with a team, but with 15 teams, 14 of the teams will lose without efford against the 15th, but will fight hard with the other 5 external teams, my team number 15th will have at least 14 victories and may be some of the other 5....easy champion.
That seems what they used, now with that strategy someone could send 1 million entries and will be the winner easily.
How did all the Southampton entries do as a combined aggregate?
I'll willingly become a slave if my partners have nice tits. I call this algorithm tit for tit :-p
Table-ized A.I.
So there it implies that the prisoners dilemma parallels everyday living and society, and yet it is not really an appropriate model for day to day life is it? Maybe I'm missing the point here, it just seems the message is "Being a selfish jerk is better than a nice guy, so live appropriately".
-Don.
Cwm, fjord-bank glyphs vext quiz
Defecting gets you a win .. ..
Copying the other players moves gets you a win
By using this "recognition system", the program is capable of "knowing" in a deterministic fashion what some of the other programs will do in advance.
In other words, at the very least, a cheat.
With a name like BridgeBum, how could I not reply to this? :-)
I'll address the tournament rules of bridge, rather than the Laws of Contract Bridge, which are different.
Basically, the governing bodies in power running tournaments can and do restrict agreements (known as 'conventions') that are allowed between partnerships. However, these restrictions are not so stifling as to only allow known agreements...if that were the case, no invention could happen.
The rules relating to conventions basically fall into two categories:
1) Any agreements you have must be made available to your opponents upon request. There may also be proceedural changes ('Alerts') for agreements which are unusual in nature to help inform the opponents when they should inquire.
2) Any conventions you play are 'categorized' and must fall into the allowed categories for the event. In the US/Canada/Mexico part of the world, they have in general the most restrictive rules about conventions, but even within those frameworks, inovation can and does happen.
Another good general rule: the higher the level of competition (National event, International, etc.), the more liberal the rules are about conventions.
In parts of the world outside North America, the local tournaments tend to be more liberal. Austrailia/New Zealand probably have the loosest rules restrictions of anywhere in the world. (I'm not sure there *are* any restrictions, except for the disclosure rule #1 above.)
More information:
American Contract Bridge League
World Bridge Federation
My UID is the product of 2 primes.
Isn't storing state from one game to the next a form of cheating? I always assumed the process should be stateless.
They "cheated", and the other guy didn't, so they won big! Wasn't that the whole premise?
-Serpent
The results depend greatly on the makeup of the tournament. Even with zero cooperating strategies like Southampton's, it's not hard to select a group of rules such that tit-for-tat loses the round robin tournament.
A much more interesting question is whether a population of S/M rules like Southampton's would survive in an evolutionary simulation (i.e. the population of rule x at iteration t depends on how well rule x did at iteration t-1). I suspect not, since the slave rules will die out quickly, leaving the masters with no victims to leech off of.
The whole concept, contrary to what the wired writeup implies, is very uninteresting. It does not occur in nature or strategic economic/political models since there are no unconditional slave strategies in such systems, i.e. they don't survive and are clearly unfit, in the evolutionary sense of the term.
One easy way to find a "winning" strategy for non-evolutionary round robin simulations, given the other participating rules, is to optimize for a weighted average of the other rules' decisions, creating a "compound rule" that simply consults all other strategies and makes its decision based what the others would do. In a sufficiently diverse set of rules, such a winning compound rule can almost always be found.
VK
IDRIPD (I did research on the iterated prisoner's dilemma)
Seems like this is the perfect application for evolutionary programming...
I've been giving talks on the Prisoner's Dilemma for a few years. (No original research, just following the thing and explaining the game to the Youth)
It is kind of an orthodoxy in the literature: Tit for Tat always ties or loses by a little bit, but in tournaments, it is the best strategy.
Well - it ain't. Someone found a way around it. Instead of urging rule-changes to prevent this new challenger, we should all be happy and excited that PD tournaments have just got MORE INTERESTING.
I can't wait to see what happens next - what new programs will emerge to have the advantages of Tit for Tat but also the ability to defend against Master-Slave programs that communicate with each other.
The game has changed - now let's leave it alone and watch.
God is real unless declared integer
The PD is indeed different than the zero-sum game but they are related. When I said the PD *or* the zero-sum game I wasn't referring to them as synonyms, I was mearly acknowledging thier relationship. I seem to recall that there is a "collective farming" analogy to the zero-sum game.
A goal is a dream with a deadline
I don't like Tit for Tat. How about Tit for Cock?
thats one of the biggest cooperative things..........did asians do it? OR
The western world?
Why does yahoo do this
I always confessed ,and i kept winning.Albert kept losing.I stopped thegame after albert got 300 years and I got none.
Am i missing something?
Why does yahoo do this
I kept reading "The prisoner's dilemma is quite useful in normal life, or at least the drinking game that gives rise to the solution is."
10 PRINT CHR$(205.5+RND(1)); : GOTO 10
Just create one that cheats. You just need to figure out the signature of the other S/M programs, emulate it and exploit it. You know in advance now how its going to respond every time.
Well.. maybe. Or Maybe not. But Definitely not sort of.
Furthermore, in the original Prisoner's Dilemma tournament, there were several rather obvious strategies that would have defeated Tit for Tat in aggregate score, but simply were not submitted.
I read all the visible posts looking for one that talked about evolutionary models and the slaves dying out. Parent is the only one I saw, but I might have missed another. Anyway, Parent is right, long term this slavery strategy doesn't work (just as in human history ...)
On the compound rule strategy, if you know Southampton is taking part, you write a program that takes the first few moves to see if it's opponent is a Southampton Slave and takes advantage if it is. If not, it then plays a couple of cooperations in a row (to fix any recriminations with a tit-for-tat-like opponent) then plays simple tit-for-tat.
My one and only journal entry is about Prisoner's Dilemma (that I wrote during a class on auctions and game theory).
http://slashdot.org/~naoursla/journal
I always thought that the Prisoner's Dilemma was whether to pick up the soap when you drop it in the shower.
My only question is whats Tat and where do I trade it in???
It is perfectly legitimate iterated PD. There is nothing in the rules of iterated PD that says records of moves cannot be kept and acted on. Indeed, that's precisely the idea of the iterated PD as explicitly stated.
Seastead this.
From TFA: The result is that Southampton had the top three performers -- but also a load of utter failures at the bottom of the table who sacrificed themselves for the good of the team.
Effectively, the Southampton group entered a team in a competition that scores individuals, so the "winners" were individual programs that had the backing of many other individual programs.
An alternative, and arguably superior, means of scoring entries would be one in which teams were scored rather than individual programs. In this contest the Southampton team would not do nearly so well.
If individual scoring continues we can expect to see entries that will attempt to recognize the Southampton programs and put a spanner in their works. These programs will be known as "labour unions", "socialist agitators" or "liberals".
The Southamptonites might counter with a raft of programs that attempt to identify the spoilers, but in this environment it will be hard to do much against them, as their activity will be fundamentally aimed at wrecking rather than winning.
The whole thing starts to look depressingly familiar, doesn't it? And all because the scoring system can be gamed by allowing one program to exploit the efforts of others...
Blasphemy is a human right. Blasphemophobia kills.
You cannot ignore kin selection in any attempt to model game theory if that model is to be relevant at all.
Seastead this.
Rabbit Season!
Duck Season!
Rabbit Season!
Duck Season!
Rabbit Season!
Rabbit Season!
Duck Season fire now!!! Pow!!
Pulls beak back around...
wonder if it can classify images in these categories: * Hardcore * Softcore * Positions * Misionary * 69 ... ... ...
"Me claiming Satan exist is just as valid as you claiming an atom exists" - 1inChrist
kin selection.
Seastead this.
The iterated PD is a different game than the PD and it is in fact more "in the spirit" of natural science than is the PD where there can be no memory of what opponents have done. Indeed, the way these guys "abused" the iterated PD is "in the spirit" of natural science since it gets to the heart of something all evolutionary biologists now accept: kin selection.
Seastead this.
While I have to admit this master/slave approach sounds neat the first time you hear about it, I have to break it to the ./ers: it's been done before. Many times.
I remember The Perl Journal sponsoring a PD competition a few years ago, and I think there was an entry (well, a set of entries) like this. [ And that certainly wasn't new even back then. ] Interesting, in the post-mortem for that competition, the judge was disappointed that no-one had tried winning the competition by rewriting the Perl symbol table and substituting in an "always defect" subroutine for your opponent. (The PD version of Orbital Mind Control Lasers, I suppose.)
So, while this is an interesting and fun little news story, I wish it weren't presented as though "'Tit for Tat' has been defeated! This is a first, a breakthrough!!!!!" It's not.
Cheers,
Richard
The omerta, or code of silence, is the ideal that the mob works toward when caught. If you get caught, you simply clam up and take whatever's thrown at you as a point of honor. It is instructive, however, that this of course does not apply universally (everyone knows that the mob is rife with snitches.)
What we call folk wisdom is often no more than a kind of expedient stupidity.-Edward Abbey
Everyone is posting about how this is bogus because it's really not the same game as PD.
But even if you don't agree with that view, another important question is:
in what meaningful sense is this new strategy a "victory"?
After all, it achieves "victory" for half of the cooperators, at the cost of sacrificing the other half.
To use one nuclear-war analogy, it's a choice between strategy "A",
where you acquiesce to the death of half of your populace, with the reward that the remaining populace is completely unaffected --
and strategy "B", with the guaranteed result that no one dies but everyone is injured.
Which populace would *you* choose to join on the eve of war?
And GPL is more or less 'Tit for Tat' in which it will only cooperate with those also cooperating.
I think what will become more interesting is that, now that we know the best lone player (tit for tat) can be defeated by players playing together, can we write our players to look for a player trying to communicate to another player so as to take advantage of it. Can my player play tit for tat against normal players, but, when it sees a S/M player, convince the S/M player to play slave for my gain?
I do security
If omerta is coupled with benefits like good legal representation and protection from the worst elements of prison society, it could still be quite effective. Of course, it is also understood that the reverse is true and that any snitch who walks is a marked man. It does work out to a variation on iterated PD. If the mob becomes known for screwing over those who play by omerta rules, the number of snitches rises.
Isn't defecting sort of like a selfish strategy? Someone might want to be nice and cooperate even though it hurts them. If this really happened and the police were asking me this. I wouldn't just squeal on the other person. I would effectively want a contract that says If I squeal but he does also mine doesn't count. That way I can't lose. If me and my partners agreed that even if we defected we would still use that method it's like a cooperation.
Why don't you guys have friends or journals?
So basically, we're talking about a computerized version of the TV show Survivor? That could be very interesting. I'm betting a good strategy for that game is to create a small alliance which as a unit plays tit-for-tat against the others.
For all of those saying, "Isn't this just cheating?" I say this:
Creativity is "just cheating." Creativity is breaking the rules in a novel way that sheds new light on reality. And isn't that the holy grail of AI?
So, was this just cheating? Hell yes. And it's fantastic.
The "sacrificial" algorithm described in this article means the defector gets 0 years, but the cooperator gets 10 years, for an average of a whopping 5 years. That stinks; the only way to get worse is to do constant mutual defection (for an average of 6 years).
Most PD games I've seen in the past have involved "community", or team, scores. In those situations, TfT has always done well because they get the best possible average score when they meet each other, and the only way to beat TfT is to do chronic defection, which leads to a pretty pyrrhic victory, because your only advantage over the TfT was the one single defection you did at the start.
If I had to live in a PD society, I'd pick a TfT one in a flash!
I agree that this defnition of the "Prisoner's Dilemma" is no more than a "meta-game," and not really a problem of philosophical ethics (though it may appear to be to some people.)
;-) ). I'd say that just because someone committed a crime does not mean they necessarily want to continue committing crimes...
What I find disturbing this is the way that the problem is framed presupposes no underlying system of ethics. To wit....
* If you confess and your partner denies taking part in the crime, you go free and your partner goes to prison for five years. * If your partner confesses and you deny participating in the crime, you go to prison for five years and yor partner goes free. * If you both confess you will serve four years each. * If you both deny taking part in the crime, you both go to prison for two years. What do you do?
How about: Tell the truth? Regardless of what your partner does, tell the truth. I find it disturbing that the problem is framed in a way that the actual truth of the matter is irrelevant. (i.e. the problem would be unchanged if I replaced "You and your partner have committed a crime and are caught" with "You and a friend have been accused of a crime which you may or may not have committed.")
I'm not trolling or off-topic here. I'm dead serious. This formulation of the PD is ethically doomed from the get-go, and thus the results of the experiment may be of interest to mathematical game theorists of this particular game, but I find it unwise to think the results make any significant implications about ethics (or anything else for that matter).
Someone will counter that since this is a "Prisoner's" dilemma the person involved must be a criminal with no "ethical" principles other than an interest in self-preservation (i.e. the person is already debased as can contribute nothing meaningful on the subject of ethics!
As a microbiologist with interest in evolution, I have followed this field from afar for years. Looking over the results, I was surprised at how relatively poorly "Pavlov" (win-stay lose-shift) did, since it performs so strongly in noisy, evolutionany, versions of the game. [see:c gi?hold ing=npg&cmd=Retrieve&db=PubMed&list_uids=8316296&d opt=Abstract
http://www.ncbi.nlm.nih.gov/entrez/query.f
It was also a bit dismaying to see how well "Grim" (hold a grudge forever) did in both games. In evolutionary versions of the game, Pavlov helps keep down the population of "suckers" (thereby decreasing the food supply for more predatory and parasitic strategies) while still rewarding "provokable" cooperators (thereby increasing the total aggregate "reward" of the ecosystem.
Also, one essential part of the payoff structure that deserves emphasis is that the payoff for cooperating has to be more than half the average of the winner and loser's payoff for defection, else one benefits by simply alternating each turn. This is a little bit like the winners did here, where they got the top spots at the cost of a lower total take for their "team". One real world example of slashdot interest where this might make sense is if you take these losses in order to eliminate your rivals from the game and then reap monopoly benefits once you control the game (not to mention any names...).
Maybe someone who has analyzed the results in more detail could comment on how the various well known strategies fared and why.
No, there are four situations in the prisoner's dilemma:
Neither confess
I confess, you don't
You confess, I don't
Both confess
The way that the dilemma works is that each of us is personally better off if we confess. Thus, the greedy solution is that we both confess. The point of omerta is to move us into the neither confess section. In neither confess, both are better off than both confess.
The way that this program works is that it picks one to confess and the other not to confess. This is obviously stupid for the one who does not confess. Since this is a program though, you can get the program to behave that way.
It is the rough equivalent of playing chess by controlling both sides. If you do, it is possible for black to checkmate in two moves. However, this would never happen in a real chess match, because no one would actually make the two stupid moves by white.
If that were true we'd just have a long string of one-off prisoner's dilemmas, with everyone always defecting.
... would like to welcome our new Southampton master overlords.
One of the premise of this dilema is that the defector becomes free when the other cooperates.
.e.g., they didn't have to commit the crime in the first place (Most probably the environment was set up that way by outside factor, such as greed, fun in competition, not enough resources to share, etc.)
In real life, this is not as simple as this, since usually both players are intertwined to each other in some way, such that if one 'dies', then the other free player cannot sustain itself long enough to survive without the 'dead' (or jailed) player.
I've seen many theories behind evolutionary survival that doesn't take this account.
I believe species will only go through this dilemma if there isn't other choices available,
The real solution is not to restrict to these choices, but to make sure it doesn't come to these choices in the first place.
The game would be more interesting if we incorporate some type of penalty or reward based on survival of the whole, not just one player.
Surviving diversity is a progress, while surviving of restricted few is doom to total annihilation.
TFT is the equivalent of "No Child Gets Ahead" If you are content to take a random walk in life go for it. Don't be fooled by people telling you the "optimum" score per round is 6pts for cooperation as that only applies to pairs. Without much effort I designed a strategy "casino" that provided randomized "rewards and punishment" for defection. Against someone with a probability of cheating its personal average could go above 3.0 (6.0/2) Naturally this was at the expense of the cheater having to pay back a lot more than was won. On average my strategy did better against cheaters/randomizing algorithms than TfT. Against TfT it was a draw since my algorithm never cheats first. As long as there were cheaters my algorithm would invade any size TFT.
To make the invasion work my randomizer caused some to "Win Big" These big winners would spawn generations of cheaters usually at the expense of TFT which the cheaters bring the average down on quickly. Thus increasing my strategy's advantage over TFT.
This applies to the real world. Casinos and lotteries use the same principle. Give someone a 1:40,000,000 chance of becoming the richest person around. If there are 300,000,000 people in the population it is worth the negative expectation of buying the lotto ticket. Your odds of being the richest by not playing is 1:300,000,000 Your odds of being the richest by playing 1:40,000,000. Of course, the house, as always is the big winner.
Getting back to IPD, the "secret" to beating TfT is to encourage as much cheating as possible in the general population. By making your personal strategy cooperate until cheated upon you will always be doing as-well or better on average than any TfT player.
Enough Grim in the simulation will help keep the TFT from being eliminated by the cheaters early on. Grim's survival was always poor as it could never re-attained cooperation with the occasional cheaters which is needed for long-term viability.
While entering a team into a tournament scored for individuals and then sacrificing the whole team for one player is by no means a new idea, what makes it so remarkably successfull here is the existance of a "guaranteed draw" strategy (in this case, always defect). The best individual response to "always defect" is to defect yourself, anything else is a suicide, so if you always defect you can force a draw. Then all your team loses to one team member, and he is the winner.
:) But now we can gauge any strategy: enter one player or a team, recognize your own team members or not, transfer money between team members as you wish, but can you make money, overall, from this tournament?
Compare this with, for example, a chess tournament. You could secretly enter a team and have them all lose to you. While this will keep you from ending last, it won't assure victory, unless all players are roughly equal. If there is a very strong player, he'll win against all your team, yourself included. So you can cheat by redistributing players of comparable strenghts, but at least you can't rob a clear champion of his deserved victory.
This is not the case in the PD tournament. But let's redefine the problem slightly: say, if both sides cooperate, each gets a dollar. If then defect, each pays a dollar. Sucker's reward is paying 10 dollars. Now the Southampton team's strategy boils down to using the tournament to give all their money to one player, while paying a hefty tax in the process. There is a cheaper way to do this, just give all money to one guy outside the tournament
The omerta insists that someone up the scale is going to be better off if you sacrifice yourself. The implication is that someday it might be you up there, and that you have to rely on the lower folks burning themselves to better your chances.
What we call folk wisdom is often no more than a kind of expedient stupidity.-Edward Abbey
Optimistic Tit-for-Tat models human behaviour well in a social setting--we give others the benefit of the doubt, and continue to cooperate when others do. When someone violates our trust, we stop trusting them and punish them, but if they act beneficially towards us again, we might be willing to forgive. Most notable, OTFT produces the best overall score, which in competition between social groups is the deciding factor.
The Southampton strategy is dependent upon large numbers of people who will sacrifice all for the good of the other, and not for the good of their community (the collective performance is worse than OTFT.) I can see sacrifice for the greater good, but this is sacrifice to another person without hope of recompensation or an increase in general wellbeing. This does happen in human societies (I think it's happening now in some political systems), but only when the winner has managed to convince the losers that its all in everyone's best interest. What Southampton has added to this mix is a capacity for extreme self-delusion that directly contravenes the economic assumption of informed choice and self-interest. For purposes of economic modelling, Southampton should probably be disqualified, or these assumptions dropped. But this should also tell you something about what could happen to those nice economic models when they hit the messy world of human beings, who for the most part aren't very informed and often work against their own best interests as a result.
The consequence for a societal group running Southhampton against an OTFT group would be the defeat of the Southhampton group every time. Selection works at individual AND group levels. So the challenge should probably be two-tier: run the programs individually against each other, and run them as tribes against each other.
Tit for Tat? I don't know the Bible very well, but wasn't it Eye for Eye, Tit for Tit?
Sincerely,
Pan Tarhei Hosé, PhD.
"Homo sum et cogito ergo odi profanum vulgus et libido."
I saw this article, and immediately thought of this poster. Apparently, truer words were never spoken.
-Hentai [in vita non pacem est]
Whining about whether an innovative strategy is "fair" is merely demonstrates the loser's desire to bring the winner down to his own level of incompetence. Rather than to improve his own game.
If the rules allow it, it's legal. Fair has nothing to do with it.
The players acted within the rules.
Some of them won the game; others lost. That's why it's called a competition.
Some losers didn't understand the rules as well as the winners did.
That's why they're losers. They didn't win.
Sucks to lose, doesn't it? Now, go home and apply what you learned. Maybe next time you can win.
Just to satisfy your apparent desire to derive social and perhaps even moral implications from this competition, I suggest you read some of the other comments to the article. You will note that in this case, we can apply lessons from the simulation to everyday life. The concept here is called "taking one for the team".
"Reality is that which, when you stop believing in it, it doesn't go away." - Philip K. Dick
So, essentially, the winning program(s) hacked (or exploited, if you prefer) the game in order to win ? That's pretty clever, but does this count as a true victory ? It's sort of like what Captain Kirk did to rig his Kobiyashi Maru scenario. Sure, he won on a technicality, but in doing so he missed the whole point of the challenge.
>|<*:=
NOS ES TURBATUS UT RECOLLIGO VICTUS PRO NOSTRUM CANI... It makes us angry to collect the food for our dogs???????????
always defect, your bound to win or tie.
If you get caught, you simply clam up and take whatever's thrown at you as a point of honor.
I figured that was less to do with honour and more to do with not getting killed.
Something slightly different would work much better, I believe, at least if our goal is to maximize total for your group instead of just one instantiation of your program.
;)
Specifically... it's nice to have 100 slaves starve themselves and work for you, where you get 5 and they get 0. If you intend on letting your slaves starve and then die, then you've just done very very well for yourself.
If you and your slave cooperated, however, you'd each get 3, a total of 6, which is greater than 5 of course.
So perhaps the best thing to do is cooperate with your slaves for a total of 6, and then take some away for yourself. Then you don't have to waste 2 going to find more slaves
Or... use propaganda to convince your slaves to give you 2 of their 3 from the cooperation because you need to use those 2 to keep them safe from terrorists. Or call it "taxes", or tithe, or some such.
In this way you'll be doing right by the group while still being an exploitative bastard of an overlord, and your people will love you for it. Don't outright screw your group by not cooperating... cooperate with them! Then take the benefits.
Easy:
And for that matter, what is a 'tat'? Whatever a 'tat' is, I doubt anyone will trade its own tits to get it.
Is this that important? I know I wrote a Prisoner's Dilemma bot back in 2000 that could consistently beat Tit-for-Tat, and my classmates wrote bots that consistently beat mine. I had thought Tit-for-Tat was just the champion of a particularly famous PD tournament.
Had a friend who took 2nd in a core war competion
due to someone else who took this kind of statagy.
You could submit as many candidates as you like.
The winner submitted a bunch who were identical
except for one which had the "queen bee" flag set.
The drones could determine when they were battling
against the "queen bee", and if so, would go belly up.
Tit for Tat tends not to be an optimal strategy in most populations. But it is a very robust one; it works well in many different populations. What has happened previous times algorithms designed specifically to beat tit-for-tat were introduced was that they weren't as good against other such algorithms as tit-for-tat was (they weren't as robust), and tit-for-tat ended up winning anyway.
where is this "tat" and how can I get me some?
If you want to attract women who admire tattoos ("tit for tat"), start here and then replace fort wayne IN with your city and state/province. However, to satisfy most employers' policies, make sure to get your tat where a work uniform would ordinarily cover it up.
You can get a tat (short for tattoo) by opening the yellow pages and turning to "tattooing". Then hang out in bars, show women your new tat, and if you're lucky, they may show you their...
Just make sure you get a 13 and not a 31.
The NBA Finals isn't scored by total points. Nor is the World Series or the US Presidential election. Given a population of Hawks and TfTs, which would win more games?
You see the same thing happen constantly in games like Magic the Gathering. There'll be a "hot new deck" archetype out there dominating the tournaments. Then, out of the blue, someone with an otherwise lousy deck designed specifically to beat that dominant deck takes a few high profile tournaments.
Everyone scrambles to retool their decks to beat the counter-deck and the process begins anew. Usually, a format will eventually reach some sort of equilibrium...
It's basically evolution. Adapt your deck to the environment or lose.
Here is an excellent Prisoner's Dilemma book that deals not only with the Prisoner's Dilemma itself but with all of mathematical game theory, and connects it all to real-world situations in fascinating ways. It really changed the way I look at these things... now I can see Prisoner's Dilemmas everywhere.
For example: Some of my teachers at school favor a lenient teaching style, going easy on actual work, giving troublemakers benefit of the doubt, and so on. This could be seen as "cooperation", where other teachers "defect" by assuming that kids are always trying to do the least work for the most benefit and get away with it, and so are strict and give lots of work. The students "cooperate" by not taking advantage of the curriculum, not cutting corners, doing work thoroughly, not cheating on tests, participating more in class, etc. They "defect" by trying to get away by doing as little work as possible.
Now, the greatest benefit for all involved is mutual cooperation; a lenient teaching style combined with students who won't take advantage of that lenience. However, I've noticed that most teachers (primarily the young, naïve ones) choose to cooperate, and the kids respond by defecting; given an opportunity, they'll slack. Older, more experienced teachers defect by being strict from the start, and students who cooperate by doing work thoughtfully and thoroughly get screwed and fall behind those who cut corners and cheat (the defectors).
It's not an exact parallel, since you can generally tell what the other side is doing, but it's close, and reasonable too.
So the point is, read the book. It's good.
Love the Third Amendment?
I think the human reaction to the way this program behaves is more interesting that the actual contest.
"The program cheated!"
"There's no rule saying it couldn't do what it did!"
etc.
One method of discouraging this strategy would be to run the tournament based on a ladder system rather than a round robin: the slaves fall to the bottom, where they will never interact with the masters again.
I don't need a perl simulation to evaluate the strategies. Online gaming does it nicely for me.
... and the problem with those games is the very few quantity of players it results in. Makes it harder to play a game.
There are always team killers (defectors) in most online games eg. half-life/far cry/call of duty.
The most successful strategy in discouraging defections in those games is TFT (making team killers sit out after a kill).
The M/S strategy is found in clans, clan games or password protected games.
Does it make it easier that there is secrecy? The programs don't know the other programs strategies.
Spatialization vs Round Robin of Iterated PD actualy can yield vastly different outcomes for identical 'worlds'. And my gut feeling is that these colluding strategies would not do nearly as well when they can only interact with a limited number of strategies in it's field of vision.
In our group's research we've been using Spatialized IPD among other projects, to come to look at prejudice reduction, which is a real-world version of fair-play and cooperation version betrayal.
There are some papers here for anyone interested. Some of the results are simple and yet very striking.
http://www.computationalphilosophy.org/
SO fastforward to what we have today, a strategy where we create 500bazillion smaller firms all for the purpose of going bankraupt, so that one parent firm makes all the money. Huh? Right, that doesn't make any sense.
Billions of people work themselves to death so that the wealthiest top 0.1% can become even wealthier.
I know! It's absurd!
On the duplication algorithm. The worker bees may die but the queen bee might be able to replace them fast enough and with the benefit of pooling resources, grow faster than competing algorithms.
Aren't you just hacking the system by the creation of an insurance benefits system, So that if a Slave "takes one for the team" they receive a greater reward then the punishment given by the game. In this case the benefit being the winning of the competition. With the flaw of this system being that the size of such a group would be limited based on the reward given by wining if you limited everything to the game. But this being Ok since everyone else would be playing fair or tit for tat.
Or in laymen's terms creating.
Then how do you get around the problem of trust?
If you have a secret society then doesn't that imply some sort of hierarchy? For example, like Slashdot, something as arbitrary as a number given based on the order in which member join, So that you have some way of working out who the masters and slaves are going to be in a given situation (once you identify them), So what eventually happens is that a low ranking member of the society is going be 100% sure of how a high ranker is going to vote while a high ranker will never be sure of whether a low ranker is going to sell them out (after all they lose either way). With such a clear depreciation of trust doesn't any advantage disappear?
Of course, you could get around this problem by making everything equal with everyone just picking balls out of a hat. But then aren't you just replacing one game with another? How are you suppose to have this game when everyone isn't meant to know each other? And in the ideal situation of total self-interest at play, then why would the losers of such a lottery not take the advantage screwing the high rankers? After all why do the high rankers think lowers are going to play fair even when - by definition - they are already creating. Hell how are you suppose to know if the other player is even in the secret society or that they maybe just some dude who got lucky in the bidding process.
It doesn't make sense.
In the second prisoner's dilemma, you hsould not be still at 2/3 since the warden just stated that you and b will be parolled. The stating of the question as who else besides me will be parolled assumes that you are one of the people being parolled. Unless the warden is a real dick, then you should be able to assume that you are included as one of the two that will be making parole.
TFT is based on the fundamental principal that cooperative behavior is more effective than selfish behavior, (so long as people have the brains and back-bone to face defectors when they show up.)
Southampton's new program is simply another aspect of cooperative behavior. The apparent superiority stems from the fact that the it is utilizing the energy potential of two deliberate cooperators rather than a single 'smart' program trying to shepherd a selfish program into cooperative behavior. Or to put it another way, Southampton uses two programs which are working in a co-linear fashion toward the same goal, while the original TFT program must work by itself toward that goal with the equivalent of a retarded child who, with effort, can sometimes assist, but who will also put up annoying resistance from time to time.
Saying that Southampton's program 'beats' the original TFT program is like saying, Cooperative Behavior 'trumps' Cooperative Behavior, which curiously is itself an example of self-serving terminology. Cooperators don't think in terms of 'winning' against other units within the same system. (Good guys don't play chess.)
In any case. . . imagine now three programs which are able to work together to maximize energy potentials. Or four. The amazing part is that the harvested energy potentials don't grow in a linear fashion. They have the ability to grow in a geometric curve. Hmm! (One might think of Groklaw, which has the ability to do thousands of hours of legal research on levels which even a large corporation cannot afford to match. Service to Others v.s. Service to Self.)
The logic presented by these rudimentary programs, if built upon, can become the most sensible energy management system that can ultimately exist; that is, a cell-type network of other-serving individuals willing to give whenever needed, however needed, and who can be assured of being given to when in need themselves.
Naturally, such a system requires that every member be two things;
1) Willing to co-operate fully and without selfish tendency. And,
2) Mastering the 'Tat'. --That is, overcoming our social programming so that we can in fact be brutal when we are being brutalized.
Cutting off a selfish defector when their defection becomes apparent is clearly one of the most difficult things in this world for people to do. If we could bring ourselves to not "Turn the Other Cheek" and to not, "Forgive and Forget", and generally not "Die on the Cross," then things like Enron and rogue presidents could have been easily recognized early on, (when the first business irregularities showed up in the energy biz, or when Shrub was blowing up frogs with fire-crackers), and have been prevented before they ever got rolling. If people had a higher level of awareness, stronger back-bones and less indoctrination, then society would work, I think, with far fewer wars and psychopathic leaders.
But anyway, my point is that such systems CAN and do work in our contemporary environment, albeit, in a limited fashion. On the large scale, though. . , things founder partly because the selfish have taken too much territory and the spineless (who are also often the mindless), are so massive in number. As such, a sensible system is going to be log-jammed with issues of self-service and self-destruction.
But then, I largely see this world as a school whereby people learn the hard way that selfishness destroys, and that the only road out comes through dropping those aspects of ourselves and growing aware, strong and courageous.
-FL
A certain code injected into a computer system causes it to become a slave of the original master computer. Then the slave works for the master.
Last year I co-authored a paper - Covert Channels for Collusion in Online Computer Games (PDF 151K) which dealt with a similar subject. Rather than IPD, it deals with a Connect-4 competition, but many of the ideas are the same.
It also discusses the link between communication in games like this and the concern of covert channels in (generally military) multi-level secure systems. Another interesting area is the link between these types of competitions and voting algorithms, since they may be a good way of designing collusion resistant competitions, or proving that they are impossible.
Steven Murdoch.
web: http://www.cl.cam.ac.uk/users/sjm217/
The competition however involves your facing an unknown strategy. What they did was construct a profile of strategies which reach a high payoff, and tweak it to allow them to identify each other. The profile of strategies is not even Nash; they did not prove anything, they just fooled the evaluation method.
Sidenote: notes on the repeated PD.
1) If the game lasts one single step, whatever your opponent does it is always better for you to defect. So you defect and so does your opponent.
2) If the game lasts 10542 steps, you know that at the 10542-th step you and your opponent will both defect. So there is no point in cooperating at stage 10541, so you also both defect. And so on. Thus the only sustainable combination of strategies (= Nash) is to defect from day one.
3) If the game has unknown length or is infinite, then cooperating becomes sustainable. Actually any payoff in the convex hull is a Nash payoff.
This is with perfectly rational players; real world players are not however.
Now have finite state automata play the repeated prisoner's dilemma, and define their "size" as their number of states. A finite state automaton of size n can not "count" up to n+1; then even in the finitely repeated PD, if its length is bigger than both lengthes then cooperation becomes sustainable. The actual result (due to A. Neyman http://ratio.huji.ac.il/dp/dp69.pdf, Th1 p9) is that as soon as _one_ of both players is approximately not larger than the exponential of the length of the game, then any payoff in the convex hull of rational payoffs can be approximated.
Similar tight results for push-back automata or Turing machines of bounded Kolmogorov complexity are unknown yet.
The interesting question is to design a Nash pair of strategies which reach the highest payoff but do so with a limited number of allowed lines of code (= Kolmogorov complexity). This is definetely no trivial problem: even if I claim to always cooperate, once I know that my opponent is dumb it may be easy (?) for me to pretend to cooperate but later betray him nonetheless ...
The original prisoners' dilemma is really about honor among thieves: Criminals do better in a cooprating group like the Mafia than as individuals who will sell each other out at the first opportunity.
But when applied to evolutionary theory, it's talking about genes, which on their own don't have any ethics (or other motivation). It shows that organisms which follow a "tit-for-tat" rule are more likley to survive and reproduce than those that follow some other strategy.
Eventually, you get a population where nearly every organism is folowing "tit-for-tat." Add consciousness and empathy, and you get the Golden Rule.
If you are interested in a more-challenging version of the iterated prisoners' dilemma, one that allows signalling and coalitions without shills, try the generalized PD. The citation is:
Fader, Peter and John R. Hauser (1988), "Implicit Coalitions in a Generalized Prisoner's Dilemma," Journal of Conflict Resolution, Vol. 32, No. 3, (September), 553-582.
The authors are now on the Wharton and MIT Sloan faculties, respectively.
The next logical step in this is likely to be a Tit for Tat program which starts off by imitating the "code" of the Master Slave, identifying itself as the Master. If the other program does not play along after a certain number of moves, it makes a cooperate move and ignores the move that the other program makes, in order to "seed" a Tit for Tat. Then it goes into Tit for Tat mode.
This program would take slight losses in some cases, but would likely come out ahead of Southampton due to its ability to cooperate with Tit for Tat.
In fact, I'm surprised that Southampton didn't choose to start their "code" with cooperations to test for Tit for Tat first. Southamptons entries could make a suspicious defect in response to the cooperate at some point to initiate the code. The "Master" entries could have had a higher score by cooperating with any entries that always cooperate, such as Tit for Tat will as long as it is not betrayed.
// harborpirate
// Slashbots off the starboard bow!