Iteration, Feedback, and Change: Prisoner's Dilemma

Published on Thursday, June 02, 2011 in , , , , ,

Dylan Oliphant's prison cells pictureNow we've seen the amazing power of complexity arising from simplicity, and learned how it was primarily the study of evolution that made this power part of our mental toolbox. You've also seen the effect that Haeckel, Sumner, and Marx had when applying this to society at large.

There's still the problem of cooperation. Why did early man, and even many animals, ever come around to cooperating in the search for food and other resources?

If you'd asked an astronomer before Copernicus about their field, they would've explained that astronomy was the study of how the planets and the sun revolved around the Earth, in order to develop a more accurate calendar. The very definition prevented most researchers from even considering that the Earth might go around the sun, instead.

Similarly, if you'd asked a biologist about cooperation before 1960, you'd most likely get the reply that cooperation wasn't even something a biologist would study. It was outside their field. Biology was so intertwined with the concept of “survival of the fittest”, that cooperation wasn't something they'd consider too much.

Often misunderstood, especially in attempts to apply evolutionary principles to society, was the fact that “fittest” didn't refer to the most physically fit, but rather those organisms that were best adapted to change.

So what happened that suddenly changed our minds about the nature of cooperation? The development of a new field in mathematics called game theory. Game theory really gets its name from studying the strategies of multiple players who are working to win a game, and who may even use deceit to try and win (such as bluffing in poker). Since it studies behaviors and decisions, it can apply to other things like economics, politics and more. Similarly, geometry started out as a way to analyze real estate, but isn't limited to real estate problems.

John von Neumann, along with Oskar Morgenstern, had been developing effective tools to analyze strategies and decisions since the 1920s, and in 1944, during World War II, they published a major work on the topic, titled Theory of Games and Economic Behaviour.

Think about the time at which this is going on. A year after Neumann's and Morgenstern's work was released, Nazi Germany surrenders, and Japan surrenders after the US drops atomic bombs on Hiroshima and Nagasaki. The US has become the only atomic superpower, and expects to remain so for quite some time. This all changes just 4 years later, on August 29, 1949, when the Soviet Union tests its first nuclear weapon.

Now that there were two nuclear superpowers, the US had to consider its options very carefully, as did the Soviet Union. With knowledge of the tools of game theory, John von Neumann and many other top minds initially supported the idea of “preemptive war”, the idea that America should attack first without provocation, so as to have the best chance at winning a nuclear war.

Just a few months later, however, in January 1950, Merrill Flood and Melvin Dresher turned game theory upside-down when they proposed a game that would be popularized as the prisoner's dilemma.

The situation is set up like this:

Imagine you and another person are arrested by the police. The police don't have sufficient evidence for a conviction, so the police split you and this other person up into different rooms. They offer you and the other person the same deal:

• If both you and the other person stay silent, both of you will wind up spending 6 months in jail (the police have enough evidence for a lesser charge).
• If both you and the other person agree to testify against each other, you will both spend the next 2 years in jail.
• If one agrees to testify for the prosecution, and the other person remains silent, the one who testifies will go free, while the one who remains silent will go to jail for 10 years.

What makes this situation so interesting is that the best outcome for both players is to cooperate, but since each player has to make a decision without knowledge of the other player's decision, the most rational choice is for each player to talk. The most rational choice, however, winds up in a less-than-ideal result for both players.

Further confounding things was the fact that, when the same two people played this repeatedly, it wasn't uncommon to see frequent cooperation among the two people. If the most rational choice in a single play is to rat out your partner, why does cooperation show up so much when its played repeatedly?

Originally, the prisoner's dilemma was considered a paradox to be solved. The applications to the immediate nuclear situation was quite clear, so there was obviously great interest in analyzing this game. We'd quickly learn that this was more a reflection of much of the human condition, instead.

In the following documentary, Nice Guys Finish First, you'll discover the surprising lessons and applications of the prisoner's dilemma. Especially surprising is what happened in 1980, when a concentrated search for the most effective strategy for the iterated prisoner's dilemma began.

To learn more about the prisoner's dilemma, I also suggest watching For All Practical Purposes: Prisoners Dilemma and Mathematics Illuminated: Game Theory as clear and excellent resources.

In his book The Evolution of Cooperation, author Robert Axelrod analyzed all the top-scoring strategies, and found the following four qualities to be essential and effective.
NICE: By far, the most important quality was that the strategy start out by cooperating with the other player. Ironically, this winds up being a purely self-interested play for purely self-interested reasons, even though the “survival of the fittest” principle would lead you to expect otherwise.

RETALIATING: A successful strategy will always betray the other person after it has been betrayed, to show that there will be a price for betrayal.

FORGIVENESS: When an opponent returns to cooperation, an effective strategy will also return to cooperation. The balance of forgiveness and retaliation helps establish that a player can be trusted.

NON-ENVIOUS: Finally, a strategy shouldn't strive to outscore its opponent's strategy. Once this happens, the effectiveness of the other qualities become diluted.
That's why the “Tit For Tat” strategy was so effective. It combined these qualities more effectively than any of the other strategies. Interestingly, even people trying to cheat in these prisoner's dilemma competitions have found that “Tit For Tat” is still more effective (usually because cheating strategies aren't non-envious).

The first time the 200-game tournament was held, it was a complete surprise that “Tit For Tat” was the dominant strategy. The second time, programmers developed their strategies to beat “Tit For Tat”, usually with variations like “Two Tits For Tat” (Defect twice after any defection), “Tit For Two Tats” (Ignore isolated defections, and punish only after two consecutive defections), and “Almost Tit For Tat” (Tit for Tat that throws in a random defection occasionally). The original “Tit For Tat” strategy still came out on top!

For the third competition, there was a new an interesting twist. Several 200-game tournaments would be played. The first would be played as originally, with 1 copy of each strategy in play. After 200 games, the points would be displayed, and strategies with 0 points would be eliminated in the next round. All surviving strategies would then be played based on their score in that first round. For example, if “Tit For Tat” scored 15% of all the points in the first round, then 15% of the strategies in the next round would be “Tit For Tat”, and a poor strategy only score 1% of the points, then only 1% of the strategies in that next round would be that poor strategy.

This gave the game a new dimension, similar to the “artificial selection” used in Conway's Game of Life. What happened? In the earlier rounds, the weaker strategies died off quickly, while “Tit For Tat” and some more exploitative strategies, such as “Always Defect” survived (which always did great against the very weak “Always Cooperate”).

After several more generations of weak strategies dying off, something surprising began to happen. The predatory strategies began to die off because the weaker strategies off of which they fed became rarer and rarer. “Tit For Tat” going up against the surviving “Always Defect” patterns would wind up always defecting too, minimizing the advantage of the latter strategy.

Eventually, “Tit For Tat” wound up dominating the game yet again. With this natural selection addition, you could dramatically see the pressure it puts on alternative strategies to cooperate.

For a more detailed course in game theory in general, I recommend William Spaniel's Game Theory 101 site, Ben Polak's Game Theory Course at Yale, as well as the documentary about Josh Nash, A Brilliant Madness, which can help clear up any misconceptions you may have picked up from watching A Beautiful Mind. You also might be surprised at the amount of game theory references in pop culture.

At this point, we not only have nature demonstrating how life behaves and interacts, but a mathematical understanding that gave us a more complete picture of the principles and results.

As I've repeatedly said, game theory, and especially the prisoner's dilemma, are more wide-ranging than just simple games. The question then becomes, how can we use the amazing powers of iteration, feedback, and change to improve our lives as we move into the future?

The answer required development of yet another new branch of mathematics, and is the topic of the next post in this series.

Spread The Love, Share Our Article

Related Posts

Post Details

No Response to "Iteration, Feedback, and Change: Prisoner's Dilemma"