CS2514 Lab 5

Background information

The purpose of this exercise is for you to put into practice some of the most recent things that were covered in the lectures: class hierarchies, inheritance, polymorphism, dynamic method binding, etc.

You are going to write a program that plays a simple 'game'. The 'game' is called Prisoner's Dilemma. The game seems unbelievably simple, but it has generated huge quantities of research. It's a game whose principles may have wide-ranging explanatory power. For example, some biologists believe "that many wild animals and plants are engaged in ceaseless games of Prisoner's Dilemma". (See chapter 12 of Richard Dawkins' The Selfish Gene (2nd edn.) for a clear and fascinating explanation.) The game may also explain aspects of economic, social and political behaviour.

Here's the game.

You and your opponent are prisoners, suspected of collaborating in a crime. You have no way of communicating with each other. Your gaolers offer both of you a choice of two actions: Cooperate or Defect. Here are the outcomes (in terms of points scored) of the different pairs of actions:

    What your opponent does
    Cooperate Defect
What you do Cooperate 3 0
Defect 5 1

So if you both choose the Cooperate action, you both get 3 points. If you choose the Cooperate action but your opponent chooses to Defect, then you get 0 and your opponent gets 5. And vice versa: if you Defect and your opponent Cooperates, you get 5 and s/he gets 0. If you both Defect, you both get 1 point.

There is only one rational way to play this game: always Defect. (If you think your opponent will Cooperate, then you can get a higher score by choosing Defect. If you think your opponent will Defect, the best you can do is also to Defect.) Yet, and this is the dilemma, if you had both Cooperated, each of you would have done better! If only you could have reached a trust-worthy agreement...

As the game stands, it's only mildly interesting: as we've seen, rational players must both Defect. But there's another version of the game called Iterated Prisoner's Dilemma. In this version, you simply play the game repeatedly, an indefinite number of times. "The successive rounds of the game give us the opportunity to build up trust or mistrust, to reciprocate or placate, forgive or avenge." (Dawkins, p.206)

Now strategies other than 'always Defect' might be worth playing. Consider a prisoner who plays the strategy known as 'nice_tit_for_tat'. In this strategy, the prisoner's first move is Cooperate. On subsequent bouts, this prisoner simply copies the previous move of the other prisoner.

Suppose two prisoners both playing 'nice_tit_for_tat' meet. They both start by Cooperating, and then they both copy each other's previous move. So they end up always Cooperating. And so they score n * 3, where n is the number of bouts they play. This is better than both always Defecting, which would have given them a score of n * 1.

In this exercise, we'll build a system that would allow a biologist, economist or political scientist to experiment with different strategies, with a view to seeing how and whether cooperation can emerge. (Our system is not unlike the system that was built by one of the pioneers of the field: Robert Axelrod.)

The Task

I suggest you read the entire exercise before writing any code. Think about your class hierarchy. Discuss with me or one of my helpers before proceeding, if you want.

Prisoner

A Prisoner will have a name and a total score. In each bout of the game that the prisoner plays, s/he will have to decide whether to Cooperate or Defect. There are several types of prisoner, each playing different strategies:

  • rand: a prisoner who plays this strategy chooses randomly between Cooperate and Defect, each having a probability of 0.5.
  • nice_tit_for_tat: see above.
  • nasty_tit-for_tat: as per nice_tit_for_tat but the first move is Defect.
  • soft_majo: a prisoner who plays this strategy will choose his/her opponent's majority move, i.e. if the majority of the opponent's moves so far have been Cooperates, then Cooperate is chosen; if the majority have been Defects, then Defect is chosen; if it's a tie, then Cooperate is chosen.
  • hard_majo: as per soft_majo but, in the event of a tie, Defect is chosen.

Bouts and Round Robin Tournaments

A bout is a single encounter between two prisoners. In a bout, each prisoner chooses a move. The move is scored using the table given earlier. Each prisoner must then record his/her score.

In a round robin tournament several prisoners play each other repeatedly.

Each prisoner is paired with each other prisoner (excluding itself). The pair then play a game of Iterated Prisoner's Dilemma, lasting n iterations. They then move on to play the next prisoner. (Eventually every prisoner will have gone n bouts with every other prisoner.)

Finally, you print out a league table that shows how well the different strategies have performed. Here's the league table that I got on one occasion:

nice_tit_for_tat: 2042
nasty_tit_for_tat: 1636
rand: 1682
soft_majo: 2162
hard_majo: 1587
Because one of the prisoners has a random strategy, the league table will be different each time you run the program.

Your main method

As you have seen, it is your job to decide exactly what classes your program needs. But there must be one called RoundRobinTester and this one should contain nothing except the main method, which shoud run a tournament between the five prisoners listed earlier and display a league table.

Discussion

Here's the fascinating thing. It is very very likely that the more cooperative strategies (nice_tit_for_tat and soft_majo) will beat the less cooperative strategies (nasty_tit_for_tat and hard_majo). (rand could come anywhere in the league table.)

When you think about this, it's remarkable! The rational thing to do when playing Prisoner's Dilemma is always Defect. But, when playing Iterated Prisoner's Dilemma, while you mustn't be a total sucker, a degree of selective cooperation pays off!

If you're interested enough to read more, here are a few links:

There are numerous other sites and papers on this topic. Put the topic through your search engine if you want to find out more. (Be prepared to be overwhelmed.)

Submission

Deadline: 4pm, Friday 11th March 2016.

Put all your class definitions into a directory called lab5. To submit:

Challenge exercise

Remember that challenge exercises are always optional. They do not form part of your year's work and they are not worth any marks. They are designed for those students who finish the main exercise quickly and easily, and wish to explore further. You may need to use things not covered so far in lectures.

Here are two ideas.

  1. Extend your program and experiment with it.
    • If you look at the literature, you'll find lots more strategies that you could implement. A strategy called pavlov, for example, Cooperates only if both prisoners made the same move in the previous bout. Periodic strategies, such as repeatedly playing C,C,D or C,D or D,D,C, have also been suggested. (If you implement the periodic strategies, try to do so in a general way, rather than requiring one class definition per periodic strategy.)
    • In Axelrod's original formulation, a prisoner could play against him/herself. I excluded this possibility above. Make a change to your program to handle this. It's not as easy as you think because, when a prisoner plays him/herself, we don't want him/her to receive both sets of points.
  2. There is an alternative to Round Robin Tournaments. Let's refer to the alternative as an Evolution Tournament.

    In an evolution tournament, there is an initial population of prisoners. Maybe the initial population contains m prisoners per strategy, for some positive integer m. Call this generation 0.

    The prisoners in generation 0 play each other in round robin fashion.

    At the end of this tournament, a new population of prisoners, generation 1, is created. It contains the same number of prisoners as in generation 0. But, points means offspring! So the number of nice_tit_for_tats in generation 1 will depend on the tit_for_tats' scores in generation 0; and similarly for each of the other strategies.

    Repeat this for a large number of generations. What kind of population mix do you end up with? Do the nice guys come out on top? Does the population stabilise (i.e. no changes in the make-up of successive generations)?