Data Scientist - Benjamin Tovar

100 Prisoners problem, 100 lines of code, a game theory example in R




100 Prisoners problem, 100 lines of code, a game theory example in R


was looking for some Game-Theory post in R-bloggers and found this very cool post, obviously I decided to read the post, to look at the code and implement a version of it.

As Matt Asher describes in his original post:

In this game, the warden places 100 numbers in 100 boxes, at random with equal probability that any number will be in any box. Each convict is assigned a number. One by one they enter the room with the boxes, and try to find their corresponding number. They can open up to 50 different boxes. Once they either find their number or fail, they move on to a different room and all of the boxes are returned to exactly how they were before the prisoner entered the room. The prisoners can communicate with each other before the game begins, but as soon as it starts they have no way to signal to each other. The warden is requiring that all 100 prisoners find their numbers, otherwise she will force them to listen to hundreds of hours of non-stop, loud rock musician interviews. Can they avoid this fate?

The first thing you might notice is that if every prisoner opens 50 boxes at random, they will have a 0.5 probability of finding their number. There’s actually a strategy that can guarantee all the prisoners to get their numbers. For this post, we will call that strategy, model, and the strategy with no strategy at all, we will call it random.model. The model strategy consist in that at the beginning, each prisoner will open the box based on their number, for example, the prisoner 42 will open the box at the position 42, after open it, say it contains the number 12, and given that 42 is not equal to 12, he will have to open the box at position 12. This will happen until he had opened 50 boxes or found his number. The random.model strategy consist in no strategy at all, only opening boxes at random until he had opened 50 boxes or found his number. Every time a prisoner found his number we will call this event a success. Our interest is to find the strategy that maximizes the success rate. We will run each simulation 1000 times.


Let’s take a look at the performance of each model individually with this histogram:

histograms1We can see that the random.model behaves as we were expecting assuming that by chance we have a 0.5 chance of getting our number by selecting 50 random boxes, On the other hand, the other strategy has a very different distribution, there’s a gap between 51 and 99 with 0 of success rate!, any suggestion to explain how is this possible?

Another way to plot the distribution of the success rates obtained from each model is by looking at a density plot:


This histogram is telling the same story than the previous plots, but in the same plot:


And as usual, I always left the best plot for the grand finale.


What we can say about this plot?, before going there, I will continue using Matt’s explanation:

The theoretical success rate of the model strategy is about 31%

One way to look at the distribution of numbers in boxes is to see it as a permutation of the numbers from 1 to 100. Each permutation can be partitioned into what are called cycles. A cycle works like this: pick any number in your permutation. Let’s say it’s 23. Then you look at the number the 23rd place (ie the number in the 23rd box, counting from the left). If that number is 16, you look at the number in the 16th place. If that number is 87, go open box number 87 and follow that number. Eventually, the box you open up will have the number that brings you back to where you started, completing the cycle. Different permutations have different cycles.

The key for the prisoner is that by starting with the box that is the same place from the left as his number, and by following the numbers in the boxes, the prisoner guarantees that if he is in a cycle of length less than 50, he will eventually open the box with his number in it, which would complete the cycle he began. One way to envision cycles of different lengths is to think about the extreme cases. If a particular permutation shifted every single number over one to the left (and wrapped number 1 onto the end), you would have a single cycle of length 100. Box 1 would contain number 2, box 2 number 3 and so on. On the other hand, if a permutation flipped every pair of consecutive numbers, you would have 50 cycles, each of length 2: box 1 would have number 2, box 2 would have number 1. Of course if your permutation doesn’t change anything you have 100 cycles of length 1.

As you can see from the histogram of the model strategy, you can never have between 50 and 100 winning prisoners. Anytime you have a single cycle of length greater than 50, for example 55, then all 55 prisoners who start on that cycle will fail to find their number. If no cycles are longer than 50, everyone wins.

Now, looking at the boxplot, OK we would say, using the model strategy we have like a 30% percent of chance that every one of the prisoners will get their number, but on the other hand, the random strategy guarantees that at least every prisoner will have 50% chance of getting his number, and therefore win.

So, the model strategy is like an all of nothing strategy, you can win the lottery or die trying. This is observable looking at the large standard deviation in the boxplot, on the other hand, the random.model strategy is very consistent with low dispersion values. This means that if the prisoners decides to use the model strategy, they will have to face tree possible outcomes, or everybody wins, or nobody does, or only a few does. On the other hand, with the random.model strategy, they will expect that at least 50 prisoners will win, thinking as a prisoner as a logical individual, I think this is the best strategy he can use.

Write your opinions.

Thanks to Matt Asher for the idea


You can download the code clicking here


Tags: , ,

Leave a comment

Your email address will not be published. Required fields are marked. *