An Interesting Game

- December 26, 2016

Today I'm going to talk about an interesting game that I thought of, and in particular I'll show the probability that each player will win. While the rules of the game are not fair, the skill levels of each player are not necessarily the same either, so I formally define what I mean and find the conditions under which these factors balance out. Without further ado, here is the problem statement:

Alice and Bob play a game consisting of an indeterminate number of rounds. In each round, Alice and Bob compete to win the round, and thus score a point. Alice is more skilled at this game and so she wins each round with probability $ p > 0.5 $ (independent of the score and other context), meaning Bob wins each round with probability $ 1 - p $. The score starts at $ 0 - 0 $ and Bob wins if he ever has more points than Alice, and Alice wins otherwise. What is the probability that Bob wins this game? For what value of $ p $ is this game fair?

While the game never ends unless Bob wins, we can still reason about the game in a theoretical setting. The rest of this blog post is devoted to answering this question, so if you'd like to see a solution or know the answer, click the button below.

A natural place to start is to think about the probability of Bob winning when the current score is (a,b), then connect the probabilities between adjacent scores by defining some kind of recurrence relation. The first key observation to make is that the probability that Bob wins only depends on the score differential, or $ a - b $: the number of points Alice has minus the number of points Bob has. This allows us to define a recurrence relation with respect to one parameter, $ d = a - b $ rather than two, and these are much easier to reason about analytically.

If $ d = a - b = -1 $, then Bob has won by definition. Letting $ f(d) $ denote the function mapping score differentials to probabilities of Bob winning, we have $ f(-1) = 1 $. This is our base case, and now we can define a more general formula for $ f(d) $ as follows:
$$ f(d) = (1-p) \cdot f(d-1) + p \cdot f(d+1) $$
If you are familiar with recurrence relations, this formula should be pretty intuitive to you. If not, I encourage you to think carefully about how $ f $ is defined in English, then convince yourself that the mathematical definition is consistent with it. This formula holds because Bob scores the next point with probability $ 1-p $ and the score differential goes down by one. Similarly, Alice scores the next point with probability $p$ and the score differential goes up by one. We are interested in $ f(0) $ - the probability that Bob wins when the score differential is $ 0 $, because that's the score differential when the game first starts.

Now that we have a recurrence relation for $ f(d) $, we'd like to find a closed form expression for it in terms of $ p $ and $ d $ so we can just plug in $ d = 0 $ to get our answer. However, using another clever insight we can get around the need for doing that (although as I'll show, it's easy to recover $f(d)$ from $ f(0) $ anyway). Using the recurrence relation defined above, we know:
$$ f(0) = (1-p) \cdot f(-1) + p \cdot f(1) = (1-p) + p \cdot f(1) $$
The second key insight is that $ f(1) = f(0)^2$. Again I encourage you to think about this carefully to convince yourself that it is true. It basically holds because $p$ doesn't change based on the score differential. We can think of $ f(1) $ in two different ways: (1) by definition, it is the probability that Bob wins when the current score differential is $ 1 $ and (2) it is the probability that Bob wins two games in a row with score differential $0$. (2) holds because winning one game brings the score differential down from $ 1 $ to $ 0 $, and winning the second game brings it down from $ 0 $ to $ -1 $. Plugging in this newly derived knowledge to the formula above, we find:
$$ f(0) = (1-p) + p \cdot f(0)^2 $$
This is just a quadratic equation that we need to solve. To make it more clear that this is the case, we will let $ f(0) = x $, so we want to find the value of $ x $ such that
$$ x = (1-p) + p x^2 $$ $$-p x^2 + x + (p-1) = 0 $$
Using the quadratic formula, we find that $ x = \frac{p-1}{p} $ or $ x = 1 $. Of course $ x = 1 $ is the solution when $ p \leq 0.5 $, but $ \frac{p-1}{p} $ is the solution when $ p > 0.5 $. Thus the answer to the first question is $ f(0) = \frac{p-1}{p} $ and we can find the answer to the second question by setting $ f(0) = \frac{1}{2} $. This shows that when $ p = \frac{2}{3} $ the game is fair for both players.

I hope you enjoyed this problem, and if you have other solutions, feel free to post them in the comments section below!

Search This Blog

Ryan's Repository of Random Reflections

An Interesting Game

Comments

Post a Comment

Popular posts from this blog

Optimal Strategy for Farkle Dice

Markov Chains and Expected Value

Beat the Streak: Day Three