Biased Best-of-K Rock Paper Scissors

- December 10, 2022

The topic of today's blog post is an interesting twist on the classical rock paper scissors game.

Alice and Bob agree to play rock paper scissors (best of 1). If Alice loses, she loses gracefully and accepts defeat. If Bob loses, he will insist on playing best of 3. If he loses yet again, he will insist on playing best of 5, and so on and so forth. What is Alice's probability of winning this game if she is willing to agree to Bob's request $k$ times (best of up to $K=2k+1$)?

Let's begin by working out the formula for $k=1$, which is perhaps the most realistic scenario. How much does Alice give up by agreeing to Bob's request once? The probability of Alice winning a given round is $1/2$, and same for Bob (a round is consists of a sequence of ties followed by exactly one non-tie). Here are the possible sequence of events, along with their probabilities, with A/B denoting that Alice/Bob wins the given round respectively.

B - 0.5 - Bob wins in 1 turn
AA - 0.25 - Alice wins in two turns
ABA - 0.125 - Alice wins in three turns
ABB - 0.125 - Bob wins in three turns

Looking at the probabilities above, it is clear that Bob enjoys a $62.5\%$ chance of winning this biased game, while Alice only has a $37.5\%$ chance. Pretty good odds for Bob! What if we consider $k=2$?

B - 1/2 - Bob wins in 1 turn
AAA - 1/8 - Alice wins in 3 turns
ABB - 1/8 - Bob wins in 3 turns
AABA - 1/16 - Alice wins in 4 turns
ABAA - 1/16 - Alice wins in 4 turns
AABBA - 1/32 Alice wins in 5 turns
AABBB - 1/32 Bob wins in 5 turns
ABABA - 1/32 - Alice wins in 5 turns
ABABB - 1/32 - Bob wins in 5 turns

Adding up the probabilities for sequences that Alice won, we see that her probability of winning is $31.25\%$. That is, Bob has more than a 2:1 advantage in this game. The brute force approach above worked fine for $k=1$ and $k=2$, but what if we want to know the probability of Alice winning for general $k$? The brute force approach can get very messy very quickly.

Dynamic Program

We'll begin by deriving a simple recursive program to answer this question for any $k$. For simplicity, we will consider $k$ to be a global constant for now, and let $P(a,b)$ denote the probability of Alice winning given that she has won $a$ rounds and Bob has won $b$ rounds. Our goal is to calculate $P(0,0)$, although we will define $P(a,b)$ recursively in terms of itself. For the base cases, we have $P(a, b) = 0$ if $b=a+1$ and $P(a,b) = 1$ if $a=k+1$. For the general case, we simply have $P(a,b) = 1/2 P(a+1, b) + 1/2 P(a, b+1)$. Putting it together, we have:

$$ P(a,b) = \begin{cases} 0 & b = a+1 \text{ (Bob Wins)} \\ 1 & a = k+1 \text{ (Alice Wins)} \\ \frac{1}{2} P(a+1,b) + \frac{1}{2} P(a, b+1) & otherwise\\ \end{cases} $$

We can pretty easily write a python program to implement this recurrence relation and compute the desired probabilities. Here is what that would look like:

from functools import lru_cache
from sympy import Rational

@lru_cache
def P(a=0, b=0, k=0):
  if a==k+1:
    return Rational(1,1)
  if a+1 == b:
    return Rational(0,1)
  return P(a+1,b,k)/2 + P(a,b+1,k)/2

Here is a short table of probabilities we get by running the function above for various $k$.

k	Probability Alice Wins	Float
0	1/2	0.500000
1	3/8	0.375000
2	5/16	0.312500
3	35/128	0.273438
4	63/256	0.246094
5	231/1024	0.225586
6	429/2048	0.209473
7	6435/32768	0.196381
8	12155/65536	0.185471
9	46189/262144	0.176197
10	88179/524288	0.168188
11	676039/4194304	0.161180
12	1300075/8388608	0.154981
13	5014575/33554432	0.149446
14	9694845/67108864	0.144464
15	300540195/2147483648	0.139950

Closed Form Expression

While the dynamic program is was simple to implement and provided answers very quickly for reasonable $k$, it is somehow unsatisfying because such as simple problem should have a nice closed form expression. Moreover, the dynamic program has $O(k^2)$ complexity, since it essentially fills in a $k \times k$ table. This prevents the method from scaling to $k=1,000,000$ (I would really have to question Alice's state of mind if she lets the game go that far). Looking at the table above, we could pretty easily conjecture a closed form expression assumes the form:

$$ \Pr[\text{Alice Wins}] = \frac{f(k)}{2 \cdot 4^k} $$

Under this conjecture, we can exactly calculate $f(k)$ by multiplying the reduced fraction in the table above by our conjectured un-reduced denominator. Doing so, we obtain the following sequence of numerators: $[1, 3, 10, 35, 126, 462, 1716, 6435, 24310, 92378, \dots]$. Plugging this into the online encyclopedia of integer sequences (OEIS.org), we can readily observe that the closed-form expression for this sequence is $\binom{2k+1}{k+1}$. Thus, our final closed form expression for the probability that Alice wins is

$$ Pr[\text{Alice Wins}] = \frac{\binom{2k+1}{k+1}}{2 \cdot 4^k} $$

I don't currently have a clear interpretation of why this formula is correct here, or even a proof that it is correct in the first place. Empirically, I have verified that the formula is true up to $k=100$. I believe there is probably a nice counting argument to be made here, although it's not immediately obvious to me. If you have any thoughts, please share them in the comments section below!

Search This Blog

Ryan's Repository of Random Reflections