Posts

Showing posts with the label probability

Biased Best-of-K Rock Paper Scissors

The topic of today's blog post is an interesting twist on the classical rock paper scissors game. Alice and Bob agree to play rock paper scissors (best of 1).  If Alice loses, she loses gracefully and accepts defeat.  If Bob loses, he will insist on playing best of 3.  If he loses yet again, he will insist on playing best of 5, and so on and so forth. What is Alice's probability of winning this game if she is willing to agree to Bob's request $k$ times (best of up to $K=2k+1$)? Let's begin by working out the formula for $k=1$, which is perhaps the most realistic scenario.  How much does Alice give up by agreeing to Bob's request once?  The probability of Alice winning a given round is $1/2$, and same for Bob (a round is consists of a sequence of ties followed by exactly one non-tie).  Here are the possible sequence of events, along with their probabilities, with A/B denoting that Alice/Bob wins the given round respectively. B - 0.5 - Bob wins in 1 turn AA -...

An Interesting Game

Today I'm going to talk about an interesting game that I thought of, and in particular I'll show the probability that each player will win.  While the rules of the game are not fair, the skill levels of each player are not necessarily the same either, so I formally define what I mean and find the conditions under which these factors balance out.  Without further ado, here is the problem statement: Alice and Bob play a game consisting of an indeterminate number of rounds.  In each round, Alice and Bob compete to win the round, and thus score a point.  Alice is more skilled at this game and so she wins each round with probability $ p > 0.5 $ (independent of the score and other context), meaning Bob wins each round with probability $ 1 - p $.  The score starts at $ 0 - 0 $ and Bob wins if he ever  has more points than Alice, and Alice wins otherwise.  What is the probability that Bob wins this game?  For what value of $ p $ is this game fair? W...

Math Puzzle: Rich Get Richer?

Today I'm going to talk about a math problem whose answer is simultaneously simple and unintuitive.  Here is the problem statement: You have a urn that has $1$ red ball and $1$ blue ball.  You repeatedly remove one ball from the urn randomly, then put two balls of the same color back into the urn.  How does this system behave over time?  Specifically, what is the probability that at some point the urn will contain exactly $r$ red balls and $b$ blue balls? If you want to see my solution, go ahead and expand the solution tab.  Otherwise, you can try to figure it out for yourself.  Also, if you have other interesting questions you want to ask about this system, feel free to post them in the comments below. Show/Hide Solution Intuitively, this seems like a "rich get richer" problem", in that the long term behavior of the system depends on how "lucky" we are in the early stages (i.e., how many red and blue balls we pick).  For example, if we pick m...

Challenge Problem: Taking a Quiz

I thought of this problem about two years ago and was going to send it out as the one of the weekly challenge problems for the ACM student chapter at UD, however I was unable to solve it when I first thought of it, and so I put it on the back-burner for a while.  Recently, I revisited it and made some progress on it, but was still unable to find a strategy that can solve it at all, let alone efficiently.  I am posting the problem to this blog in the hopes that you can help me figure it out by posting your ideas and code in the comments below.  Anyway, here is the problem: You are taking a quiz with $n$ true/false questions that you didn't study for, but you are allowed to retake the quiz as many times as you want. Every time you take the quiz, you are told how many questions you got correct, but not which questions were right and wrong.  Your final score is the number of answers you got correct on your last submission minus the number of times you retook the quiz. ...

Improving Naive Bayes with MLE

Today I'm going to be talking about the probabilistic classifier known as Naive Bayes, and a recent idea I came up with to improve it. My idea relies on the same assumptions that naive bayes does, but it finds different values for the conditional probabilities and class probabilities that describe the data better (or so I originally thought). In this blog post, I am going to quickly go through the traditional naive bayes setup, introduce my idea, then compare the two in terms of prediction quality. The Traditional Setup Let's assume we have a data set with \( N \) observations, where each observation has \( n \) attributes \( x_1, x_2, \dots, x_n \) and \( 1 \) class value \( C_k \). Naive bayes says that we can compute the probability of observing \( C_k \) given the attribute information by evaluating the following formula: $$ P(C_k | x_1, \dots, x_n) = \frac{ P(C_k) \prod_{i=1}^n P(x_i \mid C_k) }{P(x_1, \dots, x_n)} $$ where $$ P(x_1...

An Alternate Formulation of Markov Decision Processes for Solving Probabilistic Games

Today I'm going to talk about one of the coolest things I know about: Markov Decision Processes (MDPs). I was formally introduced to MDPs in my undergraduate course in AI, however I had actually independently discovered something very similar when I analyzed an unfair game . In this blog post, I will generalize my formulation of MDPs which have many similarities to the traditional setup, but I believe my formulation is better suited for finding theoretically optimal strategies to games with uncertainty. The Traditional Setup In the traditional setup, there is some state space, \( S \), and an agent who moves around the state space with some degree of randomness and some degree of choice (actions). Different actions lead to different transition probabilities for the agent. These probabilities are denoted by \( P_a(s, s') \) where a is the chosen action, \( s \) is the starting state, and \( s' \) is the state transitioned to. As a side note and to solidify your unde...

A Neat Random Walk Problem

As many of the people that read this blog probably know, I enjoy inventing math problems, especially problems that are probabilistic in nature, and then working out the math to solve that problem. In this blog post, I will introduce one problem that I solved that I think is particularly elegant: You are randomly walking along the x-axis starting at \( x=0 \). When you are at \( x = n \), you move to \( x = n-1 \) with probability \( p = \frac{n}{n+1} \) and to \( x = n+1 \) with probability \( p = \frac{1}{n+1} \). What is the expected number of steps until you return to \( x = 0 \) for the first time? This problem differs from standard random walk problems because the probability of moving left and right along the x-axis depends on your current location. In this problem, there is a force that is pulling you towards \( x = 0 \), and the further away you get, the more of an effect this force has on you. Show/Hide Solution So how do we go about solving this problem? When...

Beat the Streak: Day Three

Image
In order to maximize your probability of beating the streak, you should (1) predict the probability that a batter will get a hit in a given game given the game parameters and (2) determine if it's worthwhile to risk your current streak in order to possibly improve it by 1 or 2 games. In this blog post, I outline my solution to (2). In previous blog posts, I've hinted at what I do to solve (1) and will continue that discussion in a later blog post. Motivation When your current streak is short, the optimal strategy is to pick the best player every day, regardless of how likely their probability of getting a hit is (to a certain extent). However, as your streak grows, you have an important decision to consider: is it better to pick the best player today and possibly lose your current streak, or skip picking a player and instead maintain your current streak. Naturally, this decision should be guided by the players probability of getting a hit, as well as the distribution of the...

Interpolation Search Explained

Image
If you have taken an introductory computer science course, you've probably seen the binary search algorithm - an algorithm to efficiently find the index of an item in a sorted array, if it exists. You might not have heard of interpolation search, however. Interpolation search is an alternative to binary search that utilizes information about the underlying distribution of data to be searched. By using this additional information, interpolation search can be as fast as $O(log(log(n)))$, where $n$ is the size of the array. In this post I am going to talk about two implementations of interpolation search. First, I will discuss the algorithm as described on Wikepedia, then I will describe my own version of the algorithm that has a better worst case time complexity than the first and performs better in practice than the traditional binary search and interpolation search algorithms. The Basic Idea Interpolation search models how humans search a dictionary better than a binary search, ...

Generating Random Numbers from Arbitrary Distributions

Image
Generating random numbers is an important aspect of programming, and just about every programming language offers some source of (pseudo) randomness, usually in the form of a uniform distribution. For example, java has a builtin function, <code class="java">Math.random()</code> that returns a random floating point number in the range $[0,1)$. In some instances, we would like to have random numbers coming from a different distribution, but most programming languages don't offer this functionality (except very specialized languages such as R). In the rest of this blog post, I am going to explain how you can use a uniformly distributed random number to generate a number from some other distribution. Special Case: The Normal Distribution If you would like to get numbers from a binomial distribution, or are willing to accept an approximation for the normal distribution in favor of simplicity, then you may want to use this technique. If you add together some num...

Triangles Containing the Origin

Image
You may have found this site by searching for information regarding Project Euler Problem 184. This is not a solution to that problem. In fact, it really bothers me when I find people who post their solutions to these problems online, especially the higher level ones. In this blog post, I talk about a simple problem from probability that was motivated from this Project Euler problem, but the solution to this problem is not likely to help you solve that one. Pick three points uniformly at random along the circumference of a unit circle centered at the origin. What is the probability that the triangle connecting these three points contains the origin? Like I said, this problem is not as difficult as the problems that I usually write about, but I decided to write a blog post about it for two main reasons: I wanted to create the animation/simulation for this problem I eventually want to extend this problem to explore all convex polygons instead of just triangles (which is a more diff...

Mathematics and Darts!

In this blog post I am going to talk about one of my favorite math problems. When I first solved this problem, I developed a method that can be generalized to a lot of other similar problems. The problem is as follows: Alice and Bob take turns throwing darts at a dartboard (with Alice going first). At each turn, Alice will try to get it closer to the center than Bob's last throw and vice versa. The game ends when one of the players doesn't get it closer than the other player's previous throw. Alice and Bob are equally skilled at throwing darts and they are guaranteed to hit the dart board. What is the probability that Alice wins? To help you visualize this problem, I created a simple program in Racket (then compiled it to JavaScript using whalesong .) You can visualize the game being played by pressing the Right Arrow Key (make sure the frame is in focus). You can speed up the simulation by pressing the Space Bar (you can pause the simulation by pressing it again.) ...

Random Walk on a Complete Graph

Image
This is the 4th math problem in a series of math problem blog posts that I am writing about.  I thought of this problem as a challenging weekly problem for math club, and I will present 2 solutions.  I came up with the first solution, and a friend of mine came up with the second solution -- a completely different approach that yielded the same answer. You are randomly walking on a complete graph with \( 2 n \) vertices. You start at a random vertex and at each time step you transition to a different connected vertex. Let \( f(n) \) be the expected number of steps until you've visited exactly \( n \) distinct vertices. Find $$ \lim_{n \to \infty} \frac{f(n)}{n} $$ For those of you who are not familiar with graph theory, here is a complete graph with 6 vertices: In the context of this problem, you start on one of the circles (it doesn't matter which due to symmetry) and at each time step, you cross one of the lines to a new circle.  Henceforth, I will refer to the ...

A Challenge for my Readers!

In this blog post, I present a very interesting math problem that I came up with while pondering random walk problems. You are walking along the x axis and you start at $x=0$. At each time step you move to $x+1$ with probability $ \frac{1}{1+x} $ and to $ x-1 $ with probability $ \frac{x}{x+1} $.  What is the expected number of steps until you return to $ 0 $ for the first time? What makes this problem harder than a typical random walk problem is that the probability of moving to $x+1$ vs $x-1$ is not fixed -- it changes based on your position. That being said, the typical counting methods for solving problems of this type do not apply. There is a clever and fascinating solution that doesn't require too much advanced knowledge, but I will leave it to you to think about this problem.

A Fun Puzzle from Probability

Here is a problem that came up a few days ago and I'm still thinking about: There is an amoebae in a box. At each time step, it either multiplies in 2, multiplies in 3, stays the same, or dies - all with equal probability. What is the probability that eventually, all the amoebae die out? Show/Hide Solution Solution One can quickly see that the probability is at least $\frac{1}{4}$ because there is a $\frac{1}{4}$ chance of dying on the first time step. Similarly, you could die on the first time step ($\frac{1}{4}$), stay the same then die ($\frac{1}{4^2}$), etc. Thus, the probability of eventual extinction is at least $\sum_{n=1}^{\infty} \frac{1}{4^n}$. Unfortunately, this strategy leads to a dead end. Instead, I use a clever interpretation of the problem that leads to a very simple and elegant solution. My key insight came from seeing an infinite continued fraction problem, which uses a simple technique to define the value of the continued fraction in terms of itself. T...

Simplifying the Expected Value Formula

Probability theory is by far my favorite field in mathematics. I am particularly interested in expected value problems over a discretized domain. I often find myself thinking up and solving expected value problems that may or may not have any actual relevance. After working on an expected value problem that I had thought up, I derived a very interesting and useful identity for computing the answer to expected value problems more easily in certain situations. My derivation is easiest to understand when the sample space is the set of natural numbers (i.e., $1, 2, 3, \dots$). In situations when the sample space of a random variable is the set of natural numbers, the expected value is defined by this formula: $$ \mathbf{E}(X) = \sum_{k=1}^{\infty} k \cdot P(X = k) = P(X = 1) + 2 \cdot P(X = 2) + 3 \cdot P(X = 3) + \dots $$ In some cases, this summation may be difficult to evaluate directly. However, if you represent the sum in a special way, a simple but powerful simp...