Posts

Beat the Streak 2016: Looking for Help

For anybody who is interested in developing strategies to beat the streak using statistics or machine learning for the 2016 season and beyond, get in touch with me at RMcKenna21@gmail.com. The past six months or so I have been developing statistical models and designing algorithms to automate the pick selection process, and now I am looking for like-minded people to help me improve my methods. If you are interested in working together on this problem, let me know and we can start sharing ideas. I have a repository and a fairly nice python framework for predicting the most likely players to get a hit every day. However, the accuracy of my models are still ~10% lower than my target. I think if we can develop a model that correctly picks a player 83-85% of the time, then we have a pretty decent shot at winning this thing (by "pretty good" I mean like 1000 to 1 odds). To get a sense of what I've been doing to solve this problem so far, check out this paper and this...

An Alternate Formulation of Markov Decision Processes for Solving Probabilistic Games

Today I'm going to talk about one of the coolest things I know about: Markov Decision Processes (MDPs). I was formally introduced to MDPs in my undergraduate course in AI, however I had actually independently discovered something very similar when I analyzed an unfair game . In this blog post, I will generalize my formulation of MDPs which have many similarities to the traditional setup, but I believe my formulation is better suited for finding theoretically optimal strategies to games with uncertainty. The Traditional Setup In the traditional setup, there is some state space, \( S \), and an agent who moves around the state space with some degree of randomness and some degree of choice (actions). Different actions lead to different transition probabilities for the agent. These probabilities are denoted by \( P_a(s, s') \) where a is the chosen action, \( s \) is the starting state, and \( s' \) is the state transitioned to. As a side note and to solidify your unde...

Efficiently Evaluating Recursively Defined Functions

In this blog post, I will share a technique I have used in the past to evaluate linear recurrence relations efficiently using a little bit of linear algebra. As a motivating example, assume you have a sequence of numbers that you want to know more about: 1, 2, 3, 7, 24, 100, 448, 2051, 9444, 43549, 200889, 926770, 4275598, 19725314, 91002117, 419835526. Can you spot a pattern (hint: it's a 3rd order linear recurrence relation with constant term and a dependence on \( n \), the index into the sequence)? Once you figure it out, check out my other blog post on  finding recurrence relations from integer sequences to see if you came up with the same recurrence relation. By this point, you have figured out \( f(0) = 1 \), \( f(1) = 2 \), \( f(2) = 3 \), and \( f(n) = 5 f(n-1) - 2f(n-2) + f(n-3) - 2n + 1 \) for \( n \geq 3 \). Now let's say I want top compute the last \( 8 \) digits of \( f(10^{18}) \). A naive solution would be to write a for loop, but that would many orders...

A Neat Random Walk Problem

As many of the people that read this blog probably know, I enjoy inventing math problems, especially problems that are probabilistic in nature, and then working out the math to solve that problem. In this blog post, I will introduce one problem that I solved that I think is particularly elegant: You are randomly walking along the x-axis starting at \( x=0 \). When you are at \( x = n \), you move to \( x = n-1 \) with probability \( p = \frac{n}{n+1} \) and to \( x = n+1 \) with probability \( p = \frac{1}{n+1} \). What is the expected number of steps until you return to \( x = 0 \) for the first time? This problem differs from standard random walk problems because the probability of moving left and right along the x-axis depends on your current location. In this problem, there is a force that is pulling you towards \( x = 0 \), and the further away you get, the more of an effect this force has on you. Show/Hide Solution So how do we go about solving this problem? When...

Modeling Baseball At Bats

This semester in my mathematics capstone course, students had the chance to develop mathematical models to describe real world research problems.  I took this as an opportunity to research and develop probabilistic models that can be used to predict the outcome distribution of a baseball at bat.  My hope was that I could apply my findings from this research project to my side project of Beat the Streak.  I learned a lot about mathematical modeling in this class, and explored a variety of techniques for making predictions about baseball at bats.  At the end of the class, we wrote a report that introduces the models we came up with, and compares the quality of the predictions they produce.  I have made this report available here .

Evaluating the Sum of a Geometric Sequence

This is a short blog post that shows you an easy and intuitive way to derive the formula for the summation of an infinite geometric sequence. Let \( 0 \leq p < 1 \), and let \( a \) be some constant; then we wish to find the value of \( x \) such that $$ x = \sum_{k=0}^\infty a p^k $$ Writing out the first few terms of the summation, we get: $$ x = a + a p + a p^2 + a p^3 + \dots $$ Rewriting the equation by factoring out a \( p \) from every term except the first, we get: $$ x = a + p (a + a p + a p^2 + \dots) $$ Notice that the expression in parenthesis is exactly how \( x \) is defined. Replacing the expression with \( x \) leaves us with: $$ x = a + p x $$ Solving the equation for \( x \) yields $$ x = \frac{a}{1-p} $$ Just remember this simple derivation and you will never have to look up the formula for evaluating the sum of an infinite geometric sequence ever again!

Beat the Streak: Day Three

Image
In order to maximize your probability of beating the streak, you should (1) predict the probability that a batter will get a hit in a given game given the game parameters and (2) determine if it's worthwhile to risk your current streak in order to possibly improve it by 1 or 2 games. In this blog post, I outline my solution to (2). In previous blog posts, I've hinted at what I do to solve (1) and will continue that discussion in a later blog post. Motivation When your current streak is short, the optimal strategy is to pick the best player every day, regardless of how likely their probability of getting a hit is (to a certain extent). However, as your streak grows, you have an important decision to consider: is it better to pick the best player today and possibly lose your current streak, or skip picking a player and instead maintain your current streak. Naturally, this decision should be guided by the players probability of getting a hit, as well as the distribution of the...