Modeling and Predicting the Outcome of a Sequence of Games

Here's something I though of today that I think is pretty interesting. Consider the following toy problem:
Suppose Alice and Bob play a sequence of games, and the probability that Alice wins game $i$ is given by $p_i$. Further, assume $p_1 \sim U[0,1]$ is uniform and that $ p_{i+1} \sim Beta(\alpha p_i + \beta, \alpha (1-p_i) + \beta) $ for known $ (\alpha, \beta) $. We are unable to directly observe $p_i$, but we do observe the outcome of each game $ o_i $ for $ 1 \leq i \leq k $. The task is to estimate $ p_{k+1} $ given the evidence $ o_1, \dots, o_k $.  
The figure below shows how Alice's win probability might evolve over time.  This particular figure was generated with $ \alpha = 1000 $ and $ \beta = 5 $.

We can model this situation with a modified Hidden Markov Model.  The graph below shows how the random variables are related in the form of a directed graphical model:

There are very efficient algorithms for doing inference and learning on Hidden Markov Models, but they usually assume that the state space of the hidden nodes is discrete.  The difference here is that the transition model $ P(p_i \mid p_{i-1}) $ is a continuous probability distribution, so the standard inference and learning algorithms don't apply.  We could of course discretize the transition model to make it discrete, but I am interesting in doing the mathematically correct thing and handle the continuous probability distribution directly.  I tried working out the mathematical details but it's deceptively hard.  If you have any insights, please share them in the comments below!  I have included some code at the end of this post for you to experiment with if you want to test out your ideas.

I am also interested in the more general problem where there are more than $2$ players, and we observe a sequence of matches $ (x_i, y_i, o_i) $ where $x_i$ and $y_i$ are the players and $o_i$ is the outcome of the match.  To what degree can we predict the outcomes of future matches given this evidence?  With some modifications of the toy example, we can use this model to predict the outcomes of sporting events, which I think is pretty cool.  However, since even the simplified toy example seems challenging, it seems like a really hard thing to do.
import numpy as np

class Learner:
    def __init__(self):
        pass
    def set_params(self, alpha, beta):
        self.alpha = alpha
        self.beta = beta
    '''Update the internal model with the outcome of the current game
    winner = True if Alice won, and False otherwise'''
    def update(self, winner):
        pass
    '''Return the probability that Alice wins the next game'''
    def predict(self):
        pass

class Naive(Learner):
    def __init__(self):
        pass  
    def update(self, winner):
        pass    
    def predict(self):
        return 0.5

class WeightedAverage(Learner):   
    def __init__(self, history):
        self.index = 0
        self.history = history
        # random vector of Trues/Falses
        self.games = np.random.rand(history) < 0.5
    def update(self, winner):
        self.games[self.index] = winner
        self.index = (self.index + 1) % self.history
    def predict(self):
        ''' empirical average of most recent games, with regularization '''
        return (1.0+np.sum(self.games)) / (2.0+self.history)

def evaluate(learner, n = 1000, alpha = 100, beta = 5):
    learner.set_params(alpha, beta)
    p = np.random.rand()
    # burn in period, let the model learn something before testing it
    for i in range(100):
        winner = np.random.rand() < p
        learner.update(winner)
        p = np.random.beta(alpha*p + beta, alpha*(1-p) + beta)
    sqerr = 0.0
    loglike = 0.0
    winner = np.random.rand() < p
    for i in range(n):
        learner.update(winner)
        p = np.random.beta(alpha*p + beta, alpha*(1-p) + beta)
        q = learner.predict()
        winner = np.random.rand() < p
        sqerr += (p-q)**2
        loglike += winner*np.log(q) + (1-winner)*np.log(1-q)
    print learner.__class__
    print '  MSE: %.5f' % (sqerr/n)
    print '  LogLoss: %.5f' % (-loglike/n)

if __name__ == '__main__':
    evaluate(Naive())
    evaluate(WeightedAverage(50))

Comments

  1. This comment has been removed by the author.

    ReplyDelete
  2. Silahkan di kunjungi ya kawan-kawan 100% Memuaskan
    > Hoki anda ada di sini <
    Bandar Q Online Terpercaya dan Teraman di GUNUNGPOKER
    Link daftar : http://bandaraduq.com/Register.aspx?lang=id
    BBM : 56978317

    SEMUA GAME HANYA PAKAI 1 USER ID : Poker, Domino QQ, Capsa Susun, Adu Q, Bandar Poker,

    Segera daftarkan userid anda di GUNUNGPOKER
    Promo Terbaru dari GUNUNGPOKER
    - Minimal DEPOSIT & WITHDRAW Rp 20.000,-
    - Tersedia 7 game dalam 1 USER ID
    - BONUS Turnover 0.5%
    - BONUS Referral 20%

    UNTUK INFORMASI SELANJUTNYA BISA HUB KAMI DI :
    LIVECHAT GUNUNGPOKER 24 JAM ONLINE
    Fanspage FB : @agengunungpoker
    Pin BB : 56978317
    WA : +62812-7287-4416
    LINE : gunungpokercsr1
    WECHAT : gunungpokercsr1
    YM : gunungpokercsr1@yahoo.com
    Agen domino
    BandarQ
    Domino 99
    Domino QQ
    Agen Poker
    Bandar Poker
    Agen Judi QQ
    Judi Online
    Forum Judi Online

    hubungi kami di :
    Line : gunungpokercsr1
    Bbm : 56978317
    Wa : +6281272874416

    ReplyDelete

Post a Comment

Popular posts from this blog

Efficiently Remove Duplicate Rows from a 2D Numpy Array

Multi-Core Programming with Java

Beat the Streak: Day Three