A place where share my thoughts and ideas with the world.

Random Walk on a Complete Graph

Get link

Facebook

Twitter

Pinterest

Email

Other Apps

This is the 4th math problem in a series of math problem blog posts that I am writing about. I thought of this problem as a challenging weekly problem for math club, and I will present 2 solutions. I came up with the first solution, and a friend of mine came up with the second solution -- a completely different approach that yielded the same answer.

You are randomly walking on a complete graph with \( 2 n \) vertices. You start at a random vertex and at each time step you transition to a different connected vertex. Let \( f(n) \) be the expected number of steps until you've visited exactly \( n \) distinct vertices. Find $$ \lim_{n \to \infty} \frac{f(n)}{n} $$

For those of you who are not familiar with graph theory, here is a complete graph with 6 vertices:

In the context of this problem, you start on one of the circles (it doesn't matter which due to symmetry) and at each time step, you cross one of the lines to a new circle. Henceforth, I will refer to the circles as vertices, and the lines as edges.

First Solution

This solution relies on an important result from probability theory. Namely, The Linearity of Expectation. The Linearity of Expectation says if you have random variables \( X_1, X_2, ..., X_n \), then
$$
E(X_1 + X_2 + \dots + X_n) = E(X_1) + E(X_2) + \dots + E(X_n)
$$ where \( E(Y) \) denotes the expected value of the random variable Y. Let \( X_1 \) be the expected number of steps to reach the first new vertex, \( X_2 \) be the expected number of steps to reach the second new vertex given that you've already reached the first vertex, etc. In general \( X_i \) is the expected number of steps to reach the \( i^{th} \) vertex given that you've already visited \( (i-1) \) distinct vertices. Thus, \( f(n) \) as defined in the problem statement is given by
$$
f(n) = \sum_{i=0}^n E(X_i)
$$ Here, \( X_i \) are geometrically distributed random variables where a success is achieved when a new vertex is reached. The probability of reaching a new vertex at any given time is \( p = \frac{2n-i}{2n} \), where the numerator is the number of unreached vertices and the denominator is the total number of vertices. Technically, the \( 2n \) should be a \( 2n-1 \) since you can't transition from a vertex to itself (the graph is <i>simple</i>), but since we are concerned with the limiting behavior of \( f(n) \), I will leave it in for simplicity. The expected value of a geometric random variable is \( E(Y) = \frac{1}{p} \), where \( p \) is the probability of success. Thus, \( E(X_i) = \frac{2 n}{2 n - i} \). Therefore, \( f(n) \) is given by
$$
f(n) = \sum_{i=0}^n \frac{2 n}{2 n - i} \\
f(n) = 2 n \sum_{i=0}^n \frac{1}{2 n - i}
$$ Let's change the indices of summation to simplify the expression. Notice that the summation is given by
$$
\frac{1}{2n} + \frac{1}{2n-1} + \dots + \frac{1}{n}
$$ Thus, we can write \( f(n) \) as
$$
f(n) = 2 n \sum_{i=n}^{2n} \frac{1}{i}
$$ Notice that the sum is related to the harmonic numbers, and we can easily express it in terms of the harmonic numbers.
$$
f(n) = 2 n \Big( \sum_{i=1}^{2n} \frac{1}{i} - \sum_{i=1}^{n-1} \frac{1}{i} \Big) = 2 n (H_{2n} - H_{n-1})
$$ In the limit, the harmonic numbers converge to the natural log function. Thus,
$$
\begin{split}
lim_{n \to \infty} \frac{f(n)}{n} &= lim_{n \to \infty} \frac{2 n (H_{2n} - H_{n-1})}{n}\\
&= lim_{n \to \infty} 2 (ln(2n) - ln(n-1))\\
&= \boxed { 2 ln(2) }
\end{split}
$$

Second Solution

This solution interprets the limiting behavior of \( f(n) \) in a continuous manner instead of solving for the finite discrete case then evaluating the limit. Let \( g(x) \) be the proportion of distinct vertices that we've visited after \( x \) steps. Here, a step is not interpreted in the same way as in the discrete case. Since we are dealing directly with the proportion of visited vertices, \( x \) denotes the ratio of steps (or visited vertices) to total vertices. In the most optimistic case, \( g(x) = x \), but in general \( g(x) < x \). Here is the visual interpretation of this method.

The probability of visiting a new vertex is equal to the proportion of white area to total area, or \( 1 - g(x) \). This interpretation can be thought of as a rate of change, and so we can setup the following differential equation:
$$
g'(x) = 1 - g(x) \\
g'(x) + g(x) - 1 = 0
$$ The general solution to this differential equation is \( g(x) = c e^{-x} + 1 \). We know that \( g(0) = 0 \) and \( g'(0) = 1 \) since at \( x = 0 \), we haven't visited any vertices yet so the probability of seeing a new vertex is \( 1 \). Solving for \( c \), we see that \( c = -1 \), so \( g(x) = 1 - e^{-x} \). We are interested in when half the vertices have been visited. Thus, we want to know the value of \( x \) when \( g(x) = 0.5 \).
$$
g(x) = 1 - e^{-x} = 0.5 \\
e^{-x} = 0.5 \\
x = ln(2)
$$ Thus, the answer to the original problem is
$$
lim_{n \to \infty} \frac{f(n)}{n} = \frac{ln(2)}{0.5} = \boxed { 2 ln(2) }
$$

Conclusion

This is one of the things I find fascinating about mathematics. You can take two seemingly unrelated approaches to get the same result. The first solution relied on ideas from probability theory, whereas the second approach utilized ideas from differential equations. This just goes to show you how closely related all the fields of math are.

In this post, I will discuss my findings in terms of an optimal strategy for Farkle , which is a dice game of chance. If you are unfamiliar with the game, I encourage you to skim the linked article so that you can better understand this blog post. All of my findings are based on sound mathematical methods and a computer program was used to determine the optimal strategy. Before I begin, let me first state the value of different dice rolls I assumed while developing the strategy. Each 1: 100 points Each 5: 50 points 3 1's: 1000 points 3 2's: 200 points ... 3 6's: 600 points 4 1's: 2000 points ... 4 6's 1200 points 5 1's 3000 points ... 5 6's 1800 points Straight 1-6 1500 points 3 Pairs (e.g., 22 33 44): 750 points Everything else is considered to be a Farkle (0 points). There are many variants to Farkle, but I like to play by this set of rules. The "game state" of Farkle roll can roughly be characterized by 2 values: the curre

A few weeks ago, I was using a Markov Chain as a model for a Project Euler problem, and I learned about how to use the transition matrix to find the expected number of steps to reach a certain state. In this post, I will derive the linear system that will help answer that question, and will work out specific an example using a 1-dimensional random walk as the model. Terminology Before I begin the derivation, let me define some terms. The transition matrix is an \( n \) by \( n \) matrix that describes how you can move from one state to the next. The rows of the transition matrix must always sum to 1. States can either be transient or absorbing . An absorbing state is one that cannot be left once reached (it transitions to itself with probability 1). A transient state is a state that is not an absorbing state. In many problems, it is of general interest to compute the expected number of steps to reach an absorbing state (from some start state). Derivation Let \( p_

In a lot of the recreational math and computer science problems I solve, I try to derive a recurrence relation that describes the situation. These recurrence relations pop up pretty often in counting problems (how many ways can you tile a 2xn grid with 2x1 tiles?) In some cases, I am unable to derive a recurrence relation directly from the problem statement, but still suspect that there exists some recurrence relation. In these cases, I end up solving the problem usin g brute force methods for small input sizes and analyzing/extrapolating the pattern. However, when the relation is obscure, it becomes hard to find. In order to make this process easier in the future, I created a program that can automatically find recurrence relations for arbitrary integer sequences. To use it, simply type in the first few terms of a sequence you are interested in, and it will try to find a recurrence relation for that sequence. There are some constraints on the allowable inputs/outputs which I di

## Comments

## Post a Comment