MarkovChain_PythonCode
MarkovChain_PythonCode
Markov chain is used to model a series of events. Each sequence is usually composed of various events and the order and the length of the sequence can vary drastically from sample to sample.
Such chain of events are usually very difficult to describe with deterministic statistics.
For instance, we can think of a monthly subscription service where a user can choose whether they want to subscribe or not each month.
Suppose we obtain a yearly subscription pattern of an user.
User 1
Month 1: Subscribed
Month 2: Subscribed
Month 3: Subscribed
Month 4: Unsubscribed
Month 5: Unsubscribed
Month 6: Unsubscribed
Month 7: Subscribed
Month 8: Unsubscribed
Month 9: Subscribed
Month 10: Subscribed
Month 11: Unsubscribed
Month 12: Unsubscribed
In this example, an user has two types of events: Subscribed and Unsubscribed.
Let's try to model such subscription behavior using Markov chain.
In [45]: i m p o r t numpy a s np
#Number of states
n = 2
In [171… s * P
matrix([[0.5, 0.5]])
Out[171]:
In [172… s * (P* * 5)
matrix([[0.44445, 0.55555]])
Out[172]:
The probability of this user staying subscribed is 0.5 in 1 month and 0.4445 in 5 months.
User 3
Month 1: Subscribed
Month 2: Subscribed
Month 3: Subscribed
Month 4: Subscribed
Month 5: Subscribed
Month 6: Unsubscribed
Month 7: Unsubscribed
Month 8: Unsubscribed
Month 9: Subscribed
Month 10: Subscribed
Month 11: Subscribed
Month 12: Subscribed
Now we can use all 3 users' subscription history data to recalculate the transition probabilities.
Even more, now we can try to compute the initial probability distribution as well.
Of course, all of these computations are based on the assumption that these 3 users' data are enough to obtain statistically signifciant results (which in reality is not, but let's assume for the sake of
this problem).
With the updated model, compute the probability of an user to be subscribed in 5, 10, 15, 20 months.
In [174… s * (P* * 5)
matrix([[0.59339184, 0.40660816]])
Out[174]:
matrix([[0.59322074, 0.40677926]])
Out[175]:
matrix([[0.59322034, 0.40677966]])
Out[176]:
matrix([[0.59322034, 0.40677966]])
Out[177]:
Do the probabilities seem to converge? Let's try to compute all of the probabilities up to the 20th month and visualize them as well.
plt. legend()
plt. ylim(0,1)
(0.0, 1.0)
Out[28]:
From the plots, we can see that the probabilities are converging.
Let's formally try to compute the steady-state probabilities.
Recall that we can compute stationary probabilities via: $\pi P = \pi$ and $\sum \pi_i = 1$.
$\pi P = \pi$ can be expressed as $(P^T-I)\pi = 0$ where $\pi$ is a column vector of the steady state probabilities.
Combining with $i \cdot \pi=1$ where $i$ is a unit row vetor, i.e., sum of the proabilities should be equal to 1, we obtain
$A\pi =b$ where $A = [(P^T-I);i]$ and $b$ is a vector of which all elements except the last is 0.
This equation can be solved using $A^T A\pi = A^T b$.
Below is the python code for solving this linear equation.
In [51]: i m p o r t pandas a s pd
f r o m random i m p o r t random
A = np. append(np. transpose(P)- np. identity(n),[[1,1]], axis= 0) #you should append a unit vector of size equal to n. If n=3, you should append [[1,1,1]] instead of [[1,1]]
#axis = the axis along which values are appended
b = np. transpose(np. array([[0,0,1]])) #you should input a vector with number of zeros equal to n. If n=3, you should input [0,0,0,1] instead of [0,0,1]
np. linalg. solve(np. transpose(A). dot(A), np. transpose(A). dot(b))
matrix([[0.44444444],
Out[51]:
[0.55555556]])
You can check that the steady state probabilites match with the converging numbers identified in the above figure.
Assignment
Formulate a problem setting where you can model the situation as a Markov chain.
You should keep in mind that the Markov property should hold true. You don't have to formally prove the Markov propery, but should be able to give reasons to why Markov property holds true
(qualitative reasoning would be enough).
You are free to find real world data or generate hypothetical data.
Dataset size should consist of at least 10 samples (in the above example, it consisted of 3 user history samples), and the sequence length of each sample should be at least 50.
State space should consist of at least 3 states (in the above example, state space consisted of 2 states $S = \{S, U\}$).
Using the data, you need to go through the same steps as the provided example:
Step 1. Explain the problem setting. Provide and explain the dataset.
Step 2. Define the state and check Markov chain assumptions.
Step 3. Calculate the transition probabilities.
Step 4. Predict the state in n-steps (ex. what is the probability to be in a certain state after 10 transitions?)
Step 5. Visualize the probability to be in each states upto 20th transition. Are the probabilities converging?
Step 6. If converging, check steady-state existence conditions and compute the steady state probabilites. If not converging, explain why.
In [ ]: