100% found this document useful (2 votes)
135 views

HiddenMarkovModel FINAL

This document provides an overview of Hidden Markov Models (HMMs). It discusses that HMMs extend regular Markov models by making the state unobserved, or "hidden". The key components of an HMM are: 1) A set of hidden states that the model can be in. 2) Probabilities of transitioning between those states. 3) A set of observable output symbols associated with each state, with probabilities of emitting each symbol. 4) Initial state probabilities. The document provides examples to illustrate HMMs and their notation, including a simple example modeling orange juice brand preferences over time. It explains that in an HMM, the internal state is not visible, only the

Uploaded by

TapasKumarDash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
135 views

HiddenMarkovModel FINAL

This document provides an overview of Hidden Markov Models (HMMs). It discusses that HMMs extend regular Markov models by making the state unobserved, or "hidden". The key components of an HMM are: 1) A set of hidden states that the model can be in. 2) Probabilities of transitioning between those states. 3) A set of observable output symbols associated with each state, with probabilities of emitting each symbol. 4) Initial state probabilities. The document provides examples to illustrate HMMs and their notation, including a simple example modeling orange juice brand preferences over time. It explains that in an HMM, the internal state is not visible, only the

Uploaded by

TapasKumarDash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 73

Hidden Markov Models (HMMs)

Dhiraj
DSG-MVL

Future is independent of the past given the


present
Used to model extraordinary large number
of applications using temporal data or
sequence of data eg weather, financial
Language, music it deals with how world is
evolving over time
Andrei Andreyevich Markov

(1856 -1922)

MARKOV CHAINS

Markov Chain :
Auto Insurance Example

Auto Insurance Example

Generics

Markov Chain :
Auto Insurance Example

Markov Chain :
Auto Insurance Example

Markov Chain :
Auto Insurance Example

Markov Chain : Auto Insurance Example

Power of Markov chain, it will allow us to travel


in future many many steps
10

Markov Chain : Auto Insurance Example

11

Markov Chain : Free Throw confidence

12

Markov Chain : Free Throw confidence

13

Markov Chain : Free Throw confidence

14

Markov Chain : Free Throw confidence

15

Markov Chain : Free Throw confidence

16

Markov Chain : Free Throw confidence Transitions

17

Markov Chain : Free Throw confidence Transitions

18

Markov Chain : Transition Matrix

19

TRANSITION DIAGRAM: EXAMPLE 1

20

TRANSITION DIAGRAM: EXAMPLE 2

21

TRANSITION DIAGRAM: EXAMPLE 3


Relative Probability

22

MARKOV CHAIN

Transient/ephemeral
Recurrent
Absorbing

24

States Going to

Current
States
25

System Behavior
Arriving
Initial State representing Initial Vector

After n time
Units

After One
time Unit

Playing on
Phone

Paying
Attention

26

System Behavior
After Two
time Unit

Playing on
Phone

Paying
Attention

Writing
Notes

Kicked
Out
27

rriving

System Behavior
After 100
time Unit

Playing on Paying
Phone
Attention

Writing
Notes

Listening

Kicked
Out
28

Markov Model

A Markov model is a type of stochastic


process, sometimes referred to as a
chain
The model is similar to the FSM except
that it is executed by probabilistic
moves rather than deterministic moves
It is nondeterministic, where the FSM
is deterministic

Markov Models
A discrete (finite) system:
N distinct states.
Begins (at time t=1) in some initial state(s).
At each time step (t=1,2,) the system moves
from current to next state according to transition
probabilities associated with current state.
This kind of system is called a finite, or discrete
Markov model

30

Markov Models
Set of states: {s1 , s2 ,, s N }
Process moves from one state to another generating a
sequence of states : si1 , si 2 ,, sik ,
Markov chain property: probability of each subsequent state
depends only on what was the previous state:

P(sik | si1 , si 2 ,, sik 1 ) P(sik | sik 1 )


To define Markov model, the following probabilities have to be
specified: transition probabilities aij P( si | s j ) and initial
probabilities i P( si )
The output of the process is the set of states at each instant of
time
31

Markov Property
Markov Property: The state of the system at time t+1
depends only on the state of the system at time t

P[X t 1 x t 1 | X t x t , X t -1 x t -1 , . . . , X1 x1 , X 0 x 0 ]
P[X t 1 x t 1 | X t x t ]

Xt=1

Xt=2

Xt=3

Xt=4

Xt=5

32

A Markov System
Has N states, called s1, s2 .. sN

s2

s1
N=3
t=0

s3

There are discrete timesteps,


t=0, t=1,

A Markov System
Has N states, called s1, s2 .. sN

s2
Current State

s1
N=3
t=0
qt=q0=s3

s3

There are discrete timesteps,


t=0, t=1,

On the tth timestep the system is


in exactly one of the available
states. Call it qt
Note: qt {s1, s2 .. sN }

A Markov System
Has N states, called s1, s2 .. sN

Current State

s2

s1
N=3
t=1
qt=q1=s2

s3

There are discrete timesteps,


t=0, t=1,

On the tth timestep the system is


in exactly one of the available
states. Call it qt
Note: qt {s1, s2 .. sN }
Between each timestep, the next
state is chosen randomly.

P(qt+1=s1|qt=s2) = 1/2
P(qt+1=s2|qt=s2) = 1/2

A Markov System

P(qt+1=s3|qt=s2) = 0

Has N states, called s1, s2 .. sN

P(qt+1=s1|qt=s1) = 0
P(qt+1=s2|qt=s1) = 0

s2

P(qt+1=s3|qt=s1) = 1

s1

s3

N=3
t=1
qt=q1=s2

P(qt+1=s1|qt=s3) = 1/3
P(qt+1=s2|qt=s3) = 2/3

P(qt+1=s3|qt=s3) = 0

There are discrete timesteps,


t=0, t=1,

On the tth timestep the system is


in exactly one of the available
states. Call it qt
Note: qt {s1, s2 .. sN }
Between each timestep, the next
state is chosen randomly.
The current state determines the
probability distribution for the
next state.

P(qt+1=s1|qt=s2) = 1/2
P(qt+1=s2|qt=s2) = 1/2

A Markov System

P(qt+1=s3|qt=s2) = 0

Has N states, called s1, s2 .. sN

P(qt+1=s1|qt=s1) = 0

s2

P(qt+1=s2|qt=s1) = 0

1/2

P(qt+1=s3|qt=s1) = 1
2/3

1/2

s1
N=3
t=1
qt=q1=s2

1/3

s3

There are discrete timesteps,


t=0, t=1,

On the tth timestep the system is


in exactly one of the available
states. Call it qt
Note: qt {s1, s2 .. sN }

P(qt+1=s1|qt=s3) = 1/3
P(qt+1=s2|qt=s3) = 2/3

P(qt+1=s3|qt=s3) = 0
Often notated with arcs
between states

Between each timestep, the next


state is chosen randomly.
The current state determines the
probability distribution for the
next state.

P(qt+1=s1|qt=s2) = 1/2
P(qt+1=s2|qt=s2) = 1/2

Markov Property

P(qt+1=s3|qt=s2) = 0
P(qt+1=s1|qt=s1) = 0

s2

P(qt+1=s2|qt=s1) = 0

1/2

P(qt+1=s3|qt=s1) = 1
2/3

1/2

s1
N=3
t=1
qt=q1=s2

1/3

s3

P(qt+1=s1|qt=s3) = 1/3
P(qt+1=s2|qt=s3) = 2/3

P(qt+1=s3|qt=s3) = 0

qt+1 is conditionally independent


of { qt-1, qt-2, q1, q0 } given qt.
In other words:
P(qt+1 = sj |qt = si ) =
P(qt+1 = sj |qt = si ,any earlier history)

Hidden Markov Models


(probabilistic finite state automata)
Often we face scenarios where states cannot be
directly observed.
We need an extension: Hidden Markov Models
a11
a12
b11

a44
a34

a23
b14

b13

b12

a33

a22

4
2

Observed
phenomenon

aij are state transition probabilities.

bik are observation (output) probabilities.


b11 + b12 + b13 + b14 = 1,
b21 + b22 + b23 + b24 = 1, etc.

Hidden Markov Models - HMM

Hidden variables

H1

H2

Hi

HL-1

HL

X1

X2

Xi

XL-1

XL

Observed data

Definition of Hidden Markov Model


The Hidden Markov Model (HMM) is a finite set of states,
each of which is associated with a probability
distribution.
A Hidden Markov model is a statistical model in which the system
being modelled is assumed to be markov process with unobserved
hidden states.
Transitions among the states are governed by a set
of probabilities called transition probabilities.
In a particular state an outcome or observation
can be generated, according to the associated
probability distribution.
It is only the outcome, not the state visible to an external
observer and therefore states are ``hidden'' from the
observer; hence the name Hidden Markov Model.

Hidden Markov Models


A Hidden Markov model is a statistical model
in which the system being modelled is
assumed to be markov process with
unobserved hidden states.

In Regular Markov models the state is clearly


visible to others in which the state transition
probabilities are the parameters only where
as in HMM the state is not visible but the
output is visible.

Hidden Markov Model


Consider a discrete-time Markov Process
Consider a system that may be described at any
time as being in one of a set of N distinct states
At regularly spaced, discrete times, the system
undergoes a change of state according to a set of
probabilities associated with the state
We denote the time instance associated with
state changes as t = 1,2, and actual state at
time t as

Essentials
To define hidden Markov model, the following
probabilities have to be specified: matrix of
transition probabilities A=(aij), aij= P(si | sj) ,
matrix of observation probabilities
B=(bi (vm )), bi(vm ) = P(vm | si) and a vector of
initial probabilities =(i), i = P(si) . Model is
represented by M=(A, B, ).

Hidden Markov Model

45

Discrete Markov Model: Example


Discrete Markov Model
with 5 states.
Each aij represents the
probability of moving
from state i to state j
The aij are given in a
matrix A = {aij}
The probability to start
in a given state i is i ,
The vector represents these start
probabilities.

46

Overview

Mathematical notation
Example : flow chart

Mathematic Notation
To obtain the conditional probability of achieving a
particular state based on previous state
X1,X2,.Xn where X1 represents variable at time 1
P[Xn+1 = j | Xn = i ] = P(i,j)
What is the probability that given the system is in state i and it
will move in state j
47

Mathematical Notation
Probability Matrix
P(0,0): Probability of

P(0, 0) P(0,1) P(0, 2)


P(1, 0) P(1,1) P(1, 2)

P(2, 0) P(2,1) P(2, 2)


0.5

0.5

moving from state 0


to state 0

0.5
0
0.5

0
1.0

0.3

0.4
0.3
0.4
0.3

1.0

0.3

2
48

Example: Orange Juice

Assumption: A

Family of four buy orange juice once a weak


A = Someone using other Brand

A = Someone using Brand A


Transition Diagram

A
0.1

P=

0.9

0.3

0.9
0.7

A Next State

0.1

0.3

A
Current State
Transition Probability Matrix

0.7

So =

0.2

0.8

Initial State Distribution Matrix


49

Example: Orange Juice


Initial State Distribution Matrix

0.2

0.8

Start

0.9

0.1
0.7

A
A

0.3

So = 0.2 0.8
S1 = 0.74 0.26

To find probability that someone


uses Brand A after one week
P(Brand A after 1 wk) =
(0.2) (0.9) + (0.8) (0.7) = .74
50

Markov Model

51

Markov Model

52

Markov Model

53

Hidden Markov Model


Markov model is a process in which each state
corresponds to a deterministically observable event
and hence the output of any given state is not
random
We extend the concept of Markov Models to include
the case in which the observation is a probabilistic
function of the state
That is the resulting model is a doubly embedded
stochastic process with an underlying stochastic
process that is not directly observable(hidden) but
can be observed only through another set of
stochastic process that produce the sequence of
observations
54

HMM Components

A set of states (xs)

A set of possible output symbols


(ys)

A state transition matrix (as)

Output emission matrix (bs)

probability of making transition from


one state to the next
probability of a emitting/observing a
symbol at a particular state

Initial probability vector

probability of starting at a particular


state
Not shown, sometimes assumed to
be 1

COIN-TOSS MODEL

59

COIN-TOSS MODEL (contd..)

60

COIN-TOSS MODEL (contd..)

61

COIN-TOSS MODEL (contd..)

62

Weather Example Revisited

63

Weather Example Revisited

64

Weather Example Revisited

65

PROBLEM

66

Solution

67

Solution contd.

68

Solution contd.

69

States

Observation
70

Main issues using HMMs


Evaluation problem. Given the HMM M=(A, B, )
and the observation sequence O=o1 o2 ... oK ,
calculate the probability that model M has generated
sequence O .
Decoding problem. Given the HMM M=(A, B, )
and the observation sequence O=o1 o2 ... oK ,
calculate the most likely sequence of hidden states si
that produced this observation sequence O.
Learning problem. Given some training
observation sequences O=o1 o2 ... oK and general
structure of HMM (numbers of hidden and visible
states), adjust M=(A, B, ) to maximize the probability.
O=o1...oK denotes a sequence of observations
ok{v1,,v }.
M

71

Learning/Training Problem
Modify

the model parameters that best represents


the observed output, given the output sequence
and the model.

Consider

the coin-toss example (with 3 biased

coins)
Say we get the observations as
{HHHHTTHTTTTHHTT}
So find the model parameters, i.e., transition matrix,
emission matrix and initial distribution that best
represents the output
72

Evaluation Problem
What

is the chance of appearance of an output


observation while the model is known?

Consider

coin-toss example (with 3 biased coins)


We know some previous output sequence obtained
from the coin-toss experiment
Say {HHHHTTHTTTTHHTT}
We know the model parameters too
So what is the possibility that we will get an output
sequence like {HTHT}
73

Decoding Problem
What

is the state sequence that best explains the


output sequence while the model is known?

Say

we get the observations as


{HHHHTTHTTTTHHTT}
Decode/Find the sequence of states that generates
the output sequence
In simpler words, find the sequence of tossing the 3
biased coins that generates the given output
sequence
74

Solution to the Problems


Learning/Training:

Baum-Welch Algorithm
Viterbi Training (Unsupervised Learning)
Evaluation:
Forward Algorithm
Decoding:
Forward-Backward Algorithm
Viterbi Algorithm

75

Thank you

You might also like