lec-1 probabilistic models
lec-1 probabilistic models
Haixu Tang
School of Informatics
Probability
• Experiment: a procedure involving
chance that leads to different results
• Outcome: the result of a single trial of an
experiment;
• Event: one or more outcomes of an
experiment;
• Probability: the measure of how likely an
event is;
Example: a fair 6-sided dice
• Outcome: The possible outcomes of
this experiment are 1, 2, 3, 4, 5 and
6;
• Events: 1; 6; even
• Probability: outcomes are equally
likely to occur.
– P(A) = The Number Of Ways Event A Can Occur / The Total
Number Of Possible Outcomes
– P(1)=P(6)=1/6; P(even)=3/6=1/2;
Probability distribution
• Probability distribution: the assignment of a
probability P(x) to each outcome x.
• A fair dice: outcomes are equally likely to occur
the probability distribution over the all six
outcomes P(x)=1/6, x=1,2,3,4,5 or 6.
• A loaded dice: outcomes are unequally likely to
occur the probability distribution over the all
six outcomes P(x)=f(x), x=1,2,3,4,5 or 6, but
f(x)=1.
Example: DNA sequences
• Event: Observing a DNA sequence S=s1s2…sn:
si {A,C,G,T};
• Random sequence model (or Independent and
identically-distributed, i.i.d. model): si occurs at
random with the probability P(si), independent
of all other
n
residues in the sequence;
• P(S)= Psi
i 1
• This model will be used as a background
model (or called null hypothesis).
Conditional probability
• P(i|): the measure of how likely an event i
happens under the condition ;
– Example: two dices D1, D2
• P(i|D1) probability for picking i using dicer D1
• P(i|D2) probability for picking i using dicer D2
Joint probability
• Two experiments X and Y
– P(X,Y) joint probability (distribution) of experiments
X and Y
– P(X,Y)=P(X|Y)P(Y)=P(Y|X)P(X)
– P(X|Y)=P(X), X and Y are independent
• Example: experiment 1 (selecting a dice),
experiment 2 (rolling the selected dice)
– P(y): y=D1 or D2
– P(i, D1)=P(i| D1)P(D1)
– P(i| D1)=P(i| D2), independent events
Marginal probability
• P(X)=YP(X|Y)P(Y)
• Example: experiment 1 (selecting a dice),
experiment 2 (rolling the selected dice)
– P(y): y=D1 or D2
– P(i) =P(i| D1)P(D1)+P(i| D2)P(D2)
– P(i| D1)=P(i| D2), independent events
• P(i)= P(i| D1)(P(D1)+P(D2))= P(i| D1)
Probability models
• A system that produces different outcomes
with different probabilities.
x1
x
x0 x1
Mean and variance
• Mean
– m=xP(x)
• Variance
2= (k-m)2P(k)
: standard deviation
Typical probability distributions
• Binomial distribution
• Gaussian distribution
• Multinomial distribution
• Dirichlet distribution
• Extreme value distribution (EVD)
Binomial distribution
• An experiment with binary outcomes: 0 or 1;
i 1 i 1
– cdf:
Probabilistic model
• Selecting a model
– Probabilistic distribution
– Machine learning methods
• Neural nets
• Support Vector Machines (SVMs)
– Probabilistic graphical models
• Markov models
• Hidden Markov models
• Bayesian models
• Stochastic grammars
• Model data (sampling)
• Data model (inference)
Sampling
• Probabilistic model with parameter P(x| )
for event x;
• Sampling: generate a large set of events xi with
probability P(xi| );
• Random number generator ( function rand()
picks a number randomly from the interval [0,1)
with the uniform density;
• Sampling from a probabilistic model
transforming P(xi| ) to a uniform distribution
– For a finite set X (xiX), find i s.t. P(x1)+…+P(xi-1) <
rand(0,1) < P(x1)+…+P(xi-1) + P(xi)
Inference (ML)
• Estimating the model parameters
(inference): from large sets of trusted
examples