0% found this document useful (0 votes)
4 views

lecture 3 Parametric chap4

The document discusses parametric methods in machine learning, focusing on parametric estimation and maximum likelihood estimation (MLE). It provides examples of Bernoulli and multinomial distributions, along with the Gaussian distribution and its MLE for parameters. Additionally, it covers concepts of bias, variance, and regression, including linear and polynomial regression techniques.

Uploaded by

panzy111030003
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

lecture 3 Parametric chap4

The document discusses parametric methods in machine learning, focusing on parametric estimation and maximum likelihood estimation (MLE). It provides examples of Bernoulli and multinomial distributions, along with the Gaussian distribution and its MLE for parameters. Additionally, it covers concepts of bias, variance, and regression, including linear and polynomial regression techniques.

Uploaded by

panzy111030003
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Lecture Slides for

INTRODUCTION
TO
MACHİNE
LEARNİNG
3RD EDİTİON

ETHEM ALPAYDIN
© The MIT Press, 2014
CHAPTER 4:

PARAMETRİC METHODS
Parametric Estimation
3

¨ X = { xt }t where xt ~ p (x)
¨ Parametric estimation:
Assume a form for p (x |q ) and estimate q , its sufficient
statistics, using X
e.g., N ( μ, σ2) where q = { μ, σ2}
Maximum Likelihood Estimation
4

¨ Likelihood of q given the sample X


l (θ|X) = p (X |θ) = ∏t p (xt|θ)

¨ Log likelihood
L(θ|X) = log l (θ|X) = ∑t log p (xt|θ)

¨ Maximum likelihood estimator (MLE)


θ* = argmaxθ L(θ|X)
Examples: Bernoulli/Multinomial
5

¨ Bernoulli: Two states, failure/success, x in {0,1}


P (x) = pox (1 – po ) (1 – x)
L (po|X) = log ∏t poxt (1 – po ) (1 – xt)
MLE: po = ∑t xt / N

¨ Multinomial: K>2 states, xi in {0,1}


P (x1,x2,...,xK) = ∏i pixi
L(p1,p2,...,pK|X) = log ∏t ∏i pixit
MLE: pi = ∑t xit / N
6
Gaussian (Normal) Distribution

¨ p(x) = N ( μ, σ2)
p (x ) =
1 é (x - µ )2 ù
1 é (x - µ )2 ù
p(x ) =
exp ê- ú
exp ê-
2
2ps ë 2s û
ú
2p s ë 2s 2
û

¨ MLE for μ and σ2:


μ σ
åx t

m= t
N
å (x - m)
t 2

s2 = t
N 7
8
Bias and Variance
9

Unknown parameter q
Estimator di = d (Xi) on sample Xi

Bias: bq(d) = E [d] – q


Variance: E [(d–E [d])2]
q

Mean square error:


r (d,q) = E [(d–q)2]
= (E [d] – q)2 + E [(d–E [d])2]
= Bias2 + Variance
Regression
15

r = f (x ) + e
estimator : g(x |q )
e ~ N (0,s 2 )
p(r | x ) ~ N (g(x |q ),s 2 )

L (q |X ) = log Õ p(x t , r t )
N

t =1

= log Õ p(r t | x t ) + log Õ p(x t )


N N

t =1 t =1
Regression: From LogL to Error
16

L (q | X ) = log Õ
N
1 é
exp ê-
t
[
r - g (x |q ) ù
t 2

ú
]
t =1 2p s êë 2s 2
úû
2

å [r ]
- g (x t |q )
N
1
= -N log 2p s - t

2s 2
t =1
2

[
E (q | X ) = å r t - g (x t |q ) ]
N
1
2 t =1
Linear Regression
g (x t |w1 , w 0 ) = w1 x t + w 0

åt
r t
= Nw 0 + w1å x t

år x t t
= w 0 å x + w1 å (x
t
)
t 2

t t t

é N åt ú éw0 ù
x t
ù é å r t
ù
A = êê w = y = ê t ú
êëå
x t
( )
t ú2 ê w ú
åt x úû ë 1 û êëåt x úúû ê r t t

w = A -1y
17
Polynomial Regression

g (x |w ,", w , w , w ) = w (x )
t
k 2 1 0 k
t k
+ ! + w (x )
2
t 2
+ w1 x + w 0
t

é1 x1
ê
(x ) 1 2
" (x ) ù
ú
1 k
é r 1
ù
ê 2ú
D=ê
1 x 2
(x ) 2 2
" (x ) ú
2 k
r= ê r ú
ê! ú ê!ú
ê ú ê Nú
êë1 x N
(x ) N 2
" (x ) úû
N 2
êër úû

w = (D D ) DT r
T -1

18

You might also like