0% found this document useful (0 votes)
35 views87 pages

Modelos y Simulación - Clase 4-2016

Clase 4 - Simulación y Modelos

Uploaded by

Gustavo Sánchez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views87 pages

Modelos y Simulación - Clase 4-2016

Clase 4 - Simulación y Modelos

Uploaded by

Gustavo Sánchez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 87

Random number generation

Probability
Density

Data?

Random number generation

Random number generation

Statistical analysis

Data

Probability
Density?

Statistical analysis

Statistical analysis

Statistical analysis

Statistical test

Statistical analysis

t29,0.025 2.045

Statistical analysis

Statistical analysis

Statistical analysis

Problem

Data: Olympics 100m winning time

Rogers and Girolani (2012). A first course in Machine Learning. CRC Press.

Least Squares Method


The Least Squares Method (LSM) is

widely used to determine models


suitable to fit experimental data.
It was invented by .

?????

Least Squares Method


In 1805, Legendre

published this
book
He introduced and

named the LSM

Least Squares Method


In 1809, Gauss

published this
book
He states that he

had used LSM


since 1795

Least Squares Method


Believe it or not
210 years later, controversy

remains!

Recall
Three important results:

Linear Equations
Unconstrained Optimization
Probabilities

Linear Systems

Linear Systems
If M is non

singular, this
equation has
only one
solution

Unconstrained Optimization

Gradient and Hessian

Theorem

Gradient

The gradient points in the direction of the


greatest rate of increase!

A book

intended for
people
interested in
solving
optimization
problems

Probability Theory

Standard Normal Distribution

Joint Probability

Stop!
No more, for now:

Linear Equations
Unconstrained Optimization
Probabilities

Models

Linear Models
The simplest model we can
assume is the linear:

Winning time
(output)

Unknown
Parameters

Olympics
number
(input)

What is a good model?

Loss Function

Average Loss Function

Average loss across


the whole data set

First Problem Formulation


by Legendre

How to find a solution?

A good candidate

Data: Olympics 100m winning time

Vector Formulation

Vector Formulation

Vector Formulation

System of
Linear Equations

2nd order model

2nd order model

2nd order model

Yes, it is
the same equation!

Example: new data

2nd order model

2nd order model

Data: Olympics 100m winning time

What is a good model?


A good model must also be able to

generate the most accurate predictions


We

say that it must generalise beyond


the data we have used for training

Over-fitting
The 4th order model fits the training

data accurately, but its predictions for


next Olympics are poor!
This is known as over-fitting

Validation Data
One way to detect
over-fitting is to
use a validation
data set (not used
for training) to test
the predictive
performance of the
model

Cross - Validation
Set 1

1/3
Available
Data
Random
Selection

Set 2

1/3

Set 3

1/3

Cross - Validation
Training

Training

Validate

Training

Validate

Training

Validate

Training

Training

K-fold cross-validation
The data is

splited into K
equally sized
blocks.
Each block takes

its turn as a
validation set

Rogers and Girolani (2012). A first course in


Machine Learning. CRC Press.

K-fold cross-validation
Averaging over the resulting K loss

values gives us our final loss value.


When K = N each data observation is

held out in turn and used to test a


model trained on the other N 1
objects.

Leave-One-Out Cross-Validation
This form of cross-validation is given

the name of Leave-One-Out CrossValidation and its average loss


function is expressed:

Leave-One-Out Cross-Validation
Rogers and Girolani (2012). A first course in
Machine Learning. CRC Press.

Mean LOOCV loss as polynomials of increasing orders are fitted to


the winning 100m data

Regularised LSM

Regularised LSM

Regularised LSM

Regularised LSM

Regularised LSM

Stochastic Model
What information can we extract

from errors?
What can we expect about the

quality of our predictions?

Stochastic Model

When
simulating a
system, it is
often
convenient to
consider the
random
behavior we
can observe
in real data

Stochastic Model

Stochastic Model

Error is different
each year,
sometimes
positive,
sometimes
negative

There is not
obvious
relationship
between errors
and years

Stochastic Model

Likelihood

Likelihood

Log of Likelihood
Taking log:

C does not
depend on w

Maximum Likelihood

Be careful!
Maximum
Likelihood
criterion may
favour high
order models:
risk of overfitting!

Coefficient of Determination

Coefficient of Determination

Coefficient of Determination

Variability in Parameters
How much could change the optimal
parameters given a different data
set?

Variability in Parameters

Variability in Parameters

Variability in Parameters

Parameters Covariance Matrix

Estimation of Parameter
Covariance Matrix

Variability in Predictions

Variability in Predictions

and estimate the prediction interval:

Variability in Predictions

Variability in Predictions

Comments

Other loss functions have been proposed


for regression and in many cases will be
more appropriate (e.g weighted squares,
sum of absolute values, max. error, etc.)

Other formulations of the optimization


problem could also give better results in
practice (e.g constrained multi-objective
formulation)

You might also like