0% found this document useful (0 votes)

69 views

Generalized Linear Models: Ariel Alonso Abad

The document outlines a course on generalized linear models. It describes the course contents which include statistical inference, linear regression, logistic regression, Poisson regression, and generalized linear models. Lectures cover these topics and conclude with a student project. Evaluation includes a written project report and oral questions. The course aims to extend linear regression models to handle non-Gaussian, binary, and count responses. Analyses will use R software.

Uploaded by

Ali Shana'a

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

69 views

Generalized Linear Models: Ariel Alonso Abad

Uploaded by

Ali Shana'a

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

Generalized Linear Models

Ariel Alonso Abad

Interuniversity Institute for Biostatistics and statistical Bioinformatics, KU Leuven

Alonso, A. GLM 1 / 568

Outline

1 Description of the course

2 Statistical inference I
3 Statistical inference II
4 Linear Regression
5 Logistic Regression I
6 Logistic Regression II
7 Logistic Regression III
8 Poisson Regression
9 Generalized linear models

Alonso, A. GLM 2 / 568

Teaching Plan

Lecture 1: Statistical inference I

Lecture 2: Statistical inference II

Lecture 3: Linear Regression

Lecture 4: Logistic Regression I

Lecture 5: Logistic Regression II

Lecture 6: Logistic Regression III

Lecture 7: Poisson Regression

Lecture 8: Generalized linear models

Lecture 9: Project
Alonso, A. GLM 3 / 568

Tutorial groups

After ﬁrst lecture students organize themselves in tutorial groups

within 5 days

Email Prof. Alonso list with the members of each group (names and
student numbers)

One project assigned to each group

All assignments on Toledo → Course Documents

Report with a detailed discussion of the analysis (no more than 10

pages)

Alonso, A. GLM 4 / 568

Evaluation

Oral and written::

Written: Project

Oral: Questions about project and course

All information about the course on Toledo

Alonso, A. GLM 5 / 568

Aims of the course

Extend the linear regression model to models that can handle a

binary, count and a non-Gaussian response

Deal with over-dispersion in the data

Analyzes will be done with the R software.

R programs are given in the slides.

Alonso, A. GLM 6 / 568

References

Agresti (1990). Categorical Data Analysis, John Wiley & Sons

Faraway (2000). Linear Models with R, Chapman & Hall/CRC

Fahrmeir, Kneib, Lang and Marx (2013). Regression. Models,

Methods and Applications, Springer

McCullagh and Nelder (1989). Generalized Linear Models (2nd

Edition), Chapman & Hall/CRC

Morel and Neerchal (2012). Overdispersion Models in SAS , SAS

Press

Alonso, A. GLM 7 / 568

Outline

1 Description of the course

2 Statistical inference I
3 Statistical inference II
4 Linear Regression
5 Logistic Regression I
6 Logistic Regression II
7 Logistic Regression III
8 Poisson Regression
9 Generalized linear models

Alonso, A. GLM 8 / 568

Outline

1 Description of the course

2 Statistical inference I
3 Statistical inference II
4 Linear Regression
5 Logistic Regression I
6 Logistic Regression II
7 Logistic Regression III
8 Poisson Regression
9 Generalized linear models

Alonso, A. GLM 9 / 568

Statistical Inference

Ariel Alonso Abad

Catholic University of Leuven

Alonso, A. Inference I 10 / 568

Randomness

Randomness: The occurrence of events for which no ﬁnal cause is found

Physics: Quantum Mechanics

Biology: Theory of Evolution

Psychology: Synchronicity

Statistics and Mathematics

Alonso, A. Inference I 11 / 568

Frequency vs. probability distribution

0.020

Intelligence quotient
The intelligence quotient (IQ) of
0.015

10,000 children is determined.

Frequentiedichtheid

The results are shown in a histogram

with class width 10 IQ points.
0.010
0.005
0.000

0 50 100 150 200

Intelligentie

Alonso, A. Inference I 12 / 568

Frequency vs. probability distribution

0.020

Probability distributions
If the number of observations
0.015

increases indeﬁnitely then one can

make the class width smaller and
Frequentiedichtheid

smaller, in the limit a smooth curve

0.010

arises.
0.005
0.000

0 50 100 150 200

Intelligentie

Alonso, A. Inference I 12 / 568

Frequency vs. probability distribution

0.020

Probability distributions
If the number of observations
0.015

increases indeﬁnitely then one can

make the class width smaller and
Frequentiedichtheid

smaller, in the limit a smooth curve

0.010

arises.
Normal distribution
0.005

Student’s t-distribution

Chi-Squared distribution
0.000

F-distribution
0 50 100 150 200

Intelligentie

Alonso, A. Inference I 12 / 568

Normal distribution: Two parameters (μ, σ)

0.4
(x−μ)2
e 2σ2
f(x) =
σ 2π
0.3
probability density

0.2

Two parameters (μ, σ)

34.1% 34.1%
Symmetric
0.1

Mode=Median=Mean
13.6% 13.6%
2.1% 2.1%
0.1% 0.1%
0.0

μ − 3σ μ − 2σ μ − 1σ μ μ + 1σ μ + 2σ μ + 3σ

x−value

Alonso, A. Inference I 13 / 568

Normal distribution: Two parameters (μ, σ)

0.4
0.3

68%
probability density

0.2

Two parameters (μ, σ)

Symmetric
0.1

Mode=Median=Mean
95%
0.0

μ − 3σ μ − 2σ μ − 1σ μ μ + 1σ μ + 2σ μ + 3σ

x−value

Alonso, A. Inference I 13 / 568

Normal distribution

In the eighteenth century, mathematicians already found out that a large

number of measurements often showed a behavior that follows a pattern
similar to this Gaussian distribution

The maximum temperature on August 5 in De Bilt, measured over

several years

Body length

The intelligence, as measured by the IQ, of a large group of subjects

of the same age

Why?

Alonso, A. Inference I 14 / 568

The calculation of probabilities

Stanford−Binet IQ scores
0.025
0.020

IQ scores normally distributed

μ = 100, σ = 15
0.015
Density

0.010
0.005
0.000

40 60 80 100 120 140

Quantile

Alonso, A. Inference I 15 / 568

The calculation of probabilities
Stanford−Binet IQ scores

0.025
0.020

IQ scores normally distributed

μ = 100, σ = 15
0.015
Density

High school
0.010

P (97 ≤ IQ ≤ 115) = 0.4206045

0.005
0.000

40 60 80 100 120 140

Quantile

Alonso, A. Inference I 15 / 568

The calculation of probabilities

Stanford−Binet IQ scores
0.025
0.020

IQ scores normally distributed

μ = 100, σ = 15
0.015
Density

University
0.010

P (115 ≤ IQ ≤ 125) = 0.1108649

0.005
0.000

40 60 80 100 120 140

Quantile

Alonso, A. Inference I 15 / 568

The calculation of probabilities
Stanford−Binet IQ scores

0.025
0.020

IQ scores normally distributed

μ = 100, σ = 15
0.015
Density

PhD
0.010

P (IQ ≥ 125) = 0.04779035

0.005
0.000

40 60 80 100 120 140

Quantile

Alonso, A. Inference I 15 / 568

The calculation of probabilities

Stanford−Binet IQ scores
0.025
0.020

IQ scores normally distributed

μ = 100, σ = 15
0.015
Density

Stephen Hawking, Garry

Kasparov, Albert Einstein, Judit
0.010

Polgar
0.005

P (IQ ≥ 160) = 0.00003

0.000

40 60 80 100 120 140

Quantile

Alonso, A. Inference I 15 / 568

BMI: Body mass index

Penman and Johnson, Prev Chronic Dis. 2006; 3(3):A74

Alonso, A. Inference I 16 / 568

Birth weight

O’Cathain A., British Medical Journal 2002; 324, 643–646.

Alonso, A. Inference I 17 / 568
Inferential Statistics

Inferential Statistics
Inferential statistics makes quantitative statements about the
characteristics of a population but...

It is often impossible to examine an entire population

Thus, frequently, based on the results of a sample conclusions have to

be made or decisions have to be taken about an entire population

Alonso, A. Inference I 18 / 568

The frequentist approach

It is associated with the frequentist interpretation of probability

Conceptual framework: Any given experiment can be considered as

one of an inﬁnite sequence of possible repetitions of the same
experiment, each capable of producing statistically independent results

Frequentist inference requires that the correct conclusion should be

drawn with a given (high) probability, among this notional set of
repetitions

In a frequentist approach unknown parameters are often, but not

always, treated as having ﬁxed but unknown values that are not
capable of being treated as random variables in any sense

Alonso, A. Inference I 19 / 568

Notation

To simplify notation we will use y for both the random variable and
the realized value

In the general case one has a sample of n random variables:

Ý = (y1 , . . . , yn )

Suppose that the yi (i = 1, . . . , n) have a distribution/density

p(y | θ). For the moment, let us assume that dim(θ) = 1

Suppose also that the yi are conditionally independent given θ, i.e.

n
p(Ý | θ) = p(yi | θ)
i =1

The purpose is to draw inference about θ

Alonso, A. Inference I 20 / 568

Point estimation

Point estimation: Involves the use of sample data to calculate a

single value which is to serve as a “best guess” or “best estimate” of
an unknown population parameter

Ý ) is a random variable that aims to estimate the

Formally, θ(
parameter θ

Alonso, A. Inference I 21 / 568

Point estimation

Ý ) has a sampling distribution, resulting from

The point estimator θ(
the sampling distribution of Ý

A good estimator should have a sampling distribution concentrated

close to the true value θ

Ý)
Popular measure of closeness: Mean squared error (MSE) of θ(
deﬁned as:

MSE θ( Ý ) − θ)2
Ý ) =EÝ |θ (θ(

2
Ý ) + bias θ(
=VarÝ |θ θ( Ý)

Alonso, A. Inference I 22 / 568

Point estimation: Bias

The bias of an estimator is the diﬀerence between its expected value

and the true value of the parameter being estimated

Ý ) = EÝ |θ θ(
bias θ( Ý) − θ

Important to notice: the expectations are taken wrt the sampling

distribution of Ý given the true value of θ

Ý) = 0
An estimator is called unbiased when bias θ(

Alonso, A. Inference I 23 / 568

Point estimation: Bias

Suppose that the yi (i = 1, . . . , n) are independent and identically

distributed (i.i.d.)

Interest is to estimate the parameters

μ = E (y ) and σ 2 = Var (y ) = E (y − μ)2

1 1
n n
Point estimators: ȳ = yi and S 2 = (y − ȳ )2
n n
i =1 i =1

n−1
E (ȳ ) = μ and E S 2 = σ2 < σ2
n

Alonso, A. Inference I 24 / 568

Point estimation: Bias

All else being equal, an unbiased estimator is often preferable to a

biased estimator

In practice all else is not equal, and biased estimators are frequently
used

Unbiasedness of estimators is a debatable property, since biased

estimators may have a smaller MSE than unbiased estimators

Alonso, A. Inference I 25 / 568

Bias: Estimating a Poisson probability

The Poisson distribution may be useful to model events such as

The number of meteorites greater than 1 meter diameter that strike

Earth in a year

The number of patients arriving in an emergency room between 10 and

11 pm

The number of incoming calls at a telephone switchboard per minute

λy e −λ
y ∼ P(y |λ) where P(y |λ) = and y = 0, 1, 2 . . .
y!

E (y ) = Var (y ) = λ

Alonso, A. Inference I 26 / 568

Bias: Estimating a Poisson probability

Suppose one wants to estimate θ = P(y = 0|λ)2 = e −2λ with sample

size one

When incoming calls per minute, at a telephone switchboard, are

modeled as a Poisson process with λ denoting the average number of
calls per minute, then θ = e −2λ is the probability that no calls arrive
in the next two minutes

The only unbiased estimator is:

1 if y = 100
θ (y ) = (−1)y =
−1 if y = 101

Alonso, A. Inference I 27 / 568

Important frequentist concepts

Consistency: If θn is an estimator of θ based on a random sample of

size n then we shall say that θn is consistent for θ when

lim P(| θn − θ |> ) → 0 for any > 0

n→∞

Law of Large Numbers: If y1 , y2 , . . . , yn are independent and

identically distributed (i.i.d.) with mean μ = E (Y ), then the sample
1 n
mean ȳn = n i =1 yi is consistent for μ

Alonso, A. Inference I 28 / 568

Bias versus consistency

Unbiased but not consistent: yi with i = 1, . . . , n and E (yi ) = μ.

(Ý ) = T (y1 , y2 , . . . yn ) = y1 is unbiased but not
The estimator μ
consistent

1 n
Biased but consistent: The sample variance S 2 = (y − ȳ )2
n i =1

The sample variance is asymptotically unbiased, i.e., the bias goes to

zero as n increases. However, one can have a consistent, biased
estimator which bias does not go to zero with n

Alonso, A. Inference I 29 / 568

Sample means

Trying to understand the behavior of sample means

Are there certain regularities or laws?

To gain insight into the behavior of the sample mean we will study
how sample means from a known population behave

From a known population draw several samples of n units and look at the
averages of those samples

Alonso, A. Inference I 30 / 568

Sample means

Normal Gamma
0.30
0.4

Simulations
0.3

0.20
Density

Density
0.2

20 000 samples of 4 units

0.10
0.1

0.00
0.0

20 000 means
−4 −2 0 2 4 0 5 10 15 20 25 30

X1 , X2 , X3 , X4
x x

Uniform Beta
X1 + X2 + X3 + X4
X̄ =
12

4
1.2

10
8

X̄1 , X̄2 , X̄3 , X̄4 , . . . , X̄20000

Density

Density
0.8

6
4
0.4

2
0.0

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

x x

Alonso, A. Inference I 31 / 568

Sample means

Normal Gamma

0.30
0.8

Simulations
0.6

0.20
Density

Density
0.4

20 000 samples of 4 units

0.10
0.2

0.00
0.0

N=4 (4 units)
−2 −1 0 1 2 0 2 4 6 8 10 12 14

x x
X1 , X2 , X3 , X4
Uniform Beta
X1 + X2 + X3 + X4
3.0

X̄ =
2.0

4
2.0

1.5
Density

Density

X̄1 , X̄2 , X̄3 , X̄4 , . . . , X̄20000

1.0
1.0

0.5
0.0

0.0

0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

x x

Alonso, A. Inference I 31 / 568

Sample means

Normal Gamma
1.2

0.4

Simulations
0.3
0.8
Density

Density

0.2

20 000 samples of 9 units

0.4

0.1
0.0

0.0

N=9 (9 units)
−1.5 −0.5 0.0 0.5 1.0 1.5 2 4 6 8

x x
X1 , X2 , . . . , X9
Uniform Beta
X1 + X2 + · · · + X9
3.0

X̄ =
4

9
3

2.0
Density

Density

X̄1 , X̄2 , X̄3 , X̄4 , . . . , X̄20000

1.0
1

0.0
0

0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8 1.0

x x

Alonso, A. Inference I 31 / 568

Sample means

Normal Gamma
2.0

Simulations

0.6
1.5
Density

Density

0.4
1.0

20 000 samples of 25 units

0.2
0.5
0.0

0.0
N=25 (25 units)
−0.5 0.0 0.5 1.0 1 2 3 4 5 6

x x
X1 , X2 , . . . , X25
Uniform Beta
X1 + X2 + · · · + X25
X̄ =
5

25
6

4
Density

Density

X̄1 , X̄2 , X̄3 , X̄4 , . . . , X̄20000

3
4

2
2

1
0

0.3 0.4 0.5 0.6 0.7 0.3 0.4 0.5 0.6 0.7 0.8

x x

Alonso, A. Inference I 31 / 568

Central limit theorem (CLT)

CLT: Univariate
If y1 , y2 , . . . , yn are i.i.d. random variables with mean μ = E (y ) and
√
variance σ 2 , then n(ȳn − μ) converges in distribution to N (0, σ 2 ), and
we write √ d
n(ȳn − μ) −
→ N (0, σ 2 )
or also
d
ȳn −
→ N (μ, σ 2 /n) as n → ∞

CLT: Multivariate
If Ý 1 , Ý 2 , . . . , Ý n are i.i.d. random vectors with mean μ = E (Ý ) and
√
covariance matrix Σ, then n(Ý̄ n − μ) converges in distribution to a
multivariate normal distribution, and we write
√ d
n(Ý̄ n − μ) −
→ N (0, Σ)

Alonso, A. Inference I 32 / 568

Why is the world normal?

Body length
Genetic factors

Economic factors

...

Body length = Average Eﬀect

Alonso, A. Inference I 33 / 568

Delta Method

Delta Method
Let Ý n = (y1n , . . . , ypn )T be a sequence of random vectors with
√
E (Ý n ) = μ and n(Ý n − μ) →d Þ

Further consider the mapψ : Rp → Rk , i.e., ψ(Ý ) = (ψ1 (Ý ), . . . , ψk (Ý ))T

∂ψ
with Â (Ý ) = ∂ Ý = ∂ψ i
∂yj (k × p matrix of derivatives) the Jacobian
matrix of the transformation ψ. The delta method gives the asymptotic
distribution of the new random sequence ψ(Ý n ) as
√
n ψ(Ý n ) − ψ(μ) →d Â (μ)Þ

If Þ ∼ Np (0, Σ), then

√ d
n ψ(Ý n ) − ψ(μ) −→ Nk 0, Â (μ) Σ Â (μ)T

Alonso, A. Inference I 34 / 568

CTL Example: Bernoulli Distribution

The Bernoulli distribution, named after Swiss mathematician Jacob

Bernoulli, is the discrete probability distribution of a random variable
y which takes the value 1 with probability π and the value 0 with
probability 1 − π

The probability mass function f of this distribution is

f (k; π) = π k (1 − π)1−k for k ∈ {0, 1}

E (y ) = π and Var (y ) = π (1 − π)

Alonso, A. Inference I 35 / 568

Example: Bernoulli Distribution

If y1 , . . . , yn are i.i.d. random variables, all Bernoulli trials with

success probability π, then their sum is distributed according to a
binomial distribution with parameters n and π

n
y= yk ∼ Bin(n, π)
k=1

The binomial probability mass function f gives the probability of

getting exactly k successes in n trials

n k
f (k, n, π) = P(y = k) = π (1 − π)n−k
k

E (y ) = nπ and Var (y ) = nπ (1 − π)

Alonso, A. Inference I 36 / 568

Normal approximation

The Binomial distribution is the sum of i.i.d Bernoulli random

variables

n
y= yk ∼ Bin(n, π)
k=1

The Central Limit Theorem asserts that if n is large enough, then

y ∼ N nπ, nπ(1 − π)

Alonso, A. Inference I 37 / 568

Normal approximation

y ∼ B(n = 100, π = 0.4) thus approximately y ∼ N (40, 24)

## Binomial normal approximation

x1=36:45
x2= c(25:35, 46:55)
x1x2= seq(25, 55, by=.01)

plot(x1x2, dnorm(x1x2, 40, sqrt(24)), type="n", xlab="x",

+ ylab="Binomial Probability")
lines(x2, dbinom(x2, 100, .4), type="h", col=2)
lines(x1, dbinom(x1, 100, .4), type="h", lwd=2)

plot(x1x2, dnorm(x1x2, 40, sqrt(24)), type="l", xlab="x",

+ ylab="Binomial Probability")
lines(x2, dbinom(x2, 100, .4), type="h", col=2)
lines(x1, dbinom(x1, 100, .4), type="h", lwd=2)

Alonso, A. Inference I 38 / 568

Normal approximation

0.08
0.06
Binomial Probability

0.04
0.02
0.00

25 30 35 40 45 50 55

Alonso, A. Inference I 39 / 568

Delta Method Example: Titanic

On the 10 April 1912 the largest passenger steamship in the world left
Southampton England, to New York City. At 23:40 on 14 April, it struck
an iceberg and sank at 2:20 the following morning, resulting in the deaths
of 1,517 people in one of the deadliest peacetime maritime disasters in
history.
Alonso, A. Inference I 40 / 568
Delta Method Example: Titanic

Odds of Survival

Let y1 , . . . , yn are i.i.d. random variables, all Bernoulli trials with

success probability π = P (yi = 1), where yi = 1 if passenger i
survived and zero otherwise

Odds of surviving
π
Θsurvival =
1−π

⇒ If Θsurvival = 2 then for every 2 passengers that survive 1 dies

⇒ If Θsurvival = 0.5 then for every 1 passengers that survive 2 dies

Term comes from horse racing

Alonso, A. Inference I 41 / 568

Delta Method Example: Titanic

If y1 , . . . , yn are i.i.d. random variables, all Bernoulli trials with

success probability π = P (yi = 1), then

n
y= yk ∼ Bin(n, π)
k=1

The Central Limit Theorem implies

π(1 − π)
π
∼ N π,
n

The point estimate for the Odds of surviving is

survival = π

Θ
1−π

Alonso, A. Inference I 42 / 568

Delta Method Example: Titanic

Consider now the function

x 1
g (x) = ln , it is easy to show, g (x) =
1−x x(1 − x)

Applying the Delta Method one gets that asymptotically

2 π(1 − π)

ln Θsurvival ∼ N ln (Θsurvival ) , g (π)
n

1
Thus asymptotically Var ln Θsurvival =
π(1 − π)n

Note: It can be shown that no unbiased estimator of the log-odds

exists
Alonso, A. Inference I 43 / 568
Delta Method Example: Titanic

> install.packages("msm")
> library(msm)
> titanic.path="C:/Equizo/Courses/KULeuven/titanicmissing.txt"
> titanic<- read.table(titanic.path, header=T, sep=",")
> head(titanic,5)
survived pclass sex age
1 1 1st 0 29.0000
2 0 1st 0 2.0000
3 0 1st 1 30.0000
4 0 1st 0 25.0000
5 1 1st 1 0.9167
> p_survival=mean(titanic$survived)
> log_survival_odds=log(p_survival/(1-p_survival))
> n=nrow(titanic)
> var_p_survival=(p_survival*(1-p_survival))/n
> se_odds_delta=deltamethod(g=~log(x1/(1-x1)), mean=p_survival, cov=var_p_survival)
> se_odds_delta^2
[1] 0.003384579
> log_survival_odds
[1] -0.6545499
> survival_odds
[1] 0.5196759
> A.
Alonso, Inference I 44 / 568

Conﬁdence intervals

A conﬁdence interval is an interval estimate of a parameter

In contrast to a point estimate a conﬁdence interval gives a set of

plausible values (estimates) for the parameter

Of all the realizations of the interval some will contain the true value
of the parameter, but others will not

The probability that the stochastic interval will contain the true value
of the parameter is called the conﬁdence level of the interval

Alonso, A. Inference I 45 / 568

Conﬁdence intervals

Suppose also that the yi are conditionally independent given θ, i.e.

n
p(Ý | θ) = p(yi | θ)
i =1

Interval estimator: [a(Ý ), b(Ý )] aims to include the true value with a
pre-speciﬁed probability

For instance, 100(1 − α)% conﬁdence interval (CI) is an interval

[a(Ý ), b(Ý )] that satisﬁes for all θ:

P θ ∈ [a(Ý ), b(Ý )] = 1 − α

Alonso, A. Inference I 46 / 568

Interval estimator

The probability statement is given wrt the sampling distribution of Ý

The probability 1 − α is called the coverage probability, classically

α = 0.05 leading to the 95% CI

The adjective 95% refers to a probability statement under repeated

sampling. For a particular sample the 95% CI either includes or
excludes the true value θ

Many 95% CIs have only asymptotically the correct coverage. For
small samples their good behavior needs to be established via
simulations

Alonso, A. Inference I 47 / 568

Conﬁdence intervals

Conﬁdence interval for the mean μ

S S
CI1−α = X̄ − T1− 2 ,n−1 √ , X̄ + T1− 2 ,n−1 √
α α
n n

The probability that the stochastic interval will contain the true value
of the parameter is called the conﬁdence level of the interval

P (μ lies in CI1−α ) = 1 − α

Often α = 0.05 thus...

Alonso, A. Inference I 48 / 568

Conﬁdence intervals

Conﬁdence interval for the mean μ

S S
CI0.95 = X̄ − T0.975,n−1 √ , X̄ + T0.975,n−1 √
n n

The probability that the stochastic interval will contain the true value
of the parameter is called the conﬁdence level of the interval

P (μ lies in CI0.95 ) = 0.95

95% of all realizations of the interval will contain the true value of
the population mean μ

Alonso, A. Inference I 49 / 568

Simulations: μ = 0.0, σ = 1, n = 100, 1 − α = 0.95

Alonso, A. Inference I 50 / 568

1 − α = 0.95 versus 1 − α = 0.99

Demonstration of Confidence Intervals

coverage rate: 0.990 1 99

0.4
n]

0.2
n , x + T1−α 2S

0.0
CI: [ x − T1−α 2S

−0.2
−0.4

0 20 40 60 80 100

Samples

1 − α = 0.95 1 − α = 0.99
Alonso, A. Inference I 51 / 568
Likelihood

The likelihood function is a particular function of the parameters of a

statistical model given data

The likelihood plays a central role in all the main paradigms of

statistics

Bayesian paradigm

Frequentist paradigm

Likelihood paradigm

Information-theoretic paradigm (AIC)

Maximum Likelihood: Intuitively, the maximum likelihood estimator

selects the values of the parameters of a model that give the observed
data the largest possible probability

Alonso, A. Inference I 52 / 568

Likelihood
Probability that sample Ý = (y1 , . . . , yn ) has happened under
p-dimensional θ is given by p(Ý | θ)

As a function of θ, p(Ý | θ) is called the likelihood and is denoted as

L(θ | Ý )
n
L(θ | Ý ) = p(yi | θ)
i =1

Maximum likelihood:
Given a data set Ý , the value of θ that maximizes L(θ | Ý ) is called the
maximum likelihood estimator (MLE) and is denoted as θ

we rather maximize
To ﬁnd θ

n
(θ | Ý ) ≡ log L(θ | Ý ) = log p(yi | θ)
i =1

Alonso, A. Inference I 53 / 568

Likelihood: Related concepts

To explore the sampling properties of the MLE, we need to look at

the properties of (θ | Ý ), i.e., as a function of the random variable Ý

(θ | Ý ) will be denoted as (θ), but now it is considered a random

variable

Basically, one needs to know also the sampling behavior of

1st derivative of (θ): score function

2nd derivative of (θ): Fisher (expected) information matrix

We look at the i.i.d. case, but the MLE properties can be extended to
the non-i.i.d case

Alonso, A. Inference I 54 / 568

Score function
The score function is deﬁned as:

∂ (θ) ∂ (θ) ∂ (θ)
Ë (θ) = = ,...,
∂θ ∂θ1 ∂θp

=[s1 (θ), . . . , sp (θ)]

The MLE is typically computed as the solution of the score equation

n
∂
n
Ë (θ) = log p(yi | θ) = × i (θ) = 0
∂θ
i =1 i =1

Hessian matrix is the matrix of the second derivatives of the

log-likelihood
∂2 ∂2 n
À (θ) = (θ) = log p(yi | θ)
∂θ∂θ ∂θ∂θ
i =1

Alonso, A. Inference I 55 / 568

Fisher’s (expected) information matrix
Fisher’s expected information matrix in a sample of size n is the
p × p matrix

∂2
Á (θ) ≡ Á n (θ) = −E À (θ) = −E (θ)
∂θ∂θ
i.e., minus the expected value of the Hessian matrix

In the i.i.d. data case

n
∂2
Á (θ) = −E log p(yi | θ) = nÁ 1 (θ)
∂θ∂θ
i =1

Á 1 (θ) denotes the expected Fisher information from one observation,

i.e.,
∂2
Á 1 (θ) = −E log p(y | θ)
∂θ∂θ
Alonso, A. Inference I 56 / 568

Score function
If the data were generated by p(y | θ 0 ) then

E Ë (θ 0 ) = 0
n n
Var Ë (θ 0 ) = T×
i =1 Var × i (θ 0 ) = i =1 E × i (θ 0 ) i (θ 0 )

In the i.i.d. data case Var Ë (θ 0 ) = nE × (θ 0 )
T × (θ
0) where
∂
× (θ 0 ) = log p(y | θ 0 )
∂θ
Moreover, the so-called Information Matrix Equality (IME) holds

∂2
E × (θ 0 ) × (θ 0 ) = −E
T
log p(y | θ 0 ) = Á 1 (θ 0 )
∂θ∂θ

and, hence, Var Ë (θ 0 ) = nÁ 1 (θ 0 ) = Á (θ 0 )

Alonso, A. Inference I 57 / 568

Score function

If the data were generated by p(y | θ 0 ) then

Asymptotically: n− 2 Ë (θ 0 ) ∼ N 0, Á 1 (θ 0 )
1

For large samples Ë (θ 0 ) ∼ N 0, Á (θ 0 ) with Á (θ 0 ) = −E À (θ 0 )

An important problem is the estimation of Á (θ 0 ). Three estimators

are available:

n
= −À (θ),
Á 1 (θ) =
Á 2 (θ) T × i (θ),
× i (θ) = Á (θ)
Á 3 (θ)
i =1

Provided θ is a consistent estimator of θ 0 (not necessarily the MLE),

each of these estimators converges to Á (θ 0 ).

is called the observed information matrix

The matrix −À (θ)

Alonso, A. Inference I 58 / 568

Cramér-Rao bound

The Cramér-Rao bound (CRB) also called information inequality

expresses a lower bound on the variance of unbiased estimators

In its simplest form, the bound states that the variance of any
unbiased estimator is at least as high as the inverse of the Fisher
information

An unbiased estimator which achieves this lower bound is said to be

(fully) eﬃcient

Such a estimator achieves the lowest possible mean squared error

among all unbiased estimators and is the minimum variance unbiased
(MVU) estimator

In some cases, no unbiased estimator exists which achieves the bound

Alonso, A. Inference I 59 / 568

Cramér-Rao bound
Cramér-Rao bound
Let ( ) be an unbiased estimator of a vector function of parameters
ψ(θ). The Cramér-Rao bound states that the covariance matrix of ( )
satisﬁes
cov ( ) ≥ (θ) (θ)−1 (θ)T
where

(θ) = ( ∂ψ
i
∂θ ) is the Jacobian matrix
j

The matrix inequality ≥ is understood to mean that the matrix

− is positive semideﬁnite
= ( ) is an unbiased estimator of θ (i.e., ψ(θ) = θ), then the
If θ
Cramér-Rao bound reduces to

cov θ( ) ≥ (θ)−1

Alonso, A. Inference I 60 / 568

Properties MLE

MLEs have no optimum properties for ﬁnite samples, in the sense

that (when evaluated on ﬁnite samples) other estimators may have
greater concentration around the true parameter-value

MLEs possess a number of attractive limiting properties. As the

sample size increases to inﬁnity, sequences of maximum likelihood
estimators have these properties

Consistency: The MLE converges in probability to the value being

estimated

Eﬃciency: It achieves the CRB when the sample size tends to inﬁnity.
This means that no unbiased estimator has lower asymptotic mean
squared error (MLE asymptotically unbiased)

Asymptotic Normality: The MLE is asymptotically normal

Alonso, A. Inference I 61 / 568

Properties MLE

Consistency: Under general regularity conditions, if the data were

generated by p(Ý | θ 0 ) and we have a sufficiently large number of
observations n, then it is possible to find the value of θ 0 with
mle (Ý ) −
arbitrary precision, i.e., θ
p
→ θ0
√
Efficiency and Asymptotic Normality: The MLE is n-consistent
and asymptotically efficient, meaning that it reaches the CRB
√
n θ mle (Ý ) − θ 0 −→ N (0, Á 1 (θ 0 )−1 )
d

mle (Ý ) ∼ N θ 0 , Á (θ 0 )−1 with Á (θ 0 ) = −E À (θ 0 ) for
Therefore, θ
large samples

For carrying out inferences Á (θ 0 ) can be estimated using

Á 1,
Á 2 or
Á3

Alonso, A. Inference I 62 / 568

Regularity conditions

Identifiability: different values of θ determine different distributions

Range of the data cannot depend on unknown parameters

True parameter must lie in the interior of the parameter space

The number of parameters must not increase with the sample size, at
least not too quickly

Alonso, A. Inference I 63 / 568

Further properties of MLE

Equivariant estimator: If θ mle is the MLE of θ and ψ = ψ(θ) is a

1-to-1 transformation, then ψ
mle = ψ(θ mle ) is the MLE obtained
from L(ψ | Ý ). In addition, because of the delta method:

mle −
ψ
d mle )Â (θ)T
→ Np ψ, Â (θ)Cov(θ as n → ∞

Alonso, A. Inference I 64 / 568

Further properties of MLE

Equivariant estimator: If θ mle is the MLE of θ and ψ = ψ(θ) is a

1-to-1 transformation, then ψ
mle = ψ(θ mle ) is the MLE obtained
from L(ψ | Ý ). In addition, because of the delta method:

mle −
ψ → Np ψ, Â (θ)Á −1 (θ)Â (θ)T
d
as n → ∞

and the expected

For inference we may replace Â (θ) by Â (θ)

information by Á 1 , Á 2 or Á 3

MLE is invariant wrt 1-to-1 transformations of the data

Alonso, A. Inference I 64 / 568

Maximizing the likelihood

Given the parameter value θ k at iteration k, the Newton-Raphson

(NR) algorithm approximates Ë (θ) using a Taylor series expansion
about θ k
Ë (θ) ≈ Ë (θ k ) + À (θk ) θ − θk

An MLE is typically obtained as a solution of the likelihood equation

Ë (θ) = 0, thus we solve the system of linear equations

0 ≈ Ë (θ k ) + À (θ k ) θ − θ k

The previous equation can be used to update θ k , and this leads to

the Newton-Raphson algorithm

θ k+1 ≈ θ k − À (θ k )−1 Ë (θ k )

Alonso, A. Inference I 65 / 568

Solving the score equations

Alonso, A. Inference I 66 / 568

Maximizing the likelihood

Fisher’s scoring: Replaces (θ) with its expected value if the

formula for the expected value is known

Convergence is quadratic: As the method converges on the root,

the diﬀerence between the root and the approximation is squared at
each step

Failure of the method to converge to the root

Stationary point: Hessian is singular

Poor initial estimate

Overshoot: If the Hessian is not well behaved in the neighborhood of a

particular root, the method may overshoot, and diverge from that root

Alonso, A. Inference I 67 / 568

Surgery example: Binomial likelihood

New but rather complicated surgical technique. Surgeon operates
n = 12 patients with y = 9 successes

Binomial distribution Bin(n,π)

n y
P(y | π) = π (1 − π)(n−y ) (y = 0, 1, . . . , n)
y

Binomial likelihood (function)

12 9
L(π | 9) = π (1 − π)3
9

Binomial log-likelihood (function):

12
(π | 9) = log + 9 log(π) + 3 log(1 − π)
9
Alonso, A. Inference I 68 / 568
Surgery example: Binomial likelihood
MLE: maximize L(π | y ) or better (π | y )
(π | y ) = y ln π + (n − y ) ln(1 − π) + constant
d y (n − y )
(π | y ) = − =0⇒π
= y /n
dπ π (1 − π)
For y = 9 and n = 12 ⇒ π
= 0.75

Fisher information: 2nd derivative of (π | y )

d 2 y n−y
H(π) = =− 2 −
dπ 2 π (1 − π)2

d 2 n
I (π) = −E =
dπ 2 π(1 − π)

Variance MLE (evaluated in MLE): π

(1 − π
)/n
Alonso, A. Inference I 69 / 568

Surgery example: Binomial likelihood

## Maximum Likelihood Binomial distribution

## Likelihood function
n.size=12
y.su=9
llik2 <- function(p)-sum(dbinom(y.su,prob=p,size=n.size,log=TRUE))
p_MLE=nlm(llik2,p=c(0.5), hessian = TRUE)

> p_MLE
$minimum
[1] 1.354394

$estimate
[1] 0.7499995

$gradient
[1] -1.190159e-07
$hessian
[,1]
[1,] 64.03399

Alonso, A. Inference I 70 / 568

Surgery example: Binomial likelihood

## Maximum Likelihood Binomial distribution

## Likelihood function
n.size=12
y.su=9
llik2 <- function(p)-sum(dbinom(y.su,prob=p,size=n.size,log=TRUE))
p_MLE=nlm(llik2,p=c(0.5), hessian = TRUE)

p_vec <- seq(0.01, 1, by = 0.01)

llik3=rep(0,length(p_vec))
for(i in 1:length(p_vec)) llik3[i]=llik2(p_vec[i])

par(las = 1, cex.lab = 1.2)

plot(p_vec, llik3, type = "l", xlab = "p", ylab = "-log-Likelihood")
points(p_MLE$estimate, p_MLE$minimum, pch = 19, col = "red")
segments(x0=p_MLE$estimate, y0 =0, x1 =p_MLE$estimate, y1 = p_MLE$minimum,
lwd = 2, col = "red")

Alonso, A. Inference I 71 / 568

Surgery example: Binomial likelihood

35
30
25
−log−Likelihood

20
15
10
5
0

0.0 0.2 0.4 0.6 0.8 1.0

Alonso, A. Inference I 72 / 568

Surgery example: Binomial likelihood

Asymptotic distribution of MLE:

π
∼ N [π, π
(1 − π
)/n]

Asymptotic 95% CI for π:

π
(1 − π
) π
(1 − π
)
π
− 1.96 × ,π
+ 1.96 ×
n n

Suppose θ = log[π/(1 − π)], then θ = log[

π /(1 − π
)]

And the same (asymptotic) properties hold for θ as for π

Alonso, A. Inference I 73 / 568

Surgery example: Binomial likelihood

Asymptotic distribution of MLE:

π
∼ N [π, π
(1 − π
)/n]

Asymptotic 95% CI for π: [0.51, 0.99]

Suppose θ = log[π/(1 − π)], then θ = 1.1

And the same (asymptotic) properties hold for θ as for π

Alonso, A. Inference I 74 / 568

Statistical Methods For Bioinformatics Lecture 5
No ratings yet
Statistical Methods For Bioinformatics Lecture 5
48 pages
Differential-Geometrical Methods in Statistics
No ratings yet
Differential-Geometrical Methods in Statistics
301 pages
Sta 212 Mathematical Statistics 1
No ratings yet
Sta 212 Mathematical Statistics 1
4 pages
Data 101 Complete PDF
No ratings yet
Data 101 Complete PDF
603 pages
(J. G. Kalbfleisch) Probability and Statistical I PDF
No ratings yet
(J. G. Kalbfleisch) Probability and Statistical I PDF
188 pages
Statistics Lecture Note Asymptotic Tools
No ratings yet
Statistics Lecture Note Asymptotic Tools
216 pages
HW 03 Sol
No ratings yet
HW 03 Sol
9 pages
XY-pic: 1. Xymatrix
No ratings yet
XY-pic: 1. Xymatrix
37 pages
Asymptotics in Statistics Some Basic Concepts by Lucien Le Cam, Grace Lo Yang (auth.) (z-lib.org)
No ratings yet
Asymptotics in Statistics Some Basic Concepts by Lucien Le Cam, Grace Lo Yang (auth.) (z-lib.org)
298 pages
Statistical Methods For Bioinformatics Lecture 3
No ratings yet
Statistical Methods For Bioinformatics Lecture 3
33 pages
Parameter Searching For Epidemiology Models Using Bayesian Optimization
No ratings yet
Parameter Searching For Epidemiology Models Using Bayesian Optimization
14 pages
Seminar On: Topic of Mathematical Biology
No ratings yet
Seminar On: Topic of Mathematical Biology
40 pages
Computational Optimal Transport
No ratings yet
Computational Optimal Transport
56 pages
Statistical Methods For Bioinformatics Lecture 2
No ratings yet
Statistical Methods For Bioinformatics Lecture 2
47 pages
The SIR Model When S (T) Is A Multi-Exponential Function.
No ratings yet
The SIR Model When S (T) Is A Multi-Exponential Function.
47 pages
A Primer of Infinitesimal Analysis - Portada
0% (1)
A Primer of Infinitesimal Analysis - Portada
4 pages
Monte Carlo Simulation in Java
100% (1)
Monte Carlo Simulation in Java
15 pages
Download ebooks file Introduction to Hilbert Spaces with Application 3rd Edition Lokenath Debnath all chapters
100% (7)
Download ebooks file Introduction to Hilbert Spaces with Application 3rd Edition Lokenath Debnath all chapters
61 pages
High Quality Online Texts and Notes:: Sition To Advanced Mathematics, 2nd Ed., Addison Wesley, Reading, MA
No ratings yet
High Quality Online Texts and Notes:: Sition To Advanced Mathematics, 2nd Ed., Addison Wesley, Reading, MA
8 pages
Linear Programming: Brewer's Problem Simplex Algorithm Implementation Linear Programming
No ratings yet
Linear Programming: Brewer's Problem Simplex Algorithm Implementation Linear Programming
49 pages
Solution CH # 5
No ratings yet
Solution CH # 5
39 pages
MATH 545, Stochastic Calculus Problem Set 2: January 24, 2019
No ratings yet
MATH 545, Stochastic Calculus Problem Set 2: January 24, 2019
7 pages
Introduction of Frechet and Gateaux Derivative
No ratings yet
Introduction of Frechet and Gateaux Derivative
6 pages
STAT 111 - 2019 Spring (51056)
0% (1)
STAT 111 - 2019 Spring (51056)
2 pages
Lecture 9 Linear Least Squares SVD
No ratings yet
Lecture 9 Linear Least Squares SVD
20 pages
MATH1110, Mathematics I Practice Exam First Name: Last Name: Student Number: Instructions
No ratings yet
MATH1110, Mathematics I Practice Exam First Name: Last Name: Student Number: Instructions
21 pages
SPSS: Chi-Square Test of Independence: Analyze
No ratings yet
SPSS: Chi-Square Test of Independence: Analyze
34 pages
Solutions To Abstract Algebra - Chapter 1 (Dummit and Foote, 3e)
0% (1)
Solutions To Abstract Algebra - Chapter 1 (Dummit and Foote, 3e)
25 pages
Algebra Notes From The Underground 1st Edition Paolo Aluffi Download PDF
100% (2)
Algebra Notes From The Underground 1st Edition Paolo Aluffi Download PDF
79 pages
Math Research Papers
100% (2)
Math Research Papers
140 pages
Curve Fitting
No ratings yet
Curve Fitting
18 pages
Instant Ebooks Textbook Symmetrization in Analysis Albert Baernstein Ii Download All Chapters
100% (12)
Instant Ebooks Textbook Symmetrization in Analysis Albert Baernstein Ii Download All Chapters
70 pages
An Introduction To Bayesian Statistics and MCMC Methods
No ratings yet
An Introduction To Bayesian Statistics and MCMC Methods
69 pages
Binomial Distribution
No ratings yet
Binomial Distribution
16 pages
Cranks Andt Cores
No ratings yet
Cranks Andt Cores
17 pages
Statistical Mechanics Lecnotes
100% (1)
Statistical Mechanics Lecnotes
146 pages
Chapter B Probability Via Measure Theory: 1 Measurable Spaces
100% (1)
Chapter B Probability Via Measure Theory: 1 Measurable Spaces
54 pages
(CMBS-NSF 59) Grace Wahba - Spline Models For Observational Data-SIAM (1990)
No ratings yet
(CMBS-NSF 59) Grace Wahba - Spline Models For Observational Data-SIAM (1990)
179 pages
Computational Methods For Biological Models
No ratings yet
Computational Methods For Biological Models
254 pages
Biocalculus Calculus Probability and Statistics for the Life Sciences 1st Edition James Stewart All Chapters Instant Download
100% (6)
Biocalculus Calculus Probability and Statistics for the Life Sciences 1st Edition James Stewart All Chapters Instant Download
65 pages
Lecture 9 (Polar Coordinates and Polar Curves)
No ratings yet
Lecture 9 (Polar Coordinates and Polar Curves)
229 pages
Optimal Control Matlab
No ratings yet
Optimal Control Matlab
25 pages
Curve Fitting
No ratings yet
Curve Fitting
51 pages
Yunshu InformationGeometry PDF
No ratings yet
Yunshu InformationGeometry PDF
79 pages
Partial Differential Equations of Applied Mathematics Lecture Notes, Math 713 Fall, 2003
No ratings yet
Partial Differential Equations of Applied Mathematics Lecture Notes, Math 713 Fall, 2003
128 pages
Survival Analysis and Interpretation Of.32
No ratings yet
Survival Analysis and Interpretation Of.32
7 pages
Geometry and Intuition
No ratings yet
Geometry and Intuition
9 pages
Notes Spectral Methods PDF
No ratings yet
Notes Spectral Methods PDF
12 pages
Buy ebook (Ebook) Introduction to Smooth Ergodic Theory by Luis Barreira, Ya. B. Pesin ISBN 9781470470654, 1470470659 cheap price
100% (5)
Buy ebook (Ebook) Introduction to Smooth Ergodic Theory by Luis Barreira, Ya. B. Pesin ISBN 9781470470654, 1470470659 cheap price
67 pages
Guide of Topology
No ratings yet
Guide of Topology
66 pages
(Encyclopedia of Mathematics and Its Applications) Magurn B.A. - An Algebraic Introduction To K-Theory-Cambridge University Press (2002)
No ratings yet
(Encyclopedia of Mathematics and Its Applications) Magurn B.A. - An Algebraic Introduction To K-Theory-Cambridge University Press (2002)
694 pages
Calculus in Epidemiology
No ratings yet
Calculus in Epidemiology
15 pages
Methods For Studying Proteins
No ratings yet
Methods For Studying Proteins
96 pages
代数 (1-11章英文版) 课后答案
No ratings yet
代数 (1-11章英文版) 课后答案
71 pages
Lectures on the Coupling Method
From Everand
Lectures on the Coupling Method
Torgny Lindvall
No ratings yet
Group Theory in Solid State Physics and Photonics: Problem Solving with Mathematica
From Everand
Group Theory in Solid State Physics and Photonics: Problem Solving with Mathematica
Wolfram Hergert
No ratings yet
Complex analysis A Complete Guide
From Everand
Complex analysis A Complete Guide
Gerardus Blokdyk
No ratings yet
Introductions to Set and Functions
From Everand
Introductions to Set and Functions
Simone Malacrida
No ratings yet
Nonnegative Matrices and Applicable Topics in Linear Algebra
From Everand
Nonnegative Matrices and Applicable Topics in Linear Algebra
Alexander Graham
No ratings yet
Unit 2 (2) - 1
No ratings yet
Unit 2 (2) - 1
37 pages
Assignment 5
No ratings yet
Assignment 5
6 pages
Instructions For Performing A Kruskal-Wallis Test and Post Hoc Mann-Whitney U-Test in Minitab
No ratings yet
Instructions For Performing A Kruskal-Wallis Test and Post Hoc Mann-Whitney U-Test in Minitab
2 pages
Effect of Succession Planning Strategies On The Sustainability of Family Businesses in Nigeria
No ratings yet
Effect of Succession Planning Strategies On The Sustainability of Family Businesses in Nigeria
21 pages
2023 TERM 3 Revision Material Grade 11
No ratings yet
2023 TERM 3 Revision Material Grade 11
28 pages
Aiml Unit 3
No ratings yet
Aiml Unit 3
9 pages
Chi Square Test
No ratings yet
Chi Square Test
23 pages
Assignment #1: Decision and Support Systems ID: 999991355
No ratings yet
Assignment #1: Decision and Support Systems ID: 999991355
11 pages
2nd Quarter PR 2 REVIEWER
No ratings yet
2nd Quarter PR 2 REVIEWER
5 pages
Quality and Performance
No ratings yet
Quality and Performance
96 pages
2016 Does A Taller Husband Make His Wife Happier
No ratings yet
2016 Does A Taller Husband Make His Wife Happier
8 pages
The JASP Workbook An Introduction To Qua
No ratings yet
The JASP Workbook An Introduction To Qua
116 pages
The Central Limit Theorem (Solutions)
No ratings yet
The Central Limit Theorem (Solutions)
8 pages
CH 13
No ratings yet
CH 13
123 pages
M11n - Lesson 3.2 - PPT - Handout - Median, Mode, and Fractiles - 1sem22-23
No ratings yet
M11n - Lesson 3.2 - PPT - Handout - Median, Mode, and Fractiles - 1sem22-23
8 pages
Manova
No ratings yet
Manova
15 pages
Statprob Q4 Module 5
No ratings yet
Statprob Q4 Module 5
17 pages
Marketing Research 6th Edition Daniel Nunan - The latest ebook is available for instant download now
100% (3)
Marketing Research 6th Edition Daniel Nunan - The latest ebook is available for instant download now
71 pages
FINAL EXAM IN E-WPS Office
No ratings yet
FINAL EXAM IN E-WPS Office
12 pages
Where can buy Psychometrics An Introduction Second Edition R. Michael Furr ebook with cheap price
100% (8)
Where can buy Psychometrics An Introduction Second Edition R. Michael Furr ebook with cheap price
77 pages
ASQ Six Sigma Black Belt Study Guide
100% (1)
ASQ Six Sigma Black Belt Study Guide
18 pages
SB-Final-1 (1)
No ratings yet
SB-Final-1 (1)
33 pages
Math 533 Course Project Part B 2
100% (2)
Math 533 Course Project Part B 2
10 pages
Feature Selection Techniques in Machine Learning
No ratings yet
Feature Selection Techniques in Machine Learning
49 pages
191HS42 - Probability & Statistics - Question Bank PDF
No ratings yet
191HS42 - Probability & Statistics - Question Bank PDF
7 pages
Random Variable Exercises
No ratings yet
Random Variable Exercises
5 pages
OUTPUT Part 4
No ratings yet
OUTPUT Part 4
25 pages
House Price Prediction Analysis Project
No ratings yet
House Price Prediction Analysis Project
7 pages
Anova
No ratings yet
Anova
18 pages
Statistical Data Treatment and Evaluation Lecture
No ratings yet
Statistical Data Treatment and Evaluation Lecture
16 pages
Unit 5 Non-Parametric Tests Parametric Tests
No ratings yet
Unit 5 Non-Parametric Tests Parametric Tests
5 pages