0% found this document useful (0 votes)

18 views

Score (Statistics)

The score is the gradient of the log-likelihood function with respect to the parameter vector. It indicates how sensitive the log-likelihood is to changes in the parameter values. The score will be zero at local maxima/minima, which are used to find maximum likelihood estimates. The expected value of the score at the true parameter value is zero. Its variance equals the negative expected value of the Hessian matrix, known as the Fisher information. Examples of scores are given for Bernoulli and binary outcome models.

Uploaded by

dev414

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views

Score (Statistics)

Uploaded by

dev414

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Score (statistics)

In statistics, the score (or informant[1]) is the gradient of the log-likelihood function with respect to the
parameter vector. Evaluated at a particular point of the parameter vector, the score indicates the steepness of
the log-likelihood function and thereby the sensitivity to infinitesimal changes to the parameter values. If the
log-likelihood function is continuous over the parameter space, the score will vanish at a local maximum or
minimum; this fact is used in maximum likelihood estimation to find the parameter values that maximize the
likelihood function.

Since the score is a function of the observations that are subject to sampling error, it lends itself to a test
statistic known as score test in which the parameter is held at a particular value. Further, the ratio of two
likelihood functions evaluated at two distinct parameter values can be understood as a definite integral of
the score function.[2]

Definition
The score is the gradient (the vector of partial derivatives) of , the natural logarithm of the
likelihood function, with respect to an m-dimensional parameter vector .

This differentiation yields a row vector, and indicates the sensitivity of the likelihood (its
derivative normalized by its value).

In older literature, "linear score" may refer to the score with respect to infinitesimal translation of a given
density. This convention arises from a time when the primary parameter of interest was the mean or median
of a distribution. In this case, the likelihood of an observation is given by a density of the form
. The "linear score" is then defined as

Properties

Mean

While the score is a function of , it also depends on the observations at which the
likelihood function is evaluated, and in view of the random character of sampling one may take its expected
value over the sample space. Under certain regularity conditions on the density functions of the random
variables,[3][4] the expected value of the score, evaluated at the true parameter value , is zero. To see this,
rewrite the likelihood function as a probability density function , and denote the
sample space . Then:
The assumed regularity conditions allow the interchange of derivative and integral (see Leibniz integral
rule), hence the above expression may be rewritten as

It is worth restating the above result in words: the expected value of the score, at true parameter value is
zero. Thus, if one were to repeatedly sample from some distribution, and repeatedly calculate the score,
then the mean value of the scores would tend to zero asymptotically.

Variance

The variance of the score, , can be derived from the above expression for the
expected value.

Hence the variance of the score is equal to the negative expected value of the Hessian matrix of the log-
likelihood.[5]
The latter is known as the Fisher information and is written . Note that the Fisher information is not a
function of any particular observation, as the random variable has been averaged out. This concept of
information is useful when comparing two methods of observation of some random process.

Examples

Bernoulli process

Consider observing the first n trials of a Bernoulli process, and seeing that A of them are successes and the
remaining B are failures, where the probability of success is θ.

Then the likelihood is

so the score s is

We can now verify that the expectation of the score is zero. Noting that the expectation of A is nθ and the
expectation of B is n(1 − θ) [recall that A and B are random variables], we can see that the expectation of s
is

We can also check the variance of . We know that A + B = n (so B = n − A) and the variance of A is
nθ(1 − θ) so the variance of s is

Binary outcome model

For models with binary outcomes (Y = 1 or 0), the model can be scored with the logarithm of predictions

where p is the probability in the model to be estimated and S is the score.[6]

Applications

Scoring algorithm
The scoring algorithm is an iterative method for numerically determining the maximum likelihood estimator.

Score test

Note that is a function of and the observation , so that, in general, it is not a

statistic. However, in certain applications, such as the score test, the score is evaluated at a specific value of
(such as a null-hypothesis value), in which case the result is a statistic. Intuitively, if the restricted
estimator is near the maximum of the likelihood function, the score should not differ from zero by more
than sampling error. In 1948, C. R. Rao first proved that the square of the score divided by the information
matrix follows an asymptotic χ2 -distribution under the null hypothesis.[7]

Further note that the likelihood-ratio test is given by

which means that the likelihood-ratio test can be understood as the area under the score function between
and .[8]

Score matching (machine learning)

Score matching describes the process of applying machine learning algorithms (commonly neural
networks) to approximate the score function of an unknown distribution from
finite samples. The learned function can then be used in generative modeling to draw new samples from
.[9]

It might seem confusing that the word score has been used for , because it is not a likelihood
function, neither it has a derivative with respect to the parameters. For more information about this
definition, see the referenced paper. [10]

See also
Fisher information – Notion in statistics
Information theory – Scientific study of digital information
Score test – Statistical test based on the gradient of the likelihood function
Scoring algorithm – form of Newton's method used in statistics
Standard score – How many standard deviations apart from the mean an observed datum is
Support curve – Function related to statistics and probability theory

Notes
1. Informant in Encyclopaedia of Maths (https://encyclopediaofmath.org/wiki/Informant)
2. Pickles, Andrew (1985). An Introduction to Likelihood Analysis (https://archive.org/details/intr
oductiontoli0000pick/page/24). Norwich: W. H. Hutchins & Sons. pp. 24–29 (https://archive.o
rg/details/introductiontoli0000pick/page/24). ISBN 0-86094-190-6.
3. Serfling, Robert J. (1980). Approximation Theorems of Mathematical Statistics (https://archiv
e.org/details/approximationthe00serf). New York: John Wiley & Sons. p. 145 (https://archive.
org/details/approximationthe00serf/page/n162). ISBN 0-471-02403-1.
4. Greenberg, Edward; Webster, Charles E. Jr. (1983). Advanced Econometrics : A Bridge to
the Literature (https://books.google.com/books?id=TSK7AAAAIAAJ&pg=PA25). New York:
John Wiley & Sons. p. 25. ISBN 0-471-09077-8.
5. Sargan, Denis (1988). Lectures on Advanced Econometrics. Oxford: Basil Blackwell.
pp. 16–18. ISBN 0-631-14956-2.
6. Steyerberg, E. W.; Vickers, A. J.; Cook, N. R.; Gerds, T.; Gonen, M.; Obuchowski, N.;
Pencina, M. J.; Kattan, M. W. (2010). "Assessing the performance of prediction models. A
framework for traditional and novel measures" (https://www.ncbi.nlm.nih.gov/pmc/articles/PM
C3575184). Epidemiology. 21 (1): 128–138. doi:10.1097/EDE.0b013e3181c30fb2 (https://do
i.org/10.1097%2FEDE.0b013e3181c30fb2). PMC 3575184 (https://www.ncbi.nlm.nih.gov/p
mc/articles/PMC3575184). PMID 20010215 (https://pubmed.ncbi.nlm.nih.gov/20010215).
7. Rao, C. Radhakrishna (1948). "Large sample tests of statistical hypotheses concerning
several parameters with applications to problems of estimation". Mathematical Proceedings
of the Cambridge Philosophical Society. 44 (1): 50–57. Bibcode:1948PCPS...44...50R (http
s://ui.adsabs.harvard.edu/abs/1948PCPS...44...50R). doi:10.1017/S0305004100023987 (htt
ps://doi.org/10.1017%2FS0305004100023987). S2CID 122382660 (https://api.semanticsch
olar.org/CorpusID:122382660).
8. Buse, A. (1982). "The Likelihood Ratio, Wald, and Lagrange Multiplier Tests: An Expository
Note". The American Statistician. 36 (3a): 153–157. doi:10.1080/00031305.1982.10482817
(https://doi.org/10.1080%2F00031305.1982.10482817).
9. Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon,
Ben Poole (2020). "Score-Based Generative Modeling through Stochastic Differential
Equations". arXiv:2011.13456 (https://arxiv.org/abs/2011.13456) [cs.LG (https://arxiv.org/arch
ive/cs.LG)].
10. https://www.jmlr.org/papers/volume6/hyvarinen05a/hyvarinen05a.pdf

References
Chentsov, N.N. (2001) [1994], "Informant" (https://www.encyclopediaofmath.org/index.php?tit
le=Informant), Encyclopedia of Mathematics, EMS Press
Cox, D. R.; Hinkley, D. V. (1974). Theoretical Statistics. Chapman & Hall. ISBN 0-412-
12420-3.
Schervish, Mark J. (1995). Theory of Statistics. New York: Springer. Section 2.3.1. ISBN 0-
387-94546-6.

Retrieved from "https://en.wikipedia.org/w/index.php?title=Score_(statistics)&oldid=1165079345"

Statistics For Managers: Using Microsoft Excel
0% (2)
Statistics For Managers: Using Microsoft Excel
10 pages
R For Actuaries & Data Scientists
50% (2)
R For Actuaries & Data Scientists
70 pages
Maximum Likelihood Estimation
No ratings yet
Maximum Likelihood Estimation
17 pages
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
From Everand
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
Joseph George Caldwell
No ratings yet
Descriptive and Inferential Statistics
100% (1)
Descriptive and Inferential Statistics
31 pages
3.5 Session 14 - Naive Bayes Classifier
67% (3)
3.5 Session 14 - Naive Bayes Classifier
47 pages
Propensity Score Matching in Stata Using Teffects
No ratings yet
Propensity Score Matching in Stata Using Teffects
6 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
4 pages
Fisher Information - Wikipedia
No ratings yet
Fisher Information - Wikipedia
13 pages
Statistics 3
No ratings yet
Statistics 3
3 pages
Exercises of Statistical Inference
From Everand
Exercises of Statistical Inference
Simone Malacrida
No ratings yet
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Statistical Model
No ratings yet
Statistical Model
6 pages
1data Management Mamw 100
100% (1)
1data Management Mamw 100
84 pages
Probability and Statistics
No ratings yet
Probability and Statistics
37 pages
Goodness
No ratings yet
Goodness
12 pages
Rbrier 1
No ratings yet
Rbrier 1
7 pages
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
From Everand
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
SUJAUL CHOWDHURY
No ratings yet
Cramer Rao
No ratings yet
Cramer Rao
11 pages
Estimator Properties
No ratings yet
Estimator Properties
17 pages
Unit-IV of Data Science
No ratings yet
Unit-IV of Data Science
38 pages
Procedure of Testing Hypothesis
100% (1)
Procedure of Testing Hypothesis
5 pages
Statistical Inference
No ratings yet
Statistical Inference
14 pages
Chatterjee-Statistical Thought - A Perspective and History PDF
No ratings yet
Chatterjee-Statistical Thought - A Perspective and History PDF
300 pages
Classics: 76 Resonance
No ratings yet
Classics: 76 Resonance
15 pages
Statistical Foundations for Psychology
From Everand
Statistical Foundations for Psychology
James C. Ware
No ratings yet
Statistical Inference
100% (1)
Statistical Inference
11 pages
Statistics 580 Maximum Likelihood Estimation: 1 2 N 0 N 1 P 0 P
No ratings yet
Statistics 580 Maximum Likelihood Estimation: 1 2 N 0 N 1 P 0 P
25 pages
Application": σ/μ σ=standard deviation,μ=mean
No ratings yet
Application": σ/μ σ=standard deviation,μ=mean
5 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
7 pages
Process Performance Models: Statistical, Probabilistic & Simulation
From Everand
Process Performance Models: Statistical, Probabilistic & Simulation
Vishnuvarthanan Moorthy
No ratings yet
Statistics and Probability-Lesson 1
No ratings yet
Statistics and Probability-Lesson 1
109 pages
Multivariate Analysis: Descriptive Statistics Is The Discipline of Quantitatively Describing The Main Features of A
No ratings yet
Multivariate Analysis: Descriptive Statistics Is The Discipline of Quantitatively Describing The Main Features of A
5 pages
2021 Stat Notes
No ratings yet
2021 Stat Notes
162 pages
Data Science by CFA
No ratings yet
Data Science by CFA
27 pages
Statistics 1
No ratings yet
Statistics 1
3 pages
statistik deskriptif & inferensial
No ratings yet
statistik deskriptif & inferensial
3 pages
Calculate Mean, Median, Mode, Variance and Standard Deviation For Column A
No ratings yet
Calculate Mean, Median, Mode, Variance and Standard Deviation For Column A
22 pages
STAT 1200 - 1 Basic Concepts in Statistics
No ratings yet
STAT 1200 - 1 Basic Concepts in Statistics
46 pages
1 - Intro To Statistics
No ratings yet
1 - Intro To Statistics
11 pages
ML Unit 2
No ratings yet
ML Unit 2
8 pages
Polo, Jahazelle G. (MAT-PE) Old Curriculum
No ratings yet
Polo, Jahazelle G. (MAT-PE) Old Curriculum
11 pages
Statistical Model
No ratings yet
Statistical Model
5 pages
Discrete Probability and Likelihood: Readings: Agresti (2002), Section 1.2
No ratings yet
Discrete Probability and Likelihood: Readings: Agresti (2002), Section 1.2
17 pages
Statistical Data Science
No ratings yet
Statistical Data Science
5 pages
Notes Data Analytics
No ratings yet
Notes Data Analytics
19 pages
Statistics For Data Analytics
No ratings yet
Statistics For Data Analytics
15 pages
statistics-concept-review
No ratings yet
statistics-concept-review
54 pages
Spcor 2-Elementary Statistics: Juliean Torres Akiatan
No ratings yet
Spcor 2-Elementary Statistics: Juliean Torres Akiatan
31 pages
COR-STAT1202 Introductory Statistics Seminar 1 Full Version
No ratings yet
COR-STAT1202 Introductory Statistics Seminar 1 Full Version
9 pages
CE 459 Statistics: Assistant Prof. Muhammet Vefa AKPINAR
No ratings yet
CE 459 Statistics: Assistant Prof. Muhammet Vefa AKPINAR
211 pages
Statistics
100% (6)
Statistics
211 pages
Statistical Inference Cheat Sheet
No ratings yet
Statistical Inference Cheat Sheet
4 pages
Chapter 1 To Chapter 2 Stat 222
No ratings yet
Chapter 1 To Chapter 2 Stat 222
21 pages
1 - Basic Concepts
No ratings yet
1 - Basic Concepts
71 pages
Which Moments To Match
No ratings yet
Which Moments To Match
25 pages
Define Central Tendency
No ratings yet
Define Central Tendency
8 pages
Yo Fams
No ratings yet
Yo Fams
33 pages
Instrumental Variable: Statistics Econometrics Epidemiology
No ratings yet
Instrumental Variable: Statistics Econometrics Epidemiology
5 pages
Statistical Biology - Reviewer
100% (1)
Statistical Biology - Reviewer
6 pages
Baes Theory
No ratings yet
Baes Theory
76 pages
New Microsoft Word Document
No ratings yet
New Microsoft Word Document
8 pages
Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh
100% (4)
Information Theory, Coding and Cryptography Unit-2 by Arun Pratap Singh
36 pages
Statistical Inference in Science
No ratings yet
Statistical Inference in Science
262 pages
STATISTICS Lecture 1
No ratings yet
STATISTICS Lecture 1
5 pages
Bianconi-Barabási Model
No ratings yet
Bianconi-Barabási Model
9 pages
Minkowski Distance
No ratings yet
Minkowski Distance
2 pages
Complex Network
No ratings yet
Complex Network
6 pages
Multifractal System
No ratings yet
Multifractal System
5 pages
Coefficient of Colligation
100% (2)
Coefficient of Colligation
3 pages
Mahalanobis Distance
No ratings yet
Mahalanobis Distance
6 pages
Odds Ratio
No ratings yet
Odds Ratio
12 pages
Goodman and Kruskal's Gamma
No ratings yet
Goodman and Kruskal's Gamma
3 pages
Normally Distributed and Uncorrelated Does Not Imply Independent
No ratings yet
Normally Distributed and Uncorrelated Does Not Imply Independent
3 pages
Skewness
No ratings yet
Skewness
10 pages
Diagnostic Odds Ratio
No ratings yet
Diagnostic Odds Ratio
4 pages
Maximal Information Coefficient
No ratings yet
Maximal Information Coefficient
3 pages
Cohen's H
No ratings yet
Cohen's H
3 pages
Kolmogorov-Smirnov Test
No ratings yet
Kolmogorov-Smirnov Test
10 pages
Conditional Probability
No ratings yet
Conditional Probability
10 pages
Cross Ratio
No ratings yet
Cross Ratio
10 pages
Anderson-Darling Test
No ratings yet
Anderson-Darling Test
6 pages
Cucconi Test
No ratings yet
Cucconi Test
2 pages
RV Coefficient
No ratings yet
RV Coefficient
3 pages
Practice Problem of Unit 1
No ratings yet
Practice Problem of Unit 1
2 pages
(eBook PDF) Discovering Statistics Using IBM SPSS Statistics 5th Edition instant download
100% (6)
(eBook PDF) Discovering Statistics Using IBM SPSS Statistics 5th Edition instant download
52 pages
Lecture15 Sums of RVs
No ratings yet
Lecture15 Sums of RVs
5 pages
Probab Aug 2023
No ratings yet
Probab Aug 2023
4 pages
Course Paper On Regression Analysis of Gold Prices
33% (3)
Course Paper On Regression Analysis of Gold Prices
16 pages
The Random Variable For Probabilities: Chris Piech CS109, Stanford University
No ratings yet
The Random Variable For Probabilities: Chris Piech CS109, Stanford University
58 pages
Tutorial 06 - Joint Persistence Analysis in SWedge
No ratings yet
Tutorial 06 - Joint Persistence Analysis in SWedge
16 pages
Machine Learning Unit 4 MCQ
No ratings yet
Machine Learning Unit 4 MCQ
28 pages
Analysis of Variance Analysis of Variance: Steps For One Way Classification
No ratings yet
Analysis of Variance Analysis of Variance: Steps For One Way Classification
2 pages
Ingilizce Biyoistatistik Komite3 1
No ratings yet
Ingilizce Biyoistatistik Komite3 1
215 pages
Unit 4 & Unit 5
0% (1)
Unit 4 & Unit 5
59 pages
CH 11
No ratings yet
CH 11
111 pages
FPN Admissions 1yma Statistics and Methodology Checklist
No ratings yet
FPN Admissions 1yma Statistics and Methodology Checklist
2 pages
Quantitative Techniques in Business Lecture Slides
No ratings yet
Quantitative Techniques in Business Lecture Slides
39 pages
14 - Chapter 7 PDF
No ratings yet
14 - Chapter 7 PDF
39 pages
Chapter-3 Compact
No ratings yet
Chapter-3 Compact
62 pages
Regression Analysis of Count Data 2nd Ed
No ratings yet
Regression Analysis of Count Data 2nd Ed
9 pages
One-Sample Estimation Problems: Nguyễn Thị Thu Thủy
No ratings yet
One-Sample Estimation Problems: Nguyễn Thị Thu Thủy
50 pages
Labsheet8_241206_181419
No ratings yet
Labsheet8_241206_181419
6 pages
Test 1-past paper
No ratings yet
Test 1-past paper
8 pages
OUTPUT
No ratings yet
OUTPUT
258 pages
Panel Data Regression Models
100% (1)
Panel Data Regression Models
25 pages
Uji Reabilitas, Path Analysis, Asumsi Klasik, Sem Pls
No ratings yet
Uji Reabilitas, Path Analysis, Asumsi Klasik, Sem Pls
13 pages
Pilot Decontamination Approach in Massive Mi Mo Systems
No ratings yet
Pilot Decontamination Approach in Massive Mi Mo Systems
11 pages
IE 452 Assignment 2
100% (1)
IE 452 Assignment 2
3 pages
Measures of Variability: QD Q Q
No ratings yet
Measures of Variability: QD Q Q
6 pages

Score (Statistics)

Uploaded by

Score (Statistics)

Uploaded by

Score (statistics)

Then the likelihood is

Binary outcome model

where p is the probability in the model to be estimated and S is the score.[6]

Note that is a function of and the observation , so that, in general, it is not a

Further note that the likelihood-ratio test is given by

Score matching (machine learning)

Retrieved from "https://en.wikipedia.org/w/index.php?title=Score_(statistics)&oldid=1165079345"

You might also like