0% found this document useful (0 votes)

47 views

Zero-Inflated Model

The document discusses zero-inflated models, which are statistical models that account for excess zero values in count data. Zero-inflated models represent data as a mixture of two distributions, one that generates only zeros and one that generates counts including some zeros. Examples of zero-inflated data include number of fish caught and number of dental extractions. The zero-inflated Poisson model is described as a mixture of a binary distribution for extra zeros and a Poisson distribution for counts.

Uploaded by

ommy333

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views

Zero-Inflated Model

Uploaded by

ommy333

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Zero-inflated model

In statistics, a zero-inflated model is a statistical model based on a zero-inflated probability distribution, i.e. a distribution that
allows for frequent zero-valued observations.

Introduction to Zero-Inflated Models

Zero-inflated models are commonly used in the analysis of count data, such as the number of visits a patient makes to the
emergency room in one year, or the number of fish caught in one day in one lake.[1] Count data can take values of 0, 1, 2, …
(non-negative integer values).[2] Other examples of count data are the number of hits recorded by a Geiger counter in one
minute, patient days in the hospital, goals scored in a soccer game,[3] and the number of episodes of hypoglycemia per year
for a patient with diabetes.[4]

For statistical analysis, the distribution of the counts is often represented using a Poisson distribution or a negative binomial
distribution. Hilbe [3] notes that "Poisson regression is traditionally conceived of as the basic count model upon which a
variety of other count models are based." In a Poisson model, "… the random variable is the count response and parameter
(lambda) is the mean. Often, is also called the rate or intensity parameter… In statistical literature, is also expressed as
(mu) when referring to Poisson and traditional negative binomial models."

In some data, the number of zeros is greater than would be expected using a Poisson distribution or a negative binomial
distribution. Data with such an excess of zero counts are described as Zero-inflated.[4]

Example histograms of zero-inflated Poisson distributions with mean of 5 or 10 and proportion of zero inflation of 0.2 or
0.5 are shown below, based on the R program ZeroInflPoiDistPlots.R from Bilder and Laughlin.[1]
Examples of Zero-inflated count data
Fish counts [1] "… suppose we recorded the number of fish caught on various lakes in 4-hour fishing trips to
Minnesota. Some lakes in Minnesota are too shallow for fish to survive the winter, so fishing in those lakes
will yield no catch. On the other hand, even on a lake where fish are plentiful, we may or may not catch any
fish due to conditions or our own competence. Thus, the number of fish caught will be zero if the lake does
not support fish, and will be zero, one or more if it does."
Number of wisdom teeth extracted.[5] The number of wisdom teeth that a person has had extracted can
range from 0 to 4. Some individuals, about one-third of the population, do not have any wisdom teeth. For
these individuals, the number of wisdom teeth extracted will always be zero. For other individuals, the
number extracted will be between 0 and 4, where a 0 indicates that the subject has not yet, and may never,
have any of their 4 wisdom teeth extracted.
Publications by PhD candidates.[6] Long examined the number of publications by 915 doctoral candidates in
biochemistry in the last three years of their PhD studies. The proportion of candidates with zero publications
exceeded the number predicted by a Poisson model. "Long [6] argued that the PhD candidates might fall into
two distinct groups: "publishers" (perhaps striving for an academic career) and "non-publishers" (seeking
other career paths). One reasonable form of explanation is that the observed zero counts reflect a mixture of
the two latent classes – those who simply have not yet published and those who will likely never publish."[7]

Zero-inflated data as a mixture of two distributions

As the examples above show, zero-inflated data can arise as a mixture of two distributions. The first distribution generates
zeros. The second distribution, which may be a Poisson distribution, a negative binomial distribution or other count
distribution, generates counts, some of which may be zeros.".[7]
In the statistical literature, different authors may use different names to distinguish zeros from the two distributions. Some
authors describe zeros generated by the first (binary) distribution as "structural" and zeros generated by the second (count)
distribution as "random".[7] Other authors use the terminology "immune" and "susceptible" for the binary and count zeros,
respectively [1]

Zero-inflated Poisson
One well-known zero-inflated model is Diane Lambert's zero-inflated Poisson model,
which concerns a random event containing excess zero-count data in unit time.[8] For
example, the number of insurance claims within a population for a certain type of risk
would be zero-inflated by those people who have not taken out insurance against the
risk and thus are unable to claim. The zero-inflated Poisson (ZIP) model mixes two
zero generating processes. The first process generates zeros. The second process is
governed by a Poisson distribution that generates counts, some of which may be zero.
The mixture distribution is described as follows:
Histogram of a zero-inflated Poisson
distribution

where the outcome variable has any non-negative integer value, is the expected Poisson count for the th individual; is
the probability of extra zeros.

The mean is and the variance is .

Estimators of ZIP parameters

The method of moments estimators are given by[9]

where is the sample mean and is the sample variance.

The maximum likelihood estimator[10] can be found by solving the following equation

where is the observed proportion of zeros.

A closed form solution of this equation is given by[11]

with being the main branch of Lambert's W-function[12] and

Alternatively, the equation can be solved by iteration.[13]

The maximum likelihood estimator for is given by

Related models
In 1994, Greene considered the zero-inflated negative binomial (ZINB) model.[14] Daniel B. Hall adapted Lambert's
methodology to an upper-bounded count situation, thereby obtaining a zero-inflated binomial (ZIB) model.[15]

Discrete pseudo compound Poisson model

If the count data is such that the probability of zero is larger than the probability of nonzero, namely

then the discrete data obey discrete pseudo compound Poisson distribution.[16]

In fact, let be the probability generating function of . If , then

. Then from the Wiener–Lévy theorem,[17] has the probability generating

function of the discrete pseudo compound Poisson distribution.

We say that the discrete random variable satisfying probability generating function characterization

has a discrete pseudo compound Poisson distribution with parameters

When all the are non-negative, it is the discrete compound Poisson distribution (non-Poisson case) with overdispersion
property.

See also
Poisson distribution
Zero-truncated Poisson distribution
Compound Poisson distribution
Sparse approximation
Hurdle model

Software
pscl (https://cran.r-project.org/web/packages/pscl/index.html) and brms (https://paul-buerkner.github.io/brms/)
R packages

References
1. Bilder, Christopher; Loughin, Thomas (2015), Analysis of Categorical Data with R (First ed.), CRC Press /
Chapman & Hall, ISBN 978-1439855676
2. Hilbe, Joseph M. (2014), Modeling Count Data (First ed.), Cambridge University Press, ISBN 978-
1107611252
3. Hilbe, Joseph M. (2007), Negative Binomial Regression (Second ed.), Cambridge University Press,
ISBN 978-0521198158
4. Lachin, John M. (2011), Biostatistical Methods: The Assessment of Relative Risks (Second ed.), Wiley,
ISBN 978-0470508220
5. "Biostatistics II. 1.3 - Zero-inflated Models" (https://www.youtube.com/watch?v=14B5QUUmqts). YouTube.
Retrieved July 1, 2022.
6. Long, J. Scott (1997), Regression Models for Categorical and Limited Dependent Variables (First ed.), Sage
Publications, ISBN 978-0803973749
7. Friendly, Michael; David, Thomas (2016), Discrete Data Analysis with R (First ed.), CRC Press / Chapman &
Hall, ISBN 978-1498725835
8. Lambert, Diane (1992). "Zero-Inflated Poisson Regression, with an Application to Defects in Manufacturing".
Technometrics. 34 (1): 1–14. doi:10.2307/1269547 (https://doi.org/10.2307%2F1269547). JSTOR 1269547
(https://www.jstor.org/stable/1269547).
9. Beckett, Sadie; Jee, Joshua; Ncube, Thalepo; Washington, Quintel; Singh, Anshuman; Pal, Nabendu (2014).
"Zero-inflated Poisson (ZIP) distribution: parameter estimation and applications to model data from natural
calamities" (https://doi.org/10.2140%2Finvolve.2014.7.751). Involve. 7 (6): 751–767.
doi:10.2140/involve.2014.7.751 (https://doi.org/10.2140%2Finvolve.2014.7.751).
10. Johnson, Norman L.; Kotz, Samuel; Kemp, Adrienne W. (1992). Univariate Discrete Distributions (2nd ed.).
Wiley. pp. 312–314. ISBN 978-0-471-54897-3.
11. Dencks, Stefanie; Piepenbrock, Marion; Schmitz, Georg (2020). "Assessing Vessel Reconstruction in
Ultrasound Localization Microscopy by Maximum-Likelihood Estimation of a Zero-Inflated Poisson Model" (h
ttps://doi.org/10.1109%2FTUFFC.2020.2980063). IEEE Transactions on Ultrasonics, Ferroelectrics, and
Frequency Control. doi:10.1109/TUFFC.2020.2980063 (https://doi.org/10.1109%2FTUFFC.2020.2980063).
12. Corless, R. M.; Gonnet, G. H.; Hare, D. E. G.; Jeffrey, D. J.; Knuth, D. E. (1996). "On the Lambert W
Function". Advances in Computational Mathematics. 5 (1): 329–359. arXiv:1809.07369 (https://arxiv.org/abs/
1809.07369). doi:10.1007/BF02124750 (https://doi.org/10.1007%2FBF02124750).
13. Böhning, Dankmar; Dietz, Ekkehart; Schlattmann, Peter; Mendonca, Lisette; Kirchner, Ursula (1999). "The
zero-inflated Poisson model and the decayed, missing and filled teeth index in dental epidemiology".
Journal of the Royal Statistical Society, Series A. 162 (2): 195–209. doi:10.1111/1467-985x.00130 (https://do
i.org/10.1111%2F1467-985x.00130).
14. Greene, William H. (1994). "Some Accounting for Excess Zeros and Sample Selection in Poisson and
Negative Binomial Regression Models". Working Paper EC-94-10: Department of Economics, New York
University. SSRN 1293115 (https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1293115).
15. Hall, Daniel B. (2000). "Zero-Inflated Poisson and Binomial Regression with Random Effects: A Case
Study". Biometrics. 56 (4): 1030–1039. doi:10.1111/j.0006-341X.2000.01030.x (https://doi.org/10.1111%2Fj.
0006-341X.2000.01030.x).
16. Huiming, Zhang; Yunxiao Liu; Bo Li (2014). "Notes on discrete compound Poisson model with applications
to risk theory". Insurance: Mathematics and Economics. 59: 325–336. doi:10.1016/j.insmatheco.2014.09.012
(https://doi.org/10.1016%2Fj.insmatheco.2014.09.012).
17. Zygmund, A. (2002). Trigonometric Series. Cambridge: Cambridge University Press. p. 245.

Retrieved from "https://en.wikipedia.org/w/index.php?title=Zero-inflated_model&oldid=1158077204"

Modeling Count Data (Joseph M. Hilbe)
No ratings yet
Modeling Count Data (Joseph M. Hilbe)
304 pages
3357901H_365-373
No ratings yet
3357901H_365-373
9 pages
Zhu (2012)
No ratings yet
Zhu (2012)
14 pages
14
No ratings yet
14
10 pages
Zero-Inflated Generalized Poisson Regression Model With An Application To Domestic Violence Data
No ratings yet
Zero-Inflated Generalized Poisson Regression Model With An Application To Domestic Violence Data
14 pages
countreg
No ratings yet
countreg
11 pages
Zero-Inflated Data
No ratings yet
Zero-Inflated Data
4 pages
Reference Papr
No ratings yet
Reference Papr
14 pages
On Zero Modified Poisson Sujatha Distrib
No ratings yet
On Zero Modified Poisson Sujatha Distrib
19 pages
Baltagi Poisson
No ratings yet
Baltagi Poisson
37 pages
Lambert 1992
No ratings yet
Lambert 1992
15 pages
Bayesian Factor Zero-Inflated Poisson Model For Multiple Grouped Count Data
No ratings yet
Bayesian Factor Zero-Inflated Poisson Model For Multiple Grouped Count Data
27 pages
Lambert 1992
No ratings yet
Lambert 1992
15 pages
PSSN-CP-2021 - Template (Conf Proceedings)
No ratings yet
PSSN-CP-2021 - Template (Conf Proceedings)
7 pages
Are Zero Inflated Distributions Compulsory in The Presence of Zero Inflation
No ratings yet
Are Zero Inflated Distributions Compulsory in The Presence of Zero Inflation
4 pages
EM Alert Limits PDA - Full
No ratings yet
EM Alert Limits PDA - Full
9 pages
Markov regression models for count time series
No ratings yet
Markov regression models for count time series
13 pages
Essoham Ali
No ratings yet
Essoham Ali
27 pages
Count Data With Excess Zeros Are Common Place in Social Science
No ratings yet
Count Data With Excess Zeros Are Common Place in Social Science
1 page
Section8p
No ratings yet
Section8p
43 pages
Comparison of Count Modeling Techniques For Estimating Environmental Monitoring Limits in Clean Rooms
No ratings yet
Comparison of Count Modeling Techniques For Estimating Environmental Monitoring Limits in Clean Rooms
25 pages
A Bayesian Test For Excess Zeros in A Zero-Inflated Power Series Distribution
No ratings yet
A Bayesian Test For Excess Zeros in A Zero-Inflated Power Series Distribution
17 pages
Yang 2013
No ratings yet
Yang 2013
9 pages
Zero Inflated Poisson and Geographically Weighted Zero-Inflated Poisson Regression, Application to Filariasis Data
No ratings yet
Zero Inflated Poisson and Geographically Weighted Zero-Inflated Poisson Regression, Application to Filariasis Data
9 pages
Zero-inflated Poisson regression mixture model
No ratings yet
Zero-inflated Poisson regression mixture model
8 pages
Derivation of Zero - One Truncated Poisson Distribution
No ratings yet
Derivation of Zero - One Truncated Poisson Distribution
3 pages
Almost Unbiased Ridge Estimator in ZINB Model
No ratings yet
Almost Unbiased Ridge Estimator in ZINB Model
9 pages
Modeling Count Data. ISBN 1107611253, 978-1107611252
100% (27)
Modeling Count Data. ISBN 1107611253, 978-1107611252
23 pages
Heilbron (1994)
No ratings yet
Heilbron (1994)
17 pages
Compound Poisson Distribution
No ratings yet
Compound Poisson Distribution
6 pages
Tutorial 106b - Poisson Regression and Log-Linear Models (Bayesian)
No ratings yet
Tutorial 106b - Poisson Regression and Log-Linear Models (Bayesian)
122 pages
El 31 4 01
No ratings yet
El 31 4 01
10 pages
Decision Tree Approaches For Zero-Inflated Count Data: Seong-Keon Lee & Seohoon Jin
100% (1)
Decision Tree Approaches For Zero-Inflated Count Data: Seong-Keon Lee & Seohoon Jin
15 pages
EJMCM Volume 7 Issue 10 Pages 1400-1409
No ratings yet
EJMCM Volume 7 Issue 10 Pages 1400-1409
10 pages
Poisson Models For Count Data: 4.1 Introduction To Poisson Regression
No ratings yet
Poisson Models For Count Data: 4.1 Introduction To Poisson Regression
14 pages
c4 PDF
No ratings yet
c4 PDF
14 pages
Modeling
100% (1)
Modeling
300 pages
Decorpo: Some Tables of The Negative Binomial Distribution and Their Use
No ratings yet
Decorpo: Some Tables of The Negative Binomial Distribution and Their Use
36 pages
s13063 023 07648 8
No ratings yet
s13063 023 07648 8
11 pages
Iste Biostat19v1n1 1
No ratings yet
Iste Biostat19v1n1 1
19 pages
V27i08 PDF
No ratings yet
V27i08 PDF
25 pages
Score Tests For Heterogeneity and Overdispersion in Zero-Inflated Poisson and Binomial Regression Models
No ratings yet
Score Tests For Heterogeneity and Overdispersion in Zero-Inflated Poisson and Binomial Regression Models
16 pages
CS109/Stat121/AC209/E-109 Data Science: Statistical Models
No ratings yet
CS109/Stat121/AC209/E-109 Data Science: Statistical Models
26 pages
Research Article On Comparison of Models For Count Data With Excessive Zeros in Non-Life Insurance
No ratings yet
Research Article On Comparison of Models For Count Data With Excessive Zeros in Non-Life Insurance
11 pages
2.2 The Poisson distribution
No ratings yet
2.2 The Poisson distribution
22 pages
04-Barekeng Sinta2-DwiAgustin S3-KS AK
No ratings yet
04-Barekeng Sinta2-DwiAgustin S3-KS AK
12 pages
AUZIPRE
No ratings yet
AUZIPRE
12 pages
The_COM_Poisson_model_for_count_data_a_s
No ratings yet
The_COM_Poisson_model_for_count_data_a_s
33 pages
Green 2021 Count Regression Final Postprint v2
No ratings yet
Green 2021 Count Regression Final Postprint v2
20 pages
Bayesian Zero Inflated Negative Binomial Regression Model For The Parkinson Data
No ratings yet
Bayesian Zero Inflated Negative Binomial Regression Model For The Parkinson Data
8 pages
3 - MPH - Sem 1 - Biostatistics - Poisson Distribution 13 10 2019
No ratings yet
3 - MPH - Sem 1 - Biostatistics - Poisson Distribution 13 10 2019
18 pages
GLMMTMB Balances Speed and Flexibility Among Packages For Zero Inflated Generalized Linear Mixed Modeling
No ratings yet
GLMMTMB Balances Speed and Flexibility Among Packages For Zero Inflated Generalized Linear Mixed Modeling
23 pages
Regression Models For Count Data in R: Achim Zeileis Christian Kleiber Simon Jackman
No ratings yet
Regression Models For Count Data in R: Achim Zeileis Christian Kleiber Simon Jackman
25 pages
Assignment On Application of Poisson
No ratings yet
Assignment On Application of Poisson
10 pages
UNIT3 Binomial
No ratings yet
UNIT3 Binomial
21 pages
Shorten - Count Data Analysis
No ratings yet
Shorten - Count Data Analysis
24 pages
Exponentially Weighted Moving Average Chart Using Zero-Inflated Negative Binomial Distribution.
No ratings yet
Exponentially Weighted Moving Average Chart Using Zero-Inflated Negative Binomial Distribution.
17 pages
Anscombe (1950)
No ratings yet
Anscombe (1950)
26 pages

Zero-Inflated Model

Uploaded by

Zero-Inflated Model

Uploaded by

Zero-inflated model

Introduction to Zero-Inflated Models

Zero-inflated data as a mixture of two distributions

The mean is and the variance is .

Estimators of ZIP parameters

where is the sample mean and is the sample variance.

where is the observed proportion of zeros.

A closed form solution of this equation is given by[11]

with being the main branch of Lambert's W-function[12] and

Alternatively, the equation can be solved by iteration.[13]

The maximum likelihood estimator for is given by

Discrete pseudo compound Poisson model

In fact, let be the probability generating function of . If , then

. Then from the Wiener–Lévy theorem,[17] has the probability generating

function of the discrete pseudo compound Poisson distribution.

has a discrete pseudo compound Poisson distribution with parameters

Retrieved from "https://en.wikipedia.org/w/index.php?title=Zero-inflated_model&oldid=1158077204"

You might also like