0% found this document useful (0 votes)

41 views

The Gamma-Count Distribution in The Analysis of Experimental Underdispersed Data

The document analyzes an agronomic experiment on cotton production using count data models. It compares the Gamma-count distribution, which allows for underdispersion, to the standard Poisson regression model and quasi-Poisson model. The Gamma-count model provides a better fit to the experimental data which shows evidence of underdispersion, with variances smaller than the means.

Uploaded by

David Mejia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views

The Gamma-Count Distribution in The Analysis of Experimental Underdispersed Data

Uploaded by

David Mejia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Journal of Applied Statistics

ISSN: 0266-4763 (Print) 1360-0532 (Online) Journal homepage: www.tandfonline.com/journals/cjas20

The Gamma-count distribution in the analysis of

experimental underdispersed data

Walmes Marques Zeviani, Paulo Justiniano Ribeiro Jr, Wagner Hugo Bonat,
Silvia Emiko Shimakura & Joel Augusto Muniz

To cite this article: Walmes Marques Zeviani, Paulo Justiniano Ribeiro Jr, Wagner Hugo Bonat,
Silvia Emiko Shimakura & Joel Augusto Muniz (2014) The Gamma-count distribution in the
analysis of experimental underdispersed data, Journal of Applied Statistics, 41:12, 2616-2626,
DOI: 10.1080/02664763.2014.922168

To link to this article: https://doi.org/10.1080/02664763.2014.922168

Published online: 10 Jun 2014.

Submit your article to this journal

Article views: 550

View related articles

View Crossmark data

Citing articles: 8 View citing articles

Full Terms & Conditions of access and use can be found at

https://www.tandfonline.com/action/journalInformation?journalCode=cjas20
Journal of Applied Statistics, 2014
Vol. 41, No. 12, 2616–2626, http://dx.doi.org/10.1080/02664763.2014.922168

The Gamma-count distribution in the

analysis of experimental underdispersed
data

Walmes Marques Zeviania∗ , Paulo Justiniano Ribeiro Jra , Wagner Hugo Bonata ,
Silvia Emiko Shimakuraa and Joel Augusto Munizb
a Department of Statistics, UFPR, Centro Politecnico, Curitiba, Paraná, Brazil; b Department of Exact
Sciences, UFLA, Lavras, Minas Gerais, Brazil

(Received 18 June 2013; accepted 5 May 2014)

Event counts are response variables with non-negative integer values representing the number of times
that an event occurs within a fixed domain such as a time interval, a geographical area or a cell of a contin-
gency table. Analysis of counts by Gaussian regression models ignores the discreteness, asymmetry and
heteroscedasticity and is inefficient, providing unrealistic standard errors or possibly negative predictions
of the expected number of events. The Poisson regression is the standard model for count data with under-
lying assumptions on the generating process which may be implausible in many applications. Statisticians
have long recognized the limitation of imposing equidispersion under the Poisson regression model. A
typical situation is when the conditional variance exceeds the conditional mean, in which case models
allowing for overdispersion are routinely used. Less reported is the case of underdispersion with fewer
modeling alternatives and assessments available in the literature. One of such alternatives, the Gamma-
count model, is adopted here in the analysis of an agronomic experiment designed to investigate the effect
of levels of defoliation on different phenological states upon the number of cotton bolls. Data set and code
for analysis are available as online supplements. Results show improvements over the Poisson model and
the semi-parametric quasi-Poisson model in capturing the observed variability in the data. Estimating
rather than assuming the underlying variance process leads to important insights into the process.

Keywords: Poisson regression; likelihood inference; Gamma-count; underdispersion; quasi-Poisson;

cotton

1. Introduction
Regression models are deeply rooted in the analysis of agronomic experiments and least-squares
methods associated with the linear (Gaussian) model are widely adopted. On the other hand,
response variables in the form of counts are not uncommon. They may represent the number of
fruits produced by a tree, the number of units infected by a disease, the number of insects on a

∗ Corresponding author. Email: [email protected]

c 2014 Taylor & Francis
Journal of Applied Statistics 2617

particular plant structure, among others. Counts are random variables that assume non-negative
integer values, representing the number of times an event occurs within a fixed domain that can
be continuous, such as an interval of time or space, or discrete, such as the evaluation of an
individual or a census tract.
Gaussian regression models for count data are not efficient, typically producing inconsistent
standard errors and even negative predictions for the expected number of events [5]. The gaus-
sian linear model ignores the discreteness, heteroscedasticity, asymmetry and non-negativeness,
inherent features of count data. Impacts on the results are greater when the sample size is small
and the counts are low.
Poisson regression became the standard model for count data, in particular after the proposal
of the unifying class of generalized linear models [9] and the subsequent availability of com-
putational resources for model fitting. The Poisson distribution is an appealing option to model
count data given its domain on the non-negative integer numbers; moreover, it naturally allows
for asymmetry and heteroscedasticity that are intrinsic characteristics of this kind of data.
The assumption of variance equal to the mean (equidispersion) underlying Poisson regression
models imposes practical restrictions. Parameter estimates will be inefficient, with inconsistent
standard errors, and with larger error rates for hypothesis tests when the Poisson model is applied
to non-equidispersed data [12,13].
Overdispersion, with the variance greater than the mean, is largely reported in the literature and
may occur due to the absence of relevant covariates, heterogeneity of sampling units, sampling
levels, and excess of zeros [4]. An usual approach is to adopt a generalized linear mixed model
describing the extra variability by the inclusion of a non-observed latent random variable. An
interesting case is to assume a Poisson model with Gamma distributed random effects leading
to a negative binomial marginal distribution for the responses. El Shaarawi et al. [2] provide an
overview of this and other alternatives.
Lesser reported are the cases of underdispersion, with variances smaller than the means.
Explanatory mechanisms are more scarce and, typically, heavily dependent on the context. A
possible general description can be derived by revisiting the key property of independent expo-
nentially distributed times between events underlying the Poisson model. If inadequate, the
occurrence of an event affects the probability of another one, generating over or underdispersed
counts. Other continuous probability distributions with positive domain can be assumed such as
Gamma [11,12], lognormal [3] and Weibull [8]. Alternative approaches include weighting the
Poisson distribution [10], the COM-Poisson distribution [6,7] and heavy tail distributions [14].
Winkelmann [12] explores the connection between models for counts and models for durations
(lifetimes) relaxing the assumption of equidispersion at the cost of an extra parameter denoted by
α. The Gamma-count model is a convenient choice assuming Gamma distributed times between
events. The Poisson model becomes a particular case when the restriction α = 1 implies that the
duration distribution reduces to the exponential distribution. Varying values for the parameter α
induce a flexible probability distribution for the counts, which become underdispersed for α > 1
and overdispersed for 0 < α < 1.
We adopt the Gamma-count model for the analysis of a cotton production agronomic experi-
ment and compare the results against the ones obtained with Poisson and quasi-Poisson models.
First, the standard Poisson model is not excluded since it becomes a particular case. Second,
fitting the Gamma-count model allows for investigating whether the occurrences of bolls within
a plant are independent events, an arguable assumption under the simpler Poisson model. Third,
descriptive analyses of the data provided a clear empirical evidence that the variance is a func-
tion of the mean with a constant of proportionality below one. We also analyze the data by a
semi-parametric quasi-Poisson model as the benchmark for quantifying the observed variability
in the data.
2618 W.M. Zeviani et al.

The Gamma-count regression model is not the canonical choice amongst users of applied
statistics and not widely available in statistical software. For this reason, generic functions for
maximum-likelihood inference are available as online supplements. This includes key aspects
related to inference upon the parameters of the Gamma-count model, such as construction of
confidence intervals, either asymptotic or based on profile likelihoods; hypothesis tests; model
comparisons and prediction with corresponding confidence intervals are also included, all used
throughout the data analysis.

2. Background
Poisson regression models for count data follows directly from the generalized linear model
structure. Alternatively, the Poisson model can be derived by assuming independent and expo-
nentially distributed times between events. The latter allows for the construction of alternatives
for under or overdispersed data such as the Gamma-count model [12], as follows below.
Elementary probability arguments establish that the distribution of a count variable can be
derived from the distribution of arrival times. Let τk > 0, k ∈ N, denote a sequence of waiting
times between the (k − 1) and the kth event. Then, the arrival time of the nth event is

n
ϑn = τk , n = 1, 2, . . . . (1)
k=1

Let NT represent the total number of events within a (0, T) interval. NT is a count variable. It
follows from the definition of NT and ϑn that

NT < n ⇐⇒ ϑn ≥ T,
Pr(NT < n) = Pr(ϑn ≥ T) = 1 − Fn (T), (2)
Pr(NT = n) = Fn (T) − Fn+1 (T),

where Fn (T) is the cumulative distribution function of ϑn . Equation (2) allows obtaining the
distribution of counts NT from knowledge of the distribution of arrival times ϑn .
It is assumed that τk are identically and independently Gamma (G(α, β)) distributed with
density:
β α α−1
f (τ ; α, β) = τ exp{−βτ }, α, β ∈ R+ ,
(α)
with τ > 0, mean E(τ ) = α/β and variance Var(τ ) = α/β 2 . By Equation (1), ϑn is the sum of
n i.i.d. Gamma random variables with density Gamma(nα, β). Let G(nα, βT) be the cumulative
distribution function evaluated at T:
T
β nα nα−1
G(nα, βT) = Fn (T) = τ exp{−β} dτ
0 (nα)
βT nα−1
1 u 1
= β nα exp{−u} du
(nα) 0 β β
βT
1
= unα−1 exp{−u} du. (3)
(nα) 0

The count distribution (2) for number of events within the time interval (0, T) is given by

Pr(N = n) = G(αn, βT) − G(α(n + 1), βT), (4)

Journal of Applied Statistics 2619

with expected value given by

∞

E(NT ) = i Pr(NT = i)
i=1
∞

= i(G(αi, βT) − G(α(i + 1), βT))
i=1
∞

= G(αi, βT). (5)
i=1

For α = 1, f (τ ) reduces to the exponential density and Equation (4) simplifies to the Poisson
distribution.
For the Gamma-count regression model, the parameters depend on a vector of individual
covariates, indicated by the subscript i. Assuming that the period at risk is the same for all
observations, T can be set to unity, without loss of generality. This yields the regression
α
E(τi |xi ) = = exp{−xi γ }.
β
It is important to emphasize that the regression is for the waiting times τi and not for the counts
Ni since E(Ni |xi ) = (E(τi |xi ))−1 does not holds unless α = 1. For a given γ , E(Ni |xi ) is evaluated
by Equation (5).
Figure 1 illustrates the relation between the distribution of times between events and counts
showing the graphics of density and hazard functions with corresponding simulated values.
Gamma distributions with unity mean and different variances are shown in the first line. The sec-
ond line displays the corresponding increasing, constant and decreasing hazard functions related
to smaller, equal or larger variances than the mean. The middle plots correspond to the expo-
nential distribution and its constant hazard function. The middle panel shows simulated values
with time intervals drawn from each of the above-mentioned distributions. Vertical lines indicate
fixed width intervals for which events are counted and the counts within each interval are dis-
played. The distribution of events is nearly regular and the counts have smaller variances in the
underdispersed case. For the overdispersed case, the events are clustered with large variances for
counts. Differences are evident in the resulting histograms.
For a sample if independent counts yi , i = 1 . . . n, estimates α̂ and γ̂ can be obtained by
maximizing the log-likelihood

n
(γ , α; y, x) = log(G(yi α, α exp(xi γ )) − G(yi (α + 1), α exp(xi γ ))), (6)
i=1

where γ is the vector of regression parameters describing the interval between the events, α is
the dispersion parameter, xi is a vector of covariates and G() is given by Equation (3).
Parameter estimation requires numerical maximization of Equation (6). Confidence inter-
vals and hypotheses tests can be either based upon quadratic approximations of the likelihood
function (Wald type intervals) or profile likelihoods.
For a vector x of covariates values, time between events is predicted by
η̂ = x γ̂ .
The covariance matrix for the model parameters is

V Vαγ
V = αα ,
Vγ α Vγ γ
2620 W.M. Zeviani et al.

Figure 1. Comparison of different distributions of time between events. Top panel: Gamma densities and
hazard functions, middle panel: simulated events and corresponding interval counts for each distribution,
and bottom panel: counts histograms.

and estimated by the negative of the inverse Hessian matrix numerically obtained around the
maximized log-likelihood. The prediction standard error is given by

se(η̂) = x Vγ |α x,

−1
where Vγ |α = Vγ γ − Vγ α Vαα Vαγ . For the particular case of the Gamma-count model considered
here, α is a scalar and found to be nearly orthogonal to γ in which case Vγ |α ≈ Vγ γ . Confidence
intervals for the mean counts are obtained by computing Equation (5) after transforming the
limits of the confidence interval on the scale of the linear predictor by the inverse link function
g−1 ().
Journal of Applied Statistics 2621

3. Data set and models

The data that motivated this paper come from a greenhouse experiment with cotton
plants (Gossypium hirsutum) obtained under a completely randomized design with five
replicates. The experiment was aimed to assess the effects of five defoliation levels
(0%, 25%, 50%, 75% and 100%) on the observed number of bolls produced by plants at five
growth stages: vegetative, flower-bud, blossom, fig and cotton boll [1]. The experimental unity
was a vase with two plants. The number of cotton bolls was recorded at the each culture cycle.
Figure 2 (left) shows the number of cotton bolls recorded for each combination of defoliation
level and growth stage. All the points in the sample means and variances dispersion diagram
(right) are below the identity line, clearly suggesting the presence of underdispersion.
The analysis and assessment of the effects of the experimental factors are based on the
Gamma-count, Poisson and quasi-Poisson models, with the following structures for the log-link
function g():

Predictor 1: g(μ) = γ0 ;
Predictor 2: g(μ) = γ0 + γ1 def (first order effect of defoliation);
Predictor 3: g(μ) = γ0 + γ1 def + γ2 def2 (second order effect of defoliation);
Predictor 4: g(μ) = γ0 + γ1j def + γ2 def2 (first order defoliation effect for each growth stage);
Predictor 5: g(μ) = γ0 + γ1j def + γ2j def2 (second order effect defoliation for each growth
stage).

The parameter μ is the expected value of N for the Poisson and quasi-Poisson models and
the expected value of the latent random variable τ equivalent to time between events for the
Gamma-count model.
The nested structure of the predictors allows relevant hypothesis to be tested by likelihood
ratios. Predictor 1 contains only the intercept and is fitted simply as a baseline to assess to which
extent the structured models improve the fit. Linear and quadratic effects of defoliation are added
by Predictor 2 and Predictor 3, respectively. Predictor 4 and Predictor 5 allow the linear and

Figure 2. (Left) Number of bolls produced for each artificial defoliation level and each growth stage. (Right)
Sample variance against the sample mean of the five replicates for each combination of defoliation level
and growth stage.
2622 W.M. Zeviani et al.

quadratic effects of defoliation to vary between the growth stages, as indicated by the subscript j.
The parameter γ0 is not allowed to vary between the growth stage once the effect of no defoliation
is the same for all growth stages.
Values of the maximized log-likelihood and the Akaike criterion are recorded for the fully
parametric Poisson and Gamma-count models. The semi-parametric quasi-Poisson model is also
fitted to assess whether the parametric models produce comparable results. This model is less
restrictive concerning model assumptions, albeit without the inferential advantages of the fully
parametric counterparts.

4. Results
Table 1 summarizes the maximized log-likelihoods and likelihood ratio tests comparing the
sequence of predictors for the Poisson and Gamma-count models, as well as fitting results for
the quasi-Poisson. The Gamma-count model has a higher log-likelihood with the hypothesis of
equidispersion (α = 1) being rejected by likelihood ratio tests, even for the predictor without
covariates. Estimates of α̂ > 1 confirm that the number of cotton bolls is underdispersed with
increasing hazard functions indicating that the probability of the development of a new cotton
boll increases as time progresses. This result supports the hypothesis of a regular sharing of
plant resources in the distribution of the number of cotton bolls. The quasi-Poisson model also
indicates underdispersion (φ < 1) even for the null model.
Unlike the others, the Poisson model does not show significant effects under Model 5.
This is attributed to the inadequate assumption of equidispersion that makes the log-likelihood

Table 1. Model fit measures and comparisons between predictors and models.

Poisson np AIC diff np 2(diff ) P(> χ 2 )

1 1 −279.933 561.866
2 2 −272.001 548.001 1 15.864 6.805E–05
3 3 −271.354 548.709 1 1.293 2.556E–01
4 7 −258.674 531.348 4 25.360 4.258E–05
5 11 −255.803 533.606 4 5.742 2.193E–01
Gamma-count np AIC diff np 2(diff ) P(> χ 2 ) α̂ P(> χ 2 )a

1 2 −272.396 548.792 1.764 1.034E−04

2 3 −257.350 520.701 1 30.092 4.121E−08 2.266 6.198E−08
3 4 −255.981 519.962 1 2.738 9.796E−02 2.317 2.940E−08
4 8 −220.145 456.291 4 71.671 1.007E−14 4.206 1.661E−18
5 12 −208.386 440.773 4 23.518 9.976E−05 5.112 2.071E−22
Quasi-Poisson np deviance diff np diff dev P(> F) φ̂ P(> χ 2 )a

1 1 75.514 0.567 3.660E−04

2 2 59.650 1 34.214 4.235E−08 0.464 5.134E−07
3 3 58.357 1 2.810 9.630E−02 0.460 3.661E−07
4 7 32.997 4 22.768 7.676E−14 0.278 9.154E−16
5 11 27.255 4 5.956 2.241E−04 0.241 3.566E−18

Notes: a Bilateral hypothesis test of dispersion parameter equal to 1.

np, number of parameters; , log-likelihood; diff np, difference in np; diff , difference in ; diff dev, difference in
scaled deviance.
Journal of Applied Statistics 2623

Table 2. Parameter estimates and estimate/standard error rates for the three models.

Poisson quasi-Poisson Gamma-count

Parameter Estimate Est/SE Estimate Est/SE Estimate Est/SE

γ0 2.1896 34.5724a 2.1896 70.4205a 2.2342 79.7128a

γ1vegetative 0.4369 0.8473 0.4369 1.7260 0.4122 1.8080
γ2vegetative −0.8052 −1.3790 −0.8052 −2.8089a −0.7628 −2.9544a
γ1bud 0.2897 0.5706 0.2897 1.1622 0.2744 1.2224
γ2bud −0.4879 −0.8613 −0.4879 −1.7544 −0.4642 −1.8534
γ1blossom −1.2425 −2.0581a −1.2425 −4.1921a −1.1821 −4.4348a
γ2blossom 0.6728 0.9892 0.6728 2.0149a 0.6453 2.1486a
γ1fig 0.3649 0.6449 0.3649 1.3135 0.3198 1.2797
γ2fig −1.3103 −1.9477 −1.3103 −3.9672a −1.1990 −4.0385a
γ1boll 0.0089 0.0178 0.0089 0.0362 0.0070 0.0315
γ2boll −0.0200 −0.0361 −0.0200 −0.0736 −0.0185 −0.0756
α – – – – 5.1120 7.4228a

Note: a Indicates |Est/SE| > 1.96.

Figure 3. Dispersion diagrams of observed values and curves of predicted values and confidence intervals
(95%) as functions of the defoliation level for each growth stage.

among predictors less distinguishable. Descriptive levels (p-values) are substantially smaller
for the Gamma-count and quasi-Poisson, compared with the Poisson model. In the presence
of underdispersion, the latter becomes conservative for hypothesis testing.
The Gamma-count and the quasi-Poisson models indicate that both, linear and quadratic
effects of levels of defoliation, vary between growth stages. Results in Table 2 and Figure 3 show,
for all models, no significant effects of defoliation during the floral-bud and cotton boll stages.
The ratios between the estimates and the corresponding standard errors for these stages are, in
absolute values, smaller than the reference value of 1.96 for a significance level of 5%. The Pois-
son model only detects the effect of defoliation for the blossom stage, while the Gamma-count
and quasi-Poisson models indicate a significant effect of defoliation for the vegetative, blossom
and fig stages.
Parameter estimates for the blossom stage have an opposite signal when compared with the
other stages. A negative and significant linear term indicates a rapid decay in the number of
2624 W.M. Zeviani et al.

Figure 4. Estimated probabilities from Poisson and Gamma-count models for a level zero of defoliation.

cotton bolls during the beginning of defoliation. The positive quadratic term indicates concave
up response as seen in Figure 3 for the blossom stage. Therefore, the impact of defoliation is
greater for the blossom stage and there is a tolerance up to approximately 40% of defoliation
for the vegetative stage and 24% for the fig stages. Parameter estimates between models are not
directly comparable once they are related to the number of events in the Poisson model and to
the distribution of the time between events for the Gamma-count model.
Prediction curves for each stage are shown in Figure 3 and are indistinguishable between
the three models. The confidence bands are similar between Gamma-count and quasi-Poisson
models and clearly wider for the Poisson model.
Overall the Gamma-count and the quasi-Poisson model produced very similar inferential
results, point and interval estimates, hypothesis tests, model comparisons and prediction bands.
The semi-parametric quasi-Poisson model is expected to have a better fit to a particular data
set, as there is no explicit formulation of a probability model and functional relation between
mean and variance. Such flexibility comes with drawbacks. There are no likelihood measures for
comparing models and submodels neither an estimated probability distribution for the counts,
which could address questions of scientific interest. Figure 4 provides the estimated probability
distributions for the number of cotton bolls obtained under Poisson and Gamma-count models.
At the level zero of defoliation, the expected value is 8.93 cotton bolls per two plants for either
model; however with probability distribution more concentrated around the mean value under
the Gamma-count model.
In what follows we further explore aspects of the likelihood function. The profile log-
likelihood for α is slightly skewed (left panel, Figure 5). The 95% confidence interval based
on the χ 2 distribution is (3.89, 6.59) while the asymptotic interval is (3.76, 6.46). Both have the
same range (2.70), however shifted by 0.13 units. This is a small difference and the quadratic
approximation of the likelihood is considered satisfactory. Although the precision of the inter-
vals is similar, the interval based on the log-likelihood is preferred to describe the uncertainty
associated with α since it is able to detect possible asymmetries and has limits within the (0, ∞)
parameter space.
The right panel in Figure 5 shows the confidence regions for α and γ0 obtained via profile like-
lihood and quadratic approximation of the likelihood. Axes of the confidence regions are nearly
parallel to the Cartesian axes suggesting that the parameters are nearly orthogonal. Moreover,
Journal of Applied Statistics 2625

Figure 5. Profile likelihood and quadratic approximation for: (left) α with arrows indicating the 95%
confidence intervals and (right) (90, 95, 99%) confidence regions for γ0 and α.

covariances between α̂ and each of the other parameters γ̂ (not shown) are nearly zero implying
the inferences about one parameter are not influenced by the other parameter. The confidence
regions are symmetric in the direction of γ0 and the asymptotic and profile likelihood-based
confidence intervals are therefore coincident.
Computationally, the asymptotic confidence interval is easier to obtain since it simply requires
the inversion of the Hessian matrix around the maximum of the log-likelihood function. The
profile log-likelihood requires successive optimizations for a set of values of the parameter of
interest. For a larger number of parameters obtaining individual intervals based on the profile,
the likelihoods will increase the computational burden.

5. Conclusion
The Poisson, Gamma-count and semi-parametric quasi-Poisson models were considered for the
analysis of underdispersed count responses from a greenhouse experiment with cotton plants
subjected to different artificial defoliation levels and growth stages.
Significance of experimental factors is the same for the Gamma-count and quasi-Poisson mod-
els, whereas the Poisson model is more conservative, not identifying some experimental factors
as significant. The latter have led to greater standard errors and wider prediction bands, being
unable to capture information contained in the data. The analysis suggest that, in the presence
of underdispersion, the standard Poisson model is inadequate and can lead to wrong conclusions
about the effects of experimental factors or covariates of interest.
Results under the Gamma-count model are comparable to the semi-parametric approach which
does not assume a specific probability distribution for the counts. The fully parametric approach
is advantageous since it allows for likelihood-based inference, deriving estimated prediction
probabilities besides enabling generalizations such as specifying a regression model structure
also for the dispersion parameter.
Likelihood analysis showed nearly quadratic behavior for the parameter α controlling the dis-
persion of the counts. This parameter has little influence upon point estimates of the regression
parameters, being responsible for stabilizing the estimates of variances of regression parameters,
which are often overestimated under the Poisson distribution.
Despite the advantages and potential for usage, the Gamma-count model is an uncommon
relevant addition to the suite of models to be considered for the analysis of experimental count
2626 W.M. Zeviani et al.

data. The model can be easily implemented in a statistical programming language as illustrated
by the supplementary material.1
Possible topics for further investigation and extensions include assessment of impacts of mis-
specification under different levels dispersion, increase in flexibility possibly by modeling the
dispersion parameter as a function of covariates and the addition of random effects to account
for grouped data structures such as repeated and longitudinal measures.

Note

1. Supplementary Content may be viewed online at http://www.leg.ufpr.br/doku.php/publications:

papercompanions:zeviani-jas2014.

References
[1] A.M. da Silva, P.E. Degrande, M.G. Fernandes, R. Suekane, and W.M. Zeviani, Impacto de diferentes níveis de
desfolha artificial nos estádios fenológicos do algodoeiro, Rev. Ciências Agrárias 35 (2012), pp. 163–172.
[2] A.H. El Shaarawi, R. Zhu, and H. Joe, Modelling species abundance using the Poisson-Tweedie family,
Environmetrics 22(2) (2011), pp. 152–164.
[3] U. Gonzales-Barron and F. Butler, Characterisation of within-batch and between-batch variability in micro-
bial counts in foods using Poisson-Gamma and poisson-lognormal regression models, Food Control 22 (2011),
pp. 1268–1278.
[4] G.K. Grunwald, S.L. Bruce, L. Jiang, M. Strand, and N. Rabinovitch, A statistical model for under or overdispersed
clustered and longitudinal count data, Biom. J. 53 (2011), pp. 578–594.
[5] G. King, Variance specification in event count models: From restrictive assumptions to a generalized estimator,
Am. J. Polit. Sci. 33 (1989), pp. 762–784.
[6] D. Lord, S.R. Geedipally, and S.D. Guikema, Extension of the application of Conway-Maxwell-Poisson models:
Analyzing traffic crash data exhibiting underdispersion, Risk Anal. 30 (2010), pp. 1268–1276.
[7] D. Lord, S.D. Guikema, and S.R. Geedipally, Application of the Conway-Maxwell-Poisson generalized linear model
for analyzing motor vehicle crashes, Accid. Anal. Prev. 40 (2008), pp. 1123–1134.
[8] B. McShane, M. Adrian, E.T. Bradlow, and P.S. Fader, Count models based on Weibull interarrival times, J. Bus.
Econom. Statist. 26 (2008), pp. 369–378.
[9] J.A. Nelder and R.W.M. Wedderburn, Generalized linear models, J. Roy. Stat. Soc. Ser. A 135 (1972), pp. 370–384.
[10] M.S. Ridout and P. Besbeas, An empirical model for underdispersed count data, Stat. Model. 4 (2004), pp. 77–89.
[11] N. Toft, G.T. Innocent, D.J. Mellor, and S.W. Reid, The Gamma-Poisson model as a statistical method to determine
if micro-organisms are randomly distributed in a food matrix, Food Microbiol. 23 (2006), pp. 90–94.
[12] R. Winkelmann, Duration dependence and dispersion in count-data models, J. Bus. Econom. Statist. 13 (1995),
pp. 467–474.
[13] R. Winkelmann and K. Zimmermann, Count data models for demographic data, Math. Popul. Stud. 4 (1994),
pp. 205–221.
[14] R. Zhu and H. Joe, Modelling heavy-tailed count data using a generalised Poisson-inverse Gaussian family, Statist.
Probab. Lett. 79 (2009), pp. 1695–1703.

Statistics Reviewer Coverage From DCAT 2018 by KBanaag
No ratings yet
Statistics Reviewer Coverage From DCAT 2018 by KBanaag
4 pages
Modeling Count Data (Joseph M. Hilbe)
No ratings yet
Modeling Count Data (Joseph M. Hilbe)
304 pages
Quantitative Methods 1 2
No ratings yet
Quantitative Methods 1 2
286 pages
Tutorial 106b - Poisson Regression and Log-Linear Models (Bayesian)
No ratings yet
Tutorial 106b - Poisson Regression and Log-Linear Models (Bayesian)
122 pages
Models For Count Data With Overdispersion: 1 Extra-Poisson Variation
No ratings yet
Models For Count Data With Overdispersion: 1 Extra-Poisson Variation
7 pages
Bhati 2016
No ratings yet
Bhati 2016
32 pages
On Zero Modified Poisson Sujatha Distrib
No ratings yet
On Zero Modified Poisson Sujatha Distrib
19 pages
10E-Poisson Regression
No ratings yet
10E-Poisson Regression
19 pages
Introduction To Statistic Lab Report
No ratings yet
Introduction To Statistic Lab Report
16 pages
Specification and Testing of Some Modified Count Data Models
No ratings yet
Specification and Testing of Some Modified Count Data Models
25 pages
To The - Requifibd Stan'Cqrd'
No ratings yet
To The - Requifibd Stan'Cqrd'
122 pages
Count Data - Wikipedia
No ratings yet
Count Data - Wikipedia
7 pages
Poisson Regression in Stata
No ratings yet
Poisson Regression in Stata
23 pages
Chapter 3 Radiation
100% (1)
Chapter 3 Radiation
36 pages
Modeling Count Data. ISBN 1107611253, 978-1107611252
100% (27)
Modeling Count Data. ISBN 1107611253, 978-1107611252
23 pages
Some_discrete_exponential_dispersion_mod
No ratings yet
Some_discrete_exponential_dispersion_mod
14 pages
Chap11 Generalized Linear Models For Nonnormal Response
No ratings yet
Chap11 Generalized Linear Models For Nonnormal Response
41 pages
Shorten - Count Data Analysis
No ratings yet
Shorten - Count Data Analysis
24 pages
EJ1165803
No ratings yet
EJ1165803
15 pages
GLMMTMB Balances Speed and Flexibility Among Packages For Zero Inflated Generalized Linear Mixed Modeling
No ratings yet
GLMMTMB Balances Speed and Flexibility Among Packages For Zero Inflated Generalized Linear Mixed Modeling
23 pages
sellers2017_underdispersion models
No ratings yet
sellers2017_underdispersion models
27 pages
2012 Bookmatter ModernMathematicalStatisticsWi PDF
No ratings yet
2012 Bookmatter ModernMathematicalStatisticsWi PDF
59 pages
Poisson Models For Count Data: 4.1 Introduction To Poisson Regression
No ratings yet
Poisson Models For Count Data: 4.1 Introduction To Poisson Regression
14 pages
c4 PDF
No ratings yet
c4 PDF
14 pages
Modeling Count Data
No ratings yet
Modeling Count Data
6 pages
On_Poisson_Samade_Distribution_Its_Appli
No ratings yet
On_Poisson_Samade_Distribution_Its_Appli
16 pages
An Empirical Study of Generalized Linear Model For
No ratings yet
An Empirical Study of Generalized Linear Model For
4 pages
TCRM_CountData
No ratings yet
TCRM_CountData
43 pages
5. Countr_guide-fertility data can be downloaded from R-package
No ratings yet
5. Countr_guide-fertility data can be downloaded from R-package
35 pages
Multivariate Generalized Linear Mixed Models For Count Data: Guilherme P. Silva Henrique A. Laureano
No ratings yet
Multivariate Generalized Linear Mixed Models For Count Data: Guilherme P. Silva Henrique A. Laureano
22 pages
Nihms 857131
No ratings yet
Nihms 857131
40 pages
Categorical-Notes-Ch1
No ratings yet
Categorical-Notes-Ch1
18 pages
New Statistical Process Control Charts For Overdispersed Count Data Based On The Bell Distribution
No ratings yet
New Statistical Process Control Charts For Overdispersed Count Data Based On The Bell Distribution
22 pages
Baltagi Poisson
No ratings yet
Baltagi Poisson
37 pages
Modeling
100% (1)
Modeling
300 pages
poisson
No ratings yet
poisson
54 pages
Nature: Measurement of Diversity
No ratings yet
Nature: Measurement of Diversity
1 page
PHYS 423 Lab Manual
No ratings yet
PHYS 423 Lab Manual
115 pages
Journal of Theoretical Biology: Abbas Moghimbeigi
No ratings yet
Journal of Theoretical Biology: Abbas Moghimbeigi
7 pages
MSD_Discrete_count_models_2
No ratings yet
MSD_Discrete_count_models_2
42 pages
Mean, Standard Deviation, and Counting Statistics
No ratings yet
Mean, Standard Deviation, and Counting Statistics
2 pages
Lecture 3
No ratings yet
Lecture 3
39 pages
Nair-DistributionStudentst-1941
No ratings yet
Nair-DistributionStudentst-1941
19 pages
UNIT3 Binomial
No ratings yet
UNIT3 Binomial
21 pages
The Infinite Gamma Poisson Feature Model
No ratings yet
The Infinite Gamma Poisson Feature Model
8 pages
Book Review: Regression Analysis of Count Data
No ratings yet
Book Review: Regression Analysis of Count Data
2 pages
Statistics 21march2018
No ratings yet
Statistics 21march2018
25 pages
STAT2120: Categorical Data Analysis Chapter 1: Introduction
No ratings yet
STAT2120: Categorical Data Analysis Chapter 1: Introduction
51 pages
TB Intro Endsem
No ratings yet
TB Intro Endsem
228 pages
Assignment On Application of Poisson
No ratings yet
Assignment On Application of Poisson
10 pages
Statistical Method Book For Lectures
No ratings yet
Statistical Method Book For Lectures
348 pages
LQ1 Notes
No ratings yet
LQ1 Notes
15 pages
Measures of Central Tendency and Dispersion: Mean or Average
No ratings yet
Measures of Central Tendency and Dispersion: Mean or Average
7 pages
Index_2017_Introductory-Statistics
No ratings yet
Index_2017_Introductory-Statistics
10 pages
Poisson Regression
No ratings yet
Poisson Regression
12 pages
Robert v. Hogg, Allen T. Craig - Introduction To M
No ratings yet
Robert v. Hogg, Allen T. Craig - Introduction To M
448 pages
An Introduction To Probability and Statistics - 2015 - Rohatgi - Subject Index
No ratings yet
An Introduction To Probability and Statistics - 2015 - Rohatgi - Subject Index
11 pages
II PU STATISTICSudupi
No ratings yet
II PU STATISTICSudupi
4 pages
To generate random sample, each of size 30 from normal, Gamma, Exponential and poisson dist
No ratings yet
To generate random sample, each of size 30 from normal, Gamma, Exponential and poisson dist
4 pages
Bayesian Decision Networks: Fundamentals and Applications
From Everand
Bayesian Decision Networks: Fundamentals and Applications
Fouad Sabry
No ratings yet
Bayesian Network: Fundamentals and Applications
From Everand
Bayesian Network: Fundamentals and Applications
Fouad Sabry
No ratings yet
High-Dimensional Covariance Estimation: With High-Dimensional Data
From Everand
High-Dimensional Covariance Estimation: With High-Dimensional Data
Mohsen Pourahmadi
No ratings yet
Stata Journal
No ratings yet
Stata Journal
192 pages
Time Series
No ratings yet
Time Series
61 pages
Prob NStat 3
No ratings yet
Prob NStat 3
10 pages
Parameters and Hyperparameters notes
No ratings yet
Parameters and Hyperparameters notes
2 pages
ML - UT1 Paper
No ratings yet
ML - UT1 Paper
2 pages
Online Syllabuses and Regulations (4 Years Curriculum)
No ratings yet
Online Syllabuses and Regulations (4 Years Curriculum)
3 pages
CIVE 5015: Research Data Analysis: Lecture 9: Correlation and Regression Analysis
No ratings yet
CIVE 5015: Research Data Analysis: Lecture 9: Correlation and Regression Analysis
17 pages
Statistical Hypothesis
No ratings yet
Statistical Hypothesis
70 pages
Lecture Notes 7.2 Estimating A Population Mean
No ratings yet
Lecture Notes 7.2 Estimating A Population Mean
5 pages
D2 Analysis or Cluster
No ratings yet
D2 Analysis or Cluster
15 pages
Solution Work 1.ea
No ratings yet
Solution Work 1.ea
31 pages
Slides On T and Chi Square Distributions
No ratings yet
Slides On T and Chi Square Distributions
22 pages
Short Notes _ Binomial Distribution __ Lakshya MHTCET 2025
No ratings yet
Short Notes _ Binomial Distribution __ Lakshya MHTCET 2025
3 pages
Lesson 1 Normal Curve Distribution
100% (1)
Lesson 1 Normal Curve Distribution
43 pages
Statistics Module 3
No ratings yet
Statistics Module 3
33 pages
Cost Elements Base Case Minimum: Summary Statistics
No ratings yet
Cost Elements Base Case Minimum: Summary Statistics
5 pages
ACJC - H1 - MATH - Prelim QP (2012)
No ratings yet
ACJC - H1 - MATH - Prelim QP (2012)
6 pages
GLM Multivariate Analysis (Presentation 4) Adv Stat
No ratings yet
GLM Multivariate Analysis (Presentation 4) Adv Stat
84 pages
Recsa Cahaya Erlangga 13716039 Tugas Manrek
No ratings yet
Recsa Cahaya Erlangga 13716039 Tugas Manrek
4 pages
Statistics and Probability2021 - Quarter 3 2
No ratings yet
Statistics and Probability2021 - Quarter 3 2
38 pages
Lecture4 ECO521 Web
No ratings yet
Lecture4 ECO521 Web
21 pages
Regression Analysis
No ratings yet
Regression Analysis
29 pages
DC - Unit 1 - Final
No ratings yet
DC - Unit 1 - Final
71 pages
Random Variable 2
100% (1)
Random Variable 2
25 pages
Multiple Regression Applications: Econ 140
No ratings yet
Multiple Regression Applications: Econ 140
26 pages
Particle Filter
No ratings yet
Particle Filter
16 pages
MAST20004 Probability: Student Number
No ratings yet
MAST20004 Probability: Student Number
19 pages
Package Dynlm': R Topics Documented
No ratings yet
Package Dynlm': R Topics Documented
6 pages