Geometric Distribution
Geometric Distribution
CF
for k = 0, 1, 2, 3, ....
For example, suppose an ordinary die is thrown repeatedly until the first time a "1" appears. The probability
distribution of the number of times it is thrown is supported on the infinite set { 1, 2, 3, ... } and is a
geometric distribution with p = 1/6.
Definitions
Consider a sequence of trials, where each trial has only two possible outcomes (designated failure and
success). The probability of success is assumed to be the same for each trial. In such a sequence of trials,
the geometric distribution is useful to model the number of failures before the first success since the
experiment can have an indefinite number of trials until success, unlike the binomial distribution which has
a set number of trials. The distribution gives the probability that there are zero failures before the first
success, one failure before the first success, two failures before the first success, and so on.
The geometric distribution is an appropriate model if the following assumptions are true.
If these conditions are true, then the geometric random variable Y is the count of the number of failures
before the first success. The possible number of failures before the first success is 0, 1, 2, 3, and so on. In
the graphs above, this formulation is shown on the right.
An alternative formulation is that the geometric random variable X is the total number of trials up to and
including the first success, and the number of failures is X − 1. In the graphs above, this formulation is
shown on the left.
The general formula to calculate the probability of k failures before the first success, where the probability
of success is p and the probability of failure is q = 1 − p, is
for k = 0, 1, 2, 3, ...
E1) A doctor is seeking an antidepressant for a newly diagnosed patient. Suppose that, of the available anti-
depressant drugs, the probability that any particular drug will be effective for a particular patient is p = 0.6.
What is the probability that the first drug found to be effective for this patient is the first drug tried, the
second drug tried, and so on? What is the expected number of drugs that will be tried to find one that is
effective?
The probability that the first drug works. There are zero failures before the first success. Y = 0 failures. The
probability Pr(zero failures before first success) is simply the probability that the first drug works.
The probability that the first drug fails, but the second drug works. There is one failure before the first
success. Y = 1 failure. The probability for this sequence of events is Pr(first drug fails) p(second drug
succeeds), which is given by
The probability that the first drug fails, the second drug fails, but the third drug works. There are two
failures before the first success. Y = 2 failures. The probability for this sequence of events is Pr(first drug
fails) p(second drug fails) Pr(third drug is success)
E2) A newlywed couple plans to have children and will continue until the first girl. What is the probability
that there are zero boys before the first girl, one boy before the first girl, two boys before the first girl, and
so on?
The probability of having a girl (success) is p= 0.5 and the probability of having a boy (failure) is
q = 1 − p = 0.5.
Properties
The expected value for the number of independent trials to get the first success, and the variance of a
geometrically distributed random variable X is:
Similarly, the expected value and variance of the geometrically distributed random variable Y = X - 1 (See
definition of distribution ) is:
Proof
Expected value of X
Consider the expected value of X as above, i.e. the average number of trials until a success. On the
first trial we either succeed with probability , or we fail with probability . If we fail the remaining
mean number of trials until a success is identical to the original mean. This follows from the fact that all
trials are independent. From this we get the formula:
Expected value of Y
That the expected value of Y as above is (1 − p)/p can be shown in the following way:
The interchange of summation and differentiation is justified by the fact that convergent power series
converge uniformly on compact subsets of the set of points where they converge.
Let μ = (1 − p)/p be the expected value of Y. Then the cumulants of the probability distribution of Y
satisfy the recursion
E3) A patient is waiting for a suitable matching kidney donor for a transplant. If the probability that a
randomly selected donor is a suitable match is p = 0.1, what is the expected number of donors who will be
tested before a matching donor is found?
With p = 0.1, the mean number of failures before the first success is E(Y) = (1 − p)/p =(1 − 0.1)/0.1 = 9.
For the alternative formulation, where X is the number of trials up to and including the first success, the
expected value is E(X) = 1/p = 1/0.1 = 10.
For example 1 above, with p = 0.6, the mean number of failures before the first success is E(Y) = (1 − p)/p
= (1 − 0.6)/0.6 = 0.67.
Higher-order moments
The moments for the number of failures before the first success are given by
General properties
The probability-generating functions of X and Y are, respectively,
Like its continuous analogue (the exponential distribution), the geometric distribution is
memoryless. That is, the following holds for every m and n.
Among all discrete probability distributions supported on {1, 2, 3, ... } with given expected
value μ, the geometric distribution X with parameter p = 1/μ is the one with the largest
entropy.[2]
The geometric distribution of the number Y of failures before the first success is infinitely
divisible, i.e., for any positive integer n, there exist independent identically distributed
random variables Y1, ..., Yn whose sum has the same distribution that Y has. These will not
be geometrically distributed unless n = 1; they follow a negative binomial distribution.
The decimal digits of the geometrically distributed random variable Y are a sequence of
independent (and not identically distributed) random variables. For example, the hundreds
digit D has this probability distribution:
where q = 1 − p, and similarly for the other digits, and, more generally, similarly for numeral
systems with other bases than 10. When the base is 2, this shows that a geometrically
distributed random variable can be written as a sum of independent random variables
whose probability distributions are indecomposable.
Golomb coding is the optimal prefix code for the geometric discrete distribution.[3]
The sum of two independent Geo(p) distributed random variables is not a geometric
distribution. [1]
Related distributions
The geometric distribution Y is a special case of the negative binomial distribution, with
r = 1. More generally, if Y1, ..., Yr are independent geometrically distributed variables with
parameter p, then the sum
Suppose 0 < r < 1, and for k = 1, 2, 3, ... the random variable Xk has a Poisson distribution
with expected value r k /k. Then
has a geometric distribution taking values in the set {0, 1, 2, ...}, with expected value
r/(1 − r).
where is the floor (or greatest integer) function, is a geometrically distributed random
variable with parameter p = 1 − e−λ (thus λ = −ln(1 − p)[6]) and taking values in the
set {0, 1, 2, ...}. This can be used to generate geometrically distributed pseudorandom
numbers by first generating exponentially distributed pseudorandom numbers from a
uniform pseudorandom number generator: then is geometrically
distributed with parameter , if is uniformly distributed in [0,1].
If p = 1/n and X is geometrically distributed with parameter p, then the distribution of X/n
approaches an exponential distribution with expected value 1 as n → ∞, since
More generally, if p = λ/n, where λ is a parameter, then as n→ ∞ the distribution of X/n approaches an
exponential distribution with rate λ:
therefore the distribution function of X/n converges to , which is that of an exponential random
variable.
Statistical inference
Parameter estimation
For both variants of the geometric distribution, the parameter p can be estimated by equating the expected
value with the sample mean. This is the method of moments, which in this case happens to yield maximum
likelihood estimates of p.[7][8]
Specifically, for the first variant let k = k1 , ..., kn be a sample where ki ≥ 1 for i = 1, ..., n. Then p can be
estimated as
In Bayesian inference, the Beta distribution is the conjugate prior distribution for the parameter p. If this
parameter is given a Beta(α, β) prior, then the posterior distribution is
The posterior mean E[p] approaches the maximum likelihood estimate as α and β approach zero.
In the alternative case, let k1 , ..., kn be a sample where ki ≥ 0 for i = 1, ..., n. Then p can be estimated as
Again the posterior mean E[p] approaches the maximum likelihood estimate as α and β approach zero.
Computational methods
The R function dgeom(k, prob) calculates the probability that there are k failures before the first
success, where the argument "prob" is the probability of success on each trial.
For example,
dgeom(0,0.6) = 0.6
dgeom(1,0.6) = 0.24
R uses the convention that k is the number of failures, so that the number of trials up to and including the
first success is k + 1.
The following R code creates a graph of the geometric distribution from Y = 0 to 10, with p = 0.6.
Y=0:10
The geometric distribution, for the number of failures before the first success, is a special case of the
negative binomial distribution, for the number of failures before s successes.
For example,
Like R, Excel uses the convention that k is the number of failures, so that the number of trials up to and
including the first success is k + 1.
See also
Hypergeometric distribution
Coupon collector's problem
Compound Poisson distribution
Negative binomial distribution
References
1. A modern introduction to probability and statistics : understanding why and how. Dekking,
Michel, 1946-. London: Springer. 2005. pp. 48–50, 61–62, 152. ISBN 9781852338961.
OCLC 262680588 (https://www.worldcat.org/oclc/262680588).
2. Park, Sung Y.; Bera, Anil K. (June 2009). "Maximum entropy autoregressive conditional
heteroskedasticity model". Journal of Econometrics. 150 (2): 219–230.
doi:10.1016/j.jeconom.2008.12.014 (https://doi.org/10.1016%2Fj.jeconom.2008.12.014).
3. Gallager, R.; van Voorhis, D. (March 1975). "Optimal source codes for geometrically
distributed integer alphabets (Corresp.)". IEEE Transactions on Information Theory. 21 (2):
228–230. doi:10.1109/TIT.1975.1055357 (https://doi.org/10.1109%2FTIT.1975.1055357).
ISSN 0018-9448 (https://www.worldcat.org/issn/0018-9448).
4. Pitman, Jim. Probability (1993 edition). Springer Publishers. pp 372.
5. Ciardo, Gianfranco; Leemis, Lawrence M.; Nicol, David (1 June 1995). "On the minimum of
independent geometrically distributed random variables" (https://dx.doi.org/10.1016/0167-71
52%2894%2900130-Z). Statistics & Probability Letters. 23 (4): 313–326. doi:10.1016/0167-
7152(94)00130-Z (https://doi.org/10.1016%2F0167-7152%2894%2900130-Z).
S2CID 1505801 (https://api.semanticscholar.org/CorpusID:1505801).
6. "Wolfram-Alpha: Computational Knowledge Engine" (http://www.wolframalpha.com/input/?i=
inverse+p+=+1+-+e%5E-l). www.wolframalpha.com.
7. casella, george; berger, roger l (2002). statistical inference (2nd ed.). pp. 312–315. ISBN 0-
534-24312-6.
8. "MLE Examples: Exponential and Geometric Distributions Old Kiwi - Rhea" (https://www.proj
ectrhea.org/rhea/index.php/MLE_Examples:_Exponential_and_Geometric_Distributions_Ol
d_Kiwi). www.projectrhea.org. Retrieved 2019-11-17.
9. "3. Conjugate families of distributions" (http://halweb.uc3m.es/esp/Personal/personas/mwipe
r/docencia/English/PhD_Bayesian_Statistics/ch3_2009.pdf) (PDF). Archived (https://web.arc
hive.org/web/20100408092905/http://halweb.uc3m.es/esp/Personal/personas/mwiper/docen
cia/English/PhD_Bayesian_Statistics/ch3_2009.pdf) (PDF) from the original on 2010-04-08.
External links
Geometric distribution (http://mathworld.wolfram.com/GeometricDistribution.html) on
MathWorld.