0% found this document useful (0 votes)
36 views

A Continuous Normal Approximation To The Binomial Distribution

The binomial distribution is a well-known example of discrete probability distribution. Only two outcomes are possible for each independent trial in a binomial experiment. In this report, a continuous approximation is proposed for describing the discrete binomial probability function, which can then be used to represent an analogous binomial continuous variable. The proposed approximation consists of a correction to the combinatorial number approximated by using Stirling’s equation.

Uploaded by

Hugo Hernandez
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views

A Continuous Normal Approximation To The Binomial Distribution

The binomial distribution is a well-known example of discrete probability distribution. Only two outcomes are possible for each independent trial in a binomial experiment. In this report, a continuous approximation is proposed for describing the discrete binomial probability function, which can then be used to represent an analogous binomial continuous variable. The proposed approximation consists of a correction to the combinatorial number approximated by using Stirling’s equation.

Uploaded by

Hugo Hernandez
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Vol.

9, 2024-05

A Continuous Normal Approximation to the Binomial Distribution

Hugo Hernandez
ForsChem Research, 050030 Medellin, Colombia
[email protected]

doi: 10.13140/RG.2.2.26710.05447

Abstract
The binomial distribution is a well-known example of discrete probability distribution. Only two
outcomes are possible for each independent trial in a binomial experiment. In this report, a
continuous approximation is proposed for describing the discrete binomial probability
function, which can then be used to represent an analogous binomial continuous variable. The
proposed approximation consists of a correction to the combinatorial number approximated
by using Stirling’s equation, followed by a Taylor series approximation truncated after the
second power. As a result, a normal or Gaussian distribution function is obtained. The error of
the proposed approximation decays with the number of trials considered. However, even for
small numbers of trials (e.g. less than 10), the approximation can be considered satisfactory.

Keywords
Bernoulli trials, Binomial Distribution, Combinatorial, Continuous Approximation, Factorial,
Gamma Function, Normal Distribution, Probability, Taylor Series, Stirling’s Approximation

1. Introduction

Any situation where multiple trials can be independently performed under similar conditions,
each of which having only two possible outcomes and 1, is denoted as a binomial
experiment [1], and the trials are denoted as Bernoulli trials. Binomial experiments are also
commonly denoted as combinatorial problems and have been observed and investigated since
ancient times (with earliest written records dating from about the 5th century BC in ancient
India) [2].

1
Typically, the binomial outcomes are denoted as success and failure, but the outcomes and will be
used here as a generalization.

Cite as: Hernandez, H. (2024). A Continuous Normal Approximation to the Binomial Distribution.
ForsChem Research Reports, 9, 2024-05, 1 - 25. Publication Date: 11/04/2024.
A Continuous Normal Approximation
to the Binomial Distribution
Hugo Hernandez
ForsChem Research
[email protected]

Let us denote by the invariant probability of obtaining outcome in an experiment, and by


the invariant probability of obtaining outcome . Since only those two outcomes are possible,
then it can be concluded that, for a single trial:

(1.1)
or equivalently,

(1.2)

In the case of multiple independent trials, let us say trials, the total probability of the
different outcomes for the multiple trials, which is the product of the probabilities of each
independent trial, will be given by:
( )
(1.3)
Notice that the binomial power in Eq. (1.3) can be alternatively expressed using the Binomial
Theorem [3,4] as follows:

( ) ∑( )

(1.4)
where ( ) represents a combinatorial number defined as:

( )
( )
(1.5)
and is the factorial number given by:

(1.6)
Notice that each term of the sum in Eq. (1.4) represents the probability ( ) of obtaining
outcome in trials, and outcome in trials, in a binomial experiment of independent
trials. Then, considering Eq. (1.2), we obtain:

( ) ( ) ( )
(1.7)
representing the probability distribution of the binomial experiment, also known as binomial
distribution. For or , the corresponding probability is zero.
The expected value of the binomial distribution is:

11/04/2024 ForsChem Research Reports Vol. 9, 2024-05


10.13140/RG.2.2.26710.05447 www.forschem.org / t.me/forschem (2 / 25)
A Continuous Normal Approximation
to the Binomial Distribution
Hugo Hernandez
ForsChem Research
[email protected]

( )
( ) ∑ ( ) ∑ ( ) ∑ ( )
( ) ( ) ( )

∑( ) ( ) ∑( ) ( )

( ( ))
(1.8)
and its variance is:

( ) ( ) ( ( )) (∑ ( ))

(∑ ( ) ( ) )

(∑( )( ) ( ) )

(∑( ) ( )) ( )

( ( ) ) (( ) ) ( )
(1.9)
In addition, the cumulative probability function of the binomial distribution is:

( ) ∑ ( ) ∑( ) ( )

(1.10)
The binomial probability distribution function is a discrete function. However, discrete
functions can be approximated by continuous functions, facilitating in certain cases the
calculus of probabilities, and other mathematical operations.
A direct transformation of Eq. (1.7) into a continuous probability density function is obtained by
replacing the factorial function by the corresponding gamma function ( ) [5] and considering a
local average of the density function [6] for , as follows:
( ) ( )
( ) ( )
( ) ( )
(1.11)
where

( ) ∫

(1.12)

11/04/2024 ForsChem Research Reports Vol. 9, 2024-05


10.13140/RG.2.2.26710.05447 www.forschem.org / t.me/forschem (3 / 25)
A Continuous Normal Approximation
to the Binomial Distribution
Hugo Hernandez
ForsChem Research
[email protected]

such that
( ) ( )
(1.13)
and
( )
(1.14)
is a correction term required to satisfy a necessary condition of probability
density functions2:

∫ ( )

(1.15)
In addition, the cumulative probability function of the continuous binomial distribution will be:

( )
( ) ∫ ( )
( ) ( )
(1.16)
Alternatively, the cumulative probability function can be expressed in terms of the regularized
incomplete beta function ( ) [7] as follows:

( ) { ( )

(1.17)
where
∫ ( )
( ) ( )
∫ ( )
(1.18)
In this report, an alternative continuous approximation is presented, which can then be related
to the normal distribution. The proposed approximation results from considering, in first place,
Stirling’s approximation (Section 2) to the factorial, or the function, and the combinatorial
number, then incorporating a correction to the combinatorial number approximation (Section
3), and finally, using a Taylor series approximation for the logarithm of the probability function
(Section 4). The probability density and cumulative probability functions for a continuous
binomial variable are described in Section 5.

2
Alternatively, we may simply set the upper limit to and use a normalization factor to
fulfill condition (1.15).

11/04/2024 ForsChem Research Reports Vol. 9, 2024-05


10.13140/RG.2.2.26710.05447 www.forschem.org / t.me/forschem (4 / 25)
A Continuous Normal Approximation
to the Binomial Distribution
Hugo Hernandez
ForsChem Research
[email protected]

2. Stirling’s Approximation

The factorial of a non-negative integer ( ) can be expressed as the following product:

(2.1)
Thus, the logarithm of the factorial will be:

(2.2)
Assuming a continuous variable with , then Eq. (2.2) can be considered as a midpoint rule
Riemann sum [8], equivalent to the following integral:

∫ [( )] ( )( ( ) ) ( ( ) )

( ) ( ) ( ) ( ) ( )
(2.3)
Figure 1 illustrates the performance of approximation (2.3) on the estimation of , for
selected values of between and . This approximation has a mean absolute error in the
estimation of of about for and about for .

Figure 1. Comparison between and the continuous approximation shown in Eq. (2.3). Selected
values of between and were considered.

Notice that the term ( ) ( ) can be approximated by a Laurent series expansion at


as follows3:

3
See: https://www.wolframalpha.com/input?i=lim%28%28x%2B1%2F2%29*ln%281%2B1%2F%282x%29%29%29%2C+x%3Dinf

11/04/2024 ForsChem Research Reports Vol. 9, 2024-05


10.13140/RG.2.2.26710.05447 www.forschem.org / t.me/forschem (5 / 25)
A Continuous Normal Approximation
to the Binomial Distribution
Hugo Hernandez
ForsChem Research
[email protected]

( ) ( )
(2.4)
and therefore, for large values of :

( ) ( )
(2.5)
Replacing this result in Eq. (2.3) yields:

( ) ( )
(2.6)
The performance of approximation (2.6) is illustrated in Figure 2. The mean absolute error of
this approximation increases to for and to for .

Figure 2. Comparison between and the continuous approximation shown in Eq. (2.6). Selected
values of between and were considered.

Figure 3. Comparison between and the continuous approximation shown in Eq. (2.7). Selected
values of between and were considered.

The advanced work on integrals, series and factorials performed by John Wallis, Abraham de
Moivre and James Stirling during the 17th and 18th centuries, allowed obtaining a more precise
approximation of , nowadays known as Stirling’s approximation [9,10]:

11/04/2024 ForsChem Research Reports Vol. 9, 2024-05


10.13140/RG.2.2.26710.05447 www.forschem.org / t.me/forschem (6 / 25)
A Continuous Normal Approximation
to the Binomial Distribution
Hugo Hernandez
ForsChem Research
[email protected]

( )
(2.7)
The performance of Stirling’s approximation (Eq. 2.7) is illustrated in Figure 3. While the
differences in performance between approximations (2.3), (2.6) and (2.7) are barely noticeable
in practice, Stirling’s approximation is better. The mean absolute error obtained for Stirling’s
approximation is for and for .

3. Combinatorial Number Correction

Now, considering Stirling’s approximation, the logarithm of the combinatorial number then
becomes:

( ) ( )

( ) ( ) ( ) ( )

( )

( ) ( ) ( )
( )
(3.1)
Thus,

( ) ( ) ( )
( ) ( )

√ ( )
(3.2)
Figure 4 illustrates the behavior of approximation (3.2) for the estimation of the combinatorial
number, considering different values of . As it can be seen, the approximation works fine for
intermediate values, but dramatically fails at the extremes of the interval, that is, when
or . In fact, in the limits of or , the approximated combinatorial number tends
to , as the denominator tends to . It can also be observed that the estimated value is always
slightly higher than the exact combinatorial number. Neglecting the extreme values, the mean
relative absolute error ( ) observed for the estimation of the combinatorial number decays
with number of trials, approximately as follows:

( )
(3.3)
This behavior of the is illustrated in Figure 5.

11/04/2024 ForsChem Research Reports Vol. 9, 2024-05


10.13140/RG.2.2.26710.05447 www.forschem.org / t.me/forschem (7 / 25)
A Continuous Normal Approximation
to the Binomial Distribution
Hugo Hernandez
ForsChem Research
[email protected]

Figure 4. Comparison between ( ) and the continuous approximation shown in Eq. (3.2). Selected values
of the total number of trials ( ).

Figure 5. Mean relative absolute error observed for Eq. (3.2) as a function of the number of trials ( ) and
described by Eq. (3.3), neglecting the extreme values ( ).

11/04/2024 ForsChem Research Reports Vol. 9, 2024-05


10.13140/RG.2.2.26710.05447 www.forschem.org / t.me/forschem (8 / 25)
A Continuous Normal Approximation
to the Binomial Distribution
Hugo Hernandez
ForsChem Research
[email protected]

Since the main issue with Eq. (3.2) is that the denominator becomes zero for or ,
the following empirical correction is proposed in this report:

( )
( )
√ ( ) ( )
(3.4)
where is a small positive constant ensuring that the denominator is never zero for .

Now, since ( ) ( ) , then

( )

√ ( )
(3.5)
This approximate relation is satisfied with less than error for using the following
constant value:

(3.6)
as illustrated in Figure 6.

Figure 6. Combinatorial number ( ) approximated by Eq. (3.4), using .

The performance of Eq. (3.4) for different number of trials is graphically shown in Figure 7. The
mean relative absolute error obtained with Eq. (3.4), including the extreme values, can be
described using the following empirical relation:

( )
(3.7)
The for Eq. (3.4) is presented in Figure 8. This MRE is about one order of magnitude less
than the value obtained for Eq. (3.2).

11/04/2024 ForsChem Research Reports Vol. 9, 2024-05


10.13140/RG.2.2.26710.05447 www.forschem.org / t.me/forschem (9 / 25)
A Continuous Normal Approximation
to the Binomial Distribution
Hugo Hernandez
ForsChem Research
[email protected]

Figure 7. Comparison between ( ) and the continuous approximation shown in Eq. (3.4), using
. Selected values of the total number of trials ( ).

Figure 8. Mean relative absolute error observed for Eq. (3.4) as a function of the number of trials ( ) and
described by Eq. (3.7), including the extreme values ( ). .

11/04/2024 ForsChem Research Reports Vol. 9, 2024-05


10.13140/RG.2.2.26710.05447 www.forschem.org / t.me/forschem (10 / 25)
A Continuous Normal Approximation
to the Binomial Distribution
Hugo Hernandez
ForsChem Research
[email protected]

Using approximation (3.4), the binomial probability distribution function (Eq. 1.7) becomes:

( ) √ ( ) ( ) ( )
( )( )
(3.8)
Now, the continuous binomial probability density function can be approximated as follows:

( )
( ) ( )√ ( ) ( ) ( )
( )( )

(3.9)
In this case, a constant value ( ) is used to normalize the probability density function, such
that:

( )
∫ √ ( ) ( ) ( )
( )( )
(3.10)
While it may seem reasonable to set and , it is not necessarily the case, as it
was shown in Eq. (1.11).
In terms of the probability logarithm, we obtain:

( ) ( ) ( ) ( ) ( ) ( ) ( )
( )
( ) ( )
(3.11)
Similarly, the probability density logarithm is:

( ) ( ) ( )
(3.12)

4. Taylor Series Approximation

Let us now use a Taylor Series approximation of ( ) about the expected value of the
distribution: ( ) . Assuming and as constants, let us consider the following function:

( ) ( ) ( ) ( ) ( ) ( ) ( )
( )
( ) ( )
(4.1)

11/04/2024 ForsChem Research Reports Vol. 9, 2024-05


10.13140/RG.2.2.26710.05447 www.forschem.org / t.me/forschem (11 / 25)
A Continuous Normal Approximation
to the Binomial Distribution
Hugo Hernandez
ForsChem Research
[email protected]

Then,

( ) ( ( ) ( ) ( ) ( ))
( )
( )
( )( ( ) )
(4.2)
( ) ( )
( ) ( ) ( )
( ) ( )
(4.3)
( ) ( ) ( ( ) )
( )
( ) ( ) ( ( ) )( )
(4.4)
( )
( ) ( )
(4.5)
( ) ( )
( ( ) ) ( )
(4.6)
( )
( ) ( ) ( ) ( )
(4.7)
( ) ( )
( ( ) ) ( )
(4.8)
( ) ( ) ( )
( ) ( ) ( ) ( )
(4.9)
( ) ( )
( ( ) ) ( )
(4.10)
( ) ( ) ( )( ) ( ) ( ) ( )( ) ( )
( ) ( ) ( ) ( )
(4.11)
( ) ( ) ( ( )) ( ) ( )
( )
( ( ) ) ( )
(4.12)
Eq. (4.11) and (4.12) valid for .
Thus,

11/04/2024 ForsChem Research Reports Vol. 9, 2024-05


10.13140/RG.2.2.26710.05447 www.forschem.org / t.me/forschem (12 / 25)
A Continuous Normal Approximation
to the Binomial Distribution
Hugo Hernandez
ForsChem Research
[email protected]

( ) ( )

( )( )

( ( ) ( ) ( ) ( ))
( )
( )
( )( ( ) )
( ) ( ( ) )
( ( ) )( )
( ) ( ) ( ( ) )( )
( ) ( )( )
∑( )( )
( )( ( ) ) ( )( )
(4.13)
Truncating after the second order term we obtain:

( ) ( ( ) ( ) ( ) ( ))
( )
( )
)( (( ) )
( ) ( ( ) )
( ( ) )( )
( ) ( ) ( ( ) )( )
( )
( )( )
( ( ) ) ( )
(4.14)
From which:
( )( ) ( ) ( )
( ) √ ( ) ( )
( )( ( ) ) ( ) ( )( )
( ) ( )( )
( )
( ) ( ( ) )( )
( ) ( )
( )
( )( )
( ( ) ) ( )

(4.15)
Then, for large values of we may neglect in certain sums, resulting in:
( )( ) ( )
( )
( ) ( )
( )
√ ( )
(4.16)
In the case of the uniform binomial distribution ( ), and large values of :

11/04/2024 ForsChem Research Reports Vol. 9, 2024-05


10.13140/RG.2.2.26710.05447 www.forschem.org / t.me/forschem (13 / 25)
A Continuous Normal Approximation
to the Binomial Distribution
Hugo Hernandez
ForsChem Research
[email protected]

( )
( ) √

(4.17)
The approximation shown in Eq. (4.17) corresponds to a truncated normal distribution function
with and √ . Notice that for a uniform binomial distribution we have exactly
(from Eq. 1.8 and 1.9) ( ) and ( ) , which are consistent with the
estimations obtained with the second order approximation for large . Approximation (4.17) is
compared to the exact binomial probability values (Eq. 1.7) for selected values of in Figure 9.

Figure 9. Comparison between ( ) and the continuous approximation shown in Eq. (4.17). Selected
values of the total number of trials ( ).

11/04/2024 ForsChem Research Reports Vol. 9, 2024-05


10.13140/RG.2.2.26710.05447 www.forschem.org / t.me/forschem (14 / 25)
A Continuous Normal Approximation
to the Binomial Distribution
Hugo Hernandez
ForsChem Research
[email protected]

Even when Eq. (4.17) was obtained after assuming large values of , the estimation of the
probability for small number of trials is somehow satisfactory. Figure 10 shows the effect of the
number of trials on the mean absolute error ( ) in the estimation of the probability function
for the binomial uniform distribution. For , the obtained is already below ; for
, the drops below ; and for , the descends below . The
following empirical expression can be used to describe the error obtained:

( )
(4.18)

Figure 10. Mean absolute error observed for Eq. (4.17) as a function of the number of trials ( ) and
described by Eq. (4.18).

Now, Eq. (4.16) can be rearranged as follows:


( ( ( )( ) ))
( ) (( )( ) )
( )
( )
√ ( )
(4.19)
where the first term represents the probability density function of a normal distribution with
mean value ( )( ) and standard deviation √ ( ), and the
(( )( ) )
second term is a correction factor with ( ) .
Thus, for large number of trials:
( ( ( )( ) ))
( )
( )
√ ( )
(4.20)

11/04/2024 ForsChem Research Reports Vol. 9, 2024-05


10.13140/RG.2.2.26710.05447 www.forschem.org / t.me/forschem (15 / 25)
A Continuous Normal Approximation
to the Binomial Distribution
Hugo Hernandez
ForsChem Research
[email protected]

Figure 11. Behavior of ( ) according to the continuous approximation shown in Eq. (4.20). Selected
values of probability of outcome ( ) and total number of trials ( ).

Figure 12. Mean absolute error observed for Eq. (4.20) as a function of the number of trials ( ) and
approximately described by Eq. (4.21).

11/04/2024 ForsChem Research Reports Vol. 9, 2024-05


10.13140/RG.2.2.26710.05447 www.forschem.org / t.me/forschem (16 / 25)
A Continuous Normal Approximation
to the Binomial Distribution
Hugo Hernandez
ForsChem Research
[email protected]

Figure 11 shows the behavior of the probability function using approximation (4.20) compared
to the exact probabilities (Eq. 1.7) for different values of and . The differences are quantified
using the mean absolute error, which can be approximately described using the following
expression:

( )
(4.21)
The behavior of the with number of trials is illustrated in Figure 12. As expected, the error
decreases with the number of trials. This approximation might be considered suitable for
, as the mean absolute error is below .

Of course, lower errors are expected for higher order Taylor approximations, but at the
expense of more complicated analytical expressions.

5. Continuous Binomial Distribution

A continuous formulation of the binomial distribution requires identifying the corresponding


probability density function. Considering approximation (4.20), a probability density function
can be obtained in analogy to Eq. (3.12) as follows:

( ) ( ) ( )
(5.1)
where, in this case

( )
( ( ( )( ) ))
( )

√ ( )

( ( )( ) ) ( ( )( ) )
( ) ( )
√ ( ) √ ( )
(5.2)
where represents the error function. The error function terms emerge due to the truncation
of the normal distribution between and [11].

Considering that the difference between two consecutive values of the discrete binomial
distribution is , then each integer value is the representative value (midpoint class
mark) of the interval between and [6]. So, instead of setting the limits as
and , we need to consider the whole intervals at the extremes. That is,
and .

11/04/2024 ForsChem Research Reports Vol. 9, 2024-05


10.13140/RG.2.2.26710.05447 www.forschem.org / t.me/forschem (17 / 25)
A Continuous Normal Approximation
to the Binomial Distribution
Hugo Hernandez
ForsChem Research
[email protected]

Thus,
( ( ( )( ) ))
√ ( )
( )
( )
( )( ) ( )( )
( )
( ) ( )
√ ( ) √ ( )

(5.3)

The probability density function can be shifted by unit to transform the midpoint class mark
into an upper limit class mark, resulting in:
( ( ( )( ) ))
√ ( )
( )
( )
( )( ) ( )( )
( )
( ) ( )
√ ( ) √ ( )

(5.4)
On the other hand, the cumulative probability can be approximated as follows:

( ) ∫ ( )

( )( ) ( )( )
( ) ( )
√ ( ) √ ( )
( )( ) ( )( )
( )
( ) ( )
√ ( ) √ ( )

(5.5)

Figure 13 shows the behavior of approximation (5.5) compared to the exact cumulative
probability of the discrete distribution (Eq. 1.10) for selected values of and . The mean
absolute error obtained can be approximately described by the empirical expression:

( )
(5.6)

presented graphically in Figure 14. Even for a small number of trials (e.g. ), the mean
absolute error observed in the estimation of the cumulative probability is already less than .

11/04/2024 ForsChem Research Reports Vol. 9, 2024-05


10.13140/RG.2.2.26710.05447 www.forschem.org / t.me/forschem (18 / 25)
A Continuous Normal Approximation
to the Binomial Distribution
Hugo Hernandez
ForsChem Research
[email protected]

Figure 13. Behavior of ( ) according to the continuous approximation shown in Eq. (5.5), compared
to the exact results (Eq. 1.10). Selected values of outcome probability ( ) and total number of trials ( ).

Figure 14. Mean absolute error observed for Eq. (5.5) as a function of the number of trials ( ) and
approximately described by Eq. (5.6).

11/04/2024 ForsChem Research Reports Vol. 9, 2024-05


10.13140/RG.2.2.26710.05447 www.forschem.org / t.me/forschem (19 / 25)
A Continuous Normal Approximation
to the Binomial Distribution
Hugo Hernandez
ForsChem Research
[email protected]

The expected value obtained with the approximate probability density shown in Eq. (5.4) is:

( ) ∫ ( )

( )( )

( ( ) ( ) ) (( )( ) ( ) )
( ) ( ) ( )

( )( ) ( ) ( ) ( )
( ) ( )
√ ( ) √ ( )
(5.7)
and for large number of trials:

( )
(5.8)
The variance of the approximate probability density function is:

( ) ∫ ( ( )) ( )

( )
( ) √

( ( ) ( ) ) (( )( ) ( ) )
(( ( ) ( ) ) ( ) (( )( ) ( ) ) ( ) )

( )( ) ( ) ( ) ( )
( ) ( )
√ ( ) √ ( )
(5.9)
and for large number of trials:

( ) ( )
(5.10)
Eq. (5.8) and (5.10) correspond to the expected value and variance of the binomial distribution
(Eq. 1.8 and 1.9).
Finally, a type I standard continuous binomial random variable can be defined as follows [12]
(assuming large number of trials):

( )
√ ( )
(5.11)

11/04/2024 ForsChem Research Reports Vol. 9, 2024-05


10.13140/RG.2.2.26710.05447 www.forschem.org / t.me/forschem (20 / 25)
A Continuous Normal Approximation
to the Binomial Distribution
Hugo Hernandez
ForsChem Research
[email protected]

with probability density (using the change of variable theorem) [13]:

( )( )
( )
√ ( )

( )
( )( ) ( )( )
( )
( ) ( )
√ ( ) √ ( )

(5.12)

Since we are assuming large number of trials, then the standard continuous binomial random
variable might be approximated by the standard normal random variable ( ):

( ) ( )

(5.13)

6. Summary

The discrete probability distribution function for a binomial experiment considering


independent trials and an invariant probability of outcome equal to , is:

( ) ( ) ( )
(1.7)
with the corresponding discrete cumulative probability function:

( ) ∑ ( ) ∑( ) ( )

(1.10)
where

( )
( )
(1.5)
Assuming that the factorials in the combinatorial number can be approximated by Stirling’s
formula:

( )
(2.7)

11/04/2024 ForsChem Research Reports Vol. 9, 2024-05


10.13140/RG.2.2.26710.05447 www.forschem.org / t.me/forschem (21 / 25)
A Continuous Normal Approximation
to the Binomial Distribution
Hugo Hernandez
ForsChem Research
[email protected]

Then, we obtain:

( ) ( ) ( )
( ) ( )

√ ( )
(3.2)
Unfortunately, this expression diverges for and , as the denominator approaches a
value of . For this reason, the following approximation is proposed:

( )
( )
√ ( ) ( )
(3.4)
where

(3.6)
Thus, the binomial probability distribution becomes (as a logarithm):

( ) ( ) ( ) ( ) ( ) ( ) ( )
( )
( ) ( )
(3.11)
which is valid for continuous values of and .

The previous expression can be further approximated using a Taylor series expansion at
, yielding:

( ) ( )

( )( )

( ( ) ( ) ( ) ( ))
( )
( )
( )( ( ) )
( ) ( ( ) )
( ( ) )( )
( ) ( ) ( ( ) )( )
( ) ( )( )
∑( )( )
( )( ( ) ) ( )( )
(4.13)
which can be truncated after the second power, resulting in:

11/04/2024 ForsChem Research Reports Vol. 9, 2024-05


10.13140/RG.2.2.26710.05447 www.forschem.org / t.me/forschem (22 / 25)
A Continuous Normal Approximation
to the Binomial Distribution
Hugo Hernandez
ForsChem Research
[email protected]

( ( ( )( ) ))
( ) (( )( ) )
( )
( )
√ ( )
(4.19)
Now, considering large number of trials, the second factor approaches , and the probability
approximation simplifies into:
( ( ( )( ) ))
( )
( )
√ ( )
(4.20)
In the case of the uniform binomial distribution ( ), and large values of :
( )
( ) √

(4.17)
The probability function approximation can also be used to obtain the probability density
function of an equivalent continuous binomial random variable using:

( ) ( ) ( )
(3.12)
resulting in:
( ( ( )( ) ))
√ ( )
( )
( )
( )( ) ( )( )
( )
( ) ( )
√ ( ) √ ( )

(5.4)
with cumulative probability:

( ) ∫ ( )

( )( ) ( )( )
( ) ( )
√ ( ) √ ( )
( )( ) ( )( )
( )
( ) ( )
√ ( ) √ ( )

(5.5)

11/04/2024 ForsChem Research Reports Vol. 9, 2024-05


10.13140/RG.2.2.26710.05447 www.forschem.org / t.me/forschem (23 / 25)
A Continuous Normal Approximation
to the Binomial Distribution
Hugo Hernandez
ForsChem Research
[email protected]

Finally, for ( )( ) we simply obtain a normal distribution with and


√ ( ):
( )
( )
( )
√ ( )
(6.1)
with cumulative probability:

( )
√ ( )
( )

(6.2)

Acknowledgment and Disclaimer

This report provides data, information and conclusions obtained by the author(s) as a result of original
scientific research, based on the best scientific knowledge available to the author(s). The main purpose
of this publication is the open sharing of scientific knowledge. Any mistake, omission, error or inaccuracy
published, if any, is completely unintentional.

This research did not receive any specific grant from funding agencies in the public, commercial, or not-
for-profit sectors.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC
4.0). Anyone is free to share (copy and redistribute the material in any medium or format) or adapt
(remix, transform, and build upon the material) this work under the following terms:
 Attribution: Appropriate credit must be given, providing a link to the license, and indicating if
changes were made. This can be done in any reasonable manner, but not in any way that
suggests endorsement by the licensor.
 NonCommercial: This material may not be used for commercial purposes.

References

[1] Devore, J. L. (2016). Probability and Statistics for Engineering and the Sciences. 9 th Edition.
Cengage Learning. Boston, MA. Section 3.4. The Binomial Probability Distribution. pp. 117-125.
https://www.cengage.com/c/probability-and-statistics-for-engineering-and-the-sciences-9e-
devore/9781305251809PF/.
[2] García-García, J. I., Fernández Coronado, N. A., Arredondo, E. H., & Imilpán Rivera, I. A. (2022). The
binomial distribution: Historical origin and evolution of its problem situations. Mathematics, 10
(15), 2680. doi: 10.3390/math10152680.
[3] Coolidge, J. L. (1949). The story of the binomial theorem. The American Mathematical Monthly, 56
(3), 147-157. doi: 10.2307/2305028.

11/04/2024 ForsChem Research Reports Vol. 9, 2024-05


10.13140/RG.2.2.26710.05447 www.forschem.org / t.me/forschem (24 / 25)
A Continuous Normal Approximation
to the Binomial Distribution
Hugo Hernandez
ForsChem Research
[email protected]

[4] Laudański, L. M. (2013). Between Certainty and Uncertainty. Springer-Verlag, Berlin-Heidelberg.


Chapter 4: Binomial Distribution, pp. 87-127. doi: 10.1007/978-3-642-25697-4_7.
[5] Fowler, D. (1996). The binomial coefficient function. The American Mathematical Monthly, 103 (1),
1-17. doi: 10.1080/00029890.1996.12004694.
[6] Hernandez, H. (2024). Local Average Probabilities of Randomistic Variables. ForsChem Research
Reports, 9, 2024-04, 1 - 25. doi: 10.13140/RG.2.2.35201.88162.
[7] Dutka, J. (1981). The incomplete beta function—a historical profile. Archive for History of Exact
Sciences, 11-29. https://www.jstor.org/stable/41133604.
[8] Hughes-Hallett, D., et al. (2017). Calculus: Single Variable. 7th Edition. John Wiley & Sons, Hoboken
NJ. Section 7.5 Numerical Methods for Definite Integrals, pp. 376-385. https://www.wiley.com/en-
ae/Calculus%3A+Single+Variable%2C+7th+Edition-p-9781119374268.
[9] Dutka, J. (1991). The early history of the factorial function. Archive for History of Exact Sciences,
225-249. doi: 10.1007/BF00389433.
[10] Pearson, K. (1924). Historical note on the origin of the normal curve of errors. Biometrika, 402-
404. doi: 10.2307/2331714.
[11] Hernandez, H. (2024). Constrained Randomistic Variables. ForsChem Research Reports, 9, 2024-
03, 1 - 25. doi: 10.13140/RG.2.2.12411.73764.
[12] Hernandez, H. (2022). Standard Deterministic, Standard Random, and Randomistic Variables.
ForsChem Research Reports, 7, 2022-06, 1 - 18. doi: 10.13140/RG.2.2.36316.87688.
[13] Hernandez, H. (2017). Multivariate Probability Theory: Determination of Probability Density
Functions. ForsChem Research Reports, 2, 2017-13, 1-13. doi: 10.13140/RG.2.2.28214.60481.

11/04/2024 ForsChem Research Reports Vol. 9, 2024-05


10.13140/RG.2.2.26710.05447 www.forschem.org / t.me/forschem (25 / 25)

You might also like