0% found this document useful (0 votes)

17 views

extinction_explosion_subcritical_2015

Uploaded by

marijamilojevic664

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views

extinction_explosion_subcritical_2015

Uploaded by

marijamilojevic664

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 41

Predicting Extinction or Explosion in a Galton-Watson

Branching Process with Power Series Oﬀspring

Distribution1
Peter Guttorp2 and Michael D. Perlman3
University of Washington

Abstract
Extinction is certain in a Galton-Watson (GW) branching process if the off-
spring mean μ ≤ 1, whereas explosion is possible but not certain if μ > 1.
Discriminating between these two possibilities is a well-studied hypothesis-
testing problem. However, deciding whether extinction or explosion will oc-
cur for the current realization of the process is a prediction problem. This can
be formulated as a different testing problem by considering the conditional
distributions of the process given extinction and explosion respectively. For
power series offspring distributions, fixed-sample and sequential parametric
tests are presented for the prediction problem and illustrated with data on
the spread of epidemics and the populations of endangered species.

1
Key words and phrases: Galton-Watson branching process; extinction; explosion; sub-
critical; supercritical; stochastic ordering; prediction; hypothesis testing; least favorable
distribution; sequential probability ratio test; epidemic; endangered species.
2
[email protected]. Research supported in part by National Science Founda-
tion Grant DMS-1106862.
3
[email protected]. Research supported in part by U.S. Department of
Defense Grant H98230-10-C-0263/0000 P0004.

Preprint submitted to Elsevier April 15, 2015

© 2015. This manuscript version is made available under the Elsevier user license
http://www.elsevier.com/open-access/userlicense/1.0/
1. Introduction: the 2012 pertussis outbreak in Washington State
In 2011 the weekly numbers of new pertussis (whooping cough) cases in
Washington State remained fairly constant, but in 2012 the numbers in-
creased rapidly (Figure 1, CDC (2012)). Faced with the possibility of a
pandemic, the governor declared a state-wide health emergency in Week 14
and an inoculation/quarantine program was begun.

Figure 1: Weekly counts of new pertussis cases in Washington state.

The spread of an epidemic, at least in its initial stages, can be modeled as

a classical Galton-Watson (GW) branching process, cf. §2. The question of
predicting extinction or explosion is commonly formulated as that of testing
μ ≤ 1 (subcriticality/criticality) vs. μ > 1 (supercriticality), where μ denotes
the mean number of infected oﬀspring per individual case – cf. Becker (1974),
Heyde (1979), Scott (1987).4 Guttorp and Perlman (2015) use a decision-
theoretic analysis to show, however, that this problem is more complex than
previous literature suggests and that the basis of a standard test procedure
is somewhat dubious.
Fortunately, this testing problem usually is not the one of actual interest,
because a supercritical process still may terminate with positive probability.

4
Basawa and Scott (1976) and Sweeting (1978) treat a related testing problem for the
supercritical case.

2
Of more interest is the problem of predicting whether the current realization
of a non-terminated process will terminate or explode.
In §5-6 this prediction problem is formulated as a different hypothesis-
testing problem based on the conditional distributions of the process given
eventual extinction and explosion respectively. Unlike the original testing
problem, this prediction problem often has relatively simple solutions in the
fixed-sample (§5) and sequential sample (§6) cases, the latter based on the
classical Wald sequential probability ratio test (SPRT), see §6. Using this
procedure, explosion might have been predicted for the 2012 pertussis out-
break as early as Week 3; see Example 7.2.
Like the authors noted above who treated the original testing problem,
we assume a parametric model for the offspring distribution, a power series
offspring distribution (psod); see §3. The conditional distributions of a GW
process given (eventual) extinction or explosion are given in §2, then special-
ized in §3 to the psod case. If the psod satisfies two total positivity conditions,
these conditional distributions possess the stochastic monotonicity properties
needed to justify our fixed-n and sequential prediction methods; see §4. Ya-
glom’s (1947) well-known exponential approximation for the distribution of
the population size is extended and sharpened in §5.3 and §5.4.
2. Conditional processes derived from a GW branching process
The Galton-Watson branching process is a discrete-time Markov chain that
describes the growth or decline of a population that reproduces by simple
branching, or splitting. Applications include nuclear chain reactions, epi-
demics, and the population size of endangered species. The classic reference
is Harris (1963, Ch. I); also see Karlin (1966), Feller (1968), Athreya and
Ney (1972), Jagers (1975), Taylor and Karlin (1984), Guttorp (1991).
For each n = 0, 1, 2... let Xn denote the population size at generation n;
assume that X0 = x0 ≥ 1 is known. At generation n = 0 the ith individ-
(1) d
ual is replaced by a random number ξi = ξ of first-generation offspring,
where the offspring random variable (rv) ξ ≡ ξp has probability distribu-
tion p ≡ (p0 , p1 , p2 , . . . ) on {0, 1, 2, . . . }. The i-th individual in generation
(n) d
n − 1 similarly is replaced by a random number ξi = ξ of n-th generation
offspring independently of its siblings. Thus the population size in the n-th
generation satisfies
(n) (n)
Xn = ξ1 + · · · + ξXn−1 , n ≥ 1, (1)

3
(n) (n) d
where ξ1 , . . . , ξXn−1 are iid rvs, each = ξ. We assume that each pk < 1 so
the process is not deterministic, and that p0 > 0 so extinction is possible.
Denote the probability generating function (pgf) of the oﬀspring distri-
bution by
∞
φ(s) ≡ φp (s) = Ep (sξ ) = p k sk , s ≥ 0, (2)
k=0

and let 1 ≤ ρ ≡ ρp ≤ ∞ be its radius of convergence. Note that φ(1) = 1.

Because φ(s) is convex and p1 < 1, the equation

φ(s) = s (3)

has either one finite root or two distinct finite roots in (0, ρ], one of which
must be 1. If (3) has one finite root in (0, ρ] denote it by u ≡ up ; if (3) has
two distinct finite roots in (0, ρ] denote them by u ≡ up and v ≡ vp , where
0 < u < v ≤ ρ.
If x0 = 1, the pgf of Xn is the n-th functional iterate of φ, denoted by
φn . For x0 ≥ 1 the pgf of Xn is φxn0 ≡ (φn )x0 . Either extinction (Xn = 0 for
some n ≥ 1) or explosion (Xn → ∞) must occur; their probabilities are ux0
and 1 − ux0 respectively.
Denote the mean of the offspring distribution by μ ≡ μp = E(ξ); then
μ = φ (1). The GW process X ≡ Xp and its pgf φ ≡ φp are called subcritical
(resp., critical, supercritical) if μ < 1 (μ = 1, μ > 1); see Figure 2. In the
subcritical case, u = 1 and v may or may not exist, see §2. In the critical
case, u = 1 and v does not exist. In the supercritical case 0 < u < v = 1, so
both extinction and explosion occur with positive probability.
For a subcritical GW process, if v exists then p, X, and φ are called
extendable; in this case 1 = u < v ≤ ρ (see Figure 2). If ρ = 1 then v > 1
cannot exist so φ is not extendable, while if ρ = ∞ then φ is extendable since
◦
it grows at a quadratic rate or faster hence eventually crosses the 45 line a
second time beyond 1 = u. If 1 < ρ < ∞ then φ is extendable iff φ(ρ) ≥ ρ.
Definition 2.1. For a supercritical GW process X, define the conditional
processes

Ẋ ≡ X extinction, (4)

Ẍ ≡ X explosion. (5)

If X is subcritical or critical, deﬁne Ẋ = X.

4
2.5 Supercritical Subcritical, extendable

2.5
v
2.0

2.0
1.5

1.5
φ(s)

φ(s)
1.0

1.0
0.5

0.5
u
0.0

0.0
0.0 0.5 1.0 1.5 2.0 2.5 0.0 0.5 1.0 1.5 2.0 2.5

s s

Figure 2: The duality between supercritical and extendable subcritical pgfs.

Proposition 2.1. The set of supercritical GW processes conditional on ex-

tinction coincides with the set of subcritical extendable GW processes.
Proof. If X is supercritical it is well known5 that Ẋ is a subcritical GW
process with offspring pgf
φ(us)
φ̇(s) = (6)
u
and offspring mean μ̇ = φ (u) < 1. Furthermore φ̇ is extendable with second
root v̇ = 1/u.
Now suppose that X is subcritical and extendable. Define
φ(vs)
φ̃(s) = . (7)
v
It is straightforward to verify that φ̃ is a supercritical offspring pgf with
offspring mean μ̃ = φ (v) > 1 and extinction probability ũ = 1/v. Denote
the corresponding supercritical GW process by X̃. Then

˙ φ(ũvs)
φ̃(s) = = φ(s). (8)
ũv

5
Waugh (1958, p.248), Athreya, Ney (1972, §I.12, Theorem 3), Guttorp (1991, p.101) .

5
Futhermore, if X is supercritical then

˜ = φ(uv̇s) = φ(s).
φ̇(s) (9)
uv̇
This establishes the asserted result.
Successive conditioning on X1 , . . . , Xn−1 in (1) shows that the joint prob-
ability mass function (pmf) f ≡ fp of Xn ≡ (X1 , . . . , Xn ) is given by

n
f (xn ) ≡ Prp [Xn = xn ] = hp (xi−1 , xi ) ≡ hp (xn ) (10)
i=1

(e.g. Jagers (1975, eqn. (2.1.2)), where

hp (k, l) = p r 1 · · · p rk (11)
r1 +···+rk =l

Note that hp (k, l) is the coeﬃcient of sl in the power series [φp (s)]k .
From Bayes’ formula, the pmf of Ẋn ≡ (Ẋ1 , . . . , Ẋn ) is given by

f˙(xn ) ≡ f˙p (xn ) = Prp [ Xn = xn | extinction] (12)

Pr[ extinction | Xn = xn ] Pr[ Xn = xn ]
=
Pr[ extinction ]
xn −x0
= u f (xn ).

Similarly the pmf of Ẍn ≡ (Ẍ1 , . . . , Ẍn ) is given by

1 − u xn
f¨(xn ) ≡ f¨p (xn ) = f (xn ), xn > 0, (13)
1 − ux0

where xn > 0 means that x1 > 0, . . . , xn > 0. From (12) and (13), Ẋ and Ẍ
are Markovian with transition probabilities

f˙(xn |xn−1 ) = uxn −xn−1 hp (xn−1 , xn ), (14)

1 − u xn
f¨(xn |xn−1 ) = hp (xn−1 , xn ), xn−1 , xn > 0, (15)
1 − uxn−1

respectively. However, Ẍ is not a GW process because some individuals may

die without oﬀspring even though explosion occurs.

6
3. The GW process with power series offspring distribution
Following Becker (1974) we now specialize this discussion to a parametric
model for the offspring distribution p ≡ (p0 , p1 , . . . ). The power series off-
spring distribution (psod) pθ ≡ (pθ;0 , pθ;1 , . . . ) is given by

ak θ k
pθ;k = , k = 0, 1, . . . , 0 < θ < ψ, (16)
A(θ)

, . . . ) ≡ a are nonnegative constants, θ is the unknown parame-

where (a0 , a1
ter, A(θ) = ak θk , and 0 < ψ ≤ ∞ is the radius of convergence of A(·). We
assume that a0 > 0 so extinction is possible, and that ak > 0 for at least one
k ≥ 2 so growth is possible; without loss of generality we may take a0 = 1.
For simplicity of exposition we limit attention to the case where A(ψ−) = ∞;
this includes the familiar Poisson, binomial, geometric, negative binomial, bi-
nary splitting, and logarithmic series distributions.
Denote Xpθ , ξpθ , fpθ , φpθ , ρpθ , upθ , vpθ , μpθ by Xθ , ξθ , fθ , φθ , ρθ , uθ , vθ ,
μθ respectively. By (2) and (16), φθ has radius of convergence ρθ = ψ/θ and

A(θs) B(θs)
φθ (s) = = s, 0 < s < ρθ , (17)
A(θ) B(θ)

where B(θ) = A(θ)/θ (see Figure 3). Here B(θ) is a strictly convex positive
function on (0, ψ) with B(0+) = B(ψ−) = ∞, so B(·) has a unique minimum
at some τ ∈ (0, ψ) with B (τ ) = 0; B(θ) is strictly decreasing for θ < τ and
strictly increasing for θ > τ .
It follows from (17) that for θ ∈ (0, ψ),

θA (θ)
Eθ (ξ) ≡ μθ = , (18)
A(θ)
μθ − 1 B (θ) d log B(θ)
= = , (19)
θ B(θ) dθ
dμθ
Varθ (ξ) ≡ σθ2 = θ . (20)
dθ
By (19), μτ = 1 so Xτ is critical. By (20), μθ is strictly increasing in θ, hence
the subcritical and supercritical parameter spaces are both open intervals:

{θ | μθ < 1} = (0, τ ) (subcritical), (21)

{θ | μθ > 1} = (τ, ψ) (supercritical). (22)

7
B(θ)

θ θuθ τ θ θvθ ψ≤∞

Figure 3: The function B(θ) = A(θ)/θ.

If θ ∈ (τ, ψ) then from (3) and (17), uθ is the unique solution to

B(θuθ ) = B(θ), 0 < θuθ < τ. (23)
If θ ∈ (0, τ ) then vθ is the unique solution to
B(θ) = B(θvθ ), τ < θvθ < ψ. (24)
Thus each subcritical Xθ is extendable. It follows from the uniqueness of the
solutions of (23) and (24) that
vθuθ = u−1
θ for θ ∈ (τ, ψ), (25)
uθvθ = vθ−1 for θ ∈ (0, τ ).
(26)
Proposition 3.1. (i) For θ ∈ (τ, ψ), θuθ strictly decreases from τ to 0; uθ
strictly decreases from 1 to 0.
(ii) For θ ∈ (0, τ ), θvθ strictly decreases from ψ to τ ; vθ strictly decreases
from ∞ to 1.
Proof. (i) It follows from (23) and (19) that for θ ∈ (τ, ψ),

d(θuθ ) μθ − 1
dθ ≡ = uθ . (27)
dθ μθuθ − 1

8
Thus dθ < 0 because θuθ < τ < θ, so θuθ is strictly decreasing, a fortiori
uθ is strictly decreasing. As θ ↓ τ , B(θ) ↓ B(τ ), its unique minimum, hence
θuθ ↑ τ by (23), so uθ ↑ 1. As θ ↑ ψ, B(θ) ↑ ∞, hence θuθ ↓ 0 by (23), so
uθ ↓ 0.
(ii) It follows from (24) and (19) that for θ ∈ (0, τ ),

d(θvθ ) μθ − 1
= vθ , (28)
dθ μθvθ − 1

which is < 0 because τ < θvθ < ψ. The remaining results are verified as in
(i). Alternatively, (25) and (26) can be applied to obtain vτ − and v0+ .
Proposition 3.1, (25), and (26) establish analytically a 1-1 relation be-
tween the subcritical (0, τ ) and supercritical (τ, ψ) parameter spaces. The
corresponding probabilistic relation between the subcritical and supercritical
processes themselves is now presented.
Proposition 3.2. The set of supercritical processes {Ẋθ | θ ∈ (τ, ψ)} condi-
tional on extinction coincides with the set of subcritical processes {Xθ | θ ∈
(0, τ )}. Specifically,
d
Ẋθ = Xθuθ , θ ∈ (τ, ψ), (29)
d
Xθ = Ẋθvθ , θ ∈ (0, τ ). (30)
d
(Note too that Ẋτ = Xτ .)
Proof. Suppose first that Xθ is supercritical, i.e., θ ∈ (τ, ψ). From (6), Ẋθ
is a subcritical GW process with offspring pgf in the same psod family (16):

A(θuθ s) A(θuθ s)
φ̇θ (s) = = = φθuθ (s); (31)
A(θ)uθ A(θuθ )

cf. Becker (1974, p.394). Since θuθ < τ , Xθuθ is subcritical.

Suppose next that Xθ is subcritical, i.e., θ ∈ (0, τ ). A similar argument
using (7) shows that X̃θ is a supercritical GW process with oﬀspring pgf

φ̃θ (s) = φθvθ (s). (32)

Now apply (8) to obtain φθ = φ̇θvθ ; since θvθ > τ , Xθvθ is supercritical.

9
Example 3.1: the Poisson(θ) psod. Here pk;θ = e−θ θk /k!, 0 < θ < ∞, so
ak = 1/k!, A(θ) = eθ , ψ = ∞, A(ψ−) = ∞, B(θ) = eθ /θ, and φθ (s) = eθ(s−1) .
Then μθ = σθ2 = θ, τ = 1, and uθ and vθ satisfy the equation

eθ(s−1) = s. (33)

by (23) and (24). This cannot be solved explicitly, but necessarily

uθ = 1, vθ > 1 if θ < 1 (subcritical);

(34)
uθ < 1, vθ = 1 if θ > 1 (supercritical).

Example 3.2: the negative binomial NB(r, θ) and geometric GM(θ)

psods. For ﬁxed r > 0, the NB(r, θ) psod has pk;θ = Γ(r+k)
Γ(r)k!
(1 − θ)r θk ,
Γ(r+k) 1 1
0 < θ < 1. Here ak = Γ(r)k!
, A(θ) = (1−θ) r, ψ = 1, B(θ) = [θ(1−θ)r ]
, and
(1−θ)r rθ 2 rθ 1
φθ (s) = (1−θs)r
. Also μθ = 1−θ , σθ = (1−θ)2 , τ = 1+r
, and uθ and vθ satisfy
the equation
(1 − θ)r = (1 − θs)r s. (35)
This can be solved explicitly for the GM(θ) ≡ NB(1, θ) psod where r = 1
and τ = 1/2:

uθ = 1, vθ = 1−θ θ
if θ < 1
2
(subcritical);
(36)
uθ = 1−θ
θ
, v θ = 1 if θ > 1
2
(supercritical).

Here the relations (25) and (26) can be veriﬁed directly.

Example 3.3: binary splitting. Take a0 = a2 = 1 and ak = 0 for k = 2.
1 θ2
Thus A(θ) = 1 + θ2 for 0 < θ < ∞ ≡ ψ, so p0;θ = 1+θ 2 , p2;θ = 1+θ 2 , and
2 s2 2θ 2
pk;θ = 0 for k = 0, 2. Here B(θ) = θ−1 + θ, φθ (s) = 1+θ 1+θ 2
, μθ = 1+θ 2,
2 4θ 2
σθ = (1+θ2 )2 , τ = 1, and

uθ = 1, vθ = θ12 if θ < 1 (subcritical);

(37)
uθ = θ12 , vθ = 1 if θ > 1 (supercritical).

Again the relations (25) and (26) can be veriﬁed directly.

4. Stochastic orderings for a psod GW process
Let W , Z, Wn ≡ (W1 , . . . , Wn ), Zn ≡ (Z1 , . . . , Zn ), and W ≡ (W1 , . . . ), Z ≡
(Z1 , . . . ) be nonnegative-integer-valued random variables, random vectors,

10
and discrete-time stochastic processes, respectively. We say W is stochasti-
cally smaller than Z, written W ≺ Z, if E[g(W )] ≤ E[g(Z)] for all increasing
bounded nonnegative functions g on the nonnegative integers Z0 with strict
inequality for at least one g. It is straightforward to show that if U, V, W, Z
are independent, then
U ≺ V and W ≺ Z =⇒ U + W ≺ V + Z. (38)
Similarly, we write Wn ≺ Zn if
E[g(Wn )] ≤ E[g(Zn )] (39)
for all increasing bounded nonnegative functions g on Zn0 with strict in-
equality for at least one g. Finally, we write W ≺ Z if Wn ≺ Zn for all
n = 1, 2, . . . . The next lemma follows directly from (1) and (38).
Lemma 4.1. Let X and X be GW processes with oﬀspring rv’s ξ and ξ
respectively. If ξ ≺ ξ then X ≺ X .
Stochastic orderings satisﬁed by a GW process Xθ with psod (16) and by
the conditional processes Ẋθ and Ẍθ are now developed. These will be useful
for the testing and prediction problems treated below.
From (10), (11), and (16), the pmf of (Xθ )n is
θyn −x0
fθ (xn ) = ha (xn ), xn ∈ Ra,n , (40)
(A(θ))yn−1

where yn = x0 + x1 + · · · + xn and Ra,n = {xn | ha (xn ) > 0}. Then (12),
(13), and (40) give the following:
θyn −x0
for θ > 0 : f˙θ (xn ) = uxθ n −x0 ha (xn ), (41)
(A(θ))yn−1
1 − uxn θyn −x0
for θ > τ : f¨θ (xn ) = θ
ha (xn ), xn > 0. (42)
1 − uxθ 0 (A(θ))yn−1
The transition probabilities are obtained from (40)-(42) (recall (14)-(15)):
θ xn
fθ (xn |xn−1 ) = ha (xn−1 , xn ), (43)
(A(θ))xn−1
x −x θ xn
f˙θ (xn |xn−1 ) = uθ n n−1 ha (xn−1 , xn ), (44)
(A(θ))xn−1
1 − u xn θ xn
f¨θ (xn |xn−1 ) = x
θ
ha (xn−1 , xn ), xn−1 , xn > 0.(45)
1 − uθ n−1 (A(θ))xn−1

11
The deﬁnitions of f¨θ (·) and f¨θ (·|·) can be extended to the critical case
θ = τ:

f¨τ (xn ) = lim f¨θ (xn ) (46)

θ↓τ
xn τ yn −x0
= ha (xn ), xn > 0; (47)
x0 (A(τ ))yn−1
xn τ xn
f¨τ (xn |xn−1 ) = ha (xn−1 , xn ), xn−1 , xn > 0. (48)
xn−1 (A(τ ))xn−1

Denote the resulting Markov process by Ẍτ .6 By (46) and Scheﬀe’s Theorem,
L1
Ẍθ −→ Ẍτ as θ ↓ τ. (49)

Proposition 4.1. (i) Xθ is stochastically increasing for θ ∈ (0, ψ), that is,
θ < θ ⇒ Xθ ≺ Xθ .
(ii) Ẋθ is stochastically decreasing for θ ∈ [τ, ψ), that is, θ < θ ⇒ Ẋθ Ẋθ .
Proof. (i) follows from Lemma 4.1 since θ < θ ⇒ ξθ ≺ ξθ by the strict
monotone likelihood ratio (MLR) property of the psod family.7 (ii) follows
from (i), (29), and Proposition 3.1(i).
The veriﬁcations of the stochastic orderings of Xθ and Ẋθ are straightfor-
ward because these are GW processes. However, Ẍθ is not a GW process so
its stochastic ordering properties if any are not apparent. Although it might
appear that Ẍθ should inherit the stochastic increasing property of Xθ , upon
closer examination this is not obvious. Conditional on ultimate explosion, as
θ increases above the critical value τ those trajectories of Xθ with relatively
small initial values might have increasing likelihood of survival, hence for
ﬁxed n, (Ẍθ )n might tend to decrease stochastically, not increase.
In Proposition 4.2(iii) it will be shown, however, that Ẍθ is indeed stochas-
tically increasing for θ ≥ τ provided that two additional conditions are im-
posed, namely TP2a and/or TP2b (see below) based on total positivity of
order 2 (TP2). Also, it is shown in Proposition 4.2(i) that under TP2a alone,
the conditional random vector (Xθ )n | Xθ,n > 0 is stochastically increasing
for θ ≤ τ .

6
This is not to be interpreted as Xτ | explosion, which is vacuous.
7
Lehmann and Romano (2005) Lemma 3.4.2 and Problem 3.39; Karlin (1968) Propo-
sition 3.1 and the discussion following Proposition 3.3, both in Chapter 1.

12
Karlin (1968) is the primary reference for total positivity. The TP2 prop-
erty is equivalent to MLR, cf. Lehmann and Romano (2005, Problem 50)).
The following results for the TP2 and FKG properties appear in Kemperman
(1977) and Perlman and Olkin (1980).
Definition 4.1. Let f (x) be a nonnegative function defined on a measurable
rectangle R = ni=1 Ri ⊆ Rn . Then f satisfies the FKG condition on R if
f (xn )f (yn ) ≤ f (xn ∧ yn )f (xn ∨ yn ) ∀xn , yn ∈ R, (50)
where xn ∧ yn = (x1 ∧ y1 , . . . , xn ∧ yn ) and xn ∨ yn = (x1 ∨ y1 , . . . , xn ∨ yn );
we say that f is FKG on R. TP2 is defined as FKG for n = 2.
Some properties of TP2 and FKG: If h(xi , xj ) is TP2 on Ri × Rj in
a single pair (xi , xj ) then f (xn ) ≡ h(xi , xj ) is FKG on R. If f1 , . . . , fm are
FKG on R then so is m i=1 fi . If f (xn ) = hi (xi ) for a single i then f is
trivially FKG on R, so f (xn ) = ni=1 hi (xi ) is also trivially FKG. If f is
FKG on R∗ ≡ Ri∗ and if, for each i = 1, . . . , n, βi : Ri → Ri∗ is increasing
in xi , then f (β1 (x1 ), . . . , f (βn (xn )) is FKG on R ≡ Ri .
Lemma 4.2. (The FKG Inequality). Let Z be a random vector with an
FKG pdf f on R w.r.to a product measure ν and let g, h be component-wise
increasing nonnegative functions on R ∩ {f > 0}. Then
E[g(Z)h(Z)] ≥ E[g(Z)]E[h(Z)]. (51)
Strict inequality holds in (51) if g is nonconstant w.r.to f (Pr[g(Z) = c] < 1
for all constants c) and h is strictly increasing.
Proof. Perlman and Olkin (1980, Propositions 2.4, 2.6, and Remark 2.5.)

Condition TP2a: ha (x, y) is TP2 in (x, y) for x, y = 1, 2, . . . . (Note that

ha (x, y) is the coeﬃcient of θy in the power series [A(θ)]x .)
Condition TP2b: (1 − uxθ )θx is TP2 in (x, θ) for x = 1, 2, . . . and τ < θ < ψ.
A suﬃcient condition for TP2a to hold is that {ak |k = 0, 1, . . . } is a one-
sided Polya frequency sequence of order 2 (PF2); cf. Karlin (1968, (ii) on
pp.142-3, also Ch.8).
Let (Xθ )+
n denote the conditional random vector (Xθ )n | Xθ,n > 0. For
notational convenience the subscript θ sometimes will be omitted. The con-
ditional pmf of (Xθ )+ +
n ≡ Xn is given by

fθ+ (xn ) = Prθ [Xn = xn |Xn > 0] = bθ,n fθ (xn ), xn > 0, (52)

13
where b−1 +
θ,n = Prθ [Xn > 0]. Note that Xn > 0 ⇒ Xn > 0. Clearly Xn Xn
and Ẋ+
n Ẋn for all θ > 0, while Ẍ+
n ≡ Ẍn for θ ≥ τ .

Proposition 4.2. (i) If TP2a holds then for each n ≥ 1, (Xθ )+ n is stochas-
tically increasing for θ ∈ (0, τ ]. Therefore, by Propositions 3.1 and 3.2,
d
(Ẋθ )+ +
n = (Xθuθ )n is stochastically decreasing for θ ∈ [τ, ψ).
(ii) If TP2a holds then for each n ≥ 1, (Ẋτ )n ≺ (Ẋτ )+ +
n ≺ (Ẍτ )n ≡ (Ẍτ )n .
(iii) If TP2a and TP2b hold, Ẍθ is stochastically increasing for θ ∈ [τ, ψ).
Proof. (i) We will show that Eθ [g(X+
n )] is strictly increasing in θ ∈ (0, τ ]
for any increasing bounded nonconstant g ≥ 0 on Zn+ , where Z+ denotes
the positive integers. The FKG inequality will yield the required result as
follows: for 0 < θ1 < θ2 ≤ τ ,

fθ+2 (X+
n)
Eθ2 [g(X+ +
n )] = Eθ1 g(Xn ) +
fθ1 (X+n)
fθ+2 (X+
n)
> Eθ1 [g(X+
n )] Eθ1 +
fθ1 (X+n)
= Eθ1 [g(X+
n )].

To apply the FKG inequality (51) with strict inequality it must be shown
fθ+ (xn )
that (a) fθ+1 (xn ) is FKG on Zn+ ; and (b) the ratio r(xn ) ≡ 2
fθ+ (xn )
is strictly
1
increasing on Zn+ ∩ {fθ+1 (xn )}
= Zn+ ∩ {ha (xn ) > 0}. First, for all θ > 0 and
xn > 0, it follows from (40) and (52) that

bθ,n θx1 +···+xn n

fθ+ (xn ) = ha (xi−1 , xi ), xn > 0. (53)
(A(θ))x0 +···+xn−1 i=1

By TP2a each factor ha (xi−1 , xi ) in (53) is TP2, hence their product is FKG,
thus so is fθ+ (xn ); this gives (a). Next, 0 < θ1 < θ2 ≤ τ ⇒ B(θ1 ) > B(θ2 ), so
x0 x1 +···+xn−1 xn
bθ ,n A(θ1 ) B(θ1 ) θ2
r(xn ) ≡ 2 (54)
bθ1 ,n A(θ2 ) B(θ2 ) θ1

is strictly increasing in x1 , . . . , xn−1 , xn , which establishes (b).

(ii) The ﬁrst inequality is immediate. For the second, apply the FKG

14
inequality as follows:

f¨τ (Ẋ+
n)
Eτ [g(Ẍn )] = Eτ g(Ẋ+ )
n ˙+
f (Ẋ+ )
τ n

f¨τ (Ẋ+n)
≥ Eτ [g(Ẋ+
n )] Eτ (55)
f˙τ+ (Ẋ+
n)
= Eτ [g(Ẋ+
n )].

As in (i), FKG is applicable in (55) because (a) f˙τ+ (xn ) ≡ fτ+ (xn ) is FKG on
Zn+ (by (53) with θ = τ ); and (b) the ratio

f¨τ (xn ) xn
r(xn ) ≡ = , (56)
f˙τ+ (xn ) bτ,n x0

(obtained from (47) and (52) with θ = τ ) is increasing on Zn+ ∩ {ha (xn ) > 0}.
To show that Eτ [g(Ẍn )] > Eτ [g(Ẋ+ n )] for at least one increasing g, take
+
g(xn ) = 1{2,3,... } (x1 ). Because Ẍτ,1 ≥ 1 and Ẋτ,1 ≥ 1, it follows from (47)
and (52) that

1 − Eτ [g(Ẍn )] = Prτ [Ẍ1 = 1]

τ
= ha (x0 , 1);
x0 (A(τ ))x0

1 − Eτ [g(Ẋ+ +
n )] = Prτ [Ẋ1 = 1]
ḃ1,τ τ
= ha (x0 , 1)
(A(τ ))x0
τ
= ha (x0 , 1)
{1 − Prτ [Ẋ1 = 0]}(A(τ ))x0
τ
= a x0 ha (x0 , 1).
1 − A(τ0 ) (A(τ ))x0

Because a0 > 0 and x0 ≥ 1, we conclude that Eτ [g(Ẍn )] > Eτ [g(Ẋ+

n )].
(iii) Since B(θ1 ) < B(θ2 ) when τ ≤ θ1 < θ2 , the FKG inequality is not
applicable here (recall (54)). Instead we use induction on n to show that

Eθ [g(Ẍn )] ≡ g(xn )f¨θ (xn ) (57)
xn >0

15
is increasing for θ ∈ [τ, ψ).
For n = 1, (42) gives
1 − u x1 θ x1
f¨θ (x1 ) = θ
ha (x0 , x1 ), x1 > 0,
1 − uxθ 0 (A(θ))x0
which is TP2 in (θ, x1 ) by TP2b, so

Eθ [g(Ẍ1 )] ≡ g(x1 )fθ (x1 ) (58)
x1 >0

is increasing for θ ∈ (τ, ψ) by the monotonicity-preserving property of a TP2

≡ MLR kernel (Karlin (1968, Ch.1, Proposition 3.1)). For n ≥ 2,
Eθ [g(Ẍn )] = Eθ [Eθ [g(Ẍn−1 , Ẍn ) | Ẍn−1 ]] (59)

= Eθ ¨
g(Ẍn−1 , xn )fθ (xn | Ẍn−1 ) (60)
xn >0

≡ Eθ [gθ∗ (Ẍn−1 )]. (61)

From TP2a, TP2b, and (45), the transition probability f¨θ (xn | xn−1 ) of the
Markov process Ẍθ is TP2 in (xn , θ) and in (xn , xn−1 ), so the monotonicity-
preserving property implies that gθ∗ (Ẍn−1 ) is increasing in θ and in Ẍn−1 .
Thus by (60)-(61) and the induction hypothesis, Eθ [g(Ẍn )] is increasing for
θ for θ ∈ (τ, ψ). Lastly, these results extend to [τ, ψ) by (49) and continuity.
To show that Eθ [g(Ẍn )] is strictly increasing in θ for at least one increasing
g, take g(xn ) = 1{2,3,... } (x1 ). Because Ẍ1 ≥ 1, it follows from (42) that
1 − Eθ [g(Ẍn )] = Prθ [Ẍ1 = 1]
1−u θ
θ
= ha (x0 , 1)
1 − uθ (A(θ))x0
x0

x0
(1 − uθ )θ 1
= ha (x0 , 1). (62)
(1 − uxθ 0 )θx0 B(θ)
Because x0 ≥ 1 the ﬁrst factor in (62) is decreasing in θ by TP2b, while
the second factor is strictly decreasing because B(θ) is strictly increasing for
θ ∈ [τ, ψ).
Lemma 4.3. Let Xθ be a GW branching process with psod oﬀspring distri-
bution (16). Each of the following two conditions is equivalent to Condition
TP2b: for θ ∈ (τ, ψ),
μθ − 1 1
≤ ; (63)
1 − μθuθ uθ
B (θuθ ) + B (θ) ≤ 0. (64)

16
Proof. Let δθ = duθ /dθ. Then (1 − uxθ )θx is TP2 iﬀ for x = 1, 2, . . . , the
(1−ux+1 )θ x+1
ratio θ
(1−ux x is increasing in θ for θ ∈ (τ, ψ), equivalently, iﬀ
θ )θ

d (1 − ux+1 )θ −(x + 1)uxθ δθ xux−1 δθ 1

log θ
≡ + θ x + ≥ 0. (65)
dθ (1 − uθ )
x
1 − uθ
x+1
1 − uθ θ

After some algebra, we ﬁnd that this is equivalent to the inequality

[(1 − ux ) − xux (1 − u)] + dθ ux−1 [x(1 − u) − u(1 − ux )] ≥ 0, (66)

where we use the relation dθ = θδθ +uθ and abbreviate uθ by u. Because both
terms in square brackets are positive and dθ < 0, this is in turn equivalent
to the inequality
(1 − ux ) − xux (1 − u)
−dθ ≤ ≡ Δ(u, x). (67)
ux−1 [x(1 − u) − u(1 − ux )]

But Δ(u, 1) = 1 and Δ(u, x) ≥ 1 for x ≥ 2:

(1 − ux )(1 + ux ) − ux−1 x(1 − u)(1 + u)

Δ(u, x) − 1 =
ux−1 [x(1 − u) − u(1 − ux )]
(1 − u2x ) − ux−1 x(1 − u2 )
= x−1
u [x(1 − u) − u(1 − ux )]
(1 − u2 )[(1 + u2 + · · · + u2(x−1) ) − u2(x−1)/2 x]
=
ux−1 [x(1 − u) − u(1 − ux )]
≥ 0

because u2x is convex in x. Thus TP2b is equivalent to the simple relation

−dθ ≤ Δ(u, 1) ≡ 1, (68)

which, by (27), is equivalent to (63). Lastly, diﬀerentiate (23) with respect

to θ to establish the equivalence of (68) and (64).
Example 4.1 (= 3.1 continued). For the Poisson(θ) psod, the coeﬃcient
of θy in the power series [A(θ)]x = exθ is ha (x, y) = xy /y!, which is TP2 in
(x, y) so TP2a is satisﬁed. Furthermore μθ = θ, τ = 1, and from (33),
log uθ
− =θ (69)
1 − uθ

17
for θ ≥ 1, so (63) is equivalent to the inequality

−2u log u ≤ 1 − u2 , (70)

where u = uθ . This inequality holds for all u ∈ [0, 1], hence TP2b is also
satisfied. Thus by Proposition 4.2, (Ẋθ )+ n is stochastically decreasing and
(Ẍθ )n is stochastically increasing for θ ≥ 1, while (Ẋτ )+
n ≺ (Ẍτ )n .
Example 4.2 (= 3.2 continued). For the negative binomial(r, θ) psod,
the coefficient of θy in the power series [A(θ)]x = 1/(1 − θ)rx is
Γ(rx + y) (rx + y − 1) · · · (rx)
ha (x, y) = = , (71)
Γ(rx)y! y!
which is TP2 in (x, y), so NB(r, ·) satisfies TP2a for all r > 0. Next, μθ =
rθ
(1−θ)
t
and τ = 1+t , where t = 1r . Set u = uθ and apply (35) to obtain

1−θ
= ut , (72)
1 − θu
1 − ut
= θ. (73)
1 − ut+1
After some algebra it is seen that (63) is equivalent to each of the inequalities

1 − ut 1 + ut−1
≤ τ , (74)
1 − ut+1 1 + ut
v t − v −t
≤ v − v −1 , (75)
t
where v = u−1 ≥ 1. Because h(t) ≡ v t − v −t is convex in t and h(0) = 0, (75)
holds iff t ≤ 1. Thus the NB(r, ·) psod family satisfies TP2b iff r ≥ 1. This
includes the geometric psod family (r = t = 1) where equality holds in (75).
Thus by Proposition 4.2, if r ≥ 1 then (Ẋθ )+
n is stochastically decreasing
and (Ẍθ )n is stochastically increasing for τ ≤ θ < 1, while (Ẋτ )+
n ≺ (Ẍτ )n ,
where τ = 1/(1 + r).
Example 4.3 (= 3.3 continued). For the binary splitting GW process,
the coefficient of θy in the power series [A(θ)]x = (1 + θ2 )x is
x
y for y = 0, 2, . . . , 2x,
ha (x, y) = 2 (76)
0 otherwise,

18
which is TP2 in (x, y), so TP2a is satisfied. Furthermore B(θ) = θ + θ−1
and uθ = θ−2 for θ ≥ τ = 1, so (64) is equivalent to the valid inequality
2 − θ2 − θ−2 ≤ 0, hence TP2b is satisfied. Thus by Proposition 4.2, (Ẋθ )+ n
is stochastically decreasing and (Ẍθ )n is stochastically increasing for θ ≥ 1,
while (Ẋτ )+
n ≺ (Ẍτ )n .
Remark 4.1. The maximum likelihood estimate (MLE) θ̂ is derived by
differentiating (40), then applying (18) to obtain the relation

Yn − x 0
μ̂ ≡ μθ̂ = , (77)
Yn−1

from which θ̂ can be obtained. Here μ̂ denotes the MLE of the mean μθ .

5. Predicting extinction or explosion: the ﬁxed sample size case

Based on observed data xn ≡ (x1 , . . . , xn ) from a non-terminated psod GW
process X ≡ Xθ with initial size x0 and ﬁxed n, predict whether extinction
or explosion will occur for the current realization of the process.
By the Markov property for X,

Prθ [ extinction | Xn = xn ] = uxθ n = 1 − Prθ [ explosion | Xn = xn ]. (78)

The MLE û of uθ is given by û = uθ̂ where θ̂ is obtained from (77), so the

estimated extinction probability is

= 1 if xn ≤ x0 ,
Prθ̂ [ extinction | Xn = xn ] = ûxn (79)
< 1 if xn > x0 .

The value of ûxn can be used to predict extinction or explosion.

However, this procedure may reach unwarranted conclusions. For exam-
ple, if xn = x0 −1 then extinction will be predicted with certainty even though
the population has declined only slightly. Whereas (79) is based solely on
the value of xn , the prediction procedures derived in this section compare
xn to n in order to predict whether the observed process is on a trajectory
toward extinction or toward explosion.
5.1. Prediction as a testing problem. We reformulate the prediction
problem as a hypothesis-testing problem to which Neyman-Pearson theory
can be applied. As in Proposition 4.2, let X+ +
n ≡ (Xθ )n denote the conditional

19
random vector Xn | Xn > 0. Then conditional on non-termination at time
n, extinction will occur iﬀ either

H≤+ : X+ +
or Ḣ≥+ : X+ +
d d
n = (Xθ )n , θ ≤ τ n = (Ẋθ )n , θ ≥ τ

d
holds, while explosion will occur iﬀ Ḧ>+ : X+ +
n = (Ẍθ )n , θ > τ holds. (Recall
that (Ẍθ )+ = Ẍθ .) However, H≤+ = Ḣ≥+ by Proposition 3.2 while by (49) the
L1 -closure of Ḧ>+ is
Ḧ≥+ : X+ +
d
n = (Ẍθ )n , θ ≥ τ, (80)
so the prediction problem can be formulated as the following testing problem:
Based on observing X+
n = xn > 0, test

Ḣ≥+ (eventual extinction) vs. Ḧ≥+ (eventual explosion). (81)

Either Ḣ≥+ or Ḧ≥+ may be taken to be the null hypothesis. Note that
θ ≥ τ under both Ḣ≥+ and Ḧ≥+ . The conditional pmfs of X+ +
n under Ḣ≥ and
Ḧ≥+ are

f˙θ+ (xn ) ≡ Prθ [Ẋ+ ˙

n = xn ] = ḃθ,n fθ (xn ), xn > 0, (82)

and f¨θ+ (xn ) ≡ f¨θ (xn ), respectively, where f¨θ is given by (42) and (47) and

ḃ−1 −1
θ,n = Prθ [Ẋn > 0] = Prθuθ [Xn > 0] = bθuθ ,n . (83)

A version of the generalized LR criterion for (81) is

supτ ≤θ<ψ f¨θ+ (xn )

λ+ (xn ) ≡ , (84)
sup τ ≤θ<ψf˙+ (xn )
θ

but the numerator and denominator may be diﬃcult to evaluate.

5.2. The least favorable distributions for ﬁxed sample size. When
the psod satisﬁes Conditions TP2a and TP2b the testing problem (81) has a
convenient solution. Proposition 4.2 implies that Ḣ≥+ and Ḧ≥ are separated
families and that (f˙τ+ , f¨τ+ ) is a pair of least favorable distributions for (81).
By Theorem 3.8.1 of Lehmann and Romano (2005), a test of the form

accept Ḣ≥+ (predict extinction) if λ+ +

τ (Xn ) ≤ d,
(85)
accept Ḧ≥ (predict explosion) if λ+ +
τ (Xn ) > d,

20
is the UMP test of its size for (81), where d is a nonnegative constant and,
from (47) and (82),

f¨τ+ (xn ) xn xn
λ+
τ (xn ) ≡ = = , xn > 0. (86)
f˙τ+ (xn ) x0 ḃτ,n x0 bτ,n

Because λ+
τ (xn ) is strictly increasing in xn , the test (85) has the form

accept Ḣ≥+ (predict extinction) if Xn+ ≤ c,

(87)
accept Ḧ≥+ (predict explosion) if Xn+ ≥ c + 1,

where c is a nonnegative integer.

5.3. Exponential-type approximations for Ẋn+ when θ = τ . Suppose
ﬁrst that Ḣ≥+ is taken to be the null hypothesis. If Xn+ = xn > 0 is observed,
the p-value Prτ [Ẋn+ ≥ xn ] for test (87) is determined by the distribution of
Ẋn+ under f˙τ+ . For large n the mean and variance of Ẋn+ can be approximated
via (93) as follows:

nστ2
Eτ (Ẋn+ ) = x0 bτ,n ≈ , (88)
2
nστ2 nστ2
Varτ (Ẋn+ ) = x0 bτ,n [nστ2 − x0 (bτ,n − 1)] ≈ + x0 . (89)
2 2
Unfortunately the conditional rv Ẋn+ (≡ Ẋn | Xn > 0) is not the sum of x0
i.i.d. copies each with initial size 1: conditional on Xn > 0, some of the
initial x0 family lines may have terminated by time n. Therefore a normal
approximation is not available for Ẋn+ , even when x0 is large.
Fortunately, however, when n is large Yaglom’s classical exponential ap-
proximation can be applied. For a critical GW process (not necessarily psod)
with oﬀspring variance σ 2 < ∞, Yaglom (1947) showed that if x0 = 1 then
2
lim Pr[Xn+ ≥ nz] = e−2z/σ (90)
n→∞

for any z > 0. This result appears under progressively weaker moment
conditions in Harris (1963, §I.10), Kesten, Ney, and Spitzer (1966, p.582),
Athreya and Ney (1972, §9), and Jagers (1975, Theorem 2.4.2), but only
for the case x0 = 1. When x0 ≥ 2 it might be expected that the limiting
exponential (EXP) distribution in (90) should be replaced by the distribution

21
of the sum of x0 independent exponential rvs, i.e., a gamma distribution, but
(90) continues to hold without change, cf. (92). However, we will also present
a more accurate gamma (GAM) approximation (94) that does depend on x0 .
Let Gr denote a gamma rv with shape parameter r > 0 and scale param-
eter 1 and let Gr (z) denote its cdf, that is,
z
1
Gr (z) = tr−1 e−t dt. (91)
Γ(r) 0

Proposition 5.1. (i) Let {Xn } be a critical GW process with oﬀspring pgf
φ, oﬀspring variance σ 2 < ∞, and initial size x0 ≥ 1. For any z > 0,
2
lim Pr[Xn+ ≥ nz] = e−2z/σ , (92)
n→∞
lim nPr[Xn > 0] = 2x0 /σ 2 . (93)
n→∞

(ii) Let Ḡr (z) = 1 − Gr (z). For x0 ≥ 1 and large n,

1
x0
x0
Pr[Xn+ ≥ nz] ≈ 2
x0 Ḡr ( σ2z2 )( nσ2 2 )r (1 − 2 x0 −r
nσ 2
) . (94)
1− 1− nσ 2 r=1
r

Proof. (i) The existing results for the case x0 = 1 are based on the following
fact, cf. Jagers (1975, Lemma 2.4.1):

1 1 1 σ2
lim − = uniformly for 0 ≤ s < 1. (95)
n→∞ n 1 − φn (s) 1−s 2

Set s = 0 to obtain
2
lim n(1 − φn (0)) = . (96)
n→∞ σ2
For x0 ≥ 2, Xn has pgf φxn0 so by (96),

nPr[Xn > 0] = n(1 − φxn0 (0)) (97)

= n(1 − φn (0))[1 + φn (0) + · · · + φnx0 −1 (0)] (98)
2x0
→ (99)
σ2
because φn (0) ↑ 1; this conﬁrms (93). Furthermore, the Laplace transform

22
of Xn+ /n is, for t ≥ 0,
+
E[e−tXn /n ]
φxn0 (e−t/n ) − φxn0 (0)
=
1 − φxn0 (0)
1 − φxn0 (e−t/n )
= 1−
1 − φxn0 (0)
n(1 − φn (e−t/n ))[1 + φn (e−t/n ) + · · · + φnx0 −1 (e−t/n )]
= 1−
n(1 − φn (0))[1 + φn (0) + · · · + φxn0 −1 (0)]
σ2
2 x0 1
→ 1− 1 σ2
= 2 as n → ∞
t
+ 2
x0 1 + tσ2

by (95) and the inequalities φn (0) < φn (e−t/n ) < 1. This is the Laplace
transform of the exponential distribution in (92), conﬁrming that result.
d
(ii) Represent Xn = U1 + · · · + Ux0 , where the Ui are i.i.d. copies of Xn
but each with initial size x0 = 1. Then

Pr[Xn > 0] Pr[Xn+ ≥ nz]

+
≡ Pr[U1 + · · · Ux0 > 0] Pr U1 + · · · + Ux0 ≥ nz

= Pr U1 + · · · + Ux0 ≥ nz

= x
Pr Ui ≥ nz, Ui > 0 for i ∈ ω, Ui = 0 for i ∈
/ω
ω∈2 0 \∅ i∈ω

x0
x0
= r
Pr[U1 + · · · + Ur ≥ nz, U1 > 0, . . . , Ur > 0, Ur+1 = · · · = Ux0 = 0]
r=1

x0
x0
= r
Pr[U1+ + · · · + Ur+ ≥ nz] Pr[U1 > 0]r Pr[Ux0 = 0]x0 −r
r=1
x0
x0
≈ r
Ḡr ( σ2z2 )( nσ2 2 )r (1 − 2 x0 −r
nσ 2
)
r=1

for large n, by (92) and by (93) with x0 = 1. Furthermore, by (97) and (96),
x
Pr[Xn > 0] = (1 − φxn0 (0)) ≈ 1 − 1 − nσ2 2 0 , (100)

which yields (94).

23
From (92) and (94) and a continuity correction we obtain exponential-
type approximations for the p-value of the test (87) when Ḣ≥+ and Ḧ≥+ are
taken to be the null and alternative hypothesis, respectively:
2(xn −1)
−
Prτ [Ẋn+ ≥ xn ] ≈ e nστ2

≡ π̇ EXP (xn ; n), (101)

1
x0
x0
Prτ [Ẋn+ ≥ xn ] ≈ 2
x0 ( nσ2 2 )r (1 − 2 x0 −r
nστ2
) Ḡr ( 2(xnσ
n −1)
2 )
1− 1− nστ2 r=1
r τ τ

GAM
≡ π̇ (xn ; n, x0 ). (102)
The EXP and GAM approximations coincide when x0 = 1. The approximate
p-value π̇ EXP (xn ; n) does not depend on the value of x0 ; it conveys significance
for Ḧ≥+ (explosion) iff xn nστ2 , but convergence to the exact p-value is slow,
see Remark 5.1. The π̇ GAM (xn ; n) approximation is noticeably better.
Remark 5.1. The accuracy of the EXP and GAM approximations can be
assessed for the geometric psod (cf. Example 3.2). Here the pgf of Xn ≡ Ẋn
in the critical case can be obtained explicitly8 and expanded in a power series,
from which the exact distribution of Xn can be recovered. By (82) and (83)
this yields the exact distribution of Xn+ ≡ Ẋn+ in the critical case. The exact
p-values and their approximations π̇ EXP and π̇ GAM are shown in Tables 1 and
2, from which the superiority of the GAM approximation is apparent.
5.4. Exponential-type approximations for Ẍn+ when θ = τ . Suppose
next that Ḧ≥+ is the null hypothesis. The p-value Prτ [Ẍn+ ≤ xn ] for test
(87) is determined by the distribution of Ẍn+ ≡ Ẍn under f¨τ+ ≡ f¨τ . Again
a normal approximation is not available for large x0 because Ẍn is not the
sum of x0 i.i.d. copies each with initial size 1: conditional on explosion, some
of the initial x0 family lines nonetheless may become extinct. However, an
exponential-type approximation is available for large n, based on the follow-
ing representation for the process Ẍτ ≡ {Ẍτ,n |n ≥ 1}. We shall abbreviate
Ẍτ,n to Ẍn .
Proposition 5.2 Define Zn = Ẍn − 1, n = 1, 2, . . . , z0 = x0 − 1. When
θ = τ , Z ≡ {Zn |n ≥ 1} is a critical GW process with immigration (GWI).
Specifically,
(n) (n)
Zn | Zn−1 = ξ1 + · · · + ξZn−1 + Wn , (103)

8
cf. eqn.(8.32) in Taylor and Karlin (1998) for the case x0 = 1.

24
x0 = 1 x0 = 2
n xn π̇ EXP Exact π̇ GAM Exact
5 10 0.165 0.194 0.198 0.226
5 20 0.022 0.031 0.032 0.042
5 30 0.003 0.005 0.005 0.008
10 20 0.150 0.164 0.165 0.178
10 40 0.020 0.024 0.024 0.029
10 60 0.003 0.004 0.004 0.005
15 20 0.282 0.293 0.294 0.305
15 40 0.074 0.081 0.081 0.087
15 60 0.020 0.022 0.022 0.025
100 150 0.225 0.227 0.227 0.229

Table 1: Exact and approximate p-values Prτ [Ẋn+ ≥ xn ] for the geometric psod when
x0 = 1 and x0 = 2.

x0 = 8 x0 = 14
n xn π̇ EXP π̇ GAM Exact π̇ GAM Exact
5 10 0.165 0.420 0.428 0.631 0.618
5 20 0.022 0.129 0.141 0.286 0.288
5 30 0.003 0.035 0.042 0.108 0.115
10 20 0.150 0.262 0.273 0.369 0.375
10 40 0.020 0.058 0.064 0.108 0.115
10 60 0.003 0.012 0.014 0.029 0.032
15 20 0.282 0.369 0.378 0.444 0.450
15 40 0.074 0.126 0.133 0.178 0.184
15 60 0.020 0.042 0.046 0.069 0.073
100 150 0.225 0.237 0.239 0.248 0.249

Table 2: Exact and approximate p-values Prτ [Ẋn+ ≥ xn ] for the geometric psod when
x0 = 8 and x0 = 14.

(n) (n) (n) d

where ξ1 , . . . , ξZn−1 , Wn are independent rvs, ξj = ξτ , and Wn is a nonneg-
ative integer-valued rv with pgf φτ (s). (This is a pgf since φτ (1) = μτ = 1.)

25
Proof. From (47),
n
xi τ xi
f¨τ (xn ) = h (x , xi )
xi−1 a i−1
i=1
x i−1 (A(τ ))

n
≡ gτ (xi |xi−1 ), (104)
i=1

so Ẍτ is a Markov chain with transition probability gτ (xi |xi−1 ). The condi-
tional pgf corresponding to gτ (xi |xi−1 ) is
1 τ xi
Eτ (sẌi | Ẍi−1 = ẍi−1 ) = x i s xi ha (xi−1 , xi )
xi−1 xi
(A(τ ))xi−1
s d xi τ xi
= s ha (xi−1 , xi )
xi−1 ds x (A(τ ))xi−1
i

d
s
= [(φτ (s))xi−1 ]
xi−1 ds
= s(φτ (s))xi−1 −1 φτ (s). (105)
τ xi (i)
The third equality holds since h (x , xi )
(A(τ ))xi−1 a i−1
is the pmf of ξ1 + · · · +
(i)
ξxi−1 . Thus (105) implies that
(i) (i)
Ẍi | Ẍi−1 = 1 + ξ1 + · · · + ξXi−1 −1 + Wi , (106)
(i) (i)
where the ξj ’s and Wi are mutually independent rvs, the ξj ’s have common
pgf φτ , and Wi is the nonnegative integer-valued rv with pgf φτ (s). Now set
i = n in (106) to obtain (103).
d
By the theorem of Seneta (1970), 2Zn /nστ2 → G2 (cf. (91)) if z0 = 1
(x0 = 2). Since Ẍn = Zn + 1, we obtain the following approximation when
x0 = 2:

2xn
Prτ [Ẍn ≤ xn ] ≈ G2 ≡ π̈ G2 (xn ; n) for large n. (107)
nστ2
We now show that if n is suﬃciently large, π̈ G2 (xn ; n) remains a valid approx-
imation for Prτ [Ẍn ≤ xn ] for all x0 ≥ 2. In the process we derive a sharper
approximation π̈ G23 (xn ; n, x0 ) ≤ π̈ G2 (xn ; n) that depends on x0 as well as n.
The case x0 = 1 is treated separately.

26
Proposition 5.3. As in Proposition 5.2 let Zn = Ẍn − 1, z0 = x0 − 1, θ = τ .
(i) Assume that x0 ≥ 2, so z0 ≥ 1. Then if n is large and z > 0,

2(z0 − 1) 2z 2(z0 − 1) 2z
Prτ [Zn ≤ nz] ≈ 1 − G 2 + G 3 , (108)
nστ2 στ2 nστ2 στ2
so

Pr [Ẍ ≤ xn ] (109)
τ n
2(x0 − 2) 2xn 2(x0 − 2) 2xn
≈ 1− G2 + G3
nστ2 nστ2 nστ2 nστ2
≡ π̈ G23 (xn ; n, x0 ).

This reduces to (107) if x0 = 2 or nστ2 2(x0 − 2).

(ii) Assume that x0 = 1, so z0 = 0, and deﬁne

K = min{k|Ẍk ≥ 2} = min{k|Zk ≥ 1}. (110)

Then if n − K is large,

Pr [Z ≤ (n − K)z | K, ZK ] (111)
τ n
2(ZK − 1) 2z 2(ZK − 1) 2z
≈ 1− G2 2 + G3 2 ,
(n − K)στ
2 στ (n − K)στ
2 στ

so the conditional p-value given K and ẌK can be approximated as follows:

Pr [Ẍ ≤ xn |K, ẌK ] (112)

τ n
2(ẌK − 2) 2xn 2(ẌK − 2) 2xn
≈ 1− G2 + G3
(n − K)στ2 (n − K)στ2 (n − K)στ2 (n − K)στ2
≡ π̈ G23 (xn ; n − K, ẌK ).

Proof. (i) First assume that x0 ≥ 3, so z0 ≥ 2. Rewrite (103) as follows.

For n = 1,
(1) (1) (1)
Z1 | z0 = (ξ1 + W1 ) + (ξ¯1 · · · + ξ¯z0 −1 ) ≡ U1 + V1 , (113)
¯ are i.i.d. copies of ξτ . For n ≥ 2,
where the ξ’s and ξ’s
(n) (n) (n) (n)
Zn | Zn−1 = (ξ1 + · · · + ξUn−1 + Wn ) + (ξ¯1 · · · + ξ¯Vn−1 ) ≡ Un + Vn . (114)

27
(If Vn−1 = 0, Vn = 0.) Then {Un } is a critical GWI process with immigration
rvs {Wn } and initial size u0 = 1, {Vn } is a critical GW process with initial
size v0 = z0 − 1 = x0 − 2 ≥ 0, and {Un } is independent of {Vn }. Therefore
Prτ [2Zn ≤ nστ2 z] (115)
2 + 2
= Prτ [2Un ≤ nστ z] Prτ [Vn = 0] + Prτ [2(Un + Vn ) ≤ nστ z] Prτ [Vn > 0].
d
Since u0 = 1, Seneta’s result applies to give 2Un /nστ2 → G2 , while by (92)
d
2Vn+ /nστ2 → G1 . Because Un and Vn are independent,
Prτ [2(Un + Vn+ ) ≤ nστ2 z] → G3 (z) as n → ∞. (116)
Furthermore, nPrτ [Vn > 0] → 2(z0 − 1)/στ2 by (93), so (115) yields (108),
which, applying the continuity correction, yields (109) since Zn = Ẍn − 1.
If x0 = 2 so z0 = 1, then all Vn = 0 and (108) reduces to Seneta’s result
for {Un }.
(ii) The case x0 = 1 differs because when z0 = 0 the first nonzero value for
the GWI process {Zn } is ZK = WK and occurs when n = K. By conditioning
on K and ZK or ẌK , however, (111) and (112) follow directly from (108)
and (109) by replacing z0 by ZK , x0 by ẌK , and n by n − K.
Like π̇ EXP (xn ; n), the approximate p-value π̈ G2 (xn ; n) does not depend on
x0 (≥ 2); it conveys significance for Ḣ≥+ (eventual extinction) iff xn nστ2 .
We expect that π̈ G2 , like π̇ EXP , will converge only slowly to the exact p-
value, but that π̈ G23 will perform noticeably better. Note that π̈ G23 requires
nστ2 > 2(x0 − 2); otherwise the weight assigned to G2 in (109) is negative.
Remark 5.2. The accuracy of the G2 and G23 approximations can be
assessed for the geometric psod. The pmf of Ẍn in the critical case can be
obtained from (47) as follows: for xn > 0,
xn
Prτ [Ẍn = xn ] = x0
Prτ [Xn = xn ], (117)
and Prτ [Xn = xn ] can be obtained explicitly as in Remark 5.1. Exact p-values
and the G2 and G23 approximations are shown in Table 3, from which the
superiority of G23 is apparent.
Remark 5.3. Moments of Ẍn can be obtained from (106) by recursion, e.g.,
Eτ (Ẍn ) = x0 + nστ2 , (118)
4
Varτ (Ẍn ) = n ωτ + n−3 2
στ + (x0 − 3)στ2 − 1 , (119)
where ωτ = E(ξτ3 ).

28
x0 = 2 x0 = 8 x0 = 14
n xn π̈ G2 Exact π̈ G23 Exact π̈ G23 Exact
5 2 0.062 0.062 0.022 0.008
5 4 0.191 0.169 0.069 0.028
5 6 0.337 0.291 0.134 0.060
10 4 0.062 0.063 0.029 0.038 0.022
10 8 0.191 0.180 0.105 0.115 0.073
10 12 0.337 0.313 0.207 0.213 0.143
20 8 0.062 0.063 0.045 0.048 0.029 0.037
20 12 0.122 0.120 0.092 0.094 0.063 0.074
20 16 0.191 0.186 0.148 0.148 0.105 0.118
50 16 0.041 0.042 0.037 0.038 0.033 0.034
50 24 0.084 0.084 0.076 0.076 0.067 0.069
50 32 0.135 0.134 0.122 0.122 0.109 0.111

Table 3: Exact and approximate p-values Prτ [Ẍn+ ≤ xn ] for the geometric psod when
x0 = 2, 8, 14.

6. Predicting extinction or explosion: sequential sampling

6.1. Sequential probability ratio tests (SPRT). The SPRT (Barnard
(1946), Wald (1947), Ghosh (1970), Stuart and Ord (1991)) is well suited for
the following sequential version of the prediction problem:
Based on sequential data x ≡ (x0 , x1 , x2 , . . . ) from a psod GW process X ≡
Xθ with initial size x0 , predict whether extinction or explosion will occur for
the current realization of the process.
Unlike Section 5, non-termination need not be assumed. This prediction
problem can be formulated as the following testing problem:
Based on observing X sequentially, test
d
Ḣ≥ : X = Ẋθ , θ ≥ τ (eventual extinction) (120)
d
vs. Ḧ≥ : X = Ẍθ , θ ≥ τ (eventual explosion).

For ﬁxed θ ≥ τ , the SPRT for testing f˙θ vs. f¨θ has the following form:

29
The SPRT (θ; B, A): ﬁx 0 < B < 1 < A < ∞. For n = 1, 2, . . . ,
⎧
⎪
⎨stop and accept Ḣ≥ (predict extinction) if λθ (xn ) ≤ B,
stop and accept Ḧ≥ (predict explosion) if λθ (xn ) ≥ A,
⎪
⎩
continue sampling if B < λτ (xn ) < A,

where λθ (xn ) = f¨θ (xn )/f˙θ (xn ).

The stopping time for the SPRT(θ; B, A) is a random variable N (θ; B, A).
Because Prθ [Xn → 0 or ∞] = 1 for all θ ≥ τ , N (θ; B, A) is ﬁnite with
probability 1. As B decreases and A increases, Eθ [N (θ; B, A)] increases
under both Ḣ≥ and Ḧ≥ , but the error probabilities αθ ≡ αθ (θ; B, A) and
βθ ≡ βθ (θ; B, A) need not both decrease (cf. Wald (1947, p.45)). Here,

αθ (θ; B, A) ≡ Prθ [SPRT(θ; B, A) accepts Ḧ≥ | Ḣ≥ ]

βθ (θ; B, A) ≡ Prθ [SPRT(θ; B, A) accepts Ḣ≥ | Ḧ≥ ]

Wald (1947, §3.2) derived the following upper bounds: for any θ ≥ τ ,

1 − βθ (θ; B, A) 1
αθ (θ; B, A) ≤ ≤ , (121)
A A
βθ (θ; B, A) ≤ (1 − αθ (θ; B, A))B ≤ B. (122)
1
Thus if α and β are prespeciﬁed, we may choose B = β and A = α
to
guarantee that SPRT(θ; β, α1 ) satisﬁes the error bounds

αθ (θ; β, α1 ) ≤ α and βθ (θ; β, α1 ) ≤ β. (123)

Wald
also derived
the following approximations: if α + β < 1 then
β 1−β
SPRT θ; 1−α , α more nearly attains the speciﬁed error probabilities α and

β than does SPRT θ; β, α1 , , i.e.,

β
αθ θ; 1−α , 1−β
α
≈ α, βθ θ β
1−α
, 1−β
α
≈ β, (124)

β
αθ θ; 1−α , 1−β
α
β
+ βθ θ; 1−α , 1−β
α
≤ α + β. (125)

6.2. The least favorable distribution for sequential sampling. Be-

cause θ is unknown, the SPRT(θ; ·, ·) cannot be applied directly (but see
Section 6.3.) When the psod satisﬁes TP2a and TP2b, however, like (81) the
testing problem (120) has a convenient solution, namely the SPRT(τ ; ·, ·).

30
Propositions 4.1 and 4.2 imply that Ḣ≥ and Ḧ≥ are separated families and
that (f˙τ , f¨τ ) is a pair of least favorable distributions for (120). Furthermore,
by Propositions 4.1 and 4.2, αθ and βθ both decrease
as θ increases. There-
1−β
fore SPRT(τ ; β, α1 ) (respectively, SPRT τ ; 1−α , α ) is an optimal test for
β

f˙τ vs. f¨τ for which αθ ≤ α and βθ ≤ β (resp., approximately) for all θ ≥ τ .
Speciﬁcally, by (41) and (47),

f¨τ (xn ) xn
λτ (xn ) ≡ = , (126)
f˙τ (xn ) x0

so the SPRT(τ ; B, A) assumes the simple form

⎧
⎪
⎨stop and accept Ḣ≥ (predict extinction) if xn ≤ x0 B,
stop and accept Ḧ≥ (predict explosion) if xn ≥ x0 A,
⎪
⎩
continue sampling if x0 B < xn < x0 A.

Note that this is a universal prediction procedure, that is, it is valid for any
psod, in particular it does not depend on στ2 . As a consequence, however, it
is somewhat conservative.
6.3. A less conservative sequential prediction procedure.
β 1−β If θ were
known (θ > τ ), the SPRT(θ; β, α1 ) (respectively, SPRT θ; 1−α , α ) provides
an optimal test for f˙θ vs. f¨θ for which αθ ≤ α and βθ ≤ β (resp., αθ ≈ α and
βθ ≈ β). From (12) and (13),

f¨θ (xn ) u−xn − 1

λθ (xn ) ≡ = θ−x0 . (127)
f˙θ (xn ) uθ − 1

Because λθ (xn ) is strictly increasing in xn , the SPRT(θ; B, A) is given by

⎧
⎪
⎨stop and accept Ḣθ (predict extinction) if Xn ≤ x0 uxθ 0 , B ,

stop and accept Ḧθ (predict explosion) if Xn ≥ x0 uxθ 0 , A ,
⎪
⎩
continue sampling if x0 uxθ 0,B < Xn < x0 uxθ 0,A ,

where
log ( 1−u )η + 1
(u, η) = u
, 0 < u < 1, 0 ≤ η < ∞. (128)
log( u1 )
For ﬁxed u, (u, η) increases strictly and continuously from 0 to ∞ as η ranges
from 0 to ∞; also (u, 0) = 0 and (u, 1) = 1. Deﬁne (1, η) = (1−, η) = η.

31
Lemma 6.1. If 0 < u < 1 and 0 < η < 1 (resp., 1 < η < ∞), then (u, η)
is strictly decreasing (resp., strictly increasing) in u and

η < (u, η) < 1 resp., 1 < (u, η) < η . (129)
1−u
Proof. Set v = u
, so that 0 < v < ∞ and

log vη + 1) ¯ η).
(u, η) = ≡ (v,
log(v + 1)

For 1 < η < ∞, to show that (u, η) is strictly increasing for 0 < u < 1,
¯ η) is strictly decreasing for 0 < v < ∞, that is,
it suﬃces to show that (v,
¯
∂ (v, η)/∂v < 0. This is equivalent to showing that

η log(v + 1) log(ηv + 1)
− < 0,
ηv + 1 v+1
equivalently, that
1
Δ(v, η) ≡ (v + 1) log(v + 1) − (v + ) log(ηv + 1) < 0. (130)
η

But Δ(v, 1) = 0 for η = 1 and

∂Δ(v, η) (v + η1 )v log(ηv + 1)
= − +
∂η ηv + 1 η2
1
= − 2 ηv − log(ηv + 1) < 0,
η

hence (A.1) holds. Then by L’Hospital’s rule,

¯
η = (0+, ¯ η) > (∞−,
η) > (v, ¯ η) = 1,

which yields the desired inequalities for (u, η). The results for 0 < η < 1
are established in similar fashion.
From Lemma 6.1, the lower (resp., upper) stopping boundary for the
SPRT(θ; B, A) strictly increases (resp., strictly decreases) as θ increases on
[τ, ψ), hence the stopping region decreases and N (θ; B, A) decreases. Thus

x0 B < x0 uxθ 0 , B < x0 < x0 uxθ 0 , A < x0 A. (131)

32
This diﬀerence can be substantial (see Table 5) and implies that
Eθ [N (θ; B, A)] < Eθ [N (τ ; B, A)] for all θ ≥ τ. (132)
Thus if one is willing to assume a fairly unrestrictive upper bound uθ ≤
ū < 1 for the extinction probability uθ (e.g., ū = 0.90, 0.95, or 0.99), corre-
sponding to an unrestrictive lower bound
β θ 1−β ≥ θ ≡ θū > τ for θ itself, then
by using the SPRT(θ; β, α1 ) or SPRT θ; 1−α , α ), by Proposition 4.1(ii) one
would control the ﬁrst error probability for problem (120), i.e., αθ ≤ α for
all θ ≥ θ, while substantially reducing the expected stopping time. If TP2a
and TP2b hold, then by Proposition 4.2(iii) the second error probability also
would be controlled, i.e., βθ ≤ β for all θ ≥ θ.
Remark 6.1. For the Poisson(θ) psod, it follows from (33) that
log(ū)
θū = − , (133)
1 − ū
so θ.90 = 1.0536, θ.95 = 1.0259, θ.99 = 1.0050, which lower bounds are close
to the critical value τ = 1. For the negative binomial(r, θ) psod, (35) yields
1
1 − ū r
θū = r+1 , (134)
1 − ū r

1
which reduces to θū = 1+ū for the geometric(θ) psod when r = 1. Here again
1
this lower bound will be close to the critical value τ = 1+r if ū is close to 1.
In these cases, therefore, the assumption that uθ ≤ ū is not very restrictive
for ū = .90, .95, .99.
Remark 6.2. Unlike the ﬁxed-n prediction procedures derived in §5 (cf.
Remarks 5.1 and 5.2), the SPRTs compare xn to x0 rather than to n in order
to predict whether the observed process is on a trajectory toward extinction
or toward explosion. Note that if x0 is small, the SPRTs are useful for
predicting explosion but not for predicting extinction. For example,

x0 uxθ 0 , β < 1 ⇐⇒ x0 < uθ , β1 , (135)

so if x0 < uθ , β1 then the SPRT(θ; β, α1 ) reduces to the SPRT(θ; 0, α1 ):
⎧
⎪
⎨stop and accept Ḣθ (predict extinction) if Xn = 0,
stop and accept Ḧθ (predict explosion) if Xn ≥ x0 uxθ 0 , α1 ,
⎪
⎩
continue sampling if 1 ≤ Xn < x0 uxθ 0, α1 ,

33
hence will predict extinction only if extinction actually occurs. If x0 < β1 ,
the SPRT(τ ; β, α1 ) also reduces to SPRT(θ; 0, α1 ) hence behaves similarly.
Remark 6.3. Note that the SPRT(θ; B, A) depends on θ only through the
value of the extinction probability uθ , not on the speciﬁc oﬀspring distri-
bution, whether a psod or not. Therefore, hereafter we shall use the nota-
tion SPRT(uθ ; B, A), or simply SPRT(u; B, A). In particular, the universal
SPRT(τ ; B, A) in §9.2 is now designated as SPRT(1; B, A).

7. Examples
Four examples are presented to illustrate the ﬁxed-n (§5) and sequential (§6)
procedures for predicting extinction or explosion from the current realization
of a GW process. Because the Poisson, negative binomial, and geometric
psods are assumed, conditions TP2a and TP2b are satisﬁed, so these predic-
tion procedures possess the properties asserted in §5.2, 6.2, and 6.3.
Example 7.1: Smallpox in Sao Paolo, Brazil. An outbreak of variola
minor in Sao Paolo occurred in 1956 (see Table 4). This outbreak was caused
by a single infectious individual and lasted four generations before the schools
closed; see Becker (1972), Guttorp (1991, p.59). Becker (1977) and Heyde
(1979) modeled these data by a GW process; also see Guttorp (1991, p.58).
Like Heyde we assume a Poisson(θ) psod; here μθ = θ, τ = 1, στ2 = 1.

n 0 1 2 3 4
xn 1 5 3 12 24
Table 4: Occurrences of variola minor in Sao Paolo, Brazil, 1956.

These data suggest a trajectory toward explosion. To assess the strength

of this prediction, first consider the fixed-n testing problem (81) with Ḣ≥+
(eventual extinction) taken to be the null hypothesis and Ḧ≥+ (explosion) the
alternative. Here x0 = 1, n = 4, xn = 24 so the exponential approximation
(101) for the p-value of the fixed-n prediction procedure (87) is
47
π̇ EXP (24; 4) = e− 4 ≈ 7.89×10−6 , (136)

which strongly supports the prediction of explosion. Because n = 4 is not

large, this approximation is not entirely reliable. (Because x0 = 1, the EXP
and GAM approximations coincide.)

34
1
1

x0 ū x0 ūx0 , 0.05 x0 ūx0 , 0.05 x0 ūx0 , 0.01 x0 ūx0 , 0.01
1 1.00 0.05 20 0.01 100
0.99 0.05 18.3 0.01 69.5
0.95 0.05 14.0 0.01 35.8
0.90 0.05 11.1 0.01 23.7
14 1.00 0.70 280 0.14 1400
0.99 0.75 138.5 0.15 276.5
0.95 0.99 60.3 0.20 90.9
0.90 1.48 40.1 0.31 55.3
38 1.00 1.90 760 0.38 3800
0.99 2.29 232.1 0.46 384.2
0.95 5.13 93.6 1.14 124.8
0.9 12.39 66.3 4.09 81.5

Table 5: Stopping boundaries for SPRT(ū; β, α1 ), α = β = 0.05 and 0.01.

Next we apply the sequential testing approach. Here x0 = 1, so the stop-

ping boundaries for the sequential prediction procedure SPRT(ū; β, α1 ) (cf.
Remark 6.3) appear in the ﬁrst tier of Table 5 for α = β = .05, .01 and
ū = 1.0, .99, .95, 0.90. As ū decreases, SPRT(ū; β, α1 ) becomes less conser-
1
vative, stopping more quickly. For example, SPRT(ū = 1.0; .05, .05 ) stops
and predicts explosion when xn ≥ 20, which here occurs when n = 4, while
1
SPRT(ū = .90; .05, .05 ) stops and predicts explosion when xn ≥ 11.1, which
occurs when n = 3.
1
The SPRT(ū = .90; .05, .05 ) requires the assumption that uθ ≤ ū = .90,
equivalently θ ≥ θū = 1.0536, see Remark 6.1. The reliability of this as-
sumption can be assessed in two ways. First, yn−1 = 21 and yn = 45 so
θ̂ = μ̂ = (45 − 1)/21 ≈ 2.095 from (77), which is substantially larger than
1.0536. Second, an estimate of uθ could be obtained from the nonparametric
MLE p̂ of the oﬀspring distribution pθ (cf. Guttorp (1991, Proposition 3.4),
also Stigler (1971)), but this would require knowledge of the family histories
of each infected individual, which is unavailable. Here, however, p̂ can be
obtained from the EM algorithm because n is small, cf. Guttorp (1991, pp.
119-120). For n = 3, p̂ puts masses (0.239, 0.428, 0.206, 0.127) on 0, 1, 5,
6, and from (3) the estimated extinction probability for this distribution is
0.424. For n = 4 the estimated distribution puts masses (0.332, 0.147, 0.219,

35
0.302) on 0, 1, 2, and 5, yielding an estimated extinction probability 0.447.
Both estimates fall well below the assumed upper bound ū = 0.9.
Because the outbreak terminated at the 7th generation, ﬁxed-n prediction
methods (§5) are not relevant. Instead, beause x0 = 1 the stopping bound-
aries of SPRT(ū; β, α1 ) for ū = 1.0, 0.99, 0.95, 0.90 and α = β = .05, .01 again
appear in the ﬁrst tier of Table 5. Because 1 ≤ xn ≤ 7 for n = 1, . . . , 6 in
this example, none of these SPRTs would stop sampling until the extinction
observed at n = 7. Note that θ̂ = μ̂ = (30 − 1)/30 = 0.967 < 1 by (77)
(yn−1 = yn = 30).
Example 7.2: Pertussis in Washington State, 2012. The weekly num-
ber of new cases of pertussis remained fairly constant in 2011 (Figure 1) but
increased dramatically at the beginning of 2012 (Table 6), suggesting possi-
ble explosion. Here x0 = 1, n = 11, xn = 98, yn−1 = 594, and yn = 692. The
MLE μ̂ = 691/594 = 1.1633 and σ̃ 2 = 8.0342, where

1
n x 2
2 ν
σ̃ ≡ xν−1 − μ̂ (137)
n ν=1 xν−1

is Dion’s (1975) estimate of the oﬀspring variance, cf. Guttorp (1991, p.109).
Because μ̂ σ̃ 2 , the Poisson distribution does not ﬁt these data. Instead,
since μ̂ and σ̃ 2 agree with the mean and variance of the negative binomial
NB(r̂, θ̂) psod with r̂ = 0.1970 and θ̂ = 0.8552 (cf. Example 3.2), we shall
assume the model NB(r = 0.1970, θ) with θ unknown (0 < θ < 1), so
τ = r(1 + r)−1 = 0.1646 and στ2 = 6.0761.

Week 1 2 3 4 5 6 7 8 9 10 11 12
n 0 1 2 3 4 5 6 7 8 9 10 11
xn 1 7 22 38 50 65 78 61 74 96 102 98
Table 6: Weekly occurrences of pertussis in Washington State, 2012.

To assess the evidence for a prediction of explosion, ﬁrst consider the

ﬁxed-n testing problem (81) with Ḣ≥+ (eventual extinction) taken to be the
null hypothesis and Ḧ≥+ (eventual explosion) the alternative. From (101) the
exponential approximation to the p-value of the ﬁxed-n procedure (87) is
195
π̇ EXP (98; 11) = e− 11(6.0761) ≈ 0.054, (138)

36
which moderately supports the prediction of explosion.
By contrast, from Tables 5 and 6 the conservative SPRT(1.0; β, α1 ) with
α = β = 0.05 would have stopped and predicted explosion by Week 3 ! With
α = β = 0.01 this SPRT would not have stopped until Week 11, but the
1
SPRT(0.90; 0.01, 0.01 ) would have predicted explosion by Week 4.
In fact a state of health emergency was declared after Week 14 and an
innoculation program begun. The number of new cases9 continued to increase
to a peak of 254 in Week 20, then declined to 23 new cases in Week 52. Had
these sequential prediction procedures been applied, this program could have
begun much earlier, possibly greatly reducing the total number of cases.
We note that the prediction for 2012 (explosion) is the opposite of that
which our methods would obtain for 2011 (extinction), even though the in-
crease in μ̂ from 2011 to 2012 is small, namely 1.0455 vs. 1.1633.
Example 7.3: California condors. Wilbur (1978) gives the annual pop-
ulation counts of the threatened California condor from 1968 through 1976
(see Table 7). Here x0 = 38, n = 8, xn = 19, yn−1 = 183, and yn = 202;
the MLE μ̂ = 164/183 = 0.8962 and σ̃ 2 = 2.2755. Because σ̃ 2 is not greatly
diﬀerent from the estimated variance σθ̂2 = θ̂/(1 − θ̂)2 = 1.6992 under the
geometric GM(θ̂) distribution with θ̂ = μ̂/(1 + μ̂) = .4726 (see Example 3.2),
we will assume the GM(θ) psod model (0 < θ < 1) to illustrate its ease of
application. For this model A(θ) = 1/(1 − θ), τ = 1/2, and στ2 = 2.

Year 1968 1969 1970 1971 1972 1973 1974 1975 1976
n 0 1 2 3 4 5 6 7 8
Count xn 38 26 27 18 25 19 19 11 19
Table 7: Annual counts of California condors 1968-1976.

The data in Table 7 suggest a declining population, hence possible extinc-

tion. To evaluate this prediction, first consider the fixed-n testing problem
(81) with Ḧ≥+ (eventual explosion) as the null hypothesis and Ḣ≥+ (eventual
extinction) the alternative. Here x0 = 38 and n = 8 so nστ2 < 2(x0 −2), hence
the approximations π̈ G2 in (107) and π̈ G23 in (109) for the fixed-n prediction
procedures (87) are inapplicable (cf. Proposition 5.3(i)).

9
The weekly data shown have since been revised. We have used the unrevised data
because it was those upon which public health decisions were based.

37
1
By contrast, the sequential prediction procedure SPRT(0.9; 0.05, 0.05 ) would
have stopped in 1975 and predicted extinction. (Compare the data in Table
7 to the stopping boundaries in the last row of Table 5.)
In fact, by the mid 1980’s all remaining wild condors were captured and
moved to zoos, where a breeding program was begun, followed by relocation
back to the wild. By 2011 the total wild population had grown to 191, in
addition to 178 remaining in captivity.
Example 7.4: North American whooping cranes. Miller et al. (1974)
give the annual counts of migrating whooping cranes, an endangered species,
arriving in Texas from 1938 (n = 0) through 1972 (n = 34); see Figure 4 and
Guttorp (1991, p.190)). Here x0 = 14, n = 34, xn = 51, yn−1 = 1072, and
yn = 1123; the MLE μ̂ = 1109/1072 = 1.0345. Since μ̂ does not diﬀer greatly
from Dion’s estimate σ̃ 2 = 0.84, the Poisson(θ) psod model is assumed.

Whooping cranes at Aransas NWR

50
Number of cranes

40
30
20

1940 1945 1950 1955 1960 1965 1970

Figure 4: North American whooping crane population counts 1938-1972.

The counts in Figure 4 show an increasing trend, suggesting explosion. To

evaluate this prediction, ﬁrst consider the ﬁxed-n testing problem (81) with
Ḣ≥+ (eventual extinction) as the null hypothesis and Ḧ≥+ (eventual explosion)
as the alternative. Here στ2 = 1, so the EXP and GAM approximations (101)

38
and (102) for the p-values of the ﬁxed-n prediction procedure (87) are
101
π̇ EXP (51; 34) = e− 34 ≈ 0.051, (139)
16 14 14 14 1 101
π̇ GAM (51; 34, 14) = 17
14 17 r=1 r 16 r Ḡ r 34
≈ 0.086, (140)

respectively, with π̇ GAM expected to be more accurate. This provides modest

support for a prediction of explosion.
1
By contrast, the sequential prediction procedure SPRT(0.9; 0.05, 0.05 ) would
have stopped in 1964 (n = 26, xn = 42 > 40.1) and predicted explosion, while
1
SPRT(0.9; 0.01, 0.01 ) would have stopped in 1969 (n = 31, xn = 56 > 55.3)
and predicted explosion. (The values 40.1 and 55.3 appear in the second tier
of Table 5.)
Acknowledgement: We are grateful to Brayan Ortiz for his assistance with
numerical computations.

References

Athreya, K. B. and Ney, P. E. (1972). Branching Processes. Springer-Verlag,

Berlin.
Barnard, G. A. (1946). Sequential tests in industrial statistics (with discus-
sion). J. Royal Statist. Soc. Supplement 8, 1-26.
Becker, N. (1972). Vaccination programmes for rare infectious diseases.
Biometrika 59 443-453.
Becker, N. (1974). On parametric estimation for mortal branching processes.
Biometrika 61 393-399.
Becker, N. (1977). Estimation for discrete time branching processes with
application to epidemics. Biometrics 33 515-522.
Centers for Disease Control and Prevention (2012). Pertussis epidemic -
Washington, 2012; Morbidity and Mortality Weekly Report. Online at:
http://www.cdc.gov/mmwr/preview/mmwrhtml/mm6128a1.htm
Dion, J.-P. (1975). Estimation of the variance of a branching process. Ann.
Statist. 3 1183-1187.
Feller, W. (1968). An Introduction to Probability Theory and its Applications,
Third Edition. Wiley, NY.

39
Ghosh, B. K. (1970). Sequential Tests of Statistical Hypotheses. Addison-
Wesley, Reading, PA.
Guttorp, P. (1991). Statistical Inference for Branching Processes. Wiley,NY.
Guttorp, P. and Perlman, M. D. (2015). Testing subcriticality vs. supercriti-
cality in a Galton-Watson branching process with power series oﬀspring
distribution. In preparation.
Harris, T. E. (1963). The Theory of Branching Processes. Springer-Verlag,
Berlin.
Heyde, C. C. (1979). On assessing the potential severity of an outbreak of a
rare infectious disease: a Bayesian approach. Australian J. Statist. 21
282-292.
Jagers, P.(1975). Branching Processes with Biological Application.Wiley, NY.
Karlin, S. (1966). A First Course in Stochastic Processes. Acad. Press, NY.
Karlin, S. (1968). Total Positivity. Stanford University Press, Stanford, CA.
Kemperman, J. H. B. (1977). On the FKG-inequality for measures on a
partially ordered space. Indag. Math. 39 313- 331.
Kesten, H., Ney, P., and Spitzer, F. (1966). The Galton-Watson process
with mean one and ﬁnite variance. Teor. Veroyatnost. i Primenen. 11
579-611.
Lehmann, E. L. and Romano, J. P. (2005). Testing Statistical Hypotheses,
Third Edition. Springer, NY.
Miller, R. S., Botkin, D. B., and Mendelssohn, R. (1974). The whooping
crane (Grus americana) population of North America. Biol. Cons. 6
106-111.
Perlman, M. D. and Olkin, I. (1980). Unbiasedness of invariant tests for
MANOVA and other multivariate problems. Ann. Statist. 8 1326- 1341.
Seneta, E. (1970). An explicit-limit theorem for the critical Galton-Watson
process with immigration. J. Royal Statist. Soc. Series B 32 149-152.
Scott, D. (1987). On posterior asymptotic normality and asymptotic nor-
mality of estimators for the Galton-Watson process. J. Royal Statist.
Soc. Series B 49 209-214.

40
Stigler, S. M. (1971). The estimation of the probability of extinction and
other parameters associated with branching processes. Biometrika 58
499-508.
Stuart, A. and Ord, J. K. (1991). Kendall’s Advanced Theory of Statistics,
Vol. 2. Oxford University Press, NY.
Taylor, H. and Karlin, S. (1998). An Introduction to Stochastic Modeling,
Third Edition. Academic Press, Orlando.
Waugh, W. A. (1958). Conditioned Markov processes.Biometrika 45 241-249.
Wilbur, S. R. (1978). The California Condor, 1966-76: A Look at its Past
and Future. North American Fauna No. 72. 136 pp. U.S. Dept. of the
Interior, Fish and Wildlife Services, Washington D. C. 1978. Online at:
http://library.sandiegozoo.org/journal list/sWilbur CAcondor.pdf.

mTOR Methods and Protocols 1st Edition Thomas Weichhart (Auth.) Download PDF
100% (7)
mTOR Methods and Protocols 1st Edition Thomas Weichhart (Auth.) Download PDF
84 pages
Modelos Estoc Asticos 2: Christian Ojeda
No ratings yet
Modelos Estoc Asticos 2: Christian Ojeda
6 pages
Birth Death Process
No ratings yet
Birth Death Process
4 pages
Karlin Taylor A Second Course On Stochastic Processes
No ratings yet
Karlin Taylor A Second Course On Stochastic Processes
33 pages
Branching Stochastic Processes History, Theory, Applications - Mitov 2011
No ratings yet
Branching Stochastic Processes History, Theory, Applications - Mitov 2011
9 pages
On The Path To Extintion
No ratings yet
On The Path To Extintion
5 pages
Branching - Process
No ratings yet
Branching - Process
6 pages
Extinction in Branching Processes: Ling Jiong Zhu
No ratings yet
Extinction in Branching Processes: Ling Jiong Zhu
25 pages
CH 7
No ratings yet
CH 7
17 pages
Branching Process Stochastic
No ratings yet
Branching Process Stochastic
6 pages
Math5846_chapter9
No ratings yet
Math5846_chapter9
43 pages
Branching Processes: Galton-Watson Processes Were Introduced by Francis Galton in 1889 As A Simple Mathemat
No ratings yet
Branching Processes: Galton-Watson Processes Were Introduced by Francis Galton in 1889 As A Simple Mathemat
15 pages
Branching Processes
100% (1)
Branching Processes
15 pages
Lec38 PDF
No ratings yet
Lec38 PDF
24 pages
Markov Processes and Birth-Death Processes: J. M. Akinpelu
No ratings yet
Markov Processes and Birth-Death Processes: J. M. Akinpelu
37 pages
Pakes Non-Supersritical GWI Math. Biosci 1975
No ratings yet
Pakes Non-Supersritical GWI Math. Biosci 1975
22 pages
Stat512_2022_Lecture_12
No ratings yet
Stat512_2022_Lecture_12
13 pages
289980232
No ratings yet
289980232
38 pages
Raeesa Manjoo-Docrat Final PDF
No ratings yet
Raeesa Manjoo-Docrat Final PDF
127 pages
Full Download (Ebook) Controlled Branching Processes by Del Puerto Garcia, Ines Maria; Velasco, Miguel Gonzalez; Yanev, George Petrov ISBN 9781119484646, 9781786302533, 1119484642, 1786302535 PDF DOCX
100% (22)
Full Download (Ebook) Controlled Branching Processes by Del Puerto Garcia, Ines Maria; Velasco, Miguel Gonzalez; Yanev, George Petrov ISBN 9781119484646, 9781786302533, 1119484642, 1786302535 PDF DOCX
65 pages
Branching Processes and Their Applications
No ratings yet
Branching Processes and Their Applications
15 pages
Andrews Pitchforks Aftafinal
No ratings yet
Andrews Pitchforks Aftafinal
29 pages
2501.16526v1
No ratings yet
2501.16526v1
63 pages
IME625A
No ratings yet
IME625A
7 pages
Lecture 7
No ratings yet
Lecture 7
20 pages
2501.14879v1
No ratings yet
2501.14879v1
29 pages
Journal of Statistical Planning and Inference: Akanksha S. Kashikar
No ratings yet
Journal of Statistical Planning and Inference: Akanksha S. Kashikar
12 pages
B10a: Martingales Through Measure Theory: Alison Etheridge
No ratings yet
B10a: Martingales Through Measure Theory: Alison Etheridge
52 pages
Limit Theorems For Supercritical Remaining-Lifetime Age-Structured Branching Processes
No ratings yet
Limit Theorems For Supercritical Remaining-Lifetime Age-Structured Branching Processes
37 pages
Chapter 13 Birth and Death Process, MCMC For Discrete Distribution (Lecture On 02-16-2021) - STAT 243 - Stochastic Process
No ratings yet
Chapter 13 Birth and Death Process, MCMC For Discrete Distribution (Lecture On 02-16-2021) - STAT 243 - Stochastic Process
7 pages
STAT 6100 - MATH 6180 Lecture 20 - Branching Processes
No ratings yet
STAT 6100 - MATH 6180 Lecture 20 - Branching Processes
4 pages
Sequences and Series of Functions: 6.1 Discussion: Branching Processes
No ratings yet
Sequences and Series of Functions: 6.1 Discussion: Branching Processes
4 pages
Davies-Simple Branching Process
No ratings yet
Davies-Simple Branching Process
16 pages
CH 6
No ratings yet
CH 6
16 pages
solu07 stat3021 tute solutions
No ratings yet
solu07 stat3021 tute solutions
2 pages
Stat110 Cheatsheet PDF
No ratings yet
Stat110 Cheatsheet PDF
2 pages
Stokastik
No ratings yet
Stokastik
3 pages
1.probability Random Variables and Stochastic Processes Athanasios Papoulis S. Unnikrishna Pillai 1 300 241 270
No ratings yet
1.probability Random Variables and Stochastic Processes Athanasios Papoulis S. Unnikrishna Pillai 1 300 241 270
30 pages
Branching: Processes and The Theory of Epidemics
No ratings yet
Branching: Processes and The Theory of Epidemics
11 pages
Chap 3
No ratings yet
Chap 3
11 pages
Lecture Notes Stochastic Optimization-Koole
No ratings yet
Lecture Notes Stochastic Optimization-Koole
42 pages
2501.17268v1
No ratings yet
2501.17268v1
61 pages
Unit 6 Continuous Time Markov Processes-I1: Structure
No ratings yet
Unit 6 Continuous Time Markov Processes-I1: Structure
12 pages
Ringer David
No ratings yet
Ringer David
44 pages
Notes4 09
No ratings yet
Notes4 09
6 pages
Part2 PDF
No ratings yet
Part2 PDF
136 pages
2.lecture4b CTBD
No ratings yet
2.lecture4b CTBD
56 pages
Stochastic Calculus
No ratings yet
Stochastic Calculus
217 pages
Survival Probability For Super Brownian Motion 2022 Statistics Probabilit
No ratings yet
Survival Probability For Super Brownian Motion 2022 Statistics Probabilit
9 pages
DX Dy y X F W: Independent
No ratings yet
DX Dy y X F W: Independent
13 pages
1803.05136v4 (1)
No ratings yet
1803.05136v4 (1)
70 pages
m343_sep_e1i1_web023916 xxx
No ratings yet
m343_sep_e1i1_web023916 xxx
8 pages
chapter4slide(1)
No ratings yet
chapter4slide(1)
35 pages
Stochastic Processes Notes
100% (1)
Stochastic Processes Notes
22 pages
Stochastic Models of Stem Cells and Their Descendants Under Different Criticality Assumptions
No ratings yet
Stochastic Models of Stem Cells and Their Descendants Under Different Criticality Assumptions
23 pages
MACT-317 Practice Problems 12: Assigned Problems From The Sixth Edition
No ratings yet
MACT-317 Practice Problems 12: Assigned Problems From The Sixth Edition
8 pages
Stochastic Processes 2
No ratings yet
Stochastic Processes 2
126 pages
Probability Statistics and Random Processes for Engineers 4th Edition Stark Solutions Manualinstant download
100% (3)
Probability Statistics and Random Processes for Engineers 4th Edition Stark Solutions Manualinstant download
34 pages
Families of Distributions: Beamer-Tu-Logo
No ratings yet
Families of Distributions: Beamer-Tu-Logo
19 pages
Harmonic Analysis and the Theory of Probability
From Everand
Harmonic Analysis and the Theory of Probability
Salomon Bochner
No ratings yet
Studies in the Theory of Random Processes
From Everand
Studies in the Theory of Random Processes
A. V. Skorokhod
No ratings yet
VBT Theory
No ratings yet
VBT Theory
4 pages
Diagnostic Imaging : Oral And Maxillofacial. 2nd Edition Lisa J. Koenig - eBook PDF download
100% (4)
Diagnostic Imaging : Oral And Maxillofacial. 2nd Edition Lisa J. Koenig - eBook PDF download
71 pages
TX Protection
No ratings yet
TX Protection
14 pages
Miransky, Dynamical Symmetry Breaking in Quantum Field Theories
100% (2)
Miransky, Dynamical Symmetry Breaking in Quantum Field Theories
550 pages
From Excavation To Exhibition Halls
No ratings yet
From Excavation To Exhibition Halls
1 page
CKL2600 I
No ratings yet
CKL2600 I
14 pages
Bioinformatics Assignment Topic: Phylogenetics Analysis Softwares
No ratings yet
Bioinformatics Assignment Topic: Phylogenetics Analysis Softwares
12 pages
Trad0913 Fax PDF
No ratings yet
Trad0913 Fax PDF
1 page
55-60 PDF
No ratings yet
55-60 PDF
6 pages
Ideology of Pakistan Definition and Elucidation, Historical Aspects
No ratings yet
Ideology of Pakistan Definition and Elucidation, Historical Aspects
5 pages
Rehan Ahmed Khan Telecom Eng
No ratings yet
Rehan Ahmed Khan Telecom Eng
1 page
ECE 4113 Syllabus Fall 2011
No ratings yet
ECE 4113 Syllabus Fall 2011
9 pages
Exporting Food - Feb 2025
No ratings yet
Exporting Food - Feb 2025
28 pages
ADM 4th Quarter EIM Week 1 4
No ratings yet
ADM 4th Quarter EIM Week 1 4
32 pages
MAGNIMS–2021-CMSC–NAIMS consensus recomm on the use of MRI patients with multiple sclerosis
No ratings yet
MAGNIMS–2021-CMSC–NAIMS consensus recomm on the use of MRI patients with multiple sclerosis
18 pages
Esp32 s2 Mini 1 - Esp32 s2 Mini 1u - Datasheet - en
No ratings yet
Esp32 s2 Mini 1 - Esp32 s2 Mini 1u - Datasheet - en
30 pages
Role of Biomechanics in Physical Education and Sports
No ratings yet
Role of Biomechanics in Physical Education and Sports
2 pages
Maths MT-2 Set-B(23-24)
No ratings yet
Maths MT-2 Set-B(23-24)
5 pages
EG1065X Governor McPherson
No ratings yet
EG1065X Governor McPherson
7 pages
Design Lab 8th Sem Guildelines
No ratings yet
Design Lab 8th Sem Guildelines
2 pages
Camary 2024 2.0E ELITE - 1
No ratings yet
Camary 2024 2.0E ELITE - 1
5 pages
TA010976Z_2_3_P&D_GPL_en
No ratings yet
TA010976Z_2_3_P&D_GPL_en
9 pages
Comedkares Presentation
No ratings yet
Comedkares Presentation
11 pages
Area DE
No ratings yet
Area DE
9 pages
Reading Material Lecture IV
No ratings yet
Reading Material Lecture IV
7 pages
AACE - Toc - Ppg06-E3
No ratings yet
AACE - Toc - Ppg06-E3
17 pages
2g Ethanol Plant
No ratings yet
2g Ethanol Plant
23 pages
Clasificacion
No ratings yet
Clasificacion
2 pages

extinction_explosion_subcritical_2015

Uploaded by

extinction_explosion_subcritical_2015

Uploaded by

Predicting Extinction or Explosion in a Galton-Watson

Branching Process with Power Series Oﬀspring

Preprint submitted to Elsevier April 15, 2015

Figure 1: Weekly counts of new pertussis cases in Washington state.

The spread of an epidemic, at least in its initial stages, can be modeled as

and let 1 ≤ ρ ≡ ρp ≤ ∞ be its radius of convergence. Note that φ(1) = 1.

If X is subcritical or critical, deﬁne Ẋ = X.

Figure 2: The duality between supercritical and extendable subcritical pgfs.

Proposition 2.1. The set of supercritical GW processes conditional on ex-

(e.g. Jagers (1975, eqn. (2.1.2)), where

f˙(xn ) ≡ f˙p (xn ) = Prp [ Xn = xn | extinction] (12)

Similarly the pmf of Ẍn ≡ (Ẍ1 , . . . , Ẍn ) is given by

f˙(xn |xn−1 ) = uxn −xn−1 hp (xn−1 , xn ), (14)

respectively. However, Ẍ is not a GW process because some individuals may

, . . . ) ≡ a are nonnegative constants, θ is the unknown parame-

{θ | μθ < 1} = (0, τ ) (subcritical), (21)

θ θuθ τ θ θvθ ψ≤∞

Figure 3: The function B(θ) = A(θ)/θ.

If θ ∈ (τ, ψ) then from (3) and (17), uθ is the unique solution to

cf. Becker (1974, p.394). Since θuθ < τ , Xθuθ is subcritical.

φ̃θ (s) = φθvθ (s). (32)

by (23) and (24). This cannot be solved explicitly, but necessarily

uθ = 1, vθ > 1 if θ < 1 (subcritical);

Example 3.2: the negative binomial NB(r, θ) and geometric GM(θ)

Here the relations (25) and (26) can be veriﬁed directly.

uθ = 1, vθ = θ12 if θ < 1 (subcritical);

Again the relations (25) and (26) can be veriﬁed directly.

f¨τ (xn ) = lim f¨θ (xn ) (46)

Condition TP2a: ha (x, y) is TP2 in (x, y) for x, y = 1, 2, . . . . (Note that

bθ,n θx1 +···+xn n

is strictly increasing in x1 , . . . , xn−1 , xn , which establishes (b).

1 − Eτ [g(Ẍn )] = Prτ [Ẍ1 = 1]

Because a0 > 0 and x0 ≥ 1, we conclude that Eτ [g(Ẍn )] > Eτ [g(Ẋ+

is increasing for θ ∈ (τ, ψ) by the monotonicity-preserving property of a TP2

≡ Eθ [gθ∗ (Ẍn−1 )]. (61)

d (1 − ux+1 )θ −(x + 1)uxθ δθ xux−1 δθ 1

After some algebra, we ﬁnd that this is equivalent to the inequality

[(1 − ux ) − xux (1 − u)] + dθ ux−1 [x(1 − u) − u(1 − ux )] ≥ 0, (66)

But Δ(u, 1) = 1 and Δ(u, x) ≥ 1 for x ≥ 2:

(1 − ux )(1 + ux ) − ux−1 x(1 − u)(1 + u)

because u2x is convex in x. Thus TP2b is equivalent to the simple relation

−dθ ≤ Δ(u, 1) ≡ 1, (68)

which, by (27), is equivalent to (63). Lastly, diﬀerentiate (23) with respect

−2u log u ≤ 1 − u2 , (70)

5. Predicting extinction or explosion: the ﬁxed sample size case

Prθ [ extinction | Xn = xn ] = uxθ n = 1 − Prθ [ explosion | Xn = xn ]. (78)

The MLE û of uθ is given by û = uθ̂ where θ̂ is obtained from (77), so the

The value of ûxn can be used to predict extinction or explosion.

Ḣ≥+ (eventual extinction) vs. Ḧ≥+ (eventual explosion). (81)

f˙θ+ (xn ) ≡ Prθ [Ẋ+ ˙

A version of the generalized LR criterion for (81) is

supτ ≤θ<ψ f¨θ+ (xn )

but the numerator and denominator may be diﬃcult to evaluate.

accept Ḣ≥+ (predict extinction) if λ+ +

accept Ḣ≥+ (predict extinction) if Xn+ ≤ c,

where c is a nonnegative integer.

(ii) Let Ḡr (z) = 1 − Gr (z). For x0 ≥ 1 and large n,

nPr[Xn > 0] = n(1 − φxn0 (0)) (97)

Pr[Xn > 0] Pr[Xn+ ≥ nz]

which yields (94).

≡ π̇ EXP (xn ; n), (101)

(n) (n) (n) d

This reduces to (107) if x0 = 2 or nστ2  2(x0 − 2).

K = min{k|Ẍk ≥ 2} = min{k|Zk ≥ 1}. (110)

so the conditional p-value given K and ẌK can be approximated as follows:

Pr [Ẍ ≤ xn |K, ẌK ] (112)

Proof. (i) First assume that x0 ≥ 3, so z0 ≥ 2. Rewrite (103) as follows.

6. Predicting extinction or explosion: sequential sampling

where λθ (xn ) = f¨θ (xn )/f˙θ (xn ).

αθ (θ; B, A) ≡ Prθ [SPRT(θ; B, A) accepts Ḧ≥ | Ḣ≥ ]

αθ (θ; β, α1 ) ≤ α and βθ (θ; β, α1 ) ≤ β. (123)

6.2. The least favorable distribution for sequential sampling. Be-

so the SPRT(τ ; B, A) assumes the simple form

f¨θ (xn ) u−xn − 1

Because λθ (xn ) is strictly increasing in xn , the SPRT(θ; B, A) is given by

But Δ(v, 1) = 0 for η = 1 and

This reduces to (107) if x0 = 2 or nστ2 2(x0 − 2).