0% found this document useful (0 votes)
135 views

Adventure in Stochastic

Uploaded by

sohel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
135 views

Adventure in Stochastic

Uploaded by

sohel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 320
Adventures in Stochastic Processes ELCOME TO THE RANDOM WORLD OF HAPPY HARRY— famed restaurateut, happy hour host, community figure, former sem=pro basketball player, occasional software engineer, talent agent, bud- ding television star, world traveller, nemesis of the street. gang called the Mutant Creepazoids, theatre patron, supporter of precise and elegant. use of the English language, supporter of the wat on drugs, unsung hero of the fairy tale Sleeping Beauty, and the target of a vendetta by the local chap- ter of the Young Republicans. Harry and his restaurant are well known around his Optima Street nefghborhood both to the lovers of fine food and the public health service. Obviously this is @ man of many talents and experiences who deserves to have a book written about his life Sidney Resnick Adventures in Stochastic Processes with Mustrations B Birkhauser Boston Basel Berlin. Sidney. Resaek Schoo of Operations Research and Industriel Engnesring Inhaca, NY 14883 usa, Library of Congress Catalogingn-Publiation Data Reanc, Siney Aree pc ey Rh {ecu bibogophel eens nd ink, ISBNGS176.991-2 na: US. a ree paper) ASBN 27682-36012 hud | Seinen’ sais pee TSoxhaticmenen Tle arias 199 Sto.2e9 o.4gst ap ISBNOSIT61990-2 ———pritedon atl ee pape. ciomonnnc tase Bikaner RB (01908 Biknamer Boson, 26 pring ‘Sune Bkhuseer Bosra re prt ‘SES Bites Boston th rag Alig teceived, This work mayo be raster op ib the men [emsson of he puisker (Buhne Bowen, fa Springer SotenceeBasinens Mei ne, Rh nd Permissions, 299 Sing Sea. New Yor. NY 013 USA. excet for Moe excerpt eopecton nah vows be cly als. Use im competion wih sy form rat Norge att rerieva. isto gation, emer sltwae or by sar or deine metho Uology ow bn eras cect i nes ‘The me nth picnonaf rade ame. trademarks servicemarks 0s ven iy ucrotidnaieds ich sot be akea sa expression of pin thee eye eject pvp iM Coverdesign by Minna Resch “Typeset ne autre ia AUS TEX Prt inh UnesdSbtesot Amerex (HP) SA7654 | SPINIRELOE Table of Contents Preface... . : sae see eraereaie ix CHAPTER 1. PRELIMINARIES: DISCRETE INDEX SETS AND/oR DISCRETE STATE SPACES 11. Non-negative integer valued random variables. - eee 1.2. Convolution a eae 5 13. Generating functions 0 eee 1.3.1. Differentiation of generating functions... . sa: 1.8.2. Generating functions and moments... 2... +. 10 1.3.3. Generating functions and convolution. . sa: 1.34, Generating functions, compounding snd random sums. | 15 1.4. The simple branching process... se ee : 18 15. Limit distributions and the continuity theorem <<. - ca 1.5.1, The law of rare events. ae » 30 1.8. The simple random walk... ae 33 1.7. The distribution of a process* eee 40 1.8. Stopping times’ paeaead 7M 1.8.1, Wald’s identity. Sea, 1.82. Splitting an jd sequence aba stopping time”. 43 Exercises for Chapter Ls... eee 51 CHAPTER 2. MARKOV CHAINS 2.1. Construction and first properties eee a 2.2, Examples . ane 66 233. Higher order transition probabilities : aa) 2.4, Decomposition of the state space : eee. 2.5. The dissection principle. . ae - 8 26. Transionce and recurrence»... 2... 85 21 Panodieley aaa earnest eee a1 28, Solidarity properties . + 2 2.9, Examples : coe 94 2.10. Canonical decomposition Sa = 98 2.11. Absorption probabilities... . fee eal 2.12, Invaciant measures and stationary distributions 116 “This section contains alvanced material which may be skipped on first read ing by beginning readers. vi Cowrenrs 2.12.1, Time averages 2.13. Limit distributions 2.18.1 More on null recurrence and transience 2.14. Computation of the stationary distribution 2.15. Classification techniques ‘ Exercises for Chapter 2. Cuapren 3. ReNewaL THEORY 31, Basies 3.2. Analytic interlude 3.2.1. Integration 3.2.2, Convolution 3.2.8, Laplace transforms 3.3. Counting renowals 3.4, Renewal reward processes 3.5. The renewal equation... 3.5.1, Risk processes 3.6. The Poisson process as a renewal process 3.7. Informal discussion of renewal limit theorems; regenerative processes 8.7.1 An informal discussion of regenerative processes 38, Diserete renewal theory 3.9, Stationary renewal processes” 3.10, Blackwell and key renewal theorems" 3.10.1, Direct Riemann integrability* 3,10.2, Equivalent forms of the renewal theorems" 3.10.3, Proof of the renewal theorem* 3.11. Improper renewal equations 3.12. More regenerative processes” 3.12.1. Definitions and examples" 8.12.2. The renewal equation and Smith’s theorem? 3.12.3. Queueing examples Exercises for Chapter 3 ChapTer 4. Pont PROCESSES 4.1. Basics 4.2. The Poisson process 4.3. Transforming Poisson processes 12 126 134 437 142. uz 174 176 176 178. 181 185 192 197 205, on a2 215 a 22a 230 231 237 243, 253 250 259 23 269 280. 00) 303 308. "This section contains advanced material which may be skipped on first read- ing by beeluning readers. Contents 4.3.1, Maxstable and stable random variables* . . 4.4. More transformation theory; marking snd thinning 4.5. The order statistic property... ss. 4.6. Variants of the Poisson process 47. Technical basies* . . 47.1. The Laplace functional" 4.8. More on the Poisson process’ 4.9. A general construction of the Poisson process; a simple derivation of the order statistic property* pe eee AML Records oo... Exercises for Chapter 4 CHaprer 5. Coninuous Tite Markov CHAINS 5.1, Definitions and construction 5.2. Stability and explosions 5.2.1. The Markov property* 53, Dissection . . 5.3.1. More detail on dissection”. 5.4. The backward equation and the generator matrix 55. Stationary and limiting distributions. 5.5.1. More on invariant measures* 56. Laplace transform methods 5.7. Calculations and examples 57.1. Quoucing networks 58. Time dependent solutions* 59. Reversibility 5.10. Uniformizablity 5:11, The linear birth process as a point process Exercises for Chapter 5 Cuaprer 6, Brownian Motion 62, Introduction. 2... oe 62. Preliminaries . oo 63. Construction of Brownian motion* a4 64. Simple properties of standard Brownian motion. . 6.5. The reflection principle and the distribution of the maximum 6.6. The strong independent imerement property and reflection” 67. Escape from a strip vil 313 316 321 327 333 336 337 341 33 349 367 375 a7 382 392 402 406 45, 426 431 436 439 45 482 487 489 494 497 50 508 *This section contains advanced material which may be skipped on frst read- ing by beginning readers. vill Contents 6.8. Brownian motion with drift... : .. BLL 6.9. Heavy traffic approximations in queueing theory su 6.10. The Brownian bridge and the Kolmogorov-Smimov statistic . 524 6.11, Path properties" ql 539 6.12. Quadratic variation 542 6.13. Khintchine’s law of the iterated logarithm for Brownian motion™ 546 Exercises for Chapter 6 : 551 CHAPTER 7, THE GENERAL RANDOM WALK TAL Stopping times : 559 72. Global properties | 561 73. Prelude to Wiener-Hopt: Probabilistic inerpretaons of transforms. « : 54 74, Dual pairs of stopping times - 368 7.5. Wienet-Hopf decompositions 573 7.6. Consequences of the Wiene:Hopf factorization 581 77, The maximum of a random walk 587 78. Random walks and the G/G/1 queue 591 78.1. Exponential right tail 595 782. Application to G/M/1 queusing model 599 7.8.3. Exponential let tail : 602 784. The M/G/1 queue : 2. 605 7.85. Queve lengths : 607 References : 613 Index ez “This section contains advanced material which may be skipped on first read- ing by beginning readers Preface ‘While this is a book about Harry and his adventurous life, it is primarily « serious text. about stochastic processes, It features the basic stochas- tie processes that are necessary ingredients for building models of a wide variety of phenomena exhibiting time varying randomness. ‘The book is intended as a first year graduate text for courses usually called Stochastic Processes (perhaps amended by the words “Applied” or “Introduction to ... ”) or Applied Probability, or sometimes Stochastic Modelling. It is meant to be very accessible to beginners, and at the same time, to serve those who come to the course with strong backgrounds. "This fiesibity also permits the instructor to push the sophistication level up or down, For the novice, discussions and motivation are given carefully and in treat detail. In some sections beginners are advised to skip certain devel- opments, while in others, they can read the words and skip the symbols in order to get the content without more technical detail than they are ready to assimilate. In fact, with the numerous readings and variety of prob- lems, it s easy to carve a path so that the book challenges more advanced students, but remains instructive and manageable for beginners. Some sections are starred and come with a warning that they contain material which is moze mathematically demanding. Several discussions have been ‘modularized to facilitate flexible adaptation to the needs of students with differing backgrounds. The text makes crystal clear distinctions between the following: proofs, partial proofs, motivations, plausibility arguments sand good old fashioned hand-waving Where did Harry, Zeke and the rest of the gang come from? Courses in Stochastic Processes tend to contain overstuffed curricula. It is, therefore, useful to have quick illustrations of how the theory leads to techniques for calculating numbers. With the Harry vignettes, the student can get in and out of numerical illustrations quickly. Of course, the vignettes are not meant to replace often stimulating but time consuming real applications. A variety of examples with applied appeal are sprinkled throughout the exposition and exercises, Our students are quite fond of Harry and enjoy poychoanalyzing hin, debating whether he is ‘a polyester sort of guy” or the “jeans and running shoes type.” They seem to have no trouble discerning the didactic intent of the Harry stories and accept the neod for some easy numerieal problems before graduating to more serious ones. Student culture has become so ubiquitous that foreign students who are x PREFACE not native English speakers ean quickly get into the swing. I think Harey is @ useful and entertaining guy but if you find that you loathe his, he is easy to avoid in the text. Where did they come? con't say But I bet they have come a long long way? To the instructor: The discipline imposed during the writing was that the first six chapters should not use advanced notions of conditioning which involve relatively sophisticated ideas of integration. Only the elementary definition is used: P(AIB) = P(A B)/P(B). Instead of conditioning arguments we find independence where we need it and apply some form of the product rule: P(A B) = P(A)P(B) if A ond B are independent. ‘This maintains rigor and keeps the sophistication level down No koowledge of measure theory is assumed but it is assumed that the student has alrescy digested.» good graduate level pre-measure theoretic probability course. A bit of measure theory is discussed here and there in starred portions of the text. In most cases itis simple and intuitive but if it scares you, skip it and you will not be disadvantaged as you Journey through the book. Tf, however, you know some measure thoory, you will understand things in more depts. ‘There is sprinkling of refer- fences throughout the book to Pubins’s theorem, the monotone convergence theorem and the dominated convergence theorem. ‘These are used to jus- tify the interchange of operations such as summation and integration. A relatively unsophisticated student would not and should not worry about justifications for these interchanges of operations; these three theorems should merely remind such students that somebody knows how to check the correctness of these interchanges ‘Analysts who build mods are supposed to know how to build mod cls. So for each class of process studied, a construction of that process is included, Independent, identically distributed sequences are usually as- sumed as primitives in the constructions. Once a concrete version of the process is at hond, many’ properties are fairly transparent. Another benefit is that if you know how to construct a stochastic process, you know how to simulate the process. While no specific discussion of simulation is in- cluded here, I have tried to avoid pretending the computer does not exis. For instance, in the Markov chain chapters, formulas are frequent]y put in matrix form to make them suitable for solution by machine rather than by hand. Packages such as Minitab, Mathematica, Gauss, Matlab, etc., have been used suocessfully as valuable ads in the solution of problems but local availability of computing resources and the rapidly changing world of hard: ware and software male specific suggestions unwise, Ask your local gure 3Dr. Seuss, One Push, Two Fish, Red Fish, Blue Fish PReeace, xi for suggestions. You need to manipulate some matrices, and find roots of polynomials; but nothing too fancy. If you have access to a package that does symbolic calculations, so much the better. A companion disk to this book is being prepared by Douglas McBeth whieh will allow easy solutions to many numerical problems. ‘There is much more material here than can be covered in one semester. Some selection according to the needs of the students is required. Here is the core of the material: Chapter 1: 1.1-16. Skip the proof of the continuity theorem in 1.5 if necessary but mention Wald’s identity. Some instructors may prefer to skip Chapter 1 and return later to these topics, 1s neoded. If you are tempted by this strategy, keep in mind that Chapter 1 discusses the interesting and basie random walk and branching processes and thot facility with transforms is worthwhile. Chapter 2: 2.1-2.12,2.12.1 In Section 2.18, a skilled lecturer is advised to skip most of the proof of ‘Theorern 2.13.2, explain coupling in 15 minutes, and let it go nt that. This is one place where hand-waving really conveys something, The material from Section 2.13.1 should be left to the curious, If time permits, try to cover Sections 2.14 and 2.15 but you will have to mow st a brisk pace Chapter 3: In renewal theory stick to basics. After all the discrete state space theory in Chapters 1 and 2, the switch to the continuous state space world leaves many students uneasy. The core is Sections 3.1-3.5, 3.6, 3.7, and 3.7.1. Sections 3.8 and 3.12.3 ate accessible if there is time but 3.9 3.12.2 are only for supplemental reading by advanced students. Chapter 4 "The jewels are in Sections 4.1 to 4.7. You can skip 4.3.1. If you have a group ‘hat can cope with a bit more sophistication, try 4.7.1, 4.8 and 4.9. Once you come to know and love the Laplace functional, the rest. is ineredibly ‘easy and short. Chapter 5: The basics are 6.1-5.8. If you are pressed for time, skip possibly 5.6 and 58; beginners may avoid 5.2.1, 6.3.1 and 5.5.1 Scetion 5.7.1 is on queueing networks and is a significant application of standard techniques, so try to reserve some time for it. Section 5.9 is nice if there is time. Despite its beauty, leave 5.11 for supplemental reading by advanced students. Chapter 6: Stick to some easy path properties, strong independent increments, reflection, and some explicit ealeulations. T recommend 6.1, 6.2, 6.4, 6.5, 6.6, 6.7, and 68. For beginners, a quick survey of 6.11-6.13 may be adequate. If there is time and strong interest in queueing, try 6.9. If there is strong interest in statistics, try 6.10. I like Chapter 7, but it is unlikely it can be covered in a first course, Parts of it require advanced material. Jn the course of teaching, 1 have collected problems which have been inserted into the examples and problem sections: there should be a good. supply. These illustrate a variety of applied contexts where the skills mas- tered in the chapter can be used. Queueing theory is a frequent context {or many exercises. Many problems emphasize cakulating numbers which xii PREFACE seems to be a skill most students need these days, especially considering the wide clientele who enroll for courses in stochastic processes. There is 4 big payoff for the student who will spend serious time working out the problems. Failure to do do will relegate the novice reader to the status of voyeur. Some acknowledgements and thank you's: The staff at Bitkhiuser has been very supportive, ecient and colleagal, and the working relationship could not have been better. Minna Resnick designed a stunning cover and logo. Cornell's Kathy King probably does not realize how much cumu- lative help she intermittently provided in turning seribbled lecture notes into something I could feed the TeX machine. Richard Davis (Colorado State University), Gennady Sammorodnitsky (Cornell) and Richard Ser- foao (Georgia Institute of Technology) used the manuseript in elassroom settings and provided extensive lists of corrections and perceptive sugaes- Lions. A mature student perspective was provided by David Lando (Cor- nell) who read almost the whole manuscript and made an uncountable umber of amazingly wise suggestions about organization and presenta- tion, as well as finding his quota of mistakes. Douglas MeBeth made useful comments about appropriate levels of presentation and numerical issues, David Lando and Eleftherios Iakavou helped convince me that Harry could become friends with students whose raother tongue ws different from Eng- lish, Joan Lieberman convinced me even a lawyer could approciate Harry Minna, Rachel and Nathan Resnick provided a warm, loving family life ‘and generously shared the home computer with me. They were also very consoling as I coped with two hard disk erashes and a monitor melt-down. While writing a previous book in 1985, I wore out two mechanical pen- ils, The writing of this book took place on four different computers Financial support for modernising the computer equipment came from the National Science Foundation, Cornell's Mathematical Sciences Institute and Cornell's School of Operations Research and Industrial Engineering. Having new equipment postponed the arrival of bifocals and made that marvellous tool called TEX slmost fun to use CHAPTER 1 Preliminaries Discrete Index Sets and/or Discrete State Spaces HIS CHAPTER eases us into the subject with a review of some useful techniques for handling non-negative integer valued random variables nd their distributions. These techniques are applied to some significant examples, namely, the simple random walk and the simple branching pro- cess. Towards the end of the chapter stoping times are introduced and applied to obtain Wald's identity and some facts about the random walk. The beginning student can skip the advanced discussion on sigans-fields and needs only a primitive understanding that sigma fields organize infor- mation within probability spaces. Section 1.7, intended for somewhat advanced students, discusses the distribution of a process and leads to a more mature and mathematically usefil understanding of shat a stochastic process is rather than shat is provided by the elementary definition: A stackastie process is a collection of random variables {X(t}, € 7} defined on a common probability space indesed by the index set T which describes the evolution of some system. Often T = (0,00) ifthe system evolves in continuous time. For example, X(t) might be the number of people in a queue at time t, or the accu: mulated claims paid by an insurance company in (0,t). Alternatively, we ‘could have T = {0,1,...) ifthe systom evolves in discrete time. Then (n) right represent the number of arrivals to a queue during the service interval ‘of the nth customer, or the socio-economic status of a family after n gen- erations. When considering stationary processes, = {...,-1,0,1,...} is ‘4 common index set. Ta more exotic processes, T might be « collection of ions, ane! X(A), the number of points in region A 1.1, Non-Neganive Iwregen VALUED RANDOM VARIABLES Suppose X is a random variable whose range is (0,1... ,20}. (Allowing a possible value of co isa convenience. For instance, if X is the waiting time for a random event to occur and if this event never occurs, itis natural to think of the value of X as ao.) Set PIX=K=p, & 1, 2 PRELIMINARIES so that PIX 0, define E(X) = 00; otherwise BUX) = ohm. & IE J :{0,1,... ,00} + [0,00] then in an elementary course you probably saw the derivation of the fact that BAX) = YI flee If 20,1)... ,00] i by , 00] then define two positive funetions f* and ft = max{f,0}, #7 =—min{J,0} so that Bf*(X) and Bf (X) are both well defined and EX Now define Bl F(X) = BF (X) provided at least one of Ef*(X} and Ef~(X) is finite. In the contrary ‘case, where both are infinite, the expectation does not exist. The expecta- tion is finite if Docrcao Lf(A)IPe < 0. TE poo =O and (8) = WY then /(X) = EX" = nth moments “Hl8) = (~ ECD)", then BF(X) = E(X — BOI" = nth central moment Tn particular, when 22 in the second case we get Var(X) = BOX — BCX)? = EX? —(E(X)), 11. ITEGER VALUED VARIABLES 3 Some examples of distributions {px} that you should review and that will be particularly relevant are 1. Binomial, denoted i(k; n,p), which is the distribution of the number of successes in n Bernoulli trials when the suceess probability isp. Then = b(kin,p) = Pix = ({eta-r, osksnoerss, and B(X) = np, Var(X) = np(1 —p). 2. Poisson, denoted p(k:A). Then for k= 0,1,..., A>0 PU =k) = pk, A) = eM and E(X) = A,Var(X) = A. 3. Geometric, denoted g(k;p), 80 that for k = 0,1, PIX =k] (G-pip, OS ps1, Kt. P) which is the distribution of the number of failures before the first success in repeated Bernoulli trials. The usual notation is to set g=1~p. Then £09 = A. 4 PRELIMINARIES Proof. ‘To verify this formula involves reversing the steps of the previous computation: EE oF Ep fauna pat iss = lips = BX). In the multivariate case we have a random vector with non-negative integer valued components X! = (X1,... Xe) with a mass function PUG = dar Xe = 5a) = Phryorite for non-negative integers ji... de. If FOL)... .c0}* +> [0,00] then BAX. Xe) = DD Haase Hea de ono WEF: {0,1).-.,00}* m4 then Ef(Xay..-) Xe) = BF* (Xa, -- Xe) — BF (My. Xe) ‘as in the case k = 1 if at least one of the expectations on the right is finite. Now recall the following properties 1. Por ayy... ae R +o) (provided the right side makes sense; no o0 — oo please). 2. WX,,...,Xx ave independent so that the joint mass function of Xi, Xp factors into a product of marginal mass funetions, then for any bounded functions fa,..., fx with domain {0,1,...,00} we have A . (13) BTL A) = TT BAC), 1.2, Convouution 5 ‘The proof is easy for non-negative integer valued random variables based on the factorization of the joint mass function, 3. If EX? <0,i=1,...,k and Cons Xs) = 0184 < ISK ie vn( Soo ek Saivar(x,) for ae R, 1.2. Convouution. Suppose X and Y are independent, non-negative, integer valued random. variables with PUC=K)= a4, PIV = Heb, k= 0,1, Since for n > 0 beeyanj=Uperaay sna : a Wee say =n = Sadi ‘This operation on sequences atises frequently and is called convolution: we get rocey aif Uor=iy Pr Definition. The convolution of the two sequences {a,,n > 0) and {ban > 0} is the now soquence (cq,n > O} whose nth element cy is defined by en = Yan 6 PRELDAINARIES We write fen} = {an} + {bn}: Although this definition applies to any two sequences defined on the ative integers, it is most interesting to us because it gives the dis- tribution of a sum of independent: non-negative integer valued random vari- ables. The caleulation before the definition shows that {PIXE Y = nl) = {PC =n} {PLY =n} For the examples, the following notational convention is convenient Write X ~ {p,} to indicate the mass function of X is {p.}- The notation X £Y will be used to mean that X and ¥ have the same distribution. In the case of non-negative integer valued random variables, the mass fnc- tions or discrete densities are the same. Example 1.2.4. If X ~ p(k;A) and ¥ ~ p(h;p) and X and ¥ are independent then X + Y ~ p(k; A +p). Example 1.2.2. If X ~ (kjn,p) and Y ~ 6(;m,p) and X and ¥ are independent then X + ¥ ~ 6(k;n + m,p) since the number of successes in. n Bernoulli tials coupled with the number of suecesses in m independent Bernoulli trials yields the number of successes in n-+m Bernoulli trials. ‘The results of both examples have easy analytic proofs using just the 0 such that Als) = Saye! converges in [s] < s0, then we call A(s) the generating function (gf) of the sequence {as} We are most interested in generating functions of probability densi tins. Let X be a non-negative integer valued random variable with density {nash 0). The generating function of fp) i Pi)= Dnst = and by an abuse of language this is also called the generating function of X. Note that 13k Puls and that P() = DEppe < 1 30 the radius of convergence of P(s) is at Teast 1 (and may be greater than 1), Note P(1) = 1 iff P[X < oo] = 1 We will see that a generating function, when it exists, uniquely de- termines its sequence (and in fect we will give a differentiation scheme 8 PRELIMINARIES which generates the sequences from the gf). There are five main uses of generating functions: (1) Generating functions aid in the computation of the mass func- tion of a sum of independent non-negative integer valued ran- dom variables. (2) Generating functions aid in the calculation of moments. Mo- ments are frequently of interest in stochastic models because they provide easy (but rough) methods for statistical estima tion of the parameters of the model (3) Using the continuity theorem (see Section 1.5), generating funetions aid in the caleulation of limit distributions (4) Generating funetions aid in the solution of difference equa- tions or recursions. Generating function techniques convert the problem of solving a recursion into the problem of solving 2 differential equation. (5) Generating fanetious aid in the solution of linked systems of differential difference equations. The generating function tech nique necessitates the solution of a partial differential equa- tion. This technique is applied frequently in continuous time Mazkov chain theory and is discussed in Section 58. Example 1.3.1. X ~p(k:A). Then ard Pls) (s) La aed, forall s>0 Beample 13.2. X ~H(kinp) Then 19-3 (Jee) E (Donte = corer for als >0 Example 1.3.3. X~ (kip). Thon (8) = op)s* -p{(l ~ 9s) S03)" ford 0 such that for [Al < hg Isthlve kl} Proposition 1.8.2. Let X have mass function {pe} = (P[X = k],k > 0) satisfying S$" 9p» = 1. Define Ps) = Be, = PID, and A) - Lae Then - 321) osse1. Proof. Follow your nose: Since ge = Do 43 i we have ‘and summing the geometric terms shows that Q(s) equals =0-)70-7(0).= In (1.8.2.1) let 7 1. On the one hand, by monotone convergence, we get lig Q(s)
  • 4-208) 1.8. Gpnerarine Functions nu the last step following from Lemma 1.1.1, On the other hand Bn Q(s)
  • 1} be inde- pendent, identically distributed (iid) non-negative integer valued random variables and suppose Xi ~ {px} and Ee™ =Px,(s), 0S 851. Let N be independent of (Xn,n > 1) and suppose NV is non-negative integer valued ith PIN =j]= 4,20; Es” = Py(s),0<5 <1. Define bet Xe BEL ‘Then Sy is a random sum with a compound distribution: For j 2 0 PiSy =) =S>PISw = 5.0 = 8) & Sis. where P[Sy = j] = pit is the jth element of the kth convolution power of the sequence {p,}. Thus the gf of Sy is SPs. = PW =H Psy ts) Srisy = ale _ as a (Es. = a) xe (Pela (from Proposition 1.3.3) 16 PRELIMINARIES = Py(Px,(s))> and we conclude (1.3.42) Psy(s) = Pu(Px,(8)): Note the gf of the random index is on the outside of the functional com- position, Tn the special case where N ~ p(k) we get a compound-Poisson distribution with gf given by (1.34.1): Since Py(s) = exp{A(s —1)} (0.3.4.2) Poy (8) = exp(APx, (8) — 1)) Example. Harry Delivers Pizzas. Harry's restaurant has a delivery service for pizzas. Friday night Harry goes on a drinking binge, which causes him to be muddled all day Saturday while answering the phone at the restaurant, A Poisson(, ) number of orders are phoned in on Saturday Dut there is only probability p that Harry notes the address of a caller correctly. What is the distribution of the number of pizeas successfully delivered? Let { 1, if the address of the ith caller is correctly noted 0, otherwise Lot V ~ p(k, A). The number of successful deliveries is Siv with gf Poye(s) °u (Pxs(s)) = Pela + ps) and from (1.3.4.2) this is = exp{ (q+ ps — 1} = exp{A(ps — p)} = exp{ p(s — 1)} so we recognize that 1 ~ p(k: Ap). i ‘The effect of compounding has been to reduce the parameter from } to Ap, & phenomenon which is ealled thinning in Poisson process theory. The Poisson compounding of Bernonlli variables occurs often, and the previous simple example serves as a paradigm. Other examples: Imagine a Poisson 1.3. GENERATING FuNCTIONS ir umber of customers have arrived at o service facility to be served in burn, Bach customer has probability p of belng satisfied with the service received 30 that Sy is the number of satisfied customers. Alternatively, imagine telephone traffic arriving at a gateway are of two types, I and II, Type 1 traffic gets routed through trunk line I and type Il traffie gets routed through trunk line II. If 100p% of the calls are of type II and if the net number of calls to arrive in an hour is Poisson distributed with parameter 2 then the number of calls routed to trunk line Tis Poisson with parameter pd. This thinning procedure is further discussed in Chapter 4. Example. Harry Drives Cross-Country. For a vacation Harry drives ceross-crountry. Because of his lead foot he encounters a seemingly infinite sequence of patrolling police cars which stop him. Half of the time he is stopped, he has to pay a fine of $50 and the other half he pays a fine of $100. However when it comes to dealing with the police, Harry is a smart aleck with an uncontrollable mouth. So whenever he is stopped, there is probability p that not only will he have to pay 8 fine but he will foolishly make some sarcastic and rude comment which will result in his license being taken away, What is the distribution of the total fines assessed until his license is revoked? Let (X,,0 > 1} be tid with PIX, = 100] = 1/2 and let N be independent of {Xq,n > 1} with (N=R=gp, KEL, (Note this NV has range (1, 2,...}. Barlier we considered the geometric distribution whieh concentrates on {0,1,2,...}:) The total fines paid is Sy and since Puls) =o at's =ps/(1~ 4s) ps olay we get : ee Poy (8) = Pw(Pxi(s)) = Pe(Gs + 531). 48 PRELIMINARIES Note from (1.2.4.1) ESw =P, (s)low1 = Pre(Pxi(8)) Px, (oar (remember how to difforentiate the composition of two functions?) and assuming Px, (1) = 1 this gives = Pie(Px, (1) Px, (1) = Pu (1) Ph, (2) (0.3.4.8) ESy = E(N)E(%). ‘The patter of this expectation will repeat itself when we discuss Wald’s Identity in Section 1.8.1, In the seeond example, BX; = }(60-+ 100) = 75 and (= asptpas) | G=ap-tm Gaye ag ~ pa + pa)/9* = p/p* = Ap = BUN) Pasion therefore (Sy) = T5/p. 1.4, THE SiMPLe Brancainc Process, ‘We now discuss a significant application of generating functions. The sim- ple branching process (sometimes called the Galton-Watson-Bienymé Pro- cess) uses generating functions in an essential manner. Informally the process is described as follows: The basic ingredient is ‘density {p.} on the non-negative integers. A population starts with a progenitor who forms genecation mumber 0. This initial progenitor splits into F offspring with probability ph. ‘These oftpring constitute the first generation. Esch of the first generation offspring independently spli into ‘a random number of offspring; the number for cach is determined by the density {p_}. This process continues until extinction, which occurs when- ‘ever all the members of a generation {sil o produce offspring Such an idealized model can be thought of as the snodel for population growth in the absence of environmental pressures. In nuclear fission exper- Jments, it was an early model for the cascading of neutrons. Historially, it frst arose from a study of the likelihood of survival of family names 14. THe Supe BRANCHING PROCESS 19 How fertile must a family be to insure that in no future generation will the fainily name die out? ‘The branching process formalism has also been used Jn the study of queues; the offspring of a customer in the system are those ‘who arrive while the customer is in service. Here is a formal definition of the model: Let {2,,j,m > 1,j > 1) be iid non-negative integer valued random variables, with each vaviable having the common distribution {p,). For what follows, interpret @ random sum 5 0 when the number of summands is 0. Define the branching process {Zim 2 0} by fot Kyat Yn = lay bo 4 aay Dy Ina bot Lon © that: Zp. can be thought of as the number of members of the nth ‘eneration which are oflspring of the jth member of the (n—1}st generation, Note that Zn = 0 implies Z,41 = 0 so that once the process hits 0 it stays at 0. Also observe that Z,1 is independent of {Zn .j = 1} which 4s crucial to what follows, since we will need this independence to apply (1.3.4) For n> 0 define Pa(s) = Bs! and set P(s)= EB = S mst, 0S 61. i ‘Thus Fo(s) = s,Fi{s) = P(s), and from (1.3.4.1) we get Pals) = P-1(P(s)). ‘And therefore Pa(s) = P(P(s)) Ps(s) = Pa(P(s)) = P(P(P(s))) = P(P2(s)) Pals) = Paa(P(s)) = PUP-a{9)) ‘Thus, the analytic equivalent of the branching effect is functional com- Position. In general, explicit calculations are hard, but in principle this determines the distribution of Z,, for any m > 0, One case where some explicit ealeulations are possible is the case of binomial replacement. 20 PRELIMINARIES ‘Example. Binomial replacement: If P(s) = ¢+ps then PAs) a+ P(g+ Bs) = 9+ Pat es Pals) =o-+pa+ P(g tps) = 9+ patra + P's Poar(s) = qt pg teat + phat ps. For later purposes note that for 0 1, then <1 and is the unique non-nogative solution to the equation 3=P(s) whieh is less than 1 Proof. STEP 1: We first show x is solution of the equation s = P(s) Since the events {[Z, =0)} are non-deereasing, [Bn =O] © [Zags = Ol we have n= Pin = 0) 22 PRELIMINARIES is a non-decreasing sequence converging to 7. Since Prsals) = P(Pals)) wo get, by sotting s = 0, that Tas = Pn): Letting n — co and using the continuity of P(s) yields = P(r) STEP 2: We show x is the smallest solution of ¢ Suppose q is some solution of the equation = Then, since 0 < q, P(s) ia [0,3] m= P(0) < P(g) =9 ‘and therefore 2 = (0) ‘and, continuing in this manner one more step, Pim) < Plg)=a 3 = Pa(0) = Pla) < P(Q) = 4 In general we obtain aS Letting n — 00 yields r 0, the graphs for 0.< ¢ <1 have at most two points in common, One of these is s= 1 If (1) = m <1, then in le neighborhood of 1 the graph of y = P(s) cannot be below thot of y = ¢ and hence by convexity of P(s) the only intersection is ¢ = 1. In the contrary case, if P/(L) = m > 1 then in a left neighborhood of 1 the graph of y = P(3) is below the diagonal and there rust be an additional intersection to the left of 2 of the two graphs. See Figure 1.41. a MPLE BRANCHING PROCESS 23 Figure 1.4.1 Note in the binomial replacement example that ¢ = P{s) yields the equation sa gtps ‘whose only solution is s= 1 which agrees with the fact that m= p <1 We now give the following complement, which ties in with the conti- huity theorem of the next section, Complement. For 0 1.) Since P(s} is non-decressing, from the previous inequalities wo get P(x) =< Pals) < Pla) S. Continuing in this manner, for the general case we obtain © £ Pals) < Pyals) So $ Pls) So. “Thus we conclude that {P,(s)} i non-inereasing for x < 8 <1, Let Pya(8) = lita oo Pals). Suppose for some sy € (8,1) we have Pro($0) =! a> m Then P(q) = Jim. P(Pa(so)) = Jim, Pass (so) =a and on the domain (1,1) we have P(s) . tl What Pa(s) + says is that SS Pen = Hot x = 18° + Soot oS i Anticipating the Continuity Theorem for generating functions presented in the next section, this implies that PlZn=0) +, PlZn =k) 0, for k>1. 14, TRE SmaPLE BRANCHING PROCS 25, In fact, using Markov Chain or martingale methods, we can get a stronger result, namely that Pla +0 oF Zy -¥ oo} = 1 and PlZq 0) = 1 PlZa > ol ‘The simple branching process exhibits an instability: Either extinction ‘occurs of the process explodes. Example. Harry Yearns for a Coffee Break. In order to help some fiends, Harry becomes the east coast sales representative of B & D Soft- vrare. ‘The softwace has been favorably reviewed and demand is heavy Barry sets up a sales booth at the local computer show and takes orders Bach order takes throe mimites to fill. While each order is being filled there js probability p; that j more customers will arive and join the line, As- sume po =.2,p) = .2 and p; =.6. Harty cannot take a coffee break until ‘service is completed and no one is waiting inline to order the software IE present conditions persist, what is the probability that Harry will ever take a coffee break? Consider a branching provess with offspring distribution given by (Po,pi,p2). Hasty can take a coffee break if and only if extinction occurs inthe branching process. We have P() = 2428+. and m=(2)14(6)(2)=14>1, 80 #= P(s) yields the equation Sa 2+ Ds+ 69" ‘Therefore we must solve the quadratic equation fs? — 85+ 2=0, and the two roots 2.6) 2(.6) yield the numbers 1, $, and thus x= 3. ‘Thus the probability that Harry can ever take a coffee break if present conditions persist is 1/3, 26 PRELIMINARIES. When P() is of degree higher than two, solution by hand of the equa- tion ¢ = P(s) may be difficult, while a numerical solution is easy. ‘The procedure is frst to compute m = Sip kos. I'm <1, then x = 1 and ‘we are done, Otherwise we must solve numerically: Root finding packages ze common. A program such as Mathematica makes short work of finding the solution. Typing SaivelP(s) - = 0,5] will immediately yield all the roots of the equation and the sinallest root in the unit interval can easily be identified. Alternately, can be found by computing x = liray-gp P(0). The recursion =P) Bet = Plo) can be easily programmed on a computer and the solution will converge quickly. Tn fact the convergence will be geometrically fast since for some ee (0,1) O< rm p40. ‘The reason for this inequality is that by the mean value theorem and the monotonicity of PY 19 = Pals) ~ Pal) $ Pal) We need to check that Pile) = (P(e) and PR) <1 Since Prsals) = Py(P(s))P (8) wwe get Psalm) = Pula) P'(m). ‘The difference equation when iterated shows the desired power solution. It remains to check that P'(r) < 1. If this were false and P"(r) > 1, then for s >, by monotonicity and the mean value theorem, we get Pls) Pin) 2 Pils 7) 2 (8-7), | Ps)zrts—r= which for s > 7 is a contradiction since on (7,1) we have P(s) < ¢ (as: suming P(s) is not linear) 1.8. Law Distrisurions aN THE ConTINUITY THEOREM 27. 1.5. Limtr Disterputions anp THE Cowninurry THEOREM. Let {Xn,n > O} be non-negative, integer valued random variables with (n20,k 20) (asa) PIX, wf, Pals) Ee ‘Then X,, converges in distribution to Xo, written Xq = Xo, if (1.5.2) im pf? =f for k=0,1,2,.... As the next result shows, this is equivalent to (1.5.3) Pals) > Fo(s) for0 0) i probability mass function on {02,2} 50 thr _ AWP20, Sa = ‘Then there exists a sequence {pf’,k > O} such that (say Lim of? =p, & 20, iff there exists a function Fy(s),0 0 we may pick m so large that We have ede? 14+ ior Yial —wP ire Letting n +00, we got Pols) Se and because ¢ is arbitrary we obtain (1.5.5). ‘The proof of the converse is somewhat more involved and is deferred to the appendix at the end of this section, which ean be read by the interested student or skipped by a beginner. 15. Lure Distaupvrions ano THE ConTINUITY THEORES — 29 Example. Harry and the Mushroom Staples.* Bach morning, Harry Duys enough salad ingredients to prepare 200 salads for the lunch crowd at his restaurant. Included in the salad are mushrooms which come in small boxes held shut by relatively large staples. For each salad, there is probability .005 that the person preparing the salad will sloppily drop fa staple into it. During a three week period, Harry's precocious twelfth grade niece, who has just completed a statistics unit in high school, keeps track of the number of staples dropped in salads. (Harry's customers are not reticent about complaining about such things so detection of the sin and collection of the data pose no problem.) After drawing a histogram, the niece decides that the number of salads per day containing a staple is Poisson distributed with parameter (200)(.005) = 1. Harry's niece has empirically rediscovered the Poisson approximation to the binomial distribution: If Xq ~ (K;n, p(n) and lim EX, = € (0,00), (1.58) ta mpl then +X asm — co where Xo ~ p(k). ‘The verification is easy using generating functions, We have sig, PA) ~ in Be = i (plo) + der = lim ( + Sa dnelny using (1.5.8). Appendii Continuation of the Proof of ‘Theorem 1.5.1. We now return to the proof of Theorem 15.1 and show why’ convergence of the generatog Functions implies convergence of the sequences ‘Assume we know the following fact: Aay sequence of mass functions {CF ",5 > 0}, m2 1} has a convergent subsequence {{f",j > 0}} meaning that forall j Jim ‘exists. If {pl”} has two different subsequential limits along {n'} and {n"}, by the fist half of Theorem 15.1 and hypothesis (1.53), we would have Tim, Sop = im. Ps) = Pols) im Im A semi-true story 30 PRELIMINARIES and also dim Cpt st = fim Pols) = Pols) ‘Thus any two subsequential limits of {p{")} have the same generating func- tion. Since generating functions iniquely determine the sequence, ll sub- sequential limits are equal and thus lim, qo p{” exists for all k. ‘The limit has a generating function Po(s)- Tt remains to verify the claim that a sequence of mass functions {U3 > 0}, m > If has a subsequential limit, Since for each m we have {4.5 20) (0.1, and (0,1) is a compact set (being a produet of the compact sets [0,1]), ‘we have an infinite sequence of elements in a compact set. Hence a subse- ‘quential limit must exist, Ifthe compactness argument is not satisfying, a subsequential Iimit can be manufactured by a diagonalization procedure, (See Billingsley, 1986, page 568.) 1.5.1. THe Law oF Rare EVENTS. ‘A more sophisticated version of the Poisson approximation, sometimes called the Law of Rare Events, is discussed next, yn 1.5.2. Suppose we have a doubly indexed array of random ch that for each n= 1,2,..., {Xnark > 1), is a sequence of independent (but not necessarily identically distributed) Bernoulli random variables satisfying (5.11) Pa == pln) =1- PX ne =O), (15.12) VV. pala) =6(n) +0, neo, as.) Yopuln) = BY Xan A (0,00), +00. Jf PO(A) is a Poisson distributed random variable with mean A then Yo Xen + POW). 15. Liwtr Disreisurions AND THe Conrmeutry Tusonem 31 Remarks. ‘The ellect of (1.5.1.2) is that each Xy4,k = 1)... yn has a uniformly smal] probability of being 1. Think of X,,« 4s being the indicator of the event Ant, viz Xn = LAqy in which case So Xae= Dd = the number of 44,1 < # 2, then in order for the random walk to go from 0 to 1 in steps the first step rust be to ~1 (which has probability g). From ~1 the walk must male its way back up t0 0. Say this takes j steps. Then it seems reasonable that the probability of the walk going from —1 to 0 in j steps is gj. From 0 the random walk still must get up to 1. Say this takes & steps. Then this probability should be dy and the constraint on j and kis that 1+j-+= ‘where the 1 is used forthe intial step to ~1. Thus the equation should be SoS wea ‘The argument just given seems plausible and we now make it precise. (These who foud the argument convincing can skip to (1.6.2) For n > 2-we have (162) WWenl= Ubi = 194 n Bes 34 PRELIMINARIES where the random wall: makes a fast retura from —1 to O in j steps} ee a the random walk makes a first passage from 0 to 1 in naj 1 steps] sfintin So Xyaee = m5 Ih When n = 2, interpret the right side of (1.6.1) a8 8, the empty set. Note Ay is dotermined by Xz,...,Xj41, and similarly By,-j-1 is determined by Xy42y.o-1%n ‘Thus, the three events Pa =-a), Banta are independent because they depend on disjoint blocks of the X's. Sinez the union in (1.6.1) is union of disjoint events we have on aPC, P(Ba- 5-3) PIN Now (1, Xo, } (XX, Xe} ning the finite dimensional distributions of both sequences are identi cal; ie, for any mand sequence ky... of elements chosen from {1,1} wwe have PI = hays Xm PUK = bay Xen Fel since both sequences are just independent, identically distributed. There- ma] P(A,) = Phin ‘and similarly PByj-1 = Onis 16. THE SnaPte Ranpom Wack 35 from which we get the recursion 40 =0,1=7 (1.6.2) bn = DW jbngars 2B cat ‘This difference equation summarizes the probability structure. To solve, multiply (1.6.2) by s* and snm over n. Set &(s) = Dig dns”. We have aes) See (Serer) * “E(Seown)a Reversing the summation order (note n ~2> j > Oimplies n > j+2), we set the above equal to Setting m= n—j~1 yields E(Ear)onm BSR) osshs = 90818) S450" -gs0"(s), ‘The left side of (1.6.3) is and we conclude 36 Preuiinanies Solve the quadratic for the unknown (3). We get (0) = (1+ VI 4pas?) /2a8. ‘The solution with the “4 siga is probabilistically inadmissible. We know (0) <1 but the solution with the “+” sign has the property La YT= apg? 11 2 es gs ~ Dgs as ¢ +0 (where a(3) ~ B(s) as s —+ 0 means lim, 9 a(s)/4(s) = 1). So we conclude T= tas (1.6.4) a= Osse2 We can expand this to get an explicit sokution for {n) using the Binomial ‘Theorem. On the one hand we know (3) = D7 dns™ and on the other, by expanding 1.6.4, we also have 6) (: E Q)corer) on ‘The “1” and the *j = 0” terms cancel; taking the minus sign in front of, the sum inside the sum yields SUD by By Bis) = (-1)°? (4pq)4844 /298 =) so (U2 jar 409)! soja EC ow eg jst (s+ Os? +. We conclude ons 2) (ay s#1(4pqy8 (P)cwearaiiaa 524 ‘and, for the even indices, we have for j > 1 16. Tae Simpce Ranbom WALK 37 bys =0. For obtaining qualitative conclusions, it may be easier to extract in- formation from the generating function. For instance, from (1.6.4) PIN 0] > 0 and on the set of positive proba- bility iq > 0. OmolSn <0) the gambler is never ahead ‘When P[V = co] > 0 we have by defaition EN =o. When p > q we compute EN = (1) by differentiating in (1.6.4) 24st = 4oas?)-¥*( Spgs) — (1 VT — 4oas")2a ag?s® (ugly but correct), and letting s 7 1 yields en = (2a( ) = 200 vI= a) Hq. mumerator and denominator by 2q we get Be) = i a ew = (4 (1-19) /20 2p _ (= \p~al) p-al ee 38 PRELIMINARIES and so EN {Geos ine An extension of these techniques can be used to analyze the distribu- tion of the first return time to 0. Define Np = infin 1:8, =0), st fo = 0 and fan = PlNo = 2], n> 1. Also . FO) = fans™, OS #51. Now we have At inf{m: Dh Xin eee { 14+ infin: Sh, Keg Yon [Xs 1) on [X Set wr ttte Sax meas SX =} = N and observe that because {X,,i 2 1} 4 {Xi 2 2} we have NW 4 Nt ‘Also N* is determined by (Xi4a,i 2 1) and is therefore independent of X;. Similarly N~ is independent of X;. We have Pls) = Bs" = Bey yaa) + BN SES eyo y+ BS yas, By independence this is = Bs" PIX, = -1]+ Es" PIX, = 1] 6.7) = o0(s\q+ spEs* 16. THE SmaPLe Ranpom Wak 39 Note Wo = inffns So Xiga = 1) Snffa SO = = inf(n SX) =) infin: > XP* = 1} Moreover, the process {Sf Xifsn 2 1) is a simple random walk with step distribution PIX = Pixt 1-%1 = 1] = PX; P. To get P= BN, vee simply use the formula (1.6.4) with p and g reversed. Consequently, from (1.6.7), Fs) 90 (iS = ME) +p (=== ae) gs ps (1.6.8) =~ v1— pas. Ruther, PUI) = Ng < oo] = 1 = T= Apa = 1 [pal 0 1 fpw@ PINs g dp, ifp 2, then X isa random vector. (3) If5 = R®, then X is a stochastic sequence; ie, a stochastic process with a discrete index set. So a stochastie process with index set {0,1,...} sa random element of R®. (4) If $= C, the space of continuous functions on (0,0), then X {is a continuous path stochastic process. A prominent example of such » process is Brownian motion, (6) If S = M,(B), the space of all point measures on some nice space F, then X is a stochastic point process. Some promi- nent examples when £ = (0,00) are the Poisson processes and renewal processes, Other examples abound. ‘The distribution of the random element X is the probability measure on (5,5) induced by X, namely Po X-!, so that for Be S PoX""(B) = PIX € Bi AAs before, the distribution of X determines the probabilities of all events determined by X, namely X~(5). Usually it is convenient to find a small cass of sets, as we did in the case of ranclom variables, so that Po X~? is determined by its values on this small class. Recall that the relevant technique is the consequence of Dynkin's Theorem (Billingsley, 1986, p. 38) given next. Proposition 1.7.1. IFC is a class of subsots of S which is closed under finite intersections and generates S, i, if o(C) = S, and iftwo probability measures P,, Pp agree on C, thon P, =P, on S, Sequence Space: We now concentrate attention on RO = (x:x = (ey12,...) andy € R, i> 1} since this class is most appropriate for discrete time stochastie processes. Let C be the class of finite dimensional rectangles in R™; ue, AE C if there exist real intervals Jy,... 1 for some & and A= {xe R® neh, a Note that C is closed under intersections, and it is also true that 72, the cralgebra generated by the open sets in R, is generated by the finite dimensional rectangles in C (cf. for example, Billingsley, 1968). So we have the important conclusion that any measure on (R™, R®) is uniquely ‘determined by its values on the finite dimensional rectangles C. a2 PRELIMINARIES Suppose X= (X;, Kay...) is a random element of R defined on the probability spoce (Q,4,P). lis distribution Po X~ is determined by its values on the finite dimensional rectangles C. ‘This can be expressed another way. We say two random elements X and X’ are equal in distribution (written X £ X") if PoX-! = Po (X")-! on R™. X’ is then called a version of X Proposition 1.7.2. IfX and X! are two random elements of R™ then xéx’ if for every KEV: (yy... Xa) 4 (XY. XD € RE Proof. Define the projections My : °° ++ RE by Hye, 22,-+-) (215. 4) Each Tl, is continuous and hence measurable. IfX £ X" then also Ta(X) = Clays Me) SMR) = (Xp... VQ) 1s desired. Conversely if for every BE 1: (Nyy. Xs) £ (Xhyee- AD ERE then the distributions of X and X’ agree on C and hence everywhere, Ml Call the collection of distributions PoX"tolk() = PU(X1,.-- Xa) E41 on RE (k > 1) the finite dimensional distributions of the process X and our proposition may be phrased as the distribution of a process is determined by the finite dimensional distributions. Define a new class €’ as follows: A set \’ isin C’ if i is of the form N= {y eR? ry Sayi ohh for some & > 1,(2y,... 24) € RE Note that C’ is still closed under Intersections and still generates R™; also Pox al) = PIX San... Xe S tah 17. THE DISTRIBUTION OF A PROCESS 43 which is a &-dimensional distribution function. The analogue of Proposition 1.7.1 is that the distribution of the process is determined by the finite dimensional distribution functions. ‘Two random elements X,X’ in R which are equal in distribution will be probabilistically indistinguishable. ‘This last statement is somewhat vague. What is meant is that any probsbility caleulation dane for X yields the same answer when done for X’. (This rephrases the statement Pik €B)= PK eB, WeR™) In succueding chapters we frequently will construct 9 convenient represen- tation of the stochastic process X= (Xe). (This was already done with the branching procass.) We are assured that any othor version X= {Xi} will have the same properties as the constructed X. Here is one last bit of information: Define the coordinate map 7 Bes Rby miltytay-- for k > 1. The following shows there is nothing mysterious about a mea- surable map from 9 to R*. Proposition 1.7.3. If X is a random element of f° then for each k > 1 we have that m_(X) i a random variable. Conversely, if Xi,X2,... are random variables defined on ®, then defining X by Xu) = (w), Nal), --) ‘Yields a random element of R. ‘A random element then is just a sequence of random variables. Proof. my is continuous and hence measurable. Therefore if X is a random element of R, meoX 06 RB being the composition of two measurable maps is measurable and hence is 4 random variable For the converse, we must show XURB) CA. However, R® = o(C) and XMo(O)) = o(X"C)) 44 PRELIMINARIES But for a typiesl A € C. XNA) = eG eLied since X1,..- Xe are random variables. Hence XMOQCA and (XO) CA 1s desired. 1.8. SrorPine Times. * Information in a probability space is organized with the help of o-fields If we have a stochastic prooess (Xq,n > 1} we frequently have to know what information is available if we hypothetically observe the process for n time units, Imagine that you will observe the process next week for m time units, The information at our disposal today from this hypothetical future experiment is the o-algebra generated by X;,-..,%y which we denote as O(Xpy.++yXq)- Another way to think about this i that o(X,...Xn) consists of those events such that when we know the value of Xsy...4Xny wwe can decide whether or not the events occurred. Note for n> 0 (Xin) CO(Kty soy Xnt) and the information from hypothetically observing the whole process is of X,5 21) = of) olXs,.- Xn). In general, suppose we have a probability space (9,4, P) and an in- creasing family of o-Belds F,,n>0; ie, Fy C Fasr C A. Define Vanna) Fee * This section contains advanced material which may be skipped om first reading by beginning readers 1.8, SroPeING Times 45 30 that Fa C Fog CA, Think of (Fa,0 1: Sq =} Js a stopping time. Note the convention: Tie infimum of sn empty set is 490. So [Sn <1, for all n] = [{n 21: Sy = 1} = 0] =[W = oe] We have that, N a stopping time with respect to {F,,m > 0} where Fo= (00), Fa=o(X1,..-)Xn)n 2 1 ‘The reason is that for n> 1 iv ($1 1 frp =n] = Ko € BY... Gun € BG EB] E Fr. In Markov chain models, the state space tay be {0,2,...}- A typical case is that B= {0}. We are interested in 79 where {2 0&4 =O} Back to the general discussion. If « is a stopping time with respect to {Fa}, the information up to time a is contained in the o-feld 7 Fa ={N€ Foo: AN a= nl € Fa for all 1 1}- It is the information available after time a 1.8, STOPPING ThaEs ar 1.8.1, WAto’s IDewriry, Wald’s identity and its generalizations are special cases of martingale stop- ping theorems. They are useful for computing moments of randomly stopped sums, although checking the validity of moment: assumptions nee- essary for the identities to hold ean be tricky. We have already seen an identity lke Wald’s in (1.3.4.3) Ifyou did not read Section 1.7, think of a stopping time a with respect to the soquence {Xn,n > 1} as a random variable such that the sot [a = m] is dotermined only by X1,...,Xwms for any m > 1. Thus a takes on the value m regardless of future (beyond time m) values of the process, Proposition 1.8.1. Suppose {Xq.n > 1} are independent, identically distributed with E|X,| < 00. Suppose @ is a stopping time with respect to {Xan > 1} and Ea =E(X)Ea fom Lemma 1.1.1 ‘The rest of the proof justifying the interchange of E and 22, requires f bit of measure theory. A student. who does not have this background should skip the rest of the proof and proceed to the example below. For ‘those who continue, note ED XAtcail = BDO 1XMice 48 PRELIMINARIES Since all terms are positive, Fubini or monotone convergence justify the interchange and SF IKMjsai = BUG IBA < 20 by assumption. ‘Therefore the function of two variables é and w Xu) coil) js absolutely integrable with respect to the product of P and counting measure. This justifies a Pubini interchange of the iterated integration. Ml Example. Consider the simple random walk (Sq) with Sp = 0 and set N =inf{n > 1:84 =1} Recall P[X; = 1] = p= 1—P[Xy = —1] 0 that BX; = p—g. On [N < oe] wwe have Sy =1. If EN < oo then Wald’s identity holds and Sy E(X)EN = (p-)EN. If p = q, we get a contradiction: If BN < oo then PIN’ < oo] ‘moreover, on the one hand ESy=EL=1 (since Sy = 1) ‘and on the othe, by Wald Sy —hew=0 Hence BN ~ 00.11 EN < co and p < q then Wald implies 1 = (p-)EN < 0, a contradiction. So we get the weak conclusion EN = oo, whereas we know from (1.6) that in fact PIN’ = oo] > 0. If p > g and EN < co we conclude from Wald EN = (p~ 4), in ageeoment with (2.6.6), but this argument does not prove EN < co, 1.8.2. SpurrrinG AN IID Sequence ar SroppiNe Trust Suppose {Xn,n > 0} are iid random elements and set Fa =a(Xoy... Xa), Fa =a Xngs Xng aioe) “This section contains advanced material which may be skipped on fist reading by beginning readers. However, the main result Proposition 1.8 2s easy ‘to understand in the case that @ i five as. 1a. Sroprine Times 49 ‘Fa sepresents the history up to time n and is the future after n, For jid random clements, these two o-fields are independent. ‘The same is true svien nis replaced by a stopping time a; however care must be taken to handle the case that Pla = co] >0. As before, we must define Fu=V Fa= el Fe) ae tees If a= co it makes no sense to talk about splitting (Xa) into two piwous-—the pre-and posta pieces. Instead we restrict attention to. the Have probability space. If (9,4, ) is the probability space on which {G65} and ar are defined, the trae probability spnce is (O*, A®, P#) = (IA fe < oo), AN [er < oo], Pf-lar < 00}) {assuming Pla < 0] > 0). If Pla < oo] = 1, then there is no essential difference between the original and the trace space. Proposition 1.8.2. Let (Xq,n > 0) be iid and suppose a isa stopping time of the sequence (ie, with respect to {Fq}). In the trace probability space (2#, F#, P#), the pro- and post-a o-fields F,,F., are independent and {Xun 20} £ (Kaen 2 I fn the sense that for BE R™ (as2a) PH[(Xasesk 2 1) € B= Pl(Xqn 20) € Bl Proof. Suppose A © Fa, Then PEA [a < col Mi{Xasask2 1) € Bl (1.82.2) ¥Y PiAnia=nlA [Xnsek 21) € Bl} = From the definition of F, we have Anla=nle Fu Since [(Xqea,k 2 1) € B] € Fy which isindependent of F, we got (18.2.2) equal to SPianie PU Xneesk 2 1} € BL = So P(N [a= al] PX, b> 0} € B) =P In la < ol) PX, ope BI 50, PRELIMINARIES By dividing (1.8.2.2) by Pla < oo] we conclude (1828) PH IAM (Xapisk 21) € Bl] = PA(A)PIEX #2 O} € Bl. Let A=. We conclude that (1.8.2.1) is true, Once we know (1.8.21) i true we may rewrite (1.8.23) 88 (18.2.4) PRIANXasask 2 1} € Bl] = PAA)PH Xa k 2M) € Bl Which gives the roquized independence. a Example, Let {Xq,n > 1} be iid Bernoulli, and let the associated simple random walk be 8 = 0,8, = Xi to + Xp mB We derive the quadratic equation for @(s) ~ Bs where N inf(n > 1: Sq = 1} without first deriving a recursion for {P[N = k],k > 1}. We have B(6) Ee Yyxjary + B8™ x3) p+ Bs™Yyy, aay Define inf 21: 0&4) =) i and on [Ny < oc] define My =int{n 21:2 Xystey = Ih so that on (M, < 00,X: = —1] Nel+M+% Define P# = Pl-|M < 20}. Now on [Ny < oo}, No is the same functional of {Xnenad 2 Up a8 Mh is of {Xignd 21} So feom the previous results, for any k, PIN = 8) = P*|Ns = 4] 1. EXERCISES bL ‘Thus for 0< 5-< 1 Bs xynna] = BM ye as<0) (Gince on [X1 = -1,Ni = oc] we have NV = 00 ands = 0). Let E¥ be expectation with respect to the measure P*, Then Be iyxyaay HSB Aye, coo} =sqb¥ s™PIN, < 00), and since N,N, are independent with respect to P# (by 1.8.2), this is asqbt EAN PIN, < 00] Using (1.8.2.1) we get =q #31 Ba™ PIN, < oo] 8B, Tp 22, Let {Xq,n 2 1} be iid Bernoulli random variables with PIX, =1)=p=1- PIX, = 0] and let S, = S7ZL, Xe be the number of successes in'n trials. Show Sy, has 4 binomial distribution by the following method: (1) Prove form 20,1 Sé 0} where X(t) i the quantity ordeced by time ¢ (b} Thirty-six points are chosen randomly in Alaska according to some probability distribution. A cirele of random radius is drawn about each point yielding a random set S. Let X(A) be the value of the oil in the ground under region ACS. The proces is {X(B), BC Alaska) (c) Sleeping Beauty sleeps in one of three positions (1) Onher back looking radiant, (2) Curled up in the fetal postion. (3) In the fetal position, sucking her thumnb and looking radiant only to an orthodontist. Let X(t) be Sleeping Beauty's position at time . The process is {X(t 2 0} (d) For n= 0,1,..., et X; be the value in dollars of property damage to West Palm Beach, Florida and Charleston, South Carolina by the nth hurricane to hit the coast of the United States. 1.5. IX is a nonnegative integer valued random variable with X~ {pe}, Pls) = Es*, express the generating functions if possible, in terms of P(s), of (a) P(X < ni}, (b) PLX 1) is independent, identically distributed. Define Sy = Xq=1 and for n> 1 Sn = Xot Xtc Xe For n > 1 the distribution of Xq is specified by PIX. =5-Wery 70,1, where Meat ss) =Vays' oss <1. ca = (he random walk starts at 1; when it moves in the negative direction, it docs so only by jumpe of —1. The walk connot jump over states when roving i the negative direction.) Let N =infln: Sa = 0} If Pls) = Bs, show P(s) — sf(P(s)). (Note what happens at the first step: Either the random walk goes from 1 toO with probability po or ftom 1 to} with probability py.) If f(s) = p/(1—gs) corresponding to a geometric distribution, find the smallest solution, 1.8. In a branching process Pls) =a8" +3 +e where a> 0,b > 0,¢> 0, P(1) = 1. Compute 9. Give a condition for sure extinction. 1.9. In a binomial replacement branching, model T= ini{n: Zq = 0} (1) Find PE =n) for n> 1 (2) Find P(T =n] assuming Z» Pls) = 9 tps, let 0 1.10, Harry lets his health habits stip during a depressed period and discovers spots growing between his toes according to a branching process with generating funetion P(s) = 15 +.058-4 039? 4.0783 + 494 +255? +.059° 4 1. Exercises Will the spots survive? With what probability? LAL. A Point Process. Let 1V(A) be the number of points in re- sion A, Assume that for any n, the set A can be decomposed, A GAM, euch that AQ... AM? are disjoint, (4) = TENA) and N(A%),... ,.N(AS™) are independent. Assume PIN(A) = 0)= expl-A/n), PENCAI) 22] < As) where 6(2) isa positive function such that 6(z) —+ 0 as:z ~+ 0. Show (A) has a Poisson distribution. 1.12. For a branching process {Zn}, let $= 1+ Dy Zn be the total population ever born, Find a recursion which is satisfied by the generating fonction of 3. Solve this in the case P(s) = q+ ps and P(s) = p/(t ~ 98). What is B(S)? 1.13. Let [2] be the greatest integer <2. Check by integral comparison cof another such method that ise) ia, 2s Let {Xj,j 2 1} be independent random variables with PIX; 1/j=1- PIX; = 0) and set S, = D0, Xi 0 21 (1) What is the generating funetion of wa y x (2) Use the continuity theorem for generating functions to show = pl) = ete Swwa - Sw im, PlSiva~ Sw = (3) Define 11) = infj > 1: Xj = 1} Compute the generating function of {P[L(1) > n,n 2. What is ELC)? (4) Tf {Zq,n > 2) is 8 soquence of fd random variables with a contin- uous distribution, show that Ar,ovestent 2 US (Xp. 21) 1. EXERGIsES 55 1.14. Harry comes from a long line of descendents who do not get along with their parents. Consequently each generation vows to be diflerent from their elders. Assume the offspring distribution for the progenitor has enerating function f(s) and that the offspring distribution governing the umber of children per individual in the first generation has generating fimetion 9(s). The next generation has offspring governed by f(s) and the next has g(6) s0 that the functions alternate from generation to generation. Determine the extinction probability of this process and the mean number of individuals in the nth (assume n is even) generation, 1.18. (a) Suppose X is a non-negative integer valued random variable Conduct & compound experiment. Observe X items and mark each of the X items independently with probability » where 0 n| Ocsel Compute P{F > x 116. Stopping Times. (a) Ifa is a stopping time with respect to the ovfields {Fn} thon prove Fa is a 0-feld (b) Hag, k > 1 ate stopping tines with respect to (Fy), show Vion nd Agar are stopping times, (Note V means “max” and f'means “imin”.) If {a4} is a monotone inereasing family of stopping times thon liza, 's a stopping time (6) Mas $ a2 show Fay C Fay LAAT, For a simple random walk {S,} let up =1 and for n> 1, let tin = PISn = Compute by combinatorics the value of up. Find the generating function U(s} = Dj ugs” in closed form. To get this in elosed form you need the ()-eor(2) 1.18, Happy Harry's Fan Club. Harry’s restaurant is located near Orwell University, a famous institution of higher learning. Because of the ‘rucia} culinary, social and intellectual role played by Harry and the restau ant in the life of the University, a fan club is started consisting of two types of members: students and faculty. Due to the narrow focus of the club, ‘uembership automatically terminates after one year. Student members of 56 1. EXERCISES the Happy Harry’s Fan Club are so fanatical they recruit other members when their membership expires. Faculty members never recruit because they are too busy. A student recruiter will recruit two students with prob- ability 1/4, one student and one faculty member with probability 2/3 and 2 faculty with probability 1/12. Assume the club was started by one stu- dent. After n years, what is the probability that no faculty will have yet ‘eon recruited? What is the probability tho club will eventually have no members? 1.19, At 2 AM business is slow and Harry contemplates closing his estab- lishment for the night. He starts flipping an unfair coin out of boredom and decides to close when he gets r consecutive heads. Let T' be the nutnber of fips necassary to obtain r consecutive heads. Suppose the probability of a head is p and the probability of a tal is q. Define px = P[T’ = k] so that Pe =O for k r Pa = PID =| = P'Qll — po ~ pa — = Phor-ah (2) Compute the generating function of T and verify P(1 (3) Compute ET. Ifyou are masochistic, try Var(). ‘The next night at midnight, Harry is bored, so again he starts flipping coins. To vary the routines he looks for a pattern of HIT (head then tail) For n> 2, let fu = P| the patter HT first appears at trial number n } ‘Compute the generating function of {fq} and find the mean and variance. 1.20. In a branching process, suppose P(s) = g+ps",0 1: Zq =O} (2) Find the probability of eventual extinction. (2) Suppose the population starts with one individual, Find P{T > n}. 1.21, Let {N(t), 2 0} be » process with independent increments which means that for any & and times 0S th So S te N(ty),N(ba) - Nith),---)N(te) ~ N(¢e-1) are independent random variables. Suppose for each ¢ that .V(t) is non-negative integer valued with Rs) = BX 1, Exercises 8T For 7 0}. Is this a branching process? If so, what is the offspring distribution which generates this process? 1.28. For a branching process with offspring distribution Pa=pa", n>0p+q=10 O} with generating function da(s) = Sop Panst. As before, let Zy be the number ia the nth generation. (1) Construct @ model for this population analogous with the eon- struction of Section 1. 58. 1. EXERCISES (2) Express the generating function Jals) = Bs? in terms of ge(s),k 2 O where do(s) = s. (8) xpress my = BZ, in torms of yi > 0 where p= 4(2)- 1.25. Harry and the Management Software. Eager to give Happy Harry's Restaurant every possible competitive advantage, Harry writes in- ventory management software that is supposedly geared to restaurants. Hiarry, sly fox that be is, has designed the software to contain a virus that wipes out all computer memory and results in a restaurant being unable to continue operation. He starts by crossing the street and giving a copy to the trendy sprouts bar. The software is presented with the condition that the recipient must give a copy to two other restaurateurs, thus spreading. the joy of technology. The time it takes a recipient to find someone else for the software is random. Upon receipt of the software, the length of time until it wipes out a restaurant's computer memory {3 also random, Of course, once a resteurant’s computer memory is wiped out, the owner ‘would not continue to disburse the software. Thus a restaurateur may distribute the software to 0, 1 or 2 other restaurants. For j = 0,1,2, define pj ~ P(e restaurateur distributes the software to j other restaurants | Suppose py = .2,p: = Ip = 7. What is the probability that Harry's plans for world domination of the restaurant business will succeed? 1.26.* Suppose Xi, X» are independent, N(0,1) random variables on the space (8,A,P). (a) Prove X 4 (b) Prove 35 Le, prove that Po Xz? = Po(—Xy)"! on R. (1, Xa + Xa) & (Mi, Xa ~ Xa) in RY; ke, prove Po(Xa,X1 + X)7! = Po (Xi, Xa — Xz)" on R® Now suppose (Xi, > 1} is an iid sequence of N(0,1) random variables, * This problem requires some advanced material which shoud be skipped on ‘he frst reading by beginning readers. 1. BXBRCISES 59. (c) Prove (M+ Xa. in Re. (a) If X,¥ are random elements of a metric space S, and g: S++ S' is ‘a mapping from S to a second metric space 5”, show that X # ¥ implies oX) £9). 1.27. IX; has o negative binomial distribution with parameters pyr (of. Example 1.36, Section 1.3.3), show that ifr — oo and rq — > 0, then the negative binomial random variable X, converges in distribution to a Poisson random variable with parameter 2 1.28. Consider the simple branching process {Z} with offspring distribu tion {px} and generating function P(s). (a) When is the total number of ofispring "2.0 2, < oo? (b) When the total number of offspring is finite, give the finetional equation satishied by the generating function ¥(s) of [222.9 Zn < 00. (c) Zeke initiates a family line whichis sure to die out. “Lifetime earnings of each individual in Zeko's line of descent (including Zeke) constitute lid random variables which are independont of the branching process and have ‘common distribution function F(z), where F concentrates on (0,00). Thus vo each individual in the line of descent is associated a non-negative random variable. What isthe probability H(z) that no one in Zeke's line earns more than x im his/her lifetime, where of course «> 0. (4) Whea 5 = 35 P(s) find (3). Ifin addition, Fo) 2>0, find H(2). CHAPTER 2 Markov Chains N TRYING to make a realistic stochastic model of any physical situation, one is forced to confront the fact that real lifes full of dependencies. For example, purchases next week at the supermarket may depend on satis- faction with purchases made up to now. Similarly, an hourly reading of pollution concentration at a fixed monitoring station will depend on pre- vious readings; tomorrow's stock inventory will depend on the stock level today, as well as on demand. The number of customers awaiting service at 6 facility depends on the number of waiting eustomers in previous time periods, ‘The dilemma is that dependencies make for realistic models but also for unwieldy or impossible probebility calculations. The more independeace built into @ probability model, the more possibility for explicit calcula- tions, but the more questionable is the realisin of the model. Imagine the absurdity of a probebility model of a nuclear reactor which assumes each component of the complex system fails independently. ‘The independence assumptions would allow for calculations of the probability of a core melt- down, but the model is so unrealistic that no government agency would be so foolish as to base policy on such unreliable numbers—at least not for long. ‘When constructing a stochastic model, the challenge is to have depen- encies which allow for sufficient reatism but which ean be analytically tamed to permit sufficient’ mathematical tractability. Markov processes frequently balance these two demands nicely. A Markov process has the property that, conditional on a history up to the present, the probabilistic, structure of the future does not depend on the whole history but ony on the present. Dependencies are thus manageable since they are conditional ‘on the present state; the future becomes conditionally independent of the past. Markov chains are Markov processes with discrete index set and countable oF finite state space. We start with a construction of a Markov chain process {Xq,n > 0} ‘The process has a discrete state space denoted by S. Usually we take the state space Sto be a subset of integers such as {0,1,..} (infinite state space) or {0,1,...,m} (finite state space). When considering sta tionary Markov chains, itis frequently convenient to let the index set be 21, CONSTRUCTION AND Fins? PROPERTICS 61 for chOds index set How does a Markov chain evolve? To fix ideas, think of the following scenario. During a decadent period of Harry’s life he used to visit a bar every night. The bars were chosen according to a random mechanism, Harey's random choice of a bar was dependent only on the bar he had visited the previous night, not on the choices prior to the previous night. What would be the ingredients necessary for the specification of a model of bar selection? We would need an initial distribution {a,} so that when Harry's decadent period commenced he chose his initial bar to be the kth with probability ax. We would alco need transition probabilities p,; which could determine the probability of choosing the jth pub if on the prior night the ith was visited, Section 2.1 begins with a construction of a Markov chain and a discussion of elementary properties. The construction also describes how one would simulate a Markov chain, }, but for now the non-negative integers suffice for the 2.1, CONSTRUCTION AND Finst PROPERTIES, Let us first recall how to simulate a random variable with non-negative integor values {0,1,...}. Suppose X is a random variable with PIX =H=on #20, 35 Let U be uniformly distributed on (0,1). We may simulate X by observing U, and if U falls in the interval (SEQ £. (As a convention here and in what follows, set oz, a; = 0.) Now if we define v= Shy & v) a Bhool sothat ¥ = kif U € (Dio ai, Fig ai). then Y has the same distribution a8 X, and we have simulated We now construct a Markov chain, For concreteness we assume the state space $ is {0,1,...}. Only minor modifications are necessary if the state apace is finite, for example § = (0,1,...,m)}. We nowd an initial distribution {ay} where ax > 0,55 to govern the choice of an initial state, We also need a transition matrix to govern transitions from 2 Markov Ciatns state to state. A transition matrix is a matrix which in the infinite state space case is P = (pyj,i 2 0,j 20) or, written out, Poo Po. Po Pn ‘and where the entries satisfy p35 20 Ypy=h, Ay (ln the ease where the state space $ is nite and equal to {0,1,... sm}, P is (m+1) x (m +1) dimensional) ‘We now construct a Markov chain {X,,n > 0}. We need a scheme which will choose an initial state k with probability ay that will generate trapsitions from é to j with probability py. Let {Un 2 O} be iid uniform random variables on (0,1). Define ge Oo) ‘This is the construction given above, whick produces a random variable that takes the value k with probability ay. The rest of the process is defined inductively. Define the function f(é,u) with domain $ x {0,1} by fe so that f(i,te) = k iff we (S43 ps DhpPiy)- Now for n 2 0 define Xan = Ln Unt) Note that if X = i, we have constructed X41 so that it equals k with probability pis. Also observe that Xo is a function of Uo, X; is a function of Xo and Ui and hence is a function of Up and U,, and s0 on so, that in general we have Xn41 is a function of Up, Ur,... ;Uma: Some elementary properties of the construction follow, 1. We have (21.1) PIX =A) 21, CONSTRUCTION AND FinsT PROPERTIES 63. and for any n> 0 1.2) PiXnt = jn = Bs ‘This follows since the conditional probability in (2.1.2) is equal to PIS(Xn,Unss) = 51% PUfli Una) = PU (i Uns, Xn =i] jl since Une and X, are independent. By the construction at the beginning of Section 2.1, this probability is py 2. As a generalization of (2.1.2) we show we may condition on a history with no change in the conditional probability provided the history ends in state i. More specifically we have (2.1.3) PiXnta = 3X0 = fos: for integers fo,i1, fay Xa = ina Xe As with property 1, this conditional probability is PUG Uns) = J1X0 = t0.-- Xa i and since Xo,...,Xy are independent of Uns, the foregoing probability PLSG, Vasa) = 51 = Pas Processes satisfying (2.1.2) and (2.1.3) possess the Markov property ‘meaning PiXnea = 3X0 = io, fat = int) Xn =i] = Plf(Xa, Unga) = j1Xn =i) 3. As a generalization of (2.1.8), we show that the probability of the future subsequent to time n given the history up to m, is the same as the Probability of the future given only the state at time n; and this conditional Probability is independent of n (but dependent on the state). Precisely, we have for any integer m and any states ky,...Km. PUK tt = bys y Xt = RimiXo = fos Kner = tna Xn = i pose Xam = hin Xn 14) Xin = bi Xo = ie 64 Markov Cuamns In shorthand notation, denote the event [Xne1 = his) Xmam = ka] by [(Xj,5 2 n+ 1) € Bl. Note that in the probability of (2.1.4) we can replace X41 by f(i,Unt1), and we can replace Xnya by f(Xntts Una £(F (i, Unga), Una) and 90 on. Thus in the probability of (2.1.4) we can replace (X), j 2 n+1) by something depending only on Uy, 2 n-+1 which is independent of Xo,....Xq- Therefore the conditional probability is PUL, Untads fF lGUnta) Onsa)s---) € Bl Since this also equals PUSG,Us), AFG), Ua),---) € Bl, the result follows, "The three properties above are the essential characteristics of a Markov chain, Definition. Any process {Xn n > 0) satisfying (2.1.2)—(2.1.3) is called ‘a Markov chain with initial distribution {ax} and transition probability matrix P, Sometimes a transition probability matrix is called a Markow or a sto- ‘chastic matrix. ‘The constructed Markev chain has stationary transition probabilities since the conditional probability in (2.1.2) is independent of n. Sometimes Markov chain with stationary transition probabilities is called homo- geneous. ‘Warning. Although the constructed process possesses stationary transition probabilities, the process in general is not stationary. For the process {Xq} to be stationary, the following condition, describing a translation property of the finite dimensional distributions, must hold: For any non-negative Integers m,v and any states ko, kg We have PIX = bo, (Roughly speaking, this says the statistical evolution of the process over an Snterval isthe same as that of the process over a translated interval.) The concept of a Markov chain being a stationary stochastic process and having stationary transition probabilities should not be confused. Conditions for the Markov chain to be stationary are discussed in Section 2.12. ‘The process constructed above will sometimes be referred to as the sim- tlated Markay chain. We will show in Proposition 2.1.1 that any Markov chain {Xf} n > O} satisfying (2.1.1), (21.2) will be indistinguishable from the simulated chain (Xp) in the sense that Xm = Kel PX = Boyes Kot = kr {Xun > 0} £4X#,n > 0}, 21. CoNsTaucTION AND Finst PRoPERriEs 6s that is, the finite dimensional distributions of both processes are the same. ‘Together, the ingredients {ax} and P in fact determine the distribution of the process as shown next. Proposition 2.1.1. Given a Markov chain satisfying (2.1.1)-(2.1.3), the finite dimensional distributions are of the form (215) PIXy=i0.% soo Me = i Pits Pia lor ig,» six integers in the state space and k > 0 arbitrary. Conversely siven a density {a4} and a transition matrix P and a process {X,} whose finite dimensional distributions are given by (2.1.5), we have that {Xn} is 4 Markov chain satisfying (2.1.1)-(21.3). So the Markov property, ie. (2.1.2)-(2.1.3), can be recognized by the form of the finite dimensional distributions given in (2.1.5). Proof. Reeall the Chain Rule of conditional probability. If Ao,..- Ax are ‘events then PAD = PALF) A) PUAnal PAs) PCAnlAo) P(A) a a be provided P((Yup Ai) > 0,5 = O,1,...4k = Suppose (2:1.1)-(2.1.8) hold and Set A; = [Xy = i] 90 that if (21.6) P\X> = ij] > 0, =0,..,k—1 then, PIX: and applying (2.1.3) to the right side we got TPs =uix What if (2.1.6) fails for some j? Let inf{j 2 0: PIXo = toy... Xy 66 Markov CHAIN If j* = 0 then aj, = 0 and (2.1.5) holds trivially. If j* > 0 then by what was already proved PIX = toy... Xyrat) = O4Pits Pryenntenn > Consequently, Payeaatye = PIX foes Xp ipl PIX = fo, so again (2.1.5) holds. Conversely, suppose forall and elivioes of fo,... i that (2.1.5) holds. For OiPigs **Plyaainn > 0 ‘we have PiXe = ielXp = toy... Xen = fea] Xi = te) PIX0 = fos Xen 16a] BsgPioss ++ Phaainas = Poaaste showing that the Markov property holds. i 2.2, EXAMPLES, Hore are some examples of Markov chains. Some will be used to illustrate ‘concepts discussed later and some show the range of applications of Markov ‘chains Example 2.2.1, Independent ‘Trials. Independence is a spocial cave of Markov dependence. If (X,) are iid with PIXo= a4, K=O1,...5m, then PiX naa = intslXo = to a) and 2.2, EXAMPLES 67 Example 2.2.2. The Simple Branching Process. Consider the ple branching process of Section 1.4. Recall {Zn} are iid with common distribution {p,} and Zp = 1 and Zn ryt bot Dn les PlZn = inl = PLY, Zug = inl oy.) Znnt = int] fog ey Baar = ina] = PILL Zag = inh siving the Markov property since thie depends only on in and in. Thus Plan = 5l2n-1 = 4) = PID. Zana = 5) = Pi i where i denotes /-fold convolution. Example 2.2.3. Random Walks. Let {Xq,n 2 1} be iid with PIXn =k] = ay, ~00 Sk <0. Define the random walk by ‘Then {Sq} is a Markov chain since P(Sn1 = inesiSo = 0,5) = ity... Sn = ta] 1Xn+i + in = insalSo = 0, Xngs = ings — én] [Sasa in] intl Since Xq4y is independent of So,-.. 5 Sw. 68 Markov Cuams ‘A common special case is where only increments of £1 and 0 are al- lowed and where 0 and m are absorbing barriers, The transition matrix is ‘then of the form a 1pm 0 o 0 0 @ te He) ee) ‘he trcingonal structure indicative of a random wa with stops 1,0 ote PlSq = 018-1 = 0] = P|Sy = on|Sy-3 =m which models the hypothesized absorbing nature of states 0 and rm and Pl ner = + 1Xy =] =P5 PiXngr = t— %n PX =iXn % for l 1, we have 7,401 = 1, s0 that the process marches detersinistically through the integer towards 00 ‘A comainon method for generating Matkov chains with state space S, 0} are id fandom elements in some space 5. For instance, E could be R or Ror R®. Given two functions Br SKEOS, Gt 12, define Xo =n1(5,¥o)s and for n > 1 Xp =G2(Kna1.Vo): ‘The branching process and random walks follow this paradigm, as do the following examples. Example 2.2.6. An Inventory Model. Let I(¢) be the inventory level of an item at time &. Stock levels are checked at fixed times To,Ts, 2, A commonly used restocking policy is that there be two critical values of inventory s and $ where 0 1) is fid and independent of Xo, and suppose Xo < S. Then nan {SAT Bendh Hoek SS SVS Dardis Xa Ss, (22a) where as usual -{é ifz>0, Lo, ites. ‘This follows the paradigm Xaya = (Xn, Dusah m2 0, and hence {Xn} is. a Markov chain, For this inventory model, deseriptive quantities of interest include:
  • You might also like