0% found this document useful (0 votes)
10 views11 pages

Hoeffding 1948

(1) The document discusses ergodic theorems for U-statistics of order m for stationary but not necessarily ergodic sequences. (2) It presents a theorem showing that if the kernel h is bounded and symmetric, and the set of discontinuities of h has measure zero with respect to the product of the marginal distributions, then the U-statistic converges almost surely to an integral with respect to a random measure. (3) The convergence is expressed in terms of integrals of the kernel h with respect to products of the random measure, which is defined as the conditional expectation of an indicator given the invariant sigma-algebra.

Uploaded by

GJ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views11 pages

Hoeffding 1948

(1) The document discusses ergodic theorems for U-statistics of order m for stationary but not necessarily ergodic sequences. (2) It presents a theorem showing that if the kernel h is bounded and symmetric, and the set of discontinuities of h has measure zero with respect to the product of the marginal distributions, then the U-statistic converges almost surely to an integral with respect to a random measure. (3) The convergence is expressed in terms of integrals of the kernel h with respect to products of the random measure, which is defined as the conditional expectation of an indicator given the invariant sigma-algebra.

Uploaded by

GJ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

SOME NOTES ON ERGODIC THEOREM FOR U -STATISTICS OF ORDER m

FOR STATIONARY AND NOT NECESSARILY ERGODIC SEQUENCES

DAVIDE GIRAUDO

Abstract. In this note, we give sufficient conditions for the almost sure and the convergence in Lp of
a U -statistic of order m built on a strictly stationary but not necessarily ergodic sequence.
arXiv:2309.05988v1 [math.PR] 12 Sep 2023

1. Introduction and main results


Hoeffding (1948) introduced the concept of U -statistics of order m ∈ N∗ , defined as follows: if (Xi )i>1
is strictly stationary sequence taking values in a measurable space (S, S) and h : S m → R, the U -statistic
of kernel h is given by
1 X
(1.1) Um,n,h := m h (Xi1 , . . . , Xim ) ,
n (iℓ )ℓ∈J1,mK ∈Incm
n

n o
where J1, mK = {k ∈ N, 1 6 k 6 m} and Incm
n = (iℓ )ℓ∈J1,mK , 1 6 i1 < i2 < · · · < im 6 n . If (Xi )i>1
is i.i.d. and E [|h (X1 , . . . , Xm )|] is finite, then Um,n,h → E [h (X1 , . . . , Xm )] a.s. and in L1 . A natural
question is whether for a strictly stationary sequence (Xi )i>1 , the sequence (Um,n,h )n>m converges
almost surely or in L1 to some random variable. Assume first that (Xi )i>1 is ergodic. It is shown
in Aaronson et al. (1996) that if S = R, (Xi )i>1 has common distribution PX0 , h is bounded and
PX0 × · · · × PX0 almost everywhere continuous, then
Z
(1.2) Um,n,h → h (x1 , . . . , xm ) dPX0 (x1 ) . . . dPX0 (xm ) a.s..

Convergence in probability was also investigated in Borovkova et al. (2002). A proof of (1.2) in the con-
text of absolutely regular sequences has been given in Arcones (1998). Moreover, Marcinkievicz law of
large numbers for U -statistics of order two has been established in Dehling and Sharipov (2009) for ab-
solutely regular sequences and Giraudo (2021) for sequences expressable as functions of an independent
sequence.
It is worth pointing out that in general, the sequence (U2,n,h )n>2 may fail to converge. For instance,
Aaronson et al. (1996), Example 4.5, found a non-bounded kernel h and a strictly stationary sequence
such that lim supn→∞ U2,n,h = ∞. Moreover, the example given in Proposition 3 of Dehling et al.
(2023) shows the existence of a bounded kernel h and a stationary ergodic sequence (Xi )i>1 such that
a subsequence of (U2,n,h )n>2 converges to 0 almost surealy and an other subsequence of (U2,n,h )n>2
converges to 1 almost surely. Also, as Proposition 4 shows, boundedness in L1 of (h (X1 , Xj ))j>2 plays
a key role, otherwise, we can find a kernel h and a strictly stationary sequence (Xi )i>1 for which the
sequence (U2,n,h − E [U2,n,h ])n>2 converges to a non-degenerate normal distribution.
Some results have been established in Dehling et al. (2023), assuming that the strictly stationary
sequence (Xi )i>1 is ergodic.

Date: September 13, 2023.


Key words and phrases. U -statistics, ergodic theorem, stationary sequences.
1
2 DAVIDE GIRAUDO

(1) If S is a separable metric space, h : S × S → R is a symmetric kernel that is bounded and


R
PX0 × PX0 -almost everywhere continuous, then, as n → ∞, U2,n,h → h (x, y) dPX0 (x) dPX0 (y)
almost surely.
(2) If S = Rd , the family {h (X1 , Xj ) , j > 1} is uniformly integrable, h is PX0 × · · · × PX0 almost
everywhere continuous and symmetric, then
 Z Z 
(1.3) lim E U2,n,h − h (x, y) dPX0 (x) dPX0 (y) = 0.
n→∞ Rd Rd
R R
(3) If S = Rd , the family {h (X1 , Xj ) , j > 1} is uniformly integrable, Rd Rd |h (x, y)| dPX0 (x) dPX0 (y)
is finite, the random variable X0 has a bounded density with respect to the Lebesgue measure
on Rd and for each k > 1, the vector (X0 , Xk ) has a density fk with respect to the Lebesgue
measure of Rd × Rd and supk>1 sups,t∈Rd fk (s, t) is finite, then (1.3) holds.
Such results lead us to consider the following extensions. The case of U -statistics of order two has
been adressed and we may want to extend these results to U -statistics of arbitrary order, especially
because such mathematical object is widely used in statistics, for instance in Lyons (2013) for the
distance covariance and stochastic geometry (see Lachièze-Rey and Reitzner (2016)). Moreover, it is a
natural question to see what happens in the non-ergodic case. It is natural to consider a decomposition
of Ω into ergodic components and use the results of the ergodic case to each of them. However, the
multiple integral expression of the limit does not give a simple expression. Moreover, the assumptions of
the ergodic case for each ergodic component, namely, almost everywhere continuity (for the product law
of the marginal distribution) of the kernel and and assumption on density of the vector (Xi1 , . . . , Xim )
does not seem to give a tractable condition. Instead, we will use the following approach: when h is
symmetric and bounded, the convergence of the considered U -statistic is view as the convergence of
random product measures toward a product of random measures (deterministic measures in the ergodic
case). When we make an assumption on the densit of (Xi1 , . . . , Xim ), we approximate h by linear
combinations of products of indicator functions. This approach has similarities with the one used in
Denker and Gordin (2014). The case of products of indicators follows then from an application of the
usual ergodic theore.
We will assume that the strictly stationary sequence is such that Xi = X0 ◦ T i , where T : Ω → Ω is
a measure preserving map. Since the convergences we will study will only involve the law of (Xi )i>1 ,
there is no loss of generality by assuming such a representation (see Cornfeld et al. (1982), page 178).
We will study the almost sure convergence of (Um,n,h )n>m and the convergence in Lp . We will denote
by kZkp := (E [|Z p |])1/p the norm of a real-valued random variable Z.
It turns out that the limit will be expressed as an integral with respect to products of a random
measure defined as follows:

(1.4) µω (A) = E [1X0 ∈A | I] (ω) , A ∈ B (S) ,

where I denotes the σ-algebra of invariant sets, that is, the sets E such that T −1 E = E. The limit of
U -statistics will be expressed as integral with respect to product measure of µω , which lead us to define
Z
(1.5) Im (S, h, ω) := h (x1 , . . . , xm ) dµω (x1 ) . . . dµω (xm ) .
Sm

We will also define as Im (S, h, ·) the random variable given by

(1.6) Im (S, h, ·) : ω 7→ Im (S, h, ω) .

Some assumption will be made on the set of discontinuity points of h, which will be denoted by D (h).
SOME NOTES ON ERGODIC THEOREMS FOR U -STATISTICS OF ORDER m 3

1.1. Almost sure convergence. Our first result deals with the almost sure convergence of a U -
statistic under the assumption of boundedness of the kernel and negligibility of the set of discontinuity
with respect to the product of the marginal law.

Theorem 1.1. Let (S, d) be a separable metric space, let (Xi )i∈Z be a strictly stationary sequence.
Suppose that h : S m → R satisfies the following assumptions:

(A.1.1) h is symmetric, that is, h xσ(1) , . . . , xσ(m) = h (x1 , . . . , xm ) for each x1 , . . . , xm ∈ S and each
bijective σ : J1, mK → J1, mK,
(A.1.2) h is bounded and
(A.1.3) for almost every ω ∈ Ω, Im (S, 1Dh , ω) = 0, where Dh denotes the set of discontinuity points of
h.
Then for almost every ω ∈ Ω, the following convergence holds:
(1.7) lim Um,n,h (ω) = Im (S, h, ω) ,
n→∞
where Im (S, h, ω) is defined as in (1.5).

This result extends Theorem 1 in Dehling et al. (2023) in two directions: first, the case of U -statistics
of arbitrary order are considered. Second, we address here the not necessarity ergodic case.
When (Xi )i>1 is ergodic, the measure µω is simply the distribution of X0 hence the right hand side of
h  i
(1) (m) (1) (m)
(1.7) can be simply expressed as E h X1 , . . . , X1 , where X1 , . . . , X1 are independent copies
of X1 .
The symmetry assumption is needed in order to relate Um,n,h to a sum over a rectangle and then see
the convergence in (1.7) as a convergence in distribution of product of random measures.
1.2. Convergence in Lp , p > 1. In this subsection, we present sufficient conditions for the convergence
in Lp of (Um,n,h )n>1 .
We start by mentioning the following consequence of Theorem 1.1.

Corollary 1.2. Let(S, d) be a separable metric space, let (Xi )i∈Z be a strictly stationary sequence and
let p > 1. Suppose that h : S m → R satisfies the following assumptions:

(A.1.1) h is symmetric, that is, h xσ(1) , . . . , xσ(m) = h (x1 , . . . , xm ) for each x1 , . . . , xm ∈ S and each
bijective σ : J1, mK → J1, mK,
(A.1.2) the family {|h (Xi1 , . . . , Xim )|p , 1 6 i1 < · · · < im } is uniformly integrable.
(A.1.3) the following integral is finite:
Z Z
(1.8) |h (x1 , . . . , xm )|p dµω (x1 ) . . . dµω (xm ) dP (ω) .
Ω Sm
(A.1.4) PX0 × · · · × PX0 (Dh ) = 0, where Dh denotes the set of discontinuity points of h and PX0 the
distribution of X0 .
Then the following convergence takes place:
(1.9) lim kUm,n,h − Im (S, h, ·)kp = 0,
n→∞
where Im (S, h, ·) is defined as in (1.6).

One can wonder what happens if we remove the symmetry assumption.



Theorem 1.3. Let (S, d) be a separable metric space, let X0 ◦ T i i∈Z be a strictly stationary sequence

taking values in S and let p > 1. Suppose that h : S 2 → R and X0 ◦ T i i∈Z satisfy the following
assumptions:
4 DAVIDE GIRAUDO

(A.2.1) the collection {|h (Xi , Xj )|p , 1 6 i < j} is uniformly integrable.


(A.2.2) for almost every ω ∈ Ω, Im (S, 1Dh , ω) = 0, where Dh denotes the set of discontinuity points of
h.
(A.2.3) the following integral is finite:
Z Z
(1.10) |h (x, y)|p dµω (x) dµω (y) dP (ω) .
Ω S2
Then the following convergence takes place:
(1.11) lim kU2,n,h − I2 (S, h, ·)kp = 0,
n→∞

where I2 (S, h, ·) is defined as in (1.6).

This improves Theorem 2 in Dehling et al. (2023) under assumption (A.1) in the paper, since we do
not require symmetry of the kernel.
One may wonder why we do not present a similar result for U -statistics of order m. A first idea
would be an argument by induction on the dimension. In order to perform the induction step, say from
m = 2 to m = 3, we would need to show, after a use of the weighted ergodic, the convergence in Lp of

n −1 P
2 16i<j6n h (X−j , X−i , X0 ). Since we assume uniform integrability, it suffices to show the almost
sure convergence, which could be established by seeing this almost sure convergence as that of a product
of random measures. But without symmetry, we do not know whether the almost sure convergence of
−1 P
the sequence of random measures n2 16i<j6n δ(X−j ,X−i ) takes place.
Let us now state a result on the convergence in Lp without imposing any continuity of the kernel,
but making assumptions on the distribution of the vectors (Xi1 , . . . , Xim ) .

Theorem 1.4. Let X0 ◦ T i i∈Z be a strictly stationary sequence taking values in Rd and let p > 1.
m 
Suppose that h : Rd → R and X0 ◦ T i i∈Z satisfy the following assumptions:
(A.3.1) the collection {|h (Xi1 , . . . , Xim )|p , 1 6 i1 < · · · < im } is uniformly integrable.
(A.3.2) for each (iℓ )ℓ∈J1,mK such that 1 6 i1 < · · · < im , the vector (Xi1 , . . . , Xim ) has a density fi1 ,...,im
and there exists a q0 > 1 such that
Z
(1.12) M1 := sup m
fi1 ,...,im (t1 , . . . , tm )q0 dt1 . . . dtm < ∞.
(iℓ )ℓ∈J1,mK :16i1 <···<im (Rd )

(A.3.3) For almost every ω, the measure µω defined as in (1.4) admits a density fω with respect to the
Lebesgue measure and there exists a set Ω′ having probability one and q1 > 1 for which
Z
(1.13) M2 := sup fω (t)q1 dt < ∞.
ω∈Ω′ Rd

(A.3.4) the following integral is finite:


Z Z
(1.14) m
|h (x1 , . . . , xm )|p dµω (x1 ) . . . dµω (xm ) dP (ω) .
Ω (Rd )

Then the following convergence hold:


 
d
(1.15) lim Um,n,h − Im R , h, · = 0,
n→∞ p

where Im Rd , h, · is defined as in (1.6).

Assumption (A.3.2) is needed in order to approximate h by a linear combination of indicator functions


of produts of Borel sets, uniformly with respect to the distribution of (Xi1 , . . . , Xim ).
SOME NOTES ON ERGODIC THEOREMS FOR U -STATISTICS OF ORDER m 5

Our Theorem 1.4 improves Theorem 2 in Dehling et al. (2023) under assumption (A.2) in the following
directions. First, we provide a result for U -statistics of arbitrary order. Second, the not necessarily
ergodic case is addressed. Third, even in the ergodic case, our assumption only require a uniform control
on the Lq1 norm of the densities instead of a uniform bound.

2. Proofs
2.1. Proof of Theorem 1.1. The symmetry assumption guarantees the following decomposition
X n X
1 1
(2.1) Um,n,h = n h (Xi1 , . . . , Xim ) + n  h (Xi1 , . . . , Xim ) ,
m! m i ,...,i =1 m
1 m (iℓ )ℓ∈J1,mK ∈Jn

where Jn denotes the set of elements (iℓ )ℓ∈J1,mK ∈ J1, nKm for which there exist at least two distinct
n
indexes ℓ and ℓ′ for which iℓ = iℓ′ . Since h is bounded and Card (Jn ) / m goes to 0 as n goes to infinity,
it suffices to prove that for almost every ω ∈ Ω,
X n Z
1
(2.2) lim m h (Xi1 (ω) , . . . , Xim (ω)) = h (x1 , . . . , xm ) dµω (x1 ) . . . dµω (xm ) ,
n→∞ n Sm
i1 ,...,im =1

where µω is defined as in (1.4). Observe that for each ω ∈ Ω,


Xn Z
1
(2.3) h (Xi1 (ω) , . . . , Xim (ω)) = h (x1 , . . . , xm ) dνn,ω (x1 ) . . . dνn,ω (xm ) ,
nm Sm
i1 ,...,im =1

where
n
1X
(2.4) νn,ω = δXi (ω) .
n
i=1

Separability of S guarantees the existence of a countable collection (fk )k>1 of continuous and bounded
functions from S to R such that a sequence (µn )n>1 of probability measures converges weakly to a
R
probability measure µ if and only if for each k > 1, fk dµn → inf fk dµ. By the ergodic theorem, we
know that for each k > 1, there exists a set Ωk having probability one for which the convergence
Z n Z
1X
(2.5) lim fk (x) dµn,ω = lim fk (Xj (ω)) = E [fk (X0 ) | I] (ω) = fk (x) dµω (x) .
n→∞ n→∞ n
j=1
T
holds for each ω ∈ Ωk . Therefore, for each ω belonging to the set of probability one Ω′ := k>1 Ωk , the
sequence (µn,ω )n>1 converges weakly to µω .
Recall that Theorem 3.2 (page 21) of Billingsley (1968) shows that if µn → µ and µ′n → µ′ in
distribution on metric spaces S1 and S2 respectively, then µn × µ′n → µ × µ′ in distribution on S1 × S2 .
Applying inductively this result and using assumptions (A.1.2) and (A.1.3) shows that for each ω ∈ Ω′ ,
(1.7) holds.

2.2. Proof of Corollary 1.2. Let hR be as in (2.13). Observe that assumption (A.1.3) guarantee that
Im (S, h, ·) defined as in (1.5) belongs to Lp . Moreover, the triangle inequality implies

kUm,n,h − Im (S, h, ·)kp 6 kUm,n,h − Um,n,hR kp +kUm,n,hR − Im (S, hR , ·)kp +kIm (S, h, ·) − Im (S, hR , ·)kp .

Using assumption (A.1.2) combined with the triangle inequality, one gets

lim sup kUm,n,h − Um,n,hR kp = 0.


R→∞ n>m
6 DAVIDE GIRAUDO

Moreover, assumption (A.1.3) combined with monotone convergence shows that

lim kIm (S, h, ·) − Im (S, hR , ·)kp = 0


R→∞

hence it suffices to show that for each fixed R > 0, kUm,n,hR Im (S, hR , ·)kp → 0 as n goes to infinity.
This follows from an application of Theorem 1.1 with h replaced by hR (note that continuity of φR
guarantees that D (hR ) ⊂ D (h)), which gives that Um,n,hR → YR almost surely and the dominated
convergence theorem allows to conclude.

2.3. Convergence of weighted averages. The proof of Theorems 1.3 and 1.4 rests on weighted
versions of the ergodic theorem, which read as follows.

Lemma 2.1. Let T be a measure preserving map on the probability space (Ω, F, P) and let I be the σ-
algebra of T invariance sets. Let p > 1 and let (fj )j>1 be a sequence of functions such that kfj − f kp → 0.
Then for each m > 0, the following convergence holds:
n
X  
1 j−1
(2.6) lim n  fj ◦ T j − E [f | I] = 0.
n→∞
m+1
m
j=m+1
p

Proof. First observe that since T is measure preserving,

n
X  
1 j−1
(2.7) n  fj ◦ T j − E [f | I]
m+1
m
j=m+1
p
n
X   n
X  
1 j−1 1 j−1
6 n
 kfj − f kp + n
 f ◦ T j − E [f | I] .
m+1
m m+1
m
j=m+1 j=m+1
p
P P 
The first term goes to zero as j goes to infinity from the elementary fact that ni=1 ci xi / n
j=1 cj → 0
Pn
if cj > 0 and j=1 cj → ∞. For the second term, we assume for without loss of generality that
P
E [f | I] = 0, otherwise, we replace f by f − E [f | I]. Let Sj := ji=1 f ◦ T i . Then
n
X   n
X   n−1  
X
j−1 j j −1 j
(2.8) f ◦T = Sj − Sj
m m m
j=m+1 j=m+1 j=m

and it follows that


n
X   n−1 n−1
X    
1 j−1 1 j j−1
(2.9) n
 f ◦ Tj 6 m 
n kSn kp + n
 − kSj kp .
m+1
m m+1 m+1
m m
j=m+1 j=m
p

Since kSn kp /n → 0, the first term of the right hand side of (2.9) goes to 0 as n goes to infinity. For the
second term, one has for each m 6 R 6 n that

n−1
X     R  
X  
1 j j−1 R j j−1
(2.10) n  − kSj kp 6 n  − kSj kp
m+1
m m m+1
m m
j=m j=m
n−1  
X  
1 j j−1 kSk kp
+ n  j − sup
m+1
m m k>R k
j=R
SOME NOTES ON ERGODIC THEOREMS FOR U -STATISTICS OF ORDER m 7

hence
n−1
X    
1 j j−1 kSk kp
(2.11) lim sup n  − kSj kp 6 sup
n→∞ m+1
m m k>R k
j=m
kSk kp
and we conclude using again that k → 0 as k goes to infinity. 

2.4. Proof of Theorem 1.3. The proofs will lead us to consider truncated versions of the kernel h.
Define for each fixed R > 0 the maps φR : R → R by


−R if t < −R,

(2.12) φR (t) := t if − R 6 t < R,


R if t > R
and hR : S m → R by
(2.13) hR (x1 , . . . , xm ) := φR (h (x1 , . . . , xm )) , x1 , . . . , xm ∈ S.
Then |hR | is bounded by R and since DhR ⊂ Dh , the equality PX0 × PX0 (DhR ) = 0 holds. We claim
that it suffices to prove that (1.11) holds for each R with h replaced by hR . Indeed, by the triangle
inequality,

(2.14) sup kU2,n,h − U2,n,hR kp 6 sup h (Xi , Xj ) 1|h(Xi ,Xj )|>R


n>2 16i<j p

and
Z Z Z p
(2.15) h (x, y) dµω (x) dµω (y) − hR (x, y) dµω (x) dµω (y) dP (ω)
Ω S2 S2
Z Z
6 |h (x, y)|p 1|h(x,y)|>R dµω (x) dµω (y) dP (ω) ,
Ω S2
hence assumptions (A.2.1) and (A.2.3) allows us to choose R making the previous quantities as small
as we wish. Defining
j−1
1X
(2.16) dj,R := hR (X−i , X0 ) ,
j
i=1
we get that
n
1 X
(2.17) U2,n,hR = n jdj,R ◦ T j .
2 j=2

We first show that there exists a set of probability one Ω′ such that for each ω ∈ Ω′ ,
Z
(2.18) dj,R (ω) → hR (x, y) dµω (x) dδX0 (ω) =: YR (ω) .
S2
First, separability of S guarantees the existence of a countable collection (fk )k>1 of continuous and
bounded functions from S to R such that a sequence (µn )n>1 of probability measures converges weakly
R
to a probability measure µ if and only if for each k > 1, fk dµn → inf fk dµ.
P
Taking µn,ω := n−1 ni=1 δX−i (ω) , the ergodic theorem furnishes for each k > 1 a set Ωk having
probability one for which the convergence
Z n Z
1X
(2.19) lim fk (x) dµn,ω = lim fk (Xj (ω)) = E [fk (X0 ) | I] (ω) = fk (x) dµω (x) .
n→∞ n→∞ n
j=1
8 DAVIDE GIRAUDO
T
holds for each ω ∈ Ωk . Consequently, for each ω ∈ Ω′ := k>1 Ωk , one has µn,ω → µω weakly in S
and by Theorem 3.2 (page 21) of Billingsley (1968), we get that µn,ω × δX0 (ω) → µω × δX0 (ω) weakly in
S 2 . Since hR is bounded and for almost every ω ∈ Ω, µω × δX0 (ω) (DhR ) = 0, we get (2.18). Moreover,
|dj,R | 6 R hence by dominated convergence,
Z
(2.20) lim dj,R − hR (x, y) dµω (x) dδX0 (ω) = 0.
j→∞ S2 p

By (2.17) and the fact that T is measure preserving, we infer that

n n
1 X 1 X
(2.21) kU2,n,hR − E [YR | I]kp 6 n j kdj,R − YR kp + n jYR ◦ T j − E [YR | I] .
2 j=2 2 j=2
p

An application of (2.18) combined with the dominated convergence theorem shows that the first term
of the right hand side of (2.21) goes to 0 as n goes to infinity. Then Lemma 2.1 with m = 1 shows that
kU2,n,hR − E [YR | I]kp → 0. It remains to check that
Z
(2.22) E [YR | I] (ω) = h (x, y) dµω (x) dµω (y) .
S2
Pn 
Observe that by the ergodic theorem, E [YR | I] (ω) = limn→∞ n−1 k=1 YR T k ω . Since µT k ω = µω ,
it follows that
n Z
1X
(2.23) E [YR | I] (ω) = lim hR (x, y) dµω (x) dδXk (ω) (y) .
n→∞ n 2
k=1 S
Pn 
Using similar arguments as before gives that for almost every ω, µω × n−1 k=1 δXk (ω) converges in
distribution to µω × µω . This ends the proof of Theorem 1.3.

2.5. Proof of Theorem 1.4. We start by proving Theorem 1.4 in the case where h (x1 , . . . , xm ) =
Qm d i

ℓ=1 1xℓ ∈Aℓ . We show by induction over m that if Aℓ , ℓ ∈ J1, mK are Borel subsets of R and X ◦ T i∈Z
a stationary sequence with invariance σ-algebra I, then

X m
Y m
Y
1
(2.24) lim n
 1Xiℓ ∈Aℓ − E [1X0 ∈Aℓ | I] = 0.
n→∞
m (iℓ )ℓ∈J1,mK ∈Incm
n ℓ=1 ℓ=1
p

The case m = 1 is a direct consequence of the ergodic theorem. Let us show the case m = 2. We start
from
n j−1
!
1 X 1 X 1X
(2.25) n 1Xi ∈A1 1Xj ∈A2 6 n j 1X−i ∈A1 1X0 ∈A2 ◦ T j
2 2
j
16i<j6n j=2 i=1

Pj−1
Let fj := 1j i=1 1X−i ∈A1 1X0 ∈A2 . By the ergodic theorem, fj → E [1X0 ∈A1 | I] 1X0 ∈A2 , which gives
(2.24) for m = 2.
Suppose now that (2.24) holds for each Borel subset A1 , . . . , Am of Rd and each strictly stationary
 
sequence X0 ◦ T i i∈Z . Let A1 , . . . , Am+1 be Borel subsets of Rd and let X0 ◦ T i i∈Z be a strictly
SOME NOTES ON ERGODIC THEOREMS FOR U -STATISTICS OF ORDER m 9

stationary sequence. We start from


1 X Y
(2.26) n
 1Xiℓ ∈Aℓ
m+1 (iℓ )ℓ∈J1,m+1K ∈Incm+1 ℓ∈J1,m+1K
n
 
n
X X Y
1
= n
 1X0 ∈Am+1 1Xiℓ −j ∈Aℓ  ◦ T j .
m+1 j=m+1 (iℓ )ℓ∈J1,mK ∈Incm
j−1 ℓ∈J1,mK

Define
1 X Y
(2.27) fj := j−1
 1X0 ∈Am+1 1Xiℓ −j ∈Aℓ .
m (iℓ )ℓ∈J1,mK ∈Incm
j−1 ℓ∈J1,mK

Doing the changes of index k1 = j − im , . . . , km = j − i1 , the previous expression can be rewritten as


1 X Y
(2.28) fj = 1X0 ∈Am+1 j−1 1X−kℓ ∈Am−ℓ+1
m (kℓ )ℓ∈J1,mK ∈Incm
j−1 ℓ∈J1,mK

and using the induction assumption, we derive that


Y  
(2.29) lim fj − 1X0 ∈Am+1 E 1X0 ∈Am−ℓ+1 | I .
j→∞
ℓ∈J1,mK p

Then we conclude by (2.6).


We now show (1.15) in the general case. Fix a positive ε and define for a positive K
m
Y
(K)
(2.30) h (x1 , . . . , xm ) = h (x1 , . . . , xm ) 1|h(x1 ,...,xm )|6K 1|xℓ |d 6K ,
ℓ=1

where |·|d denotes the Euclidean norm on Rd . Observe that by the triangle inequality,

(2.31) Um,n,h − Um,n,h(K) 6 sup h (Xi1 , . . . , Xim ) 1|h(Xi ,...,Xim )|>K p


p 16i1 <···<im 1

m
X
+ sup h (Xi1 , . . . , Xim ) 1|xℓ |d >K ,
16i1 <···<im p
ℓ=1

hence by assumption (A.3.1), we can find K ′ such that for each K > K ′ ,

(2.32) sup Um,n,h − Um,n,h(K ′ ) 6 ε.


n>m p

Moreover, by assumption (A.3.4), we can choose K ′′ such that for each K > K ′′ ,
Z  p 
(2.33) Im Rd , h − h(K) , ω dP (ω) 6 εp .

Let K0 = max {K ′ , K ′′ }. Observe that in assumptions (A.3.2) and (A.3.3) , we can assume without loss
of generality that q0 = q1 . By standard results in measure theory, we know that we can find an integer
J, constants c1 , . . . , cJ and Borel subsets Aℓ,j , ℓ ∈ J1, mK, j ∈ J1, JK such that
Z q0
p q −1 q
p 0
(2.34) m
h (K0 )
(x 1 , . . . , x m ) − e
h (K0 )
(x 1 , . . . , x m ) 0
dx1 . . . dxm < (M1 + M2 )p ε q0 −1 ,
( Rd )
where
J
X m
Y
(2.35) e
h(K0 ) (x1 , . . . , xm ) = cj 1xℓ ∈Aℓ,j .
j=1 ℓ=1
10 DAVIDE GIRAUDO

Notice that for each 1 6 i1 < · · · < im ,


p
(2.36) h(K0 ) (Xi1 , . . . , Xim ) − e
h(K0 ) (Xi1 , . . . , Xim )
p
Z
p
= m
h(K0 ) (x1 , . . . , xm ) − e
h(K0 ) (x1 , . . . , xm ) fi1 ,...,im (x1 , . . . , xm ) dx1 . . . dxm
( Rd )
hence using Hölder’s inequality, (2.34) and assumption (A.3.2), we derive that

(2.37) sup h(K0 ) (Xi1 , . . . , Xim ) − e


h(K0 ) (Xi1 , . . . , Xim ) 6ε
16i1 <···<im p

and by the triangle inequality,

(2.38) sup Um,N,h(K0 ) − Um,N,eh(K0 ) 6 ε.


N >m p

Moreover, using Hölder’s inequality, we find that


Z  p 
(2.39) Im Rd , h(K0 ) − eh(K0 ) , ω dP (ω) 6 εp .

As a consequence,
 
Um,n,h − Im Rd , h, · 6 sup Um,N,h − Um,N,h(K0 ) + sup Um,N,h(K0 ) − Um,N,eh(K0 )
p N >m p N >m p
     
+ Um,n,eh(K0 ) − Im Rd , e h(K0 ) , · + Im R d , e
h(K0 ) , · − Im Rd , h(K0 ) , ·
p p
   
d (K0 ) d
+ Im R , h , · − Im R , h, · .
p
  
By (2.24) and (2.35), we can find n0 such that for each n > n0 , Im Rd , e
h(K0 ) , · − Im Rd , h(K0 ) , · 6
 p
ε hence we derive that for such n’s, Um,n,h − Im Rd , h, · p 6 4ε. This ends the proof of Theorem 1.4.

References
J. Aaronson, R. Burton, H. Dehling, D. Gilat, T. Hill, and B. Weiss. Strong laws for L-
and U -statistics. Trans. Amer. Math. Soc., 348(7):2845–2866, 1996. ISSN 0002-9947. URL
https://doi.org/10.1090/S0002-9947-96-01681-9.
M. A. Arcones, The law of large numbers for U -statistics under absolute regularity, Electron. Comm.
Probab. 3 (1998), 13–19. MR 1624866
P. Billingsley. Convergence of probability measures. John Wiley & Sons Inc., New York, 1968.
S. Borovkova, R. Burton, and H. Dehling. From dimension estimation to asymptotics of dependent U -
statistics. In Limit theorems in probability and statistics, Vol. I (Balatonlelle, 1999), pages 201–234.
János Bolyai Math. Soc., Budapest, 2002.
I. P. Cornfeld, S. V. Fomin, and Ya. G. Sinaı̆. Ergodic theory, volume 245 of Grundlehren der Mathe-
matischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer-Verlag, New
York, 1982. ISBN 0-387-90580-4. URL http://dx.doi.org/10.1007/978-1-4615-6927-5. Trans-
lated from the Russian by A. B. Sosinskiı̆.
H. Dehling, D. Giraudo, and D. Volny. Some remarks on the ergodic theorem for U -statistics, 2023.
URL https://arxiv.org/abs/2302.04539, to appear in Comptes Rendus Mathématique.
H. G. Dehling and O. Sh. Sharipov, Marcinkiewicz-Zygmund strong laws for U -statistics of weakly
dependent observations, Statist. Probab. Lett. 79 (2009), no. 19, 2028–2036. MR 2571765
SOME NOTES ON ERGODIC THEOREMS FOR U -STATISTICS OF ORDER m 11

M. Denker and M. Gordin. Limit theorems for von Mises statistics of a measure preserving
transformation. Probab. Theory Related Fields, 160(1-2):1–45, 2014. ISSN 0178-8051. URL
https://doi.org/10.1007/s00440-013-0522-z.
D. Giraudo, Limit theorems for U -statistics of Bernoulli data, ALEA Lat. Am. J. Probab. Math. Stat.
18 (2021), no. 1, 793–828. MR 4243516
W. Hoeffding. A class of statistics with asymptotically normal distribution. Ann. Math. Statistics, 19:
293–325, 1948. ISSN 0003-4851. URL https://doi.org/10.1214/aoms/1177730196.
R. Lachièze-Rey and M. Reitzner. U -statistics in stochastic geometry. In Stochastic analysis for Poisson
point processes, volume 7 of Bocconi Springer Ser., pages 229–253. Bocconi Univ. Press, 2016.
R. Lyons. Distance covariance in metric spaces. Ann. Probab., 41(5):3284–3305, 2013. ISSN 0091-1798.
URL https://doi.org/10.1214/12-AOP803.
(†) Institut de Recherche Mathématique Avancée UMR 7501, Université de Strasbourg and CNRS 7 rue
René Descartes 67000 Strasbourg, France

You might also like