Introduction To Polynomials Chaos With NISP: Michael Baudin (EDF) Jean-Marc Martinez (CEA) January 2013
Introduction To Polynomials Chaos With NISP: Michael Baudin (EDF) Jean-Marc Martinez (CEA) January 2013
Abstract
This document is an introduction to polynomials chaos with the NISP module for Scilab.
Contents
1 Orthogonal polynomials 4
1.1 Definitions and examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Orthogonal polynomials for probabilities . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Hermite polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Legendre polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5 Laguerre polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.6 Chebyshev polynomials of the first kind . . . . . . . . . . . . . . . . . . . . . . . . 19
1.7 Accuracy of evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.8 Notes and references . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2 Multivariate polynomials 24
2.1 Occupancy problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.2 Multivariate monomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.3 Multivariate polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.4 Generating multi-indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.5 Multivariate orthogonal functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.6 Tensor product of orthogonal polynomials . . . . . . . . . . . . . . . . . . . . . . 32
2.7 Multivariate orthogonal polynomials and probabilities . . . . . . . . . . . . . . . . 36
2.8 Notes and references . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3 Polynomial chaos 39
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2 Truncated decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3 Univariate decomposition examples . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.4 Generalized polynomial chaos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.5 Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.6 Notes and references . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
1
4 Thanks 48
A Gaussian integrals 48
A.1 Gaussian integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
A.2 Weighted integral of xn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Bibliography 51
2
Copyright
c 2013 - Michael Baudin
This file must be used under the terms of the Creative Commons Attribution-ShareAlike 3.0
Unported License:
http://creativecommons.org/licenses/by-sa/3.0
3
1 Orthogonal polynomials
In this section, we define the orthogonal polynomials which are used in the context of polynomial
chaos decomposition. We first present the weight function which defines the orthogonality of the
square integrable functions. Then we present the Hermite, Legendre, Laguerre and Chebyshev
orthogonal polynomials.
Example (Weight function for Hermite polynomials) The weight function for Hermite polyno-
mials is
2
x
w(x) = exp − , (1)
2
for x ∈ R. This function is presented in the figure 1. Clearly, this function is differentiable and
nonnegative, since exp(−x2 ) > 0, for any finite x. The integral
Z √
exp(−x2 /2)dx = 2π (2)
R
is called the Gaussian integral (see the proposition A.1 for the proof).
Definition 1.2. ( Weighted L2 space in R) Let I be an interval in R and let L2w (I) be the set of
functions g which are square integrable with respect to the weight function w, i.e. such that the
integral
Z
kgk = g(x)2 w(x)dx
2
(3)
I
Example (Functions in L2w (R)) Consider the Gaussian weight w(x) = exp(−x2 /2), for x ∈ R.
Obviously, the function g(x) = 1 is in L2w (R).
Now consider the function
g(x) = x (4)
for x ∈ R. The proposition A.3 (see in appendix) proves that
Z √
x2 exp(−x2 /2)dx = 2π. (5)
R
4
Figure 1: The Gaussian weight function f (x) = exp(−x2 /2).
Definition 1.3. ( Inner product in L2w (I) space) For any g, h ∈ L2w (I), the inner product of g
and h is
Z
hg, hi = g(x)h(x)w(x)dx. (6)
I
Assume that g ∈ We can combine the equations 3 and 6, which implies that the L2w (I)
L2w (I).
norm of g can be expressed as an inner product:
kgk2 = hg, gi . (7)
Example (Inner product in L2w (R)) Consider the Gaussian weight w(x) = exp(−x2 /2), for x ∈ R.
Then consider the function g(x) = 1 and h(x) = x. We have
Z Z
g(x)h(x)w(x)dx = x exp(−x2 /2)dx (8)
R R
Z Z
2 2
= x exp(−x /2)dx + x exp(−x /2)dx . (9)
x≤0 x≥0
However,
Z Z
2
x exp(−x /2)dx = − x exp(−x2 /2)dx, (10)
x≤0 x≥0
by symmetry. Therefore,
Z
g(x)h(x)w(x)dx = 0, (11)
R
5
The L2w (I) space is a Hilbert space.
Definition 1.4. ( Orthogonality in L2w (I) space) Let I be an interval of R. Let g and h be two
functions in L2w (I). The functions g and h are orthogonal if
hg, hi = 0. (12)
hPi , Pj i = 0 (14)
for i 6= j.
We assume that, for all orthogonal polynomials, the degree zero polynomial P0 is equal to one:
P0 (x) = 1, (15)
for any x ∈ I.
Proposition 1.7. ( Integral of orthogonal polynomials) Let {Pn }n≥0 be orthogonal polynomials.
We have
Z Z
P0 (x)w(x)dx = w(x)dx (16)
I I
Proof. The equation 16 is the straightforward consequence of 15. Moreover, for any n ≥ 1, we
have
Z Z
Pn (x)w(x)dx = P0 (x)Pn (x)w(x)dx (18)
I I
= hP0 (x), Pn (x)i (19)
= 0, (20)
6
Figure 2: The function Hen (x)w(x), for n = 0, 1, ..., 5, and the Gaussian weight w(x) =
2
exp(−x /2).
Example (Distribution function for Hermite polynomials) The distribution function for Hermite
polynomials is
2
1 x
f (x) = √ exp − , (24)
2π 2
7
Figure 3: The Gaussian distribution function f (x) = √1 exp(−x2 /2).
2π
Proposition 1.9. ( Expectation of orthogonal polynomials) Let {Pn }n≥0 be orthogonal polyno-
mials. Assume that X is a random variable associated with the probability distribution function
f , derived from the weight function w. We have
where the first equation derives from the equation 21, and the second equality is implied by
17.
8
Proposition 1.10. ( Variance of orthogonal polynomials) Let {Pn }n≥0 be orthogonal polynomials.
Assume that x is a random variable associated with the probability distribution function f , derived
from the weight function w. We have
Proof. The equation 33 is implied by the fact that P0 is a constant. Moreover, for n ≥ 1, we have:
Figure 4: Histogram of N = 10000 outcomes of Hen (X), where X is a standard normal random
variable, for n = 0, 1, ..., 5.
Example In order to experimentally check the propositions 1.9 and 1.10, we consider Hermite
polynomials. We generate 10000 pseudo random outcomes of a standard normal variable X, and
9
compute Hen (X). The figure 4 presents the empirical histograms of Hen (X), for n = 0, 1, ..., 5.
The histogram for n = 0 is centered on X = 1, since He0 (x) = 1. This confirms that E(He0 (X)) =
1 and V (He0 (X)) = 0. The histograms for n = 1, 3, 5 are symmetric, which is consistent with
the fact that E(Hen (X)) = 0, for n ≥ 1. The following Scilab session presents more detailed
numerical results. The first column prints n, the second column prints the empirical mean, the
third column prints the empirical variance and the last column prints n!, which is the exact value
of the variance.
n mean ( X ) variance ( X ) n!
0. 1. 0. 1.
1. 0.0041817 0.9978115 1.
2. - 0.0021810 2.0144023 2.
3. 0.0179225 6.1480027 6.
4. 0.0231110 24.483042 24.
5. - 0.0114358 112.13277 120.
Proposition 1.11. Let {Pn }n≥0 be orthogonal polynomials. Assume that x is a random variable
associated with the probability distribution function f , derived from the weight function w. For
two integers i, j ≥ 0, we have
if i 6= j, and
Proof. We have
Z
E(Pi (X)Pj (X)) = Pi (x)Pj (x)f (x)dx (41)
I Z
1
= R Pi (x)Pj (x)w(x)dx (42)
I
w(x)dx I
hPi , Pj i
= R . (43)
I
w(x)dx
If i 6= j, the orthogonality of the polynomials implies 39. If, on the other hand, we have i = j,
then
hPi , Pi i
E(Pi (X)2 ) = R (44)
I
w(x)dx
kPi k2
= R . (45)
I
w(x)dx
We then use the equation 34, which leads to the equation 40.
In the next sections, we review a few orthogonal polynomials which are important in the
context of polynomial chaos. The table 5 summarizes the results.
10
Distrib. Support Poly. w(x) f (x) 2 kPn k2 V (Pn )
x2
√
N (0, 1) R Hermite exp − 2 √1 exp − x 2πn! n!
2π 2
1 2 1
U(−1, 1) [−1, 1] Legendre 1 2 2n+1 2n+1
E(1) R+ Laguerre exp(−x) exp(−x) 1 1
The distribution function for Hermite polynomials is the standard Normal distribution:
2
1 x
f (x) = √ exp − , (48)
2π 2
for x ∈ R.
Here, we consider the probabilist polynomials Hen , as opposed to the physicist polynomials
Hn [17].
The first Hermite polynomials are
for n = 1, 2, ....
Similarily,
11
He0 (x) = 1
He1 (x) = x
He2 (x) = x2 − 1
He3 (x) = x3 − 3x
He4 (x) = x4 − 6x2 + 3
He5 (x) = x5 − 10x3 + 15x
12
n c0 c1 c2 c3 c4 c5 c6 c7 c8 c9
0 1
1 1
2 -1 1
3 -3 1
4 3 -6 1
5 15 -10 1
6 -15 45 -15 1
7 -105 105 -21 1
8 105 -420 210 -28 1
9 945 -1260 378 -36 1
The Hermite polynomials are orthogonal with respect to the weight w(x). Moreover,
√
kHen k2 = 2πn!. (59)
Hence,
V (Hen (X)) = n!, (60)
for n ≥ 1, where X is a standard normal random variable.
The following HermitePoly Scilab function creates the Hermite polynomial of order n.
function y = HermitePoly ( n )
if ( n ==0) then
y = poly (1 , " x " ," coeff " )
elseif ( n ==1) then
y = poly ([0 1] , " x " ," coeff " )
else
polyx = poly ([0 1] , " x " ," coeff " )
// y (n -2)
yn2 = poly (1 , " x " ," coeff " )
// y (n -1)
yn1 = polyx
for k =2: n
y = polyx * yn1 -( k -1)* yn2
yn2 = yn1
yn1 = y
end
end
endfunction
13
The script:
for n =0:10
y = HermitePoly (x , n );
disp ( y )
end
produces the following output:
1
2
- 1 + x
3
- 3x + x
2 4
3 - 6x + x
3 5
15 x - 10 x + x
7
14
P0 (x) = 1
P1 (x) = x
P2 (x) = 12 (3x2 − 1)
P3 (x) = 12 (5x3 − 3x)
P4 (x) = 18 (35x4 − 30x2 + 3)
P5 (x) = 18 (63x5 − 70x3 + 15x)
which implies
1
P2 (x) = (3x2 − 1). (69)
2
Similarily,
15
Figure 10: The Legendre polynomials Pn , for n = 0, 1, 2, 3.
function y = LegendrePoly ( n )
if ( n ==0) then
y = poly (1 , " x " ," coeff " )
elseif ( n ==1) then
y = poly ([0 1] , " x " ," coeff " )
else
polyx = poly ([0 1] , " x " ," coeff " )
// y (n -2)
yn2 = poly (1 , " x " ," coeff " )
// y (n -1)
yn1 = polyx
for k =2: n
y =((2* k -1)* polyx * yn1 -( k -1)* yn2 )/ k
yn2 = yn1
yn1 = y
end
end
endfunction
16
for x ≥ 0.
The figure 11 plots the exponential weight function.
We have:
Z +∞
w(x)dx = 1. (79)
0
L0 (x) = 1, (81)
L1 (x) = −x + 1 (82)
for n = 1, 2, ....
The first Laguerre polynomials are presented in the figure 12.
17
L0 (x) = 1
L1 (x) = −x + 1
L2 (x) = 2!1 (x2 − 4x + 2)
L3 (x) = 3!1 (−x3 + 9x2 − 18x + 6)
L4 (x) = 4!1 (x4 − 16x3 + 72x2 − 96x + 24)
L5 (x) = 5!1 (−x5 + 25x4 − 200x3 + 600x2 − 600x + 120)
Hence,
1 2
L2 (x) = (x − 4x + 2). (88)
2
Similarily,
which implies
1
L3 (x) = (−x3 + 9x2 − 18x + 6). (94)
6
The figure 13 plots the Laguerre polynomials Ln , for n = 0, 1, 2, 3.
The Laguerre polynomials are orthogonal with respect to w(x) = exp(−x). Moreover,
kLn k2 = 1. (95)
Furthermore,
18
Figure 13: The Laguerre polynomials Ln , for n = 0, 1, 2, 3.
// y (n -2)
yn2 = poly (1 , " x " ," coeff " )
// y (n -1)
yn1 =1 - polyx
for k =2: n
y =((2* k -1 - polyx )* yn1 -( k -1)* yn2 )/ k
yn2 = yn1
yn1 = y
end
end
endfunction
19
T0 (x) = 1
T1 (x) = x
T2 (x) = 2x2 − 1
T3 (x) = 4x3 − 3x
T4 (x) = 8x4 − 8x2 + 1
T5 (x) = 16x5 − 20x3 + 5x
20
Figure 15: The Chebyshev polynomials Tn , for n = 0, 1, 2, 3.
21
yn2 = yn1
yn1 = y
end
end
endfunction
Consider the value of the polynomial L100 at the point x = 10. From Wolfram Alpha, which
uses an arbitrary precision computer system, the exact value of L100 (10) is
13.277662844303454137789892644070318985922997400185621...
22
Figure 16: The number of common digits in the evaluation of Laguerre polynomials, for increasing
n and x.
S = y1 + y2 + ... + yn .
We know from numerical analysis [8] that the condition number of a sum is
23
Cell 1 Cell 2 Cell 3
1 | *** | - | - |
2 | - | *** | - |
3 | - | - | *** |
4 | ** | * | - |
5 | ** | - | * |
6 | * | ** | - |
7 | - | ** | * |
8 | * | - | ** |
9 | - | * | ** |
10 | * | * | * |
Figure 17: Placing d = 3 balls into p = 3 cells. The balls are represented by stars ∗.
In the appendices of [11], we can find a presentation of the most common orthogonal polyno-
mials used in uncertainty analysis.
2 Multivariate polynomials
In this section, we define the multivariate polynomials which are used in the context of polynomial
chaos decomposition. We consider the regular polynomials as well as multivariate orthogonal
polynomials.
α1 + α2 + ... + αp = d. (107)
Example (Case where d = 8 and p = 6) Consider the case where there are d = 8 balls to
distribute into p = 6 cells. We represent the balls with stars ”*” and the cells are represented by
the spaces between the p + 1 bars ”|”. For example, the string ”|***|*| | | |****|” represents the
configuration where α = (3, 1, 0, 0, 0, 4).
Proposition 2.1. ( Occupancy problem) The number of ways to distribute d balls into p cells is:
p p+d−1
= . (108)
d d
The left hand side of the previous equation is the multiset coefficient.
24
Proof. Within the string, there is a total of p+1 bars : the first and last bars, and the intermediate
p − 1 bars. We are free to move only the p − 1 bars and the d stars, for a total of p + d − 1 symbols.
Therefore, the problem reduces to finding the place of the d stars in a string of p + d − 1 symbols.
From combinatorics, we know that the number of ways to choose k items in a set of n items is
n n!
= , (109)
k k!(n − k)!
for any nonnegative integers n and k, and k ≤ n. Hence, the number of ways to choose the place
of the d stars in a string of p + d − 1 symbols is defined in the equation 108.
for any x ∈ Rp .
Definition 2.3. ( Degree of a multivariate monomial) Let M (x) be a monomial associated with
the exponents α, for any x ∈ Rp . The degree of M is
d = α1 + α2 + ... + αp . (113)
The degree d of a multivariate monomial is also denoted by |α|.
The figure 18 presents the 10 monomials with degree d = 2 and p = 4 variables.
Proposition 2.4. ( Number of multivariate monomials) The number of degree d multivariate
monomials of p variables is:
p p+d−1
= . (114)
d d
Proof. The problem can be reduced to distributing d balls into p cells: the i-th cell represents
the variable xi where we have to put αi balls. It is then straightforward to use the proposition
2.1.
The figure 19 presents the number of monomials for various d and p.
25
α1 α2 α3 α4
2 0 0 0
1 1 0 0
1 0 1 0
1 0 0 1
0 2 0 0
0 1 1 0
0 1 0 1
0 0 2 0
0 0 1 1
0 0 0 2
for any x ∈ Rp .
We can plug the equation 112 into 115 and get
p
X Y
P (x) = βα xαi i , (116)
|α|≤d i=1
for any x ∈ Rp .
Example (Case where d = 2 and p = 3) Consider the degree d = 2 multivariate polynomial with
p = 3 variables:
P (x1 , x2 , x3 ) = 4 + 2x3 + 5x21 − 3x1 x2 x3 (117)
for any x ∈ R3 . The relevant multivariate monomials exponents α are:
(0, 0, 0), (0, 0, 1), (118)
(1, 0, 0), (1, 1, 1). (119)
26
The only nonzero coefficients β are:
P0 (x) = 1. (123)
multivariate polynomials of degree d = 0. This proves that the equation 122 is true for d = 0.
The polynomials of degree d = 1 are:
Pi (x) = xi , (125)
for any x ∈ Rp . Hence, the number of multivariate polynomials of degree d + 1 is the sum of
the number of multivariate polynomials of degree d and the number of multivariate monomials
of degree d + 1. From the equation 114, the number of such monomials is
p p+d
= . (129)
d+1 d+1
27
p\d 0 1 2 3 4 5 6
1 1 2 3 4 5 6 7
2 1 3 6 10 15 21 28
3 1 4 10 20 35 56 84
4 1 5 15 35 70 126 210
5 1 6 21 56 126 252 462
6 1 7 28 84 210 462 924
for any nonnegative integers n and k. We apply the previous equality with n = p+d and k = d+1,
and get
p+d p+d p+d+1
+ = , (132)
d+1 d d+1
which proves that the equation 122 is also true for d + 1, and concludes the proof.
The figures 20 and 21 present the number of polynomials for various d and p.
One possible issue with the equation 115 is that it does not specify a way of ordering the Pdp
multivariate polynomials of degree d. However, we will present in the next section a constructive
way of ordering the multi-indices is such a way that there is a one-to-one mapping from the single
index k, in the range from 1 to Pdp , to the corresponding multi-index α(k) . With this ordering,
the equation 115 becomes
p
Pd
X
P (x) = βk Mα(k) (x), (133)
k=1
for any x ∈ Rp .
Example (Case where d = 2 and p = 2) Consider the degree d = 2 multivariate polynomial with
p = 2 variables:
P (x1 , x2 ) = β(0,0) + β(1,0) x1 + β(0,1) x2 + β(2,0) x21 + β(1,1) x1 x2 + β(0,2) x22 (134)
for any x1 , x2 ∈ R2 .
28
Figure 21: Number of degree d polynomials with p variables.
29
end
for j =1: p
for m =L - pmat (j , k )+1: L
P = P +1
a (P ,1: p )= a (m ,1: p )
a (P , j )= a (P , j )+1
end
end
end
endfunction
For example, in the following session, we compute the list of multi-indices corresponding to
multivariate polynomials of degree d = 3 with p = 3 variables.
-->p =3;
-->d =3;
-->a = polymultiindex (p , d )
a =
0. 0. 0.
1. 0. 0.
0. 1. 0.
0. 0. 1.
2. 0. 0.
1. 1. 0.
1. 0. 1.
0. 2. 0.
0. 1. 1.
0. 0. 2.
3. 0. 0.
2. 1. 0.
2. 0. 1.
1. 2. 0.
1. 1. 1.
1. 0. 2.
0. 3. 0.
0. 2. 1.
0. 1. 2.
0. 0. 3.
There are 20 rows in the previous matrix a, which is consistent with the table 20. In the
previous session, the first row correspond to the zero-th degree polynomial, the rows 2 to 4
corresponds to the degree one monomials, the rows from 5 to 10 corresponds to the degree 2
monomials, whereas the rows from 11 to 20 corresponds to the degree 3 monomials.
30
where I1 , I2 ,..., Ip are intervals of R. We denote this tensor product by:
I = I1 ⊗ I2 ⊗ ... ⊗ Ip . (137)
Now that we have multivariate intervals, we can consider a multivariate weight function on
such an interval.
Such a weight can be created by making the tensor product of univariate weights.
Proposition 2.9. ( Tensor product of univariate weights) Let I1 , I2 , ..., Ip be intervals of R. As-
sume that w1 , w2 , ..., wp are univariate weights on I1 , I2 , ..., Ip . Therefore, the tensor product func-
tion
However, all the individual integrals are finite, since each function wi is, by hypothesis, a weight
on Ii , for i = 1, 2, ..., p. Hence, the product is finite, so that the function w is, indeed, a weight
on I.
Example (Multivariate Gaussian weight) In the particular case where all weights are the same,
the expression simplifies further. Consider, for example, the Gaussian weight
for x1 , x2 , ..., xp ∈ R.
Example (Multivariate Hermite-Legendre weight) Consider the weight for Hermite polynomials
w1 (x1 ) = exp(−x21 /2) for x1 ∈ R, and consider the weight for Legendre polynomials w2 (x2 ) = 1
for x2 ∈ [−1, 1]. Therefore,
31
In the remaining of this document, we will not make a notation difference between the uni-
variate weight function w(x) and the multivariate weight w(x), assuming that it is based on a
tensor product if needed.
Definition 2.10. ( Multivariate weighted L2 space in Rp ) Let I be an interval in Rp . Let w be a
multivariate weight function on I. Let L2w (I) be the set of functions g which are square integrable
with respect to the weight function w, i.e. such that the integral
Z
kgk = g(x)2 w(x)dx
2
(144)
I
Let I be an interval of Rp and assume that g ∈ L2w (I). We can combine the equations 144
and 145, which implies that the L2w (I) norm of g can be expressed as an inner product:
kgk2 = hg, gi . (146)
Proposition 2.12. ( Multivariate probability distribution function) Let I be an interval of Rp .
Let w be a multivariate weight function on I. Assume {Xi }i=1,2,...,p are independent random
variables associated with the probability distribution functions fi , derived from the weight functions
wi , for i = 1, 2, ..., p. Therefore, the function
f (x) = f1 (x1 )f2 (x2 )...fp (xp ), (147)
for x ∈ I is a probability distribution function.
Proof. We must prove that the integral of f is equal to one. Indeed,
Z Z
f (x)dx = f1 (x1 )f2 (x2 )...fp (xp )dx1 dx2 ...dxp (148)
I I
Z Z Z !
= f1 (x1 )dx1 f2 (x2 )dx2 ... fp (xp )dxp (149)
I1 I2 Ip
= 1, (150)
where each integral is equal to one since, by hypothesis, fi is a probability distribution function
for i = 1, 2, ..., p.
32
associated with the multivariate Gaussian weight w(x).
We can make the tensor product of some Hermite polynomials, which leads, for example, to:
associated with the multivariate Gaussian weight w(x). These polynomials are of degree 2. We
may wonder if these two polynomials are orthogonal. We have
hΨ1 , Ψ2 i (154)
Z
= Ψ1 (x1 , x2 )Ψ2 (x1 , x2 )w(x1 )w(x2 )dx1 dx2 (155)
R 2
Z
= He1 (x1 )2 He2 (x2 )w(x1 )w(x2 )dx1 dx2 (156)
2
RZ Z
2
= He1 (x1 ) w(x1 )dx1 He2 (x2 )w(x2 )dx2 . (157)
R R
Hence,
hΨ1 , Ψ2 i = 0, (160)
Definition 2.13. ( Tensor product of orthogonal polynomials) Let I be a tensor product interval
of Rp and let w the associated multivariate tensor product weight function. Let φα(k) (xi ) be a family
i
of univariate orthogonal polynomials. The associated multivariate tensor product polynomials are
p
Y
Ψk (x) = φα(k) (xi ) (164)
i
i=1
33
d Multi-Index Polynomial
0 α(1) = [0, 0] Ψ1 (x) = He0 (x1 )He0 (x2 ) = 1
1 α(2) = [1, 0] Ψ2 (x) = He1 (x1 )He0 (x2 ) = x1
1 α(3) = [0, 1] Ψ3 (x) = He0 (x1 )He1 (x2 ) = x2
2 α(4) = [2, 0] Ψ4 (x) = He2 (x1 )He0 (x2 ) = x21 − 1
2 α(5) = [1, 1] Ψ5 (x) = He1 (x1 )He1 (x2 ) = x1 x2
2 α(6) = [0, 2] Ψ6 (x) = He0 (x1 )He2 (x2 ) = x22 − 1
3 α(7) = [3, 0] Ψ7 (x) = He3 (x1 )He0 (x2 ) = x31 − 3x1
3 α(8) = [2, 1] Ψ8 (x) = He2 (x1 )He1 (x2 ) = (x21 − 1)x2
3 α(9) = [1, 2] Ψ9 (x) = He1 (x1 )He2 (x2 ) = x1 (x22 − 1)
3 α(10) = [0, 3] Ψ10 (x) = He0 (x1 )He3 (x2 ) = x32 − 3x2
Figure 23: The multivariate Hermite polynomials, with degree d = 2 and p = 2 variables.
34
Example (Multivariate Hermite polynomials) The figure 22 presents the multivariate Hermite
polynomials of p = 2 variables and degree d = 3. The figure 23 presents the multivariate Hermite
polynomials, with degree d = 2 and p = 2 variables.
Proposition 2.14. ( Multivariate orthogonal polynomials) The multivariate tensor product poly-
nomials defined in 2.13 are orthogonal.
Proof. We must prove that, for two different integers k and `, we have
hΨk , Ψ` i = 0. (166)
By the definition of the inner product on L2w (I), we have
Z
hΨk , Ψ` i = Ψk (x)Ψ` (x)w(x)dx (167)
I
By assumption, the multivariate weight function w on I is the tensor product of univariate weight
functions wi on Ii . Hence,
Z Yp p p
Y Y
hΨk , Ψ` i = φα(k) (xi ) φα(`) (xi ) wi (xi )dx (168)
i i
I i=1 i=1 i=1
p Z
Y
= φα(k) (xi )φα(`) (xi )wi (xi )dxi . (169)
i i
i=1 Ii
In other words,
p D E
Y
hΨk , Ψ` i = φα(k) , φα(`) . (170)
i i
i=1
However, the multi-indice ordering implies that, if k 6= `, then there exists an integer i ∈
{1, 2, ..., p}, such that
(k) (`)
αi 6= αi . (171)
By assumption, the φα(k) (xi ) polynomials are orthogonal. This implies
i
D E
φα(k) , φα(`) = 0, (172)
i i
35
Proposition 2.15. ( Norm of multivariate orthogonal polynomials) The L2w (I) norm of the mul-
tivariate orthogonal polynomials defined in 2.13 is:
p
Y
2
kΨk k = kφα(k) k2 . (176)
i
i=1
Example (Norm of the multivariate Hermite polynomials) Consider the multivariate Hermite
polynomials in the case where p = 2. The figure 22 indicates that
The norm of the univariate Hermite polynomial is given by the equation 59. Hence,
√ √
kΨ8 k2 = 2π · 2! · 2π · 1! (179)
= 4π. (180)
and
for k > 1.
36
Proof. Indeed,
Z
E(Ψ1 (X)) = Ψ1 (x)f (x)dx (185)
ZI
= f (x)dx (186)
I
= 1, (187)
(k)
if αi = 0 and
E φα(k) (X) = 0 (193)
i
(k)
if αi ≥ 1.
(k)
Since k > 1, there is at least one integer i ∈ {1, 2, ..., p} such that αi ≥ 1. If this was not
(k)
true, this would imply that αi = 0, for i = 1, 2, ..., p. This implies that k = 1, which contradicts
the hypothesis.
Therefore, there is at least one integer i such that the expectation is zero, so that the product
in the equation 191 is also zero.
Proposition 2.17. ( Variance of multivariate orthogonal polynomials) Under the same hypotheses
as in proposition 2.16, we have
and
p
Y
V (Ψk (X)) = E φα(k) (Xi )2 , (195)
i
i=1
for k > 1.
37
Proof. Obviously, the random variable Ψ1 (X) = 1 has a zero variance. Furthermore, for k > 1,
we have
V (Ψk (X)) = E (Ψk (X) − E(Ψk (X)))2
(196)
2
= E Ψk (X) , (197)
since E(Ψk (X)) = 0. We then use the tensor product definition of both Ψk and f , and finally get
to the equation 195.
For k > 1, we have
2
E φα(k) (Xi )2 = V φα(k) (Xi ) + E φα(k) (Xi ) . (198)
i i i
Therefore, based on the propositions 1.9 and 1.10, we can use the equation 195 to evaluate the
variance of Ψk .
Proposition 2.18. ( Covariance of multivariate orthogonal polynomials) Under the same hy-
potheses as in proposition 2.16, for any two integers k and `, we have
Cov(Ψk (X), Ψ` (X)) = 0, (199)
if k 6= `.
Proof. Assume that k > 1. Therefore,
Cov(Ψ1 (X), Ψk (X)) = E ((Ψ1 (X) − µ1 ) (Ψk (X) − µk )) , (200)
where µ1 = E(Ψ1 (X)) and µk = E(Ψk (X)). However, we have Ψ1 (X) = E(Ψ1 (X)) = 1. Hence,
Cov(Ψ1 (X), Ψk (X)) = 0. (201)
The same is true if we consider
Cov(Ψk (X), Ψ1 (X)) = Cov(Ψ1 (X), Ψk (X)) (202)
= 0, (203)
by the symmetry property of the covariance.
Assume now that k and ` are two integers such that k, ` > 1. We have,
Cov(Ψk (X), Ψ` (X)) = E ((Ψk (X) − µk ) (Ψ` (X) − µ` )) (204)
= E (Ψk (X)Ψ` (X)) , (205)
since, by the proposition 2.16, we have µk = µ` = 0. Hence,
Z
Cov(Ψk (X), Ψ` (X)) = Ψk (x)Ψ` (x)f (x)dx (206)
I Z
1
= R Ψk (x)Ψ` (x)w(x)dx (207)
I
w(x)dx I
1
= R hΨk , Ψ` i (208)
I
w(x)dx
by the definition of the weight function w. By the proposition 2.14, the polynomials Ψk and Ψ`
are orthogonal, which concludes the proof.
38
2.8 Notes and references
The stars and bars proof used in section 2.1 is presented in Feller’s book [7], in the section
”Application to occupancy problems” of the chapter 2 ”Elements of combinatorial analysis”. The
book [11] and the thesis [10] present spectral methods, including the multivariate orthogonal
polynomials involved in polynomial chaos. The figures 22 and 23 are presented in several papers
and slides related to multivariate orthogonal polynomials, including [12], for example.
3 Polynomial chaos
3.1 Introduction
The polynomial chaos, introduced by Wiener [16], uses Hermite polynomials as the basis and
involves independent Gaussian random variables.
Denote [4, 13] the set of multi-indices with finite number of nonzero components as
where
∞
X
|α| = αi . (210)
i=1
If α ∈ J , then there is only a finite number of nonzero components (otherwise |α| would be
infinite).
Let {Xi }i≥0 be an infinite set of independent standard normal random variables and let f be
the associated multivariate tensor product normal distribution function. Assume that g(X) is a
random variable with finite variance. We are interested in the decomposition of g(X) onto Heαi ,
the Hermite polynomial of degree αi ≥ 0.
The expectation of g(X) is
Z
E(g(X)) = g(x)f (x)dx, (211)
where µ = E(g(X)).
For any α ∈ J , the Wick polynomial is
∞
Y
Ψα (X) = Heαi (Xi ). (213)
i=1
The degree of the polynomial Tα is |α|. Notice that, since there is only a finite number of nonzero
components in α, the right hand side of the expression 213 has a finite number of factors.
We consider the inner product
Z
hg, Ψα i = g(x)Ψα (x)f (x)dx. (214)
39
We use the norm
Z
2
kΨα k = Ψα (x)2 f (x)dx. (215)
where X = (X1 , X2 , ..., Xp ) are independent standard normal random variables. There is a one-
to-one mapping from the multi-indices of α of the set Jp,d and the indices k = 1, 2, ..., Pdp defined
in the section 2.4. Therefore, the decomposition 216 can be written:
p
Pd
X
g(X) ≈ ak Ψk (X), (222)
k=1
40
Proposition 3.2. ( Expectation and variance of the truncated PC expansion) The truncated
expansion 222 is such that
E(g(X)) = a1 (223)
Pdp
X
V (g(X)) = a2k V (Ψk (X)). (224)
k=2
Since the expansion 222 involves only a finite number of terms, it is relatively easy to prove
the previous proposition.
Proof. The expectation of g(X) is
p
Pd
X
E(g(X)) = E ak Ψk (X) (225)
k=1
p
Pd
X
= E(ak Ψk (X)) (226)
k=1
p
Pd
X
= ak E(Ψk (X)), (227)
k=1
since the expectation of a sum is the sum of expectations. Then, the equation 223 is a straight-
forward consequence of the proposition 2.16.
The variance of g(X) is
p
Pd
X
V (g(X)) = V ak Ψk (X) (228)
k=1
Pdp Pd p
X X
= V (ak Ψk (X)) + Cov(ak Ψk (X), a` Ψ` (X)) (229)
k=1 k,`=1
k6=`
p p
Pd Pd
X X
= a2k V (Ψk (X)) + ak a` Cov(Ψk (X), Ψ` (X)). (230)
k=1 k,`=1
k6=`
However, the proposition 2.18 states that the covariances are zero. Moreover, the variance of Ψ1
is zero, since this is a constant. This leads to the equation 224.
In the following proposition, we prove the equation 217 in the case of the truncated decom-
position.
Proposition 3.3. ( Coefficients of the truncated decomposition) The truncated expansion 222 is
such that
hg, Ψk i
ak = . (231)
kΨk k2
41
Proof. Indeed,
Z
hg, Ψk i = g(x)Ψk (x)w(x)dx (232)
I
Pdp Z
X
= a` Ψ` (x)Ψk (x)w(x)dx (233)
`=1 I
p
Pd
X
= a` hΨ` , Ψk i (234)
`=1
= ak hΨk , Ψk i (235)
= ak kΨk k2 , (236)
where the equation 235 is implied by the orthogonality of the functions {Ψk }k≥1 .
An immediate consequence of the equation 3.3 is that the decomposition 222 is unique. Indeed,
assume that ak and bk are real numbers and consider the decompositions
p
Pd
X
g(X) ≈ ak Ψk (X), (237)
k=1
and
p
Pd
X
g(X) ≈ bk Ψk (X). (238)
k=1
The equation 231 is satisfied both by the coefficients ak and bk , so that
ak = b k . (239)
The following decomposition shows how the coefficients be be expressed in termes of expecta-
tions and variances.
Proposition 3.4. ( Coefficients of the truncated decomposition (2)) The truncated expansion 222
is such that
a1 = E(g(X)) (240)
and
E (g(X)Ψk (X))
ak = (241)
V (Ψk (X))
for k > 1.
Proof. By definition of the expectation, we have
Z
E (g(X)Ψk (X)) = g(x)Ψk (x)f (x)dx (242)
I Z
1
= R g(x)Ψk (x)w(x)dx (243)
I
w(x)dx I
hg, Ψk i
= R (244)
I
w(x)dx
ak kΨk k2
= R , (245)
I
w(x)dx
42
where the last equation is implied by 231. Hence,
Z
ak
E (g(X)Ψk (X)) = R Ψk (x)2 w(x)dx (246)
w(x)dx
IZ I
This implies:
E (g(X)Ψk (X))
ak = . (249)
E(Ψk (X)2 )
For k = 1, we have Ψ1 (X) = 1, so that the previous equation implies 240. For k > 1, we have
E (g(X)Ψk (X))
ak = . (250)
V (Ψk (X)) + E(Ψk (X))2
However, we know from proposition 2.16 that, for k > 1, we have E(Ψk (X)) = 0, which concludes
the proof.
where Ψk are the univariate Hermite polynomials. The number of univariate polynomials is given
by the proposition 2.6, which states that, for p = 1, we have
1 1+d
Pd = (252)
d
(1 + d)!
= (253)
d!1!
= 1 + d. (254)
Moreover, the equation 164 states that the functions Ψk are defined in terms of the Hermite
polynomials as
α(k) = k − 1. (256)
43
This implies :
Ψk (x) = Hek−1 (x), (257)
for any x ∈ R and k ≥ 1. The equation 251 simplifies to:
g(X) ≈ a1 He0 (X) + a2 He1 (X) + ... + ad+1 Hed (X), (258)
where d ≥ 1 is an integer and {Hei }i=0,1,...,d are Hermite polynomials.
The proposition 3.4 states that the coefficients are
a1 = E(g(X)) (259)
and
E (g(X)Hek−1 (X))
ak = (260)
V (Hek−1 (X))
E (g(X)Hek−1 (X))
= (261)
(k − 1)!
for k > 1. In general, this requires to compute the integral
Z
E (g(X)Hek−1 (X)) = g(x)Hek−1 (x)f (x)dx, (262)
R
for k ≥ 1, where f is the Gaussian probability distribution function defined in the equation 48.
In this section, we consider several specific examples of functions g and present the associated
coefficients ai , for i = 1, 2, ..., d + 1.
Example (Constant) Assume that
g(x) = c, (263)
for some real constant c. A possible exact decomposition is:
g(X) = cHe0 (X). (264)
Since the polynomial chaos decomposition is unique, this is the decomposition. But it is interesting
to see how the coefficients can be computed in this particular case. We have
a1 = E(c) (265)
= c. (266)
Moreover,
E (cHek−1 (X))
ak = (267)
(k − 1)!
E (Hek−1 (X))
= c (268)
(k − 1)!
However, the proposition 1.9 states that E (Hek−1 (X)) = 0 for k > 1. This implies
ak = 0 (269)
for k > 1.
44
Example (Standard normal random variable) Assume that
g(x) = x, (270)
since He1 (X) = X. Again, it is interesting to see how the coefficients can be computed in this
particular case. We have
a1 = E(X) (272)
= 0, (273)
E (XHek−1 (X))
ak = (274)
(k − 1)!
E (He1 (X)Hek−1 (X))
= , (275)
(k − 1)!
since the first Hermite polynomial is He1 (x) = x. The equation 39 then implies ak = 0, for k > 2.
Moreover,
E (He1 (X)2 )
a2 = (276)
V (He1 (X))
V (He1 (X))
= (277)
0!
0!
= (278)
1
= 1, (279)
where the previous equation comes from the equation 40. This immediately leads to the equation
271.
g(x) = x2 , (280)
for any x ∈ R. Consider the random variable g(X) = X, where X is a standard normal random
variable. From the table 6, we see that
x2 = x2 − 1 + 1 (281)
= He2 (x) + He0 (x). (282)
45
k a0 a1 a2 a3 a4 a5 a6 a7 a8 a9
x0 1
x1 1
x2 1 1
x3 3 1
x4 3 6 1
x5 15 10 1
x6 15 45 15 1
x7 105 105 21 1
x8 105 420 210 28 1
x9 945 1260 378 36 1
Figure 24: Coefficients ak in the decomposition of xn into Hermite polynomials, for n = 0, 1, ..., 9.
Z ∞
cos(x)Hen (x)w(x)dx = { (286)
−∞
46
cos(x) sin(x)
a0 0.6065307
a1 0.6065307
a2 -0.3032653
a3 -0.1010884
a4 0.0252721
a5 0.0050544
a6 -0.0008424
a7 -0.0001203
a8 0.0000150
a9 0.0000017
a10 -0.0000002
a11 -1.519D-08
a12 1.266D-09
a13 9.740D-11
a14 -6.957D-12
Figure 25: Coefficients ak in the decomposition of several functions into Hermite polynomials.
Distribution Polynomial Support
Continuous Normal Hermite (−∞, ∞)
Distribution Gamma (Exponential) Laguerre [0, ∞)
Beta Jacobi [a, b]
Uniform Legendre [a, b]
Discrete Poisson Chalier {0, 1, 2, ...}
Distribution Binomial Krawtchouk {0, 1, 2, ..., N }
Negative Binomial Meixner {0, 1, 2, ...}
Hypergeometric Hahn {0, 1, 2, ..., N }
Figure 26: Map from a distribution function to the associated orthogonal polynomials [12].
The involved orthogonal polynomials are constructed from limit relationships from the hyper-
geometric orthogonal polynomials. This leads to a classification which allows to create a tree of
relations between the polynomials, called the Askey scheme. The mapping from the distribution
to the corresponding orthogonal polynomial is presented in the figure 26.
The finite decomposition is
X
g(X) ≈ aα Ψα (X), (287)
α∈Jp,d
where X = (X1 , X2 , ..., Xp ) are independent random variables and Ψk is the tensor product of
orthogonal polynomials φ, so that
p
Y
Ψk (x) = φα(k) (xi ) (288)
i
i=1
47
Example Assume that X1 is a standard normal random variable, X2 is a uniform random vari-
able in the interval [−1, 1], and assume that X1 and X2 are independent. Therefore, the gener-
alized polynomial chaos associated with the random variable X = (X1 , X2 ) involves the Hermite
orthogonal polynomials (for X1 ) and the Legendre orthogonal polynomials (for X2 ).
3.5 Transformations
TODO
4 Thanks
TODO
A Gaussian integrals
A.1 Gaussian integral
Proposition A.1. ( Gaussian integral)
Z +∞ √
exp(−x2 /2)dx = 2π. (289)
−∞
Proof. We have,
Z +∞ 2 Z +∞ Z +∞
2 2 2
exp(−x )dx = exp(−x )dx exp(−x )dx (290)
−∞ −∞ −∞
Z +∞ Z +∞
2 2
= exp(−x )dx exp(−y )dy (291)
−∞ −∞
Z +∞ Z +∞
exp −x2 − y 2 dxdy.
= (292)
−∞ −∞
x = r cos(θ), (293)
y = r sin(θ) (294)
48
where r ∈ R and θ ∈ [0, 2π]. Its Jacobian is J(r, θ) = r (see, e.g. [6]). Hence,
Z +∞ 2 Z +∞ Z 2π
2
exp −r2 rdrdθ
exp(−x )dx = (295)
−∞ 0 0
Z +∞
r exp −r2 dr.
= 2π (296)
0
We plug the equation 300 into the previous equality and get 289.
The previous proof does not take into account the improper integrals which appear during the
computation.
Proof. If n is odd, the function xn is antysymmetric, since (−x)n = −xn for any x ∈ R. On
the other hand, the function exp(−x2 /2) is symmetric, since exp(−(−x)2 /2) = exp(−x2 /2). We
consider the change of variable y = −x and get:
Z 0 Z +∞
n 2
x exp(−x /2)dx = (−y)n exp(−(−y)2 /2)dy (303)
−∞ 0
Z +∞
=− y n exp(−y 2 /2)dy. (304)
0
This implies
Z +∞ Z 0 Z +∞
n 2 n 2
x exp(−x /2)dx = x exp(−x /2)dx + xn exp(−x2 /2)dx (305)
−∞ −∞ 0
= 0. (306)
and concludes the proof.
49
Proposition A.3. For any even n, we have
Z +∞ √
xn exp(−x2 /2)dx = 1 · 3 · 5 · ... · (n − 1) 2π. (307)
−∞
This implies
+∞
xn+2
Z
In = exp(−x2 /2)dx (313)
−∞ n+1
1
= In+2 . (314)
n+1
√
Moreover, we from from proposition A.1 that I0 = 2π, which concludes the proof.
Consider the Gaussian weight function w(x) = exp(−x2 /2). The consequence of the proposi-
tions A.2 and A.3 is that a monomial xn is in L2w (R), for any integer n.
50
References
[1] Milton Abramowitz and Irene A. Stegun. Handbook of Mathematical Functions with Formu-
las, Graphs, and Mathematical Tables. Dover, New York, ninth dover printing, tenth gpo
printing edition, 1964.
[2] George E. Andrews, Richard Askey, and Ranjan Roy. Special functions, volume 71 of En-
cyclopedia of Mathematics and its Applications. Cambridge University Press, Cambridge,
1999.
[3] G. Arfken. Mathematical Methods for Physicists, 3rd ed. Elsevier Academic Press, 2005.
[4] M. Branicki and A.J Majda. Fundamental limitations of polynomial chaos for uncertainty
quantification in systems with intermittent instabilities, 2012.
[6] Philip J. Davis and Philip Rabinowitz. Methods of Numerical Integration. Academic Press,
New York, 1984.
[7] William Feller. An Introduction to Probability Theory and Its Applications - Volume I. John
Wiley and Sons, third edition edition, 1968.
[8] Nicholas J. Higham. Accuracy and Stability of Numerical Algorithms. Society for Industrial
and Applied Mathematics, Philadelphia, PA, USA, second edition, 2002.
[9] Thomas Y. Hou, Wuan Luo, Boris Rozovskii, and Hao min Zhou. Wiener chaos expansions
and numerical solutions of randomly forced equations of fluid mechanics. J. Comput. Phys,
216:687–706, 2006.
[10] O.P. Le Maı̂tre. Méthodes spectrales pour la propagation d’incertitudes paramétriques dans
les modèles numériques, mémoire d’habilitation à diriger des recherches. Springer, 2005.
[11] O.P. Le Maı̂tre and O.M. Knio. Spectral methods for uncertainty quantification. Springer,
2010.
[13] Wuan Luo. Wiener Chaos Expansion and Numerical Solutions of Stochastic Stochastic Par-
tial Differential Equations. PhD thesis, California Institute of Technology, May 2006.
[16] Norbert Wiener. The homogeneous chaos. Amer. J. Math., 60(4):897–936, 1938.
[17] Wikipedia. Hermite polynomials — wikipedia, the free encyclopedia, 2013. [Online; accessed
26-January-2013].
51
[18] Dongbin Xiu and George Em Karniadakis. The wiener–askey polynomial chaos for stochastic
differential equations. SIAM J. Sci. Comput., 24(2):619–644, February 2002.
[19] Dongbin Xiu and George Em Karniadakis. Modeling uncertainty in flow simulations via
generalized polynomial chaos. J. Comput. Phys., 187(1):137–167, May 2003.
52