Rafaely B Fundamentals of Spherical Array Processing
Rafaely B Fundamentals of Spherical Array Processing
Boaz Rafaely
Fundamentals
of Spherical
Array Processing
Second Edition
Springer Topics in Signal Processing
Volume 16
Series editors
Jacob Benesty, Montreal, Canada
Walter Kellermann, Erlangen, Germany
More information about this series at http://www.springer.com/series/8109
Boaz Rafaely
Fundamentals of Spherical
Array Processing
Second Edition
123
Boaz Rafaely
Department of Electrical and Computer
Engineering
Ben-Gurion University of the Negev
Beer-Sheva, Israel
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
To my parents, Nitzan and Rivka Rafaely
Preface
Microphone arrays and associated array processing techniques have been developed
for a wide range of applications over the past few decades. These applications
include speech communication, spatial audio, room acoustics analysis, noise control
and acoustic holography, defense and security, entertainment, and many more. In
the cases of speech in rooms and music in concert halls, the sound tends to travel
throughout the entire enclosed space, producing a three-dimensional sound field.
Microphone arrays that effectively measure and process three-dimensional sound
fields typically require the positioning of microphones within a volume in
three-dimensional space. Planar arrays, mounted on an enclosure wall, have been
studied for several decades; more recently, spherical arrays, in which microphones
are mounted around a rigid sphere, for example, have been developed. These offer
several advantages over classical linear, rectangular, or circular arrays:
(i) The sphere, having complete rotational symmetry, facilitates spatial filtering
or beamforming that can be designed to effectively enhance or attenuate
sources in any direction.
(ii) Array processing and performance analysis can be formulated in the spher-
ical harmonics domain, which is the Fourier domain for the sphere. This
domain facilitates efficient algorithms and extensive acoustic modeling of
both the array and the surrounding sound field.
(iii) Beamforming can be efficiently implemented by decoupling beam pattern
design from beam pattern steering, therefore providing simplicity and flex-
ibility in array realization.
These advantages have motivated an increasing number of researchers in recent
years to develop spherical microphone array systems, to study spherical array
configurations, to develop algorithms for spherical arrays, and to apply these arrays
in a wide range of applications. This growing activity has provided the author with
the motivation and inspiration to write this book, with the aim of presenting the
fundamentals of spherical array processing in a tutorial manner suitable for
researchers, graduate students, and engineers interested in this topic.
vii
viii Preface
The first two chapters provide the reader with the necessary mathematical and
physical background, including an introduction to the spherical Fourier transform
and to the formulation of plane-wave sound fields in the spherical harmonics
domain. The third chapter covers the theory of spatial sampling, which becomes
useful when selecting the positions of microphones to sample sound pressure
functions in space. The next chapter presents various spherical array configurations,
including the popular configuration based on a rigid sphere. The fifth chapter
introduces the concept of beamforming and its basic equations, including popular
design methods such as delay-and-sum and regular beamforming. The following
chapter presents methods for the optimal design of beam patterns, formulated to
achieve various objectives such as maximum robustness, maximum directivity, or
minimum side-lobe level. The final chapter develops more advanced array pro-
cessing algorithms such as the minimum variance distortionless response (MVDR)
algorithm. These algorithms aim to enhance a desired signal while attenuating
undesired noise components in the sound field by exploring their unique formu-
lation in the spherical harmonics domain.
My own interest in spherical array processing began during a six-month visit to
the sensory communication group at MIT in 2002, working with Julie Greenberg
and greatly enjoying the stimulating vibe of Boston. I would like to thank Julie for
providing this opportunity, for the hospitality, and for the helpful discussions.
During my visit to Boston I was exposed to the inspiring publications on spherical
arrays by Jens Meyer and Gary Elko. Their pioneering work planted the seeds that
later flourished to an extensive research effort at my lab, the Acoustics Laboratory
at Ben-Gurion University of the Negev. The research at the Acoustics Laboratory
was pursued through an invaluable cooperation with a great number of research
students, postdoctoral researchers, and visitors. The relaxed atmosphere at the lab,
the great teamwork, and the endless discussions were the fuel that kept the writing
of this book viable. I would like to express great thanks to the Acoustics Laboratory
researchers: Dr. Jonathan Sheaffer, Dr. Jonathan Rathsam, Dr. Noam Shabtai, Dr.
Dror Lederman, Dr. Yotam Peled, Dr. Etan Fisher, Dr. Vladimir Tourbabin, Dr. Hai
Morgenstern, Dr. David Alon, Zamir Ben-Hur, Lior Madmoni, Moti Lugasi, Koby
Alhaiany, Mickey Jeffet, Eran Miller, Hanan Beit-On, Itay Ifergan, Ran Weisman,
Tom Shlomo, Amir Musicant, Yoav Biderman, Uri Abend, Elad Cohen, Dima
Lvov, Or Nadiri, Shahar Villeval, Tal Szpruch, Nejem Hulihel, Ilan Ben-Hagai,
Tomer Peleg, Amir Avni, Morag Agmon, Maor Klieder, Dima Haykin, Itai Peer,
and Ilya Balmages. Also, special thanks to Dr. Franz Zotter for the helpful com-
ments on a draft version of the manuscript made during a visit to the lab. Thanks
also to Debbie Kedar for the prompt and professional editing and proofreading of
this book. Finally, thanks to my family, Vered, Asaf, Yonathan, and Tal, for pro-
viding love therapy that time and again pulled me out of the writing stumbles and
falls.
This second edition of Fundamentals of Spherical Array Processing, in addition
to the correction of all known errors, now includes comprehensive support in
MATLAB. A manual has been developed that includes MATLAB code to repro-
duce all examples and figures in the book, with the aim of providing MATLAB
Preface ix
support to complement the theory and signal processing methods presented in this
book. The MATLAB manual is provided as additional material to this book and can
be downloaded from https://www.mathworks.com or from the author’s website
http://www.ee.bgu.ac.il/*br.
1 Mathematical Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Functions on the Sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Spherical Harmonics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Exponential and Legendre Functions . . . . . . . . . . . . . . . . . . . . . . 12
1.4 Spherical Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.5 Some Useful Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.6 Rotation of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.7 Spherical Convolution and Correlation . . . . . . . . . . . . . . . . . . . . . 28
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2 Acoustical Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.1 The Acoustic Wave Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.2 Spherical Bessel and Hankel Functions . . . . . . . . . . . . . . . . . . . . 36
2.3 A Single Plane Wave . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.4 Plane-Wave Composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.5 Point Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.6 Sound Pressure Around a Rigid Sphere . . . . . . . . . . . . . . . . . . . . 49
2.7 Translation of Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3 Sampling the Sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.1 Sampling Order-Limited Functions . . . . . . . . . . . . . . . . . . . . . . . 59
3.2 Equal-Angle Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.3 Gaussian Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.4 Uniform and Nearly-Uniform Sampling . . . . . . . . . . . . . . . . . . . . 67
3.5 Numerical Computation of Sampling Weights . . . . . . . . . . . . . . . 70
3.6 The Discrete Spherical Fourier Transform . . . . . . . . . . . . . . . . . . 74
3.7 Spatial Aliasing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
xi
xii Contents
Abstract This chapter provides the mathematical background necessary for study-
ing spherical array processing. Spherical arrays typically sample functions on a
sphere (e.g. sound pressure); therefore, this chapter begins by presenting the spherical
coordinate system, as well as some examples of functions on the sphere. Spherical
harmonics are a central theme of this book as they form a basis for representing
functions on the sphere. Therefore, spherical harmonics are first defined and illus-
trated, and then an introduction to the spherical Fourier transform and a description
of functions on the sphere in Hilbert space follows. The chapter concludes with a pre-
sentation of the topics of rotation, convolution, and correlation defined for functions
on the sphere.
x ≡ (x, y, z) ∈ R3 , (1.1)
which represents all positions having unit distance from the origin, with · denoting
the Euclidean norm. Positions on S 2 can be denoted by elevation and azimuth angles,
θ and φ, which define the spherical coordinates, together with the radial distance (or
radius), r :
r ≡ (r, θ, φ). (1.3)
Fig. 1.1 Spherical coordinate system defined relative to the Cartesian coordinate system
Fig. 1.2 Plot of function f (θ, φ) = sin2 θ cos(2φ) over the surface of a unit sphere
The azimuth angle φ is measured from the x-axis towards the y-axis, while the
elevation angle θ is measured downwards from the z-axis, as illustrated in Fig. 1.1.
Strictly speaking, θ denotes inclination, but the term elevation will be used in this
book.
A position r = (r, θ, φ) represented in spherical coordinates can be related to the
same position represented in Cartesian coordinates x = (x, y, z) using
x = r sin θ cos φ
y = r sin θ sin φ
z = r cos θ. (1.4)
1.1 Functions on the Sphere 3
160 0.8
140 0.6
0.4
120
0.2
100
0
80
-0.2
60
-0.4
40 -0.6
20 -0.8
-1
0 50 100 150 200 250 300 350
Fig. 1.3 Plot of function f (θ, φ) = sin2 θ cos(2φ) over the θφ plane
Spherical functions, or functions defined over the unit sphere, are central to this book.
An example of a function over the sphere is
The function can be presented graphically in various ways: as a color map on the
surface of a unit sphere, as in Fig. 1.2, as a color contour map on a θ φ plane mapping
the surface of a unit sphere, as in Fig. 1.3, and with magnitude denoted by the
distance from the origin (balloon plot), as in Fig. 1.4. In the latter plot, cyan (green-
blue) shades represent positive values, while magenta (purple-red) shades represent
negative values. All three figures show one maximum and two zeros over θ , due to
the sin2 θ term in the range θ ∈ [0, π ], and two maxima, two minima and four zeros
over φ, due to the cos(2φ) term in the range φ ∈ [0, 2π ].
In this book more than one single notation is used to represent functions on the
unit sphere. A common notation uses the angles of the spherical coordinate system
directly, i.e.
f (θ, φ), (θ, φ) ∈ S 2 . (1.6)
Sometimes a more compact notation is desired, in which case the two angles will be
denoted by a single parameter, e.g. μ ≡ μ(θ, φ), using the function representation
Fig. 1.4 Balloon plot of function f (θ, φ) = sin2 θ cos(2φ), with the distance from the origin
defined by | f (θ, φ)|, and with cyan (green-blue) shades representing positive values of f , and
magenta (purple-red) shades representing negative values of f
In the sections that follow, functions on the unit sphere are presented as a weighted
sum of a set of basis functions, also forming the Fourier basis for functions on the
sphere. These basis functions are the spherical harmonics, defined as follows [13]:
2n + 1 (n − m)! m
Ynm (θ, φ) ≡ P (cos θ )eimφ , (1.9)
4π (n + m)! n
where (·)! represents the factorial function, Pnm (·) are the associated Legendre func-
tions, m ∈ Z is an integer denoting the function degree and n ∈ N is a natural number
denoting the function order.
Table 1.1 presents expressions for the spherical harmonics of orders zero to four
[11]. Note that the spherical harmonics have a complex exponential dependence on
φ, so that the absolute value, |Ynm (θ, φ)|, will be constant along φ. Therefore, plots of
the real and imaginary parts of the spherical harmonics are typically presented, rather
than plots of the magnitude and phase. The order n determines the highest power of
1.2 Spherical Harmonics 5
the cos θ and sin θ terms controlling the dependence of the spherical harmonics over
θ , while m determines the dependence over φ through the exponential term eimφ .
Figure 1.5 presents balloon plots of the real and imaginary parts of the spherical
harmonics, Re{Ynm (θ, φ)} and Im{Ynm (θ, φ)}, with a view angle of
6 1 Mathematical Background
Fig. 1.5 Balloon plot of the spherical harmonics for n = 0 (top row) to n = 4 (bottom row), with
Yn0 (θ, φ), which is a real function, presented in the central column. I m{Ynm (θ, φ)} for −n ≤ m ≤ −1
are presented in the left-hand side columns, and Re{Ynm (θ, φ)} for 1 ≤ m ≤ n are presented in the
right-hand side columns. The view direction is indicated by the orientation of the axes presented at
the top of the figure. Colors indicate the sign of the spherical harmonic functions, with cyan (green-
blue) shades representing positive values, and magenta (purple-red) shades representing negative
values
(θ, φ) = (60◦ , −127.5◦ ). The rows in the figure present plots for n = 0 (top row) to
n = 4 (bottom row), while the columns present plots for m = −n (leftmost column)
to m = n (rightmost column). Im{Ynm (θ, φ)} is presented for m < 0, Re{Ynm (θ, φ)} is
presented for m > 0, and Yn0 (θ, φ), which is real, is presented in the central column.
Table 1.2 explicitly illustrates the functions presented in Fig. 1.5, for clarity. Figure
1.5 shows that Y00 is constant over the sphere, similar to a monopole function. The
real and imaginary parts of the spherical harmonics of order n = 1 have dipole-like
shapes, while higher orders have more complex forms, with the number of lobes
increasing with n and m.
With the aim of visualizing the spherical harmonics more clearly, Fig. 1.6 presents
the spherical harmonics in a similar manner to Fig. 1.5, but as viewed from the z-
axis, i.e. downwards from above. In this case, the behavior of the real and imaginary
parts of the spherical harmonics over the azimuth angle φ is illustrated clearly. All
spherical harmonics at m = 0 are constant over φ, while exhibiting cos(mφ) behavior
for the real parts, and sin(mφ) behavior for the imaginary parts. The plots on the left
1.2 Spherical Harmonics 7
Fig. 1.6 Same as Fig. 1.5, but viewed from the z-axis (top view)
side (imaginary part, m < 0) are therefore rotated versions of the plots on the right
side (real part, m > 0), by 90◦ /m.
Figures 1.7 and 1.8 follow the same approach as Fig. 1.6, but with x-axis and y-
axis viewpoints, respectively, showing the dependence on θ more clearly. Spherical
harmonics Yn0 have a high value around θ = 0 and θ = π due to the cosn θ terms. The
behavior of the other spherical harmonics is more complex. For example, spherical
8 1 Mathematical Background
Fig. 1.7 Same as Fig. 1.5, but viewed from the x-axis (front view)
harmonics Ynn and Yn−n have a sinn θ dependence, producing “flat” looking functions
from the viewpoints shown in Figs. 1.7 and 1.8.
1.2 Spherical Harmonics 9
Fig. 1.8 Same as Fig. 1.5, but viewed from the y-axis (side view)
Some of the properties of the spherical harmonics are presented next, starting with
basic properties and progressing to properties involving integration and summation.
• Complex conjugate. The spherical harmonics are complex functions due to the com-
plex exponential term, eimφ , while the associated Legendre functions, Pnm (cos θ ),
are all real. The complex conjugate of the spherical harmonics take the form
∗
Ynm (θ, φ) = (−1)m Yn−m (θ, φ), (1.10)
which is derived from the expression of the associated Legendre function for
negative values of m [see Eq. (1.31)]. The complex conjugate property also defines
the relation between Ynm (θ, φ) and Yn−m (θ, φ), which are spherical harmonics of
the same order and opposite degrees.
• Limit on degree value, m. By definition, spherical harmonics with a degree that is
higher than the order are zero, i.e.
• Zeros of the spherical harmonics. The spherical harmonics contain sin|m| θ terms,
defining the zeros of the function for m = 0, i.e.
10 1 Mathematical Background
These spherical harmonics are not dependent on φ and are therefore axis-
symmetric relative to the z-axis. This is clearly illustrated in Figs. 1.7 and 1.8,
by the spherical harmonic functions in the central columns.
• Spherical harmonics at m = n and m = −n. At these extreme values of m, the
spherical harmonics have a sine dependence on θ and a simplified form:
1(2n + 1)! n −inφ
Yn−n (θ, φ) = sin θ e
2n+1 n!π
(−1)n (2n + 1)! n inφ
Ynn (θ, φ) = n+1 sin θ e . (1.14)
2 n! π
• Mirror symmetry along θ with respect to the equator, θ = π/2. The spherical har-
monics have a mirror symmetry in θ , such that the function on the upper hemisphere
is equal to the function on the lower hemisphere, up to a sign factor:
This symmetry is clearly illustrated in Figs. 1.7 and 1.8 by the real and imaginary
parts of the spherical harmonics, in which the sign is indicated by color. For even
n + m the functions are symmetric about the equator, whereas for odd n + m the
functions are antisymmetric about the equator.
• Symmetry with respect to φ. The spherical harmonics have mirror symmetry with
respect to φ due to the exponential function, such that
This property is illustrated in Fig. 1.6, where spherical harmonic functions for
even values of m are equal at opposite sides of the circle defined by φ, while for
odd values of m the functions have the opposite sign (different color) at a phase
shift of 180◦ along φ.
Another symmetry along φ is defined relative to the x-axis, again due to the
behavior of the exponential function:
∗
Ynm (θ, −φ) = Ynm (θ, φ) . (1.17)
1.2 Spherical Harmonics 11
Figure 1.6 illustrates that the real part of the spherical harmonics, plotted in the
right-hand side columns, is symmetric about the x-axis, while the imaginary part
is antisymmetric.
• Opposite direction. Combining the last two properties, Eqs. (1.15) and (1.16), the
spherical harmonics at (π − θ, φ + π ), which is the opposite direction to (θ, φ),
can be written as
• Periodicity with respect to φ. The spherical harmonics are periodic with respect to
φ with a period of 2π/m, due to the exponential term eimφ , and therefore satisfy
The periodicity is illustrated in Fig. 1.6, where, for example, the spherical harmon-
ics in the central column with m = 0 are constant along φ, spherical harmonics
corresponding to m = ±1 have a period of 2π , those corresponding to m = ±2
have a period of π , and so on.
The next set of properties is related to the integration of the spherical harmonic
functions over the unit sphere. In general, integration over a sphere of radius r can
be calculated by dividing the sphere area into elements, as illustrated in Fig. 1.9. The
length along φ of each element on the sphere surface is given by r sin θ dφ, denoting
the fact that the elements are narrower in the azimuth dimension nearer the poles.
The width along θ of each element is given by r dθ . The area element is therefore
defined as
r 2 dΩ = r 2 sin θ dθ dφ, (1.20)
where Ω is the solid angle and dΩ is the area element covered by sin θ dθ dφ on a unit
sphere. With a finer grid on the sphere surface, and elements becoming infinitesimally
small, the area can be calculated by integrating over the entire sphere surface:
12 1 Mathematical Background
2π π 2π 1
r 2
dΩ = r 2
sin θ dθ dφ = r 2
dzdφ = 4πr 2 , (1.21)
S2 0 0 0 −1
2π π √
Ynm (θ, φ) sin θ dθ dφ = 4π δn0 δm0 , (1.22)
0 0
where δn0 is the Kronecker delta function, which is zero for all n except for n = 0.
• Orthogonality of spherical harmonics. The previous property can be easily derived
from the orthogonality property of the spherical harmonics over the sphere surface,
given by
2π π
m ∗
Yn (θ, φ) Ynm (θ, φ) sin θ dθ dφ = δnn δmm , (1.23)
0 0
where δnn is equal to unity for n = n and zero otherwise. Although spherical
harmonics are normalized to maintain orthonormality, the term orthogonality will
be used in this book.
• Completeness of spherical harmonics. The completeness property states that
∞
n
∗
Ynm (θ, φ) Ynm (θ , φ ) = δ(cos θ − cos θ )δ(φ − φ ), (1.24)
n=0 m=−n
where δ(cos θ )δ(φ) is the Dirac delta function on the sphere, which is zero every-
where on the sphere except at (θ, φ) = (π/2, 0), and satisfies
2π π 2π 1
δ(cos θ )δ(φ) sin θ dθ dφ = δ(z)δ(φ)dzdφ = 1, (1.25)
0 0 0 −1
where z = cos θ was used to remove the dependence of the Dirac delta function
on the cosine function.
• Spherical harmonics addition theorem. Another property related to completeness
is the addition theorem, which involves a summation over m:
1.2 Spherical Harmonics 13
n
∗ 2n + 1
Ynm (θ, φ) Ynm (θ , φ ) = Pn (cos Θ), (1.26)
m=−n
4π
where
cos Θ = cos θ cos θ + cos(φ − φ ) sin θ sin θ , (1.27)
Θ is the angle between (θ, φ) and (θ , φ ) and Pn (·) is the Legendre polynomial.
The properties of the spherical harmonic functions presented in Sect. 1.2 are the direct
result of the properties of the functions that compose the spherical harmonics, i.e.
the complex exponential eimφ , the associated Legendre function Pnm (cos θ ) and the
Legendre polynomials, Pn (cos θ ), for m = 0. Therefore, these functions and some
of their properties are presented in this section.
The complex exponential, widely used in signal processing, forms a complete and
orthogonal basis for functions on the circle, i.e.
∞
e−imφ eimφ = 2π δ(φ − φ ) (1.28)
m=−∞
2π
1
e−imφ eim φ dφ = δmm , (1.29)
2π
0
dm
Pnm (x) = (−1)m (1 − x 2 )m/2 Pn (x), x ∈ [−1, 1]. (1.30)
dxm
Table 1.3 presents expressions for the associated Legendre function for orders zero to
four. Figure 1.10 presents plots of Pnm (x) for m ≥ 0. Associated Legendre functions
for negative values of m are proportional to the same functions with a positive value
of m, and are given by
14 1 Mathematical Background
(0,0)
2
1
0
-1 0 1
(1,0) (1,1)
1 0
0 -0.5
-1 -1
-1 0 1 -1 0 1
(2,0) (2,1) (2,2)
1 2
0.5 0 2
0 1
-2 0
-1 0 1 -1 0 1 -1 0 1
(3,0) (3,1) (3,2) (3,3)
1 1 5 0
0 0 0 -5
-1 -10
-1 -2 -5
-1 0 1 -1 0 1 -1 0 1 -1 0 1
(4,0) (4,1) (4,2) (4,3) (4,4)
1 2 10 50 150
0.5 0 0 0 100
0 50
-2 -10 -50 0
-1 0 1 -1 0 1 -1 0 1 -1 0 1 -1 0 1
Fig. 1.10 Associated Legendre function Pnm (x), with (n, m) denoted on each figure
(n − m)! m
Pn−m (x) = (−1)m P (x). (1.31)
(n + m)! n
They are therefore not illustrated in Fig. 1.10. The behavior of Pnm (x) illustrated in
the curves in Fig. 1.10 is responsible for the behavior of the spherical harmonics over
the elevation angle θ , as illustrated in Figs. 1.7 and 1.8.
The associated Legendre functions for different orders n and the same degree m
are orthogonal under the integral satisfying
1
2 (n + m)!
Pnm (x)Pnm (x)d x = δnn , −n ≤ m ≤ n. (1.32)
2n + 1 (n − m)!
−1
This property is responsible for the orthogonality of the spherical harmonics, Eq.
(1.23), when integrating along θ . Combining Eqs. (1.32) and (1.29) (the orthogonality
of the exponential functions) one can directly derive the orthogonality of the spherical
harmonics, Eq. (1.23).
The values of the associated Legendre function for m = 0, i.e. Pn0 (x), or the values
of the spherical harmonics for m = 0, i.e. Yn0 (θ, φ), are determined by the Legendre
polynomials that satisfy
3
P43 (x) = −105x(1 − x 2 ) 2
P44 (x) = 105(1 − x 2 )2
Table 1.4 presents expressions for the Legendre polynomials for orders zero to four.
Figure 1.11 presents plots of Pn (x) for n = 0, . . . , 4. Note that these curves are the
same as the curves presented in Fig. 1.10, left column, for the associated Legendre
function.
The Legendre polynomials can be derived directly through the following differ-
entiation formula:
1 dn 2
Pn (x) = n (x − 1)n . (1.34)
2 n! d x n
16 1 Mathematical Background
1 1
0.5
0
0
-1
1 1
0.5
0
0
-1
-1 0 1 -1 0 1
The Legendre polynomials form a complete and orthogonal set of basis functions
over the line section x ∈ [−1, 1]. They are in L 2 ([−1, 1]), the space of functions on
this line section, and satisfy [1]
∞
2n + 1
Pn (x)Pn (x ) = δ(x − x ), (1.35)
n=0
2
1
2
Pn (x)Pn (x)d x = δnn . (1.36)
2n + 1
−1
Therefore, one can define a Legendre transform, or Fourier Legendre series [1], as
will be presented in Eq. (1.48). Substituting Eq. (1.26) into Eq. (1.24), or simply
substituting x = 1 and Pn (1) = 1 [1] in Eq. (1.35), leads to
∞
2n + 1
Pn (x) = δ(x − 1). (1.37)
n=0
2
N
N +1
(2n + 1)Pn (x)Pn (x ) =
PN +1 (x)PN (x ) − PN (x)PN +1 (x ) , (1.38)
n=0
x−x
where f nm are the weights. These weights form the spherical Fourier transform of
f (θ, φ) and can be derived from f (θ, φ) by
2π π
∗
f nm = f (θ, φ) Ynm (θ, φ) sin θ dθ dφ. (1.41)
0 0
Equations (1.41) and (1.40) form the spherical Fourier transform and its inverse,
respectively. Although denoted in this book (and elsewhere) as “transform”, Fourier
series may be a more suitable name, as Eq. (1.40) involves a summation rather than
an integral, f nm is discrete rather than continuous, and f (θ, φ) has a finite support
over (θ, φ), similar to Fourier series representations of periodic functions over R.
The requirement that f (θ, φ) ∈ L 2 (S 2 ) is also a sufficient condition for a bounded
spherical Fourier transform, i.e. | f nm | < ∞, n ∈ N, −n ≤ m ≤ n. The Cauchy-
Schwarz inequality is employed in the proof, as follows:
18 1 Mathematical Background
2π π 2
m ∗
| f nm |2 = f (θ, φ) Yn (θ, φ) sin θ dθ dφ
0 0
2π π 2π π
m
≤ | f (θ, φ)| sin θ dθ dφ ×
2 Y (θ, φ) 2 sin θ dθ dφ
n
0 0 0 0
2π π
= | f (θ, φ)|2 sin θ dθ dφ < ∞, (1.42)
0 0
where the orthogonality property, as in Eq. (1.23), has been used to evaluate the
integral over |Ynm (θ, φ)|2 , and f ∈ L 2 (S 2 ) has been substituted in deriving the final
inequality. Equation (1.42) suggests that any function in L 2 (S 2 ) will have a spherical
Fourier transform with bounded coefficients. This is clearly a sufficient condition
and not a necessary condition. For example, f (θ, φ) = δ(cos θ − cos θ )δ(φ − φ )
is not in L 2 (S 2 ), since the integral over the square of a delta function diverges.
However, the spherical harmonic coefficients in this case are f nm = Ynm (θ , φ ), as
can be deduced from Eq. (1.24), and are bounded for all n and m.
Some of the properties of the spherical Fourier transform and of functions defined
as a linear combination of spherical harmonics are outlined next.
• Parseval’s relation. Orthogonality and completeness of the spherical harmonics
have been presented in Eqs. (1.23) and (1.24), respectively. Parseval’s relation
follows directly:
2π π ∞
n
| f (θ, φ)| sin θ dθ dφ =
2
| f nm |2 , (1.43)
0 0 n=0 m=−n
2π π ∞
n
f (θ, φ) [g(θ, φ)]∗ sin θ dθ dφ = ∗
f nm gnm . (1.44)
0 0 n=0 m=−n
• Linearity. The spherical Fourier transform maintains the property of linearity due
to the integral operation of the transform. This implies that scaling and addition
of two functions lead to scaling and addition of their transforms:
2π π
∗
f nm = f (θ ) Ynm (θ, φ) sin θ dθ dφ
0 0
π
2n + 1
= 2π δm0 f (θ )Pnm (cos θ ) sin θ dθ
4π
0
4π
= f n δm0 , (1.47)
2n + 1
where f n depends only on n. This property has been derived by solving for the
integral over φ. In this case, f n inherits the properties of the Legendre series [1]:
π
2n + 1
fn = f (θ )Pn (cos θ ) sin θ dθ
2
0
∞
f (θ ) = f n Pn (cos θ ), (1.48)
n=0
2π π
∗
f nm = f (φ) Ynm (θ, φ) sin θ dθ dφ
0 0
2π π
1 −imφ 2n + 1 (n − m)!
= f (φ)e dφ × 2π Pnm (cos θ ) sin θ dθ
2π 4π (n + m)!
0 0
= f m Cnm , (1.49)
where f m is the Fourier series coefficient of f (φ) and Cnm is a constant derived by
evaluating the integral over θ [4]. In this case, f m inherits all the properties of the
Fourier series:
20 1 Mathematical Background
2π
1
fm = f (φ)e−imφ dφ
2π
0
∞
f (φ) = f m eimφ , (1.50)
m=−∞
∞
n
f (θ, φ) = f nm Ynm (θ, φ)
n=0 m=−n
∞
n
= f nm Ynm (θ, π − φ)
n=0 m=−n
∞ n
∗
= f nm (−1)m Ynm (θ, φ)
n=0 m=−n
∞
n
= f nm Yn−m (θ, φ)
n=0 m=−n
∞ n
= f n(−m) Ynm (θ, φ), (1.51)
n=0 m=−n
2π π
f (θ, φ)δ(cos θ − cos θ )δ(φ − φ ) sin θ dθ dφ = f (θ , φ ), (1.52)
0 0
1
δ(cos θ − cos θ )δ(φ − φ ) = δ(θ − θ )δ(φ − φ ). (1.53)
sin θ
Some of the properties of the spherical harmonics and functions defined on the
sphere are a direct result of the spherical harmonics forming a basis in Hilbert space
L 2 (S 2 ), i.e. the space of all square-integrable functions on the unit sphere. The inner
1.4 Spherical Fourier Transform 21
2π π
f, g ≡ f (θ, φ) [g(θ, φ)]∗ sin θ dθ dφ. (1.54)
0 0
such that
∞
n
f = f, Ynm Ynm . (1.56)
n=0 m=−n
Some useful functions defined over the sphere and their spherical Fourier transform
are presented in this section.
• Constant function. A function that is constant along both θ and φ can be rep-
resented using only the zero-order spherical harmonics, leading to the following
transform pair:
f (θ, φ) = 1
√
f nm = 4π δn0 δm0 , (1.57)
√
which can be derived by noting that f (θ, φ) = 1 = 4π Y00 (θ, φ), substituting in
Eq. (1.41), and evaluating the integral using the orthogonality property, as in Eq.
(1.23).
• Dirac delta function. The Dirac delta function over the sphere, δ(cos θ − cos θ ) ×
δ(φ − φ ), is considered next. Substituting the Dirac delta function in Eq. (1.41)
(the spherical Fourier transform) and evaluating the integral using the sifting prop-
erty, as in Eq. (1.52), the spherical Fourier coefficients for the Dirac delta function
are found to be simply the spherical harmonics:
22 1 Mathematical Background
• Spherical harmonics. The spherical Fourier transform, Eq. (1.41), for f (θ, φ) =
Ynm (θ, φ), can be evaluated using the orthogonality property, Eq. (1.23), leading
to the following spherical Fourier transform pair:
f (θ, φ) = Ynm (θ, φ)
f nm = δnn δmm . (1.59)
N
n
∗
f (θ, φ) = Ynm (θ , φ ) Ynm (θ, φ)
n=0 m=−n
N
2n + 1
= Pn (cos Θ)
n=0
4π
N +1
= PN +1 (cos Θ) − PN (cos Θ) . (1.60)
4π(cos Θ − 1)
The spherical harmonics addition theorem, Eq. (1.26), was used to derive the
second line in the equation, where Θ is the angle between (θ, φ) and (θ , φ ),
defined in Eq. (1.27). The third line was derived using (1.39) [7], leading to the
following transform pair:
N +1
f (θ, φ) = PN +1 (cos Θ) − PN (cos Θ)
4π(cos Θ − 1)
∗
Ynm (θ , φ ) , n ≤ N
f nm = (1.61)
0, n > N.
2π α
∗
f nm = Ynm (θ, φ) sin θ dθ dφ
0 0
2π α
2n + 1 (n − m)! m
= P (cos θ )e−imφ sin θ dθ dφ
4π (n + m)! n
0 0
α
2n + 1 (n − m)!
= 2π δm0 Pnm (cos θ ) sin θ dθ
4π (n + m)!
0
1
2n + 1
= 2π δm0 Pn (z)dz. (1.62)
4π
cos α
√
For n = 0, Pn (cos θ ) reduces to 1, leading to f 00 = π (1 − cos α), while for
n > 0 a recurrence formula for the Legendre polynomials can be used to evaluate
the integral [13], leading to
1, 0 ≤ θ ≤ α
f (θ, φ) =
0, α < θ ≤ π
√
π (1 − cos α), n=0
f nm = . (1.63)
π
δm0 2n+1 Pn−1 (cos α) − Pn+1 (cos α) , n > 0
Functions defined on the unit sphere can be shifted, in a similar manner to functions
defined over the line or the unit circle. For functions defined over the line, f (x) ∈
L 2 (R), and for functions defined over the circle, f (φ) ∈ L 2 ([0, 2π ]), the shift param-
eter is in the same domain as the function argument, e.g. f (x − x0 ), x, x0 ∈ R and
f (φ − φ0 ), φ, φ0 ∈ [0, 2π ]. However, for a function defined over the unit sphere,
f (θ, φ), the “shift” parameter is a three-dimensional operation, not in the same
24 1 Mathematical Background
40
N=8
35 N=20
30
25
20
15
10
-5
-10
-80 -60 -40 -20 0 20 40 60 80
Fig. 1.12 A truncated spherical harmonics series to order N (a sinc-like function) with coefficients
[Ynm (θ , φ )]∗ , illustrated for orders N = 8, 20
domain as (θ, φ). For example, the function f (θ, φ) can be rotated around the z-axis
(a one-parameter operation), and can then be further rotated by moving the point on
the sphere intersecting the z-axis (the north pole) to any other point on the sphere
(a two parameter operation), therefore supporting a three-dimensional rotation oper-
ation.
Rotation of a function on the sphere is typically defined using the parameter set
(α, β, γ ), formulated using Euler angles [1]. In this case, an initial counter-clockwise
rotation of angle γ is performed about the z-axis, followed by a counter-clockwise
rotation by angle β about the y-axis, and concluded by a counter-clockwise rotation
of angle α about the z-axis. See, for example, [1] for more details on Euler angles
and rotations. First, a position on the unit sphere in Cartesian coordinates is written
in algebraic vector notation:
=15 °
1
=45 °
0.8
0.6
0.4
0.2
0
0 10 20 30 40 50 60 70 80 90
0.8
=15 °
=45 °
0.6
0.4
0.2
-0.2
-0.4
0 5 10 15 20
Fig. 1.13 Spherical cap function f (θ, φ) as a function of θ (top) and its spherical Fourier transform
f nm , shown for m = 0 (bottom), for α = 15◦ , 45◦
and
⎡ ⎤
cos β 0 sin β
R y (β) = ⎣ 0 1 0 ⎦. (1.67)
− sin β 0 cos β
The Euler matrices are defined in S O(3), i.e. the Special Orthogonal group of 3 × 3
orthogonal matrices, satisfying
26 1 Mathematical Background
R T R = I, det(R) = 1, (1.68)
The rotation matrices introduced in Eqs. (1.66) and (1.67) operate on position vectors
in Cartesian coordinates and so, in this section, functions on the unit sphere are
presented in a similar manner, i.e. f (x), x ∈ S 2 (see Sect. 1.1). The rotation operation
is denoted by Λ and is written as
−1
Λ(α, β, γ ) f (x) = f Rz (α)R y (β)Rz (γ ) x , (1.70)
where the left hand side denotes rotation of the function values, while keeping the
coordinate system fixed; this is equivalent to keeping the function values fixed, while
rotating the coordinate system with an inverse rotation, as represented by the right-
hand side. Now, a series of L rotations R1 , R2 , . . . , R L is described by the product
of the rotation matrices:
R = R L · · · R2 R1 , (1.71)
n
Λ(α, β, γ )Ynm (θ, φ) = Dmn m (α, β, γ )Ynm (θ, φ), (1.72)
m =−n
where Dmn m (α, β, γ ) is the Wigner-D function [11] [see Sect. 4.3, Eq. (1)]:
Dmn m (α, β, γ ) = e−im α dmn m (β)e−imγ , (1.73)
and dmn m is the Wigner-d function, which is real and can be written in terms of the
Jacobi polynomial [5, 11]:
s!(s + μ + ν)!
dmn m (β) = ζm m sin(β/2)μ cos(β/2)ν Ps(μ,ν) (cos β), (1.74)
(s + μ)!(s + ν)!
The Wigner-D functions form a basis for the rotational Fourier transform, applied
to functions defined over the rotation group S O(3)[5, 11].
Equation (1.72) is useful in formulating rotations in the spherical harmonics
domain:
Using the final line in Eqs. (1.76) and (1.40), rotation in the spherical harmonics
domain can now be written as
n
gnm = f nm Dmn m (α, β, γ ), (1.77)
m=−n
such that the Fourier coefficients of the rotated function are formulated as a weighted
sum of the Fourier coefficients of the original function. For order-limited functions,
Eq. (1.77) can be written in a matrix form:
28 1 Mathematical Background
with
T
gnm = g00 , g1(−1) , g10 , g11 , . . . , g N N
T
fnm = f 00 , f 1(−1) , f 10 , f 11 , . . . , f N N , (1.79)
and so on.
The rotation of a function defined over the unit sphere is presented next. Consider
the spherical cap function, defined in Eq. (1.63), but with spherical harmonic coeffi-
cients truncated to an order of N = 2, i.e. all coefficients above n = 2 are set to zero.
1.6 Rotation of Functions 29
Figure 1.15 illustrates the function using a balloon plot, marked as “Original” in the
figure. The function is then rotated by multiplying its spherical Fourier coefficient
vector with the appropriate Wigner-D rotation matrix, as defined in Eq. (1.78). In the
figure, balloon plots of the rotated function are illustrated for different rotations. In
this example, the spherical harmonic coefficients vector of the original function is
given by
fnm = [(0.24), (0, 0.38, 0), (0, 0, 0.43, 0, 0)]T , (1.81)
with round brackets artificially separating coefficients with the same order. The ele-
ments in fnm are non zero only for m = 0, as expected, because the function is
constant along φ [see Eq. (1.47)]. However, when rotated, the operation of multipli-
cation with the Wigner-D rotation matrix results in a vector fnm , which is no longer
non-zero only at m = 0, and the function is no longer constant along φ. In this exam-
ple, the original function after rotation by Λ(0, 45◦ , 0) has the following vector of
the spherical harmonic coefficients:
fnm = [(0.24), (0.19, 0.27, −0.19), (0.13, 0.26, 0.11, −0.26, 0.13)] T . (1.82)
Convolution and correlation are widely used in signal processing to describe the
operation of linear systems and to investigate similarity between signals. Convolu-
tion and correlation can also be defined for functions on the unit sphere. Spherical
convolution, for example, has been previously applied to describe the sound pres-
sure measured on a spherical surface [7], while spherical correlation has been used
to describe spatial filtering on the sphere [9].
The operations of convolution and correlation of functions defined over the unit
sphere are presented in this section. The operation of convolution of two functions
defined over the line or the circle is typically formulated as an integral over one
function multiplied by a reversed and shifted version of the other function. Similarly,
convolution over the sphere can be described as the result of integrating the product
of one function with a rotated version of another function. However, since rotation is
a three-parameter operation, it involves a triple integral. The operation of convolving
function f (θ, φ) with function g(θ, φ) to produce y(θ, φ) is formulated as follows.
First, a compact notation is introduced for a double integral over the sphere and a
triple integral over the rotation angles:
2π π
dμ ≡ sin θ dθ dφ, (1.83)
S2 0 0
2π π 2π
dξ ≡ dα sin βdβdγ , (1.84)
S O(3) 0 0 0
such that ξ ≡ ξ(α, β, γ ) ∈ S O(3). Using this notation, and denoting the functions
defined over the unit sphere f (μ), g(μ), and y(μ) as in Sect. 1.1, the convolution is
now defined as [2]
Note that gnm is evaluated only at m = 0. This is a result of the fact that f (β, α)
is not dependent on γ , and so the integral over γ defined within the integral over ξ
operates only on the rotated function g, averaging its value along the azimuth. The
coefficients gn0 evaluated only for m = 0 (gn0 δm0 ) represent a function that varies
only with elevation, while the coefficients gn0 evaluated for all m (gn0 ∀m) represent
a function with symmetry along φ that satisfies f (θ, φ) = f (θ, π − φ), because
this is a special case of the symmetry property presented in Eq. (1.51). A detailed
derivation of Eq. (1.86) can be found in [2].
The correlation between two functions is a measure of the similarity of the two
functions. It is typically formulated as the integral of the product of the two functions,
with one of the functions shifted, or rotated in the case of functions on a sphere.
Therefore, the correlation between f (μ) and g(μ), denoted by c(ξ ), is defined as
[5]
1.7 Spherical Convolution and Correlation 31
c(ξ ) = f (μ) [Λ(ξ )g(μ)]∗ dμ. (1.87)
S2
Note that the result of the correlation operation, c(ξ ), is a function of the three
parameters of the rotation ξ . Using the spherical harmonics representation for f and
g, as in Eq. (1.40), and substituting in Eq. (1.87), c(ξ ) can be written in terms of f nm
and gnm as [5]
∞
n n
∗
n ∗
c(ξ ) = f nm gnm Dmm (ξ ) . (1.88)
n=0 m=−n m =−n
Equation (1.88) may be more useful than Eq. (1.87) as it involves summations rather
than integrals, which could be particularly useful if the functions are order-limited.
References
1. Arfken, G., Weber, H.J.: Mathematical Methods for Physicists, 5th edn. Academic, San Diego
(2001)
2. Driscoll, J.R., Healy Jr., D.M.: Computing Fourier transforms and convolutions on the 2-sphere.
Adv. Appl. Math. 15(2), 202–250 (1994)
3. Gelb, A.: The resolution of the Gibbs phenomenon for spherical harmonics. Math. Comput.
66(218), 699–717 (1997)
4. Jespen, D.W., Haugh, E.F., Hirschfelder, J.O.: The integral of the associated Legendre function.
Technical report, University of Wisconsin, Naval Research Laboratory (1955)
5. Kostelec, P.J., Rockmore, D.N.: FFTs on the rotation group. J. Fourier Anal. Appl. 14, 145–179
(2008)
6. Legendre polynomials (2014). http://functions.wolfram.com/Polynomials/LegendreP/
7. Rafaely, B.: Plane-wave decomposition of the pressure on a sphere by spherical convolution.
J. Acoust. Soc. Am. 116(4), 2149–2157 (2004)
8. Rafaely, B.: Spherical loudspeaker array for local active control of sound. J. Acoust. Soc. Am.
125(5), 3006–3017 (2009)
9. Rafaely, B., Weiss, B., Bachmat, E.: Spatial aliasing in spherical microphone arrays. IEEE
Trans. Sig. Process. 55(3), 1003–1010 (2007)
10. Sansone, G.: Orthogonal Functions. Interscience Publishers, New York (1959)
11. Varshalovich, D.A., Moskalev, A.N., Khersonskii, V.K.: Quantum Theory of Angular Momen-
tum, 1st edn. World Scientific Publishing, Singapore (1988)
12. Weyl, H.: Die Gibbssche erscheinung in der theorie der kugelfunktionen. Rend. Circ. Mat.
Palermo 29(1), 308–323 (1910)
13. Williams, E.G.: Fourier Acoustics: Sound Radiation and Nearfield Acoustical Holography.
Academic, New York (1999)
Chapter 2
Acoustical Background
Abstract The mathematical background for functions defined on the unit sphere
was presented in Chap. 1. Spherical harmonics played an important role in presenting
and manipulating these functions. In this chapter, functions on the sphere are defined
through the formulations of fields in three dimensions. Although sound fields are
of primary concern in this book, which is oriented towards microphone arrays, the
material presented here can be applied to scalar fields in general. This chapter begins
by presenting the acoustic wave equation in Cartesian and spherical coordinates,
with possible solutions. Solutions to the wave equation in spherical coordinates are
shown to involve spherical harmonics and spherical Bessel and Hankel functions.
Having formulated the fundamental solutions, sound fields due to a plane wave and
a point source are presented, including an analysis of the effect of a rigid sphere
introduced into the sound field. The latter is useful for describing the sound field
around a microphone array configured over a rigid sphere, for example. The chapter
concludes with a formulation of the three-dimensional translation of sound fields.
Sound pressure in free three-dimensional space, denoted by p(x, t), and measured
in Pascals, with x = (x, y, z) ∈ R3 measured in meters, and t representing time in
seconds, satisfies the homogeneous acoustic wave equation [6]:
1 ∂2
∇x2 p(x, t) − p(x, t) = 0, (2.1)
c2 ∂t 2
with c denoting the speed of sound in air, typically 343 m/s under normal ambient
conditions, and ∇x2 denoting the Laplacian in Cartesian coordinates, defined for a
function f (x, y, z) as
∂2 ∂2 ∂2
∇x2 f ≡ f + f + f. (2.2)
∂x2 ∂ y2 ∂z 2
where ω is the radial frequency in radians per second. In this representation, p(x)
can be regarded as the space-dependent amplitude of the sound pressure at frequency
ω. With k = ω/c denoting the wave number in radians per meter, the dependence of
the pressure amplitude on ω or on k can be explicitly described using the notation
p(k, x). Substituting Eq. (2.3) into Eq. (2.1), the wave equation transforms into the
Helmholtz equation (with time-dependence omitted):
In this book, sound fields are measured by spherical microphone arrays, so that it
is preferable to represent the position vector in spherical coordinates, r = (r, θ, φ).
The wave equation is now rewritten in spherical coordinates, for which the Laplacian
in spherical coordinates is first defined for a function f (r, θ, φ):
1 ∂ 2 ∂ 1 ∂ ∂ 1 ∂2
∇r2 f ≡ 2 r f + 2 sin θ f + 2 2 f. (2.7)
r ∂r ∂r r sin θ ∂θ ∂θ r sin θ ∂φ 2
Equation (2.7) can be derived from the Laplacian in Cartesian coordinates [Eq. (2.2)]
using Eq. (1.4) and the chain rule. The wave equation in spherical coordinates can
now be written as
2.1 The Acoustic Wave Equation 35
1 ∂2
∇r2 p(r, t) − p(r, t) = 0, (2.8)
c2 ∂t 2
where p(r, t) is the sound pressure as a function of time and space in spherical
coordinates. For single frequency, i.e. harmonic, sound fields, the Helmholtz equation
can also be written in spherical coordinates as
where p(k, r) is the amplitude of the sound pressure over space and the dependence
on k is explicitly expressed. The amplitude of the pressure can be represented in a
way similar to Eq. (2.3), as
p(r, t) = p(r)eiωt . (2.10)
A solution to the wave equation (2.8) can be obtained using separation of variables:
Substituting Eq. (2.11) into the wave equation (2.8), the single equation as a function
of p can be decomposed into four partial equations in the separate variables as
a function of Θ, Φ, R and T . The equation representing dependence on time is a
second-order differential equation:
d2T
+ ω2 T = 0, (2.12)
dt 2
with a fundamental solution
as also implied by Eq. (2.10). Substituting Eq. (2.11) in the Helmholtz equation (2.9)
and multiplying by r 2 sin2 θ/ p, the term depending on φ can be isolated, satisfying
d 2Φ
+ m 2 Φ = 0, (2.14)
dφ 2
with μ = cos θ . This equation is known as the associated Legendre differential equa-
tion and has two types of solutions, one singular at μ = 1 and a second solution that
is typically selected and is referred to as the associated Legendre function of the first
kind:
Θ(θ ) = Pnm (cos θ ), n ∈ N, m ∈ Z. (2.17)
Substituting Eq. (2.16) into the Helmholtz equation and applying some further
manipulations, a term dependent only on r can be isolated, satisfying
d2 d
r2 R + 2r R + (kr )2 − n(n + 1) R = 0. (2.18)
dr 2 dr
This equation is known as the spherical Bessel equation and its solution comprises
spherical Bessel functions of the first kind, jn (kr ), or spherical Hankel functions of
the first kind, h n (kr ), or both (see Sect. 2.2).
Combining the solutions over r , θ , φ, and t, a fundamental solution for the wave
equation in spherical coordinates can written in the form
or
p(r, t) = h n (kr )Ynm (θ, φ)eiωt , (2.20)
Solutions to the wave equation in spherical coordinates include spherical Bessel and
Hankel functions. These functions are presented in this section. The spherical Bessel
function of the first kind, jn (x), and of the second kind, yn (x), can be written using
Rayleigh formulas as [1]
n
1 d sin(x)
jn (x) = (−1) x
n n
(2.21)
x dx x
and n
1 d cos(x)
yn (x) = −(−1) xn n
, (2.22)
x dx x
with the spherical Hankel functions of the first kind, h n (x), and the second kind,
h (2)
n (x), written as
2.2 Spherical Bessel and Hankel Functions 37
n
1 d ei x
h n (x) = −i(−1) x n n
(2.23)
x dx x
and n
1 d e−i x
h (2)
n (x) = i(−1) x n n
, (2.24)
x dx x
and
h (2)
n (x) = jn (x) − i yn (x). (2.26)
Because the spherical Bessel functions are real, jn (x) and yn (x) compose the real
and imaginary parts of h n (x), i.e.
and
yn (x) = I m {h n (x)} . (2.28)
The spherical Bessel and Hankel functions are also related to the Bessel function,
Jα (x), and the Hankel function, Hα (x), through
π
jn (x) = J 1 (x) (2.29)
2x n+ 2
and
π
h n (x) = H 1 (x). (2.30)
2x n+ 2
xn
jn (x) ≈ , x 1, (2.31)
(2n + 1)!!
where (·)!! is the double factorial function, i.e. (2n + 1)!! = (2n + 1)(2n − 1) · · · 1.
38 2 Acoustical Background
j3 (x) = cos x
x − 6 sin x
x2
− 15 cos x
x3
+ 15 sin x
x4
10
0
0
-10 1
-20 2
-30 3
4
-40
5
-50
6
-60
-70
-80
1 2 3 4 5 6 7 8 9 10
Figure 2.1 shows that at large values of x, the amplitude of jn (x) decays in a similar
manner for all n. Indeed, for x n [or more specifically, for x n(n + 1)/2] jn (x),
as expressed in Table 2.1, is dominated by the first term, decays as 1/x, and can be
approximated by [1]
1
jn (x) ≈ sin(x − nπ/2), x n(n + 1)/2. (2.32)
x
2.2 Spherical Bessel and Hankel Functions 39
0
0
1
-50
-100 3
4
5
-150
6
-200
10-3 10-2 10-1 100
Fig. 2.2 Magnitude of the spherical Bessel function, | jn (x)|, for n = 0, . . . , 6 and for x < 1
40
30 6
5
20 4
3
10
2
1
0 0
-10
-20
1 2 3 4 5 6 7 8 9 10
Figure 2.1 also shows that the spherical Bessel function has zeros. The zeros of j0 (x)
are at ±lπ, l = 1, 2, . . . ∞; for higher orders, the first zeros are positioned at x > π ,
but tend to appear at a spacing of π for large x, as suggested by Eq. (2.32).
Figure 2.3 presents |h n (x)|, illustrating that the spherical Hankel functions, unlike
the spherical Bessel functions, diverge towards the origin. Furthermore, Fig. 2.4
illustrates that for x 1, higher orders increase towards the origin with a larger
slope. This is supported by the small argument approximation of the spherical Hankel
function:
40 2 Acoustical Background
500
400
6
5
300 4
3
200
2
1
100
0
0
10-3 10-2 10-1 100
Fig. 2.4 Magnitude of the spherical Hankel function, |h n (x)|, for n = 0, . . . , 6 and x < 1
(2n − 1)!!
h n (x) ≈ −i , x 1. (2.33)
x n+1
On the other hand, for large values of x, h n (x) decays similarly for all n, which is
supported by the large argument approximation:
ei x
h n (x) ≈ (−i)n+1 , x n(n + 1)/2. (2.34)
x
The spherical Bessel function also satisfies recurrence equations:
2n + 1
jn (x) = jn−1 (x) + jn+1 (x) (2.35)
x
and
(2n + 1) jn (x) = n jn−1 (x) − (n + 1) jn+1 (x), (2.36)
with jn (x) denoting the first derivative of jn (x) with respect to x. These relations also
hold for the spherical Bessel function of the second kind and the spherical Hankel
functions of the first and second kinds [1].
The exponential representation of a plane wave, as in the first line of Eq. (2.37),
is simple and natural, compared to the infinite summation on the second line of the
same equation. However, the advantage of representing a plane wave in spherical har-
monics lies in the possibility of performing separation of variables. Terms including
kr , wave arrival direction (θk , φk ), and position (θ, φ) on the surface of a sphere of
radius r , can thus be formulated as parameters of separate functions. This advantage
is exploited later in the book when developing array processing algorithms in the
spherical harmonics domain. Derivation of Eqs. ( 2.37) and (2.38) and further reading
can be found in [1, 8], for example.
The shortcoming of the representation of plane waves using spherical harmonics
with an infinite summation is typically overcome in practice by approximating the
infinite summation with a finite summation, i.e. Eq. (2.37) is rewritten as
N
n
∗
p(k, r, θ, φ) ≈ 4πi n jn (kr ) Ynm (θk , φk ) Ynm (θ, φ), (2.39)
n=0 m=−n
-0.2
-5
-0.4
-10
-0.6
-15
-0.8
-20 -1
-20 -10 0 10 20
20 2
15 1.5
10
1
5
0.5
0
0
-5
-10 -0.5
-15 -1
-20
-20 -10 0 10 20
20 2
15 1.5
10 1
5 0.5
0 0
-5 -0.5
-10 -1
-1.5
-15
-2
-20
-20 -10 0 10 20
2.3 A Single Plane Wave 43
Comparing Eqs. (2.37) and (2.40), the spherical harmonic coefficients for the
sound pressure over a sphere of radius r , in a sound field composed of a single
unit-amplitude plane wave arriving from (θk , φk ), can be written as
∗
pnm (k, r ) = 4πi n jn (kr ) Ynm (θk , φk ) . (2.41)
Equation (2.41) also shows that the magnitude of pnm is proportional to the magnitude
of jn (kr ). It is therefore expected that pnm for a plane-wave sound field decays as a
function of n for n > kr , as suggested by Fig. 2.1, and, more explicitly, as illustrated
in Fig. 2.6 for kr = 8 and kr = 16. This is an important result, as it suggests that the
sound field represented by the infinite summation in Eq. (2.37) can be represented by
a finite summation as in Eq. (2.39) with little error. The spherical harmonics series
for a plane-wave sound field can therefore be considered as nearly order limited,
so that sampling theories for order-limited functions, as detailed in Chap. 3, can be
applied with little error.
This behavior is clearly illustrated in Fig. 2.5 for N = 16, for example. The figure
shows a circle of radius r = 16 (equivalent to kr = 16 because k = 1), illustrating
that with N = 16, the pressure inside the circle satisfying kr < N is reconstructed
accurately, while outside the circle, with kr > N , the pressure is reconstructed with
significant error.
The condition of kr < N for accurate sound pressure reconstruction is further
illustrated in the following example, analyzing the magnitude of sound pressure over
the surface of a sphere of a fixed radius r = 1, at wave number k = 10, satisfying
kr = 10. The pressure is produced by a single unit-amplitude plane wave arriving
from direction (θk , φk ) = (45◦ , −45◦ ), which is then reconstructed using Eq. (2.39)
for various values of N . Figure 2.7 shows that for N = 20, satisfying N > kr , good
44 2 Acoustical Background
20
-20
-40
-60
-80
-100
0 5 10 15 20 25 30
20
-20
-40
-60
-80
-100
0 5 10 15 20 25 30
Fig. 2.6 Magnitude of the normalized spherical Bessel function, |4πi n jn (kr )|, for kr = 8, 16
2π
π
p(k, r, θ, φ) = a(k, θk , φk )ei k̃·r sin θk dθk dφk
0 0
∞
n
= 4πi n jn (kr )Ynm (θ, φ)
n=0 m=−n
2π
π
∗
× a(k, θk , φk ) Ynm (θk , φk ) sin θk dθk dφk
0 0
∞
n
= 4πi n anm (k) jn (kr )Ynm (θ, φ), (2.42)
n=0 m=−n
where anm (k) is the spherical Fourier transform of a(k, θk , φk ). Comparing Eqs.
(2.37) and (2.42), it is clear that for a sound field composed of a single unit-amplitude
plane wave, the following holds:
∗
anm (k) = Ynm (θk , φk ) , (2.43)
This is a very useful result, relating directly the spherical harmonic coefficients of
the plane-wave amplitude density to the spherical harmonic coefficients of the sound
pressure. This is also the advantage of analyzing the sound pressure over the surface
of a sphere – the measured function pnm is in the same domain (spherical harmonics)
as the function generating the sound field, anm , thus facilitating a direct relation
between the two.
2.4 Plane-Wave Composition 47
Equation (2.42) involves an infinite summation, but, similar to the case of a single
plane wave, the infinite summation may be approximated in practice by a finite
summation, leading to
N
n
p(k, r, θ, φ) ≈ 4πi n anm (k) jn (kr )Ynm (θ, φ). (2.46)
n=0 m=−n
∞ n
jn (kr )
p(k, r , θ , φ ) = pnm (k, r )Ynm (θ , φ ). (2.47)
n=0 m=−n
jn (kr )
This use of Eq. (2.47) is limited in practice by several factors. First, kr values
corresponding to the zeros of the spherical Bessel functions lead to division by zero
and a diverging quotient. Second, as discussed above, pnm (kr ) has significant terms
only up to order n = kr , whereas if r r , accurate reconstruction of the pressure
at r will require terms up to order n = kr kr . Therefore, accurate calculation of
p(k, r , θ , φ ) may require division by jn (kr ), which may have low magnitude at n >
kr , again leading to numerical instability. Furthermore, if the infinite summation in
Eq. (2.47) is replaced by a finite summation of order N , as expressed in the following
equation, the order-limited equation will be useful only in the range where both kr
and kr are smaller than N :
N n
jn (kr )
p(k, r , θ , φ ) ≈ pnm (k, r )Ynm (θ , φ ). (2.48)
j
n=0 m=−n n
(kr )
48 2 Acoustical Background
Real-world sources produce sound fields in their immediate vicinity with a behavior
that makes it appropriate to model them as a simple point source (a monopole source)
or a combination of these. Consider a point source located at rs = (rs , θs , φs ), pro-
ducing unit-amplitude sound pressure at a distance of 1 m from the source. The source
produces a spherical sound field, i.e. the pressure magnitude decays at a rate that is
inversely proportional to the distance from the source, while the phase is constant
as a function of θ and φ for a fixed distance from the source. The sound pressure at
location r = (r, θ, φ) for this spherical radiation field can be written using a series
of spherical harmonics as [5, 8]
∞
e−ikr−rs
n
m ∗ m
= 4π(−i)kh (2)
n (kr s ) jn (kr ) Yn (θs , φs ) Yn (θ, φ), r < r s ,
r − rs n=0 m=−n
(2.49)
where r = r and · is the Euclidean norm. The condition r < rs means that the
measurement point is nearer the origin relative to the point source. If a spherical
measurement surface of radius r is considered, then the point source is assumed to
be outside the measurement sphere. Note the similarity, in this case, between the
sound field produced by the point-source and the plane-wave sound field; the latter is
described by Eq. (2.37), with the plane-wave arrival direction replaced by the direc-
tion of the point source. Indeed, a point source positioned far from a measurement
region will produce a sound field similar to a plane-wave sound field. The minus sign
in the exponential e−ikr−rs guarantees that when combined with the time-dependent
exponential, eiωt , the sound radiation is outwards from the point source.
When the point source is nearer the origin relative to the measurement point, r
and rs exchange places, such that
∞
e−ikr−rs
n
m ∗ m
= 4π(−i)kh (2)
n (kr ) jn (kr s ) Yn (θs , φs ) Yn (θ, φ), r > r s .
r − rs n=0 m=−n
(2.50)
The sound pressure at the surface of a sphere of radius r , p(k, r, θ, φ), due to a
point source positioned at (rs , θs , φs ) can be described using the spherical harmonic
coefficients by comparing Eq. (2.40) with Eqs. (2.49) and (2.50), leading to
m ∗
pnm (k, r ) = 4π(−i)kh (2)
n (kr s ) jn (kr ) Yn (θs , φs ) , r < r s , (2.51)
and m ∗
pnm (k, r ) = 4π(−i)kh (2)
n (kr ) jn (kr s ) Yn (θs , φs ) , r > r s . (2.52)
e−krs ∗
pnm (k, r ) ≈ 4πi n jn (kr ) Ynm (θs , φs ) , r < rs , krs n(n + 1)/2.
rs
(2.53)
The spherical harmonic coefficients of the sound pressure on a sphere of radius r are
similar to the coefficients produced on the same sphere by a plane wave, as shown
−krs
in Eq. (2.41), with (θk , φk ) = (θs , φs ) normalized by the term e rs representing the
phase shift and attenuation due to the propagation from the point source to the origin.
Furthermore, if we consider the sound pressure limited to a sphere of radius r and
approximately order limited to N = kr and assume that rs satisfies krs > N (N +
1)/2, then the sound pressure produced by the point source is approximately the same
as the sound pressure produced by a plane wave with (θk , φk ) = (θs , φs ). This is a
useful result, as it allows the sound pressure in a limited region in space, produced
by a distant point source, to be approximated by the sound pressure produced by a
plane wave and thus to inherit the properties of a plane-wave sound field. For a more
detailed comparison between the sound field produced around the origin by a point
source and by a plane wave, the reader is referred to [4].
The sound pressure on the surface of a sphere in a free field due to plane waves and
point sources has been analyzed in previous sections. In this section the pressure
around a rigid sphere is derived. This is useful when measuring microphones are
50 2 Acoustical Background
placed around a rigid sphere, which is often the case in practice, or when such a rigid
sphere is employed to mimic a human head.
The sound pressure around a rigid sphere is composed of the incident sound field,
which is the sound field in free field without the rigid sphere, and the scattered sound
field, which is the sound field that is scattered from the rigid sphere due to the incident
field. The contribution of both fields to the sound pressure around a rigid sphere is
formulated next. Consider a rigid sphere of radius ra . The sphere imposes a boundary
condition on its surface of zero radial velocity:
u r (k, ra , θ, φ) = 0, (2.54)
because of the infinite impedance at the sphere boundary and the inability of the
sound pressure to generate radial motion at this boundary. Acoustic velocity can
be related to pressure through the equation of momentum conservation (or Euler
equation) in spherical coordinates:
∂p 1 ∂p 1 ∂p
∇p ≡ r̂ + θ̂ + φ̂. (2.56)
∂r r ∂θ r sin θ ∂φ
ρ0 is the air density in kilograms per cubic meter and r̂ , θ̂ , φ̂ are unit vectors, as
shown in Fig. 2.8, with r̂ pointing in the direction of r, θ̂ is tangential to the surface
of a sphere of radius r , pointing downwards along the longitude, and φ̂ is tangential
to the surface of a sphere of radius r , pointing along the latitude.
∂
[ pi (k, r, θ, φ) + ps (k, r, θ, φ)] r =ra = 0. (2.57)
∂r
The scattered pressure is now written as a spherical harmonics series as
∞
n
ps (k, r, θ, φ) = cnm (k)h (2)
n (kr )Yn (θ, φ).
m
(2.58)
n=0 m=−n
Note that anm assumes an incident sound field composed of plane waves described
in the notation previously used [see Eq. (2.42)]. However, a similar formulation also
holds for sound fields due to point sources, as long as they are outside the sphere of
radius r [see Eq. (2.49)].
Writing Eq. (2.57) in the spherical harmonics domain, by substituting Eq. (2.58)
for the scattered pressure and Eq. (2.59) for the incident pressure, yields
jn (kra )
cnm (k) = −anm (k)4πi n . (2.60)
h (2)
n (kra )
Substituting cnm in Eq. (2.58) and adding the incident field, Eq. (2.59), the total sound
field around a rigid sphere is given by
∞
n
jn (kra )
p(k, r, θ, φ) = anm (k)4πi n
jn (kr ) − h (2)
n (kr ) Ynm (θ, φ).
n=0 m=−n h (2)
n (kra )
(2.61)
By denoting
jn (kra ) (2)
bn (kr ) = 4πi n jn (kr ) − (2) h n (kr ) , (2.62)
h n (kra )
52 2 Acoustical Background
10
0
0
1
-10
2
-20
3
-30
4
-40
5
-50
6
-60
-70
-80
1 2 3 4 5 6 7 8 9 10
Fig. 2.9 Function |bn (kr )|/(4π ) for a rigid sphere with r = ra , for n = 0, . . . , 6
the pressure outside the rigid sphere can be written in the spherical harmonics domain
as
pnm (k, r ) = anm (k)bn (kr ). (2.63)
Note the similarity to Eq. (2.45) with 4πi n jn (kr ) replaced by bn (kr ), now containing
a scattering term. Also note that the explicit dependence of bn on ra has been omitted
for notation simplicity. The behavior of the magnitude of bn , normalized by 4π , is
presented in Fig. 2.9. Compared to Fig. 2.1, showing the magnitude of jn , function
bn does not have zeros away from the origin. This important property is useful when
a division by jn is replaced by a division by bn , such as in sound extrapolation [see
Eq. (2.48)] or, generally, in array processing, as presented later in this book.
Similar to the case of the pressure around a sphere in a free field, around a rigid
sphere the magnitude of the spherical harmonic coefficients of the pressure due to a
plane-wave sound field decreases for n > kr , as shown in Fig. 2.10. This figure is
similar to Fig. 2.6, only here the functions are smoother for low values of n due to
the absence of the zeros.
Figure 2.11 shows the sound pressure, Re{ p(k, r, θ, φ)}, around rigid spheres of
radii ra = 1, 3, 10, due to a single unit-amplitude plane wave arriving from (θk , φk ) =
(90◦ , 20◦ ), with k = 1. The sound pressure was calculated using Eq. (2.61), with
terms limited to order N = 32. Comparing Figs. 2.5 and 2.11, the effect of the sound
pressure scattered from the rigid sphere is clear. For large radii, e.g. ra = 3, 10, the
sound field around the rigid sphere is significantly altered by the scattered field, while
for smaller radii, e.g. ra = 1, the change is minor.
The relation between the radius of the rigid sphere and the magnitude of the scat-
tered sound field can be studied analytically. The scattered sound field is dependent on
2.6 Sound Pressure Around a Rigid Sphere 53
20
-20
-40
-60
-80
-100
0 5 10 15 20 25 30
20
-20
-40
-60
-80
-100
0 5 10 15 20 25 30
Fig. 2.10 Function |bn (kr )| for a rigid sphere with r = ra and kr = 8, 16
the scattering term jn (kra )/ h (2)
n (kra ) in bn [see Eq. (2.62)]. For a small rigid sphere
satisfying kra 1, substituting the relation in Eq. (2.36) for the derivatives and using
the small argument approximations in Eqs. (2.31) and (2.33), jn (kra )/ h (2) n (kra ) is
proportional to (kra )2n+1 ; this term tends to zero as kra → 0, therefore leading to a
negligible contribution from the scattered field.
The sound pressure on the surface of a rigid sphere due to a plane-wave
sound field is illustrated in Fig. 2.12, showing Re{ p(k, r, θ, φ)} on the surface
54 2 Acoustical Background
-10 -1
-15 -1.5
-20
-20 -10 0 10 20
20 1
15
0.5
10
5 0
0
-0.5
-5
-1
-10
-15
-1.5
-20
-20 -10 0 10 20
20 1
0.8
15
0.6
10
0.4
5 0.2
0 0
-0.2
-5
-0.4
-10
-0.6
-15 -0.8
-1
-20
-20 -10 0 10 20
2.6 Sound Pressure Around a Rigid Sphere 55
Fig. 2.12 Re{ p(k, r, θ, φ)} due to a unit-amplitude plane wave with arrival direction (45◦ , −45◦ ),
evaluated using Eq. (2.61) with kra = 10 and plotted on the surface of a rigid sphere
So far in this chapter the sound pressure has been presented relative to the origin of
the spherical coordinate system. It may be useful to present the sound pressure in
the spherical harmonics domain, relative to a translated spherical coordinate system.
For example, the sound pressure around several spheres can be presented relative to
a common origin. Other examples of translated sound fields represented in spherical
harmonics have been investigated in recent publications [2, 7]. The aim of this section
is therefore to provide an overview of the operation of translation of sound fields and
of the effect of translation on the representation of the sound fields in spherical
harmonics.
Sound fields due to plane waves or distant point sources at (r, θ, φ) are described as
a series of weighted jn (kr )Ynm (θ, φ) terms, whereas sound fields due to point sources
that are near the origin are described as a series of weighted h (2)
n (kr )Yn (θ, φ) terms
m
[see Eq. (2.50)]. Consider a translation in the coordinate system from the origin to
r = (r , θ , φ ), such that
56 2 Acoustical Background
r = r + r , (2.64)
n =0 m =−n
∞
× jn (kr )Ynm−m
(θ , φ )Cnnmn
m , r > r (2.66)
n =0
where
(2n + 1)(2n + 1)(2n + 1)
(n +n −n)
Cnnmn
m = 4πi (−1) m
4π
nn n n n n
× (2.68)
0 0 0 −m m m − m
j1 j2 j3
and is the Wigner 3-j symbol [3] . Equation (2.65) was derived from
m1 m2 m3
the equation ei k̃·r = ei k̃·r ei k̃·r by first substituting Eq. (2.37) for all terms and then
multiplying by Ynm (θk , φk ) and integrating over the sphere with respect to (θk , φk ).
Equations (2.66) and (2.67) can then be derived by exploring relationships between
spherical Bessel and spherical Hankel functions [3].
We now consider the case where a sound field composed of multiple plane waves
is measured around a spherical surface, r = (r, θ, φ), such that r is constant. In this
case, the function on the sphere can be represented by coefficients in the spherical
harmonics domain, as in Eq. (2.45):
The coefficients anm (k) provide information on the sound field and can be used to
calculate the sound pressure at a position (r, θ, φ) relative to the origin. Now, keeping
the same sound field, but shifting the origin of the coordinate system to r , we would
like to calculate the sound pressure at position (r , θ , φ ) relative to the new origin,
using a similar set of coefficients anm (k). We would like to formulate a direct relation
between anm (k) and anm (k). The sound pressure can be written using Eqs. (2.65) and
(2.69) as
∞
n
p(k, r, θ, φ) = 4πi n anm (k) jn (kr )Ynm (θ, φ)
n=0 m=−n
∞
n ∞
n
= 4πi anm (k)
n
jn (kr )Ynm (θ , φ )
n=0 m=−n n =0 m =−n
∞
× jn (kr )Ynm−m
(θ , φ )Cnnmn
m
n =0
∞
n
= 4πi n jn (kr )Ynm (θ , φ )
n =0 m =−n
∞ ∞
n
× anm (k) jn (kr )Ynm−m
(θ , φ )i n−n Cnnmn
m .
n=0 m=−n n =0
(2.70)
58 2 Acoustical Background
which provides a relationship between the sound field coefficients in the original
and in the translated coordinates. Similar relations can also be developed using Eqs.
(2.66) and (2.67). Note that Cnnmn m is non-zero only for |n − n | ≤ n ≤ n + n , and
so if anm is of finite order, each coefficient an m can be calculated by a finite number
of summations.
References
1. Arfken, G., Weber, H.J.: Mathematical Methods for Physicists, 5th edn. Academic Press, San
Diego (2001)
2. Ben Hagai, I., Pollow, M., Vorlander, M., Rafaely, B.: Acoustic centering of sources measured
by surrounding spherical microphone arrays. J. Acoust. Soc. Am. 130(4), 2003–2015 (2011)
3. Chew, W.C.: Waves and Fields in Inhomogeneous Media, 1st edn. Wiley-IEEE Press, New York
(1999)
4. Fisher, E.: Near-field spherical microphone array processing with radial filtering. IEEE Trans.
Audio Speech Lang. Process. 19(2), 256–265 (2011)
5. Jackson, J.D.: Classical Electrodynamics, 3rd edn. Wiley, New York (1999)
6. Kinsler, L.E., Frey, A.R., Coppens, A.B., Sanders, J.V.: Fundamentals of Acoustics, 4th edn.
Wiley, New York (1999)
7. Peleg, T., Rafaely, B.: Investigation of spherical loudspeaker arrays for local active control of
sound. J. Acoust. Soc. Am. 130(4), 1926–1935 (2011)
8. Williams, E.G.: Fourier Acoustics: Sound Radiation and Nearfield Acoustical Holography. Aca-
demic, New York (1999)
Chapter 3
Sampling the Sphere
The sampling of functions defined over continuous variables such as time and space is
often required to enable digital processing of the sampled functions using computers.
The sampling of sound pressure functions in space requires microphones, where the
positions of the microphones determine the sampling points. The design of a spatial
sampling systems using microphones involves a trade-off – reducing the number of
microphones leads to a reduction in system complexity, while increasing the number
of microphones may lead to improvement in the accuracy of the reconstruction of
the sound pressure function.
Sampling theorems, such as the Nyquist theorem, e.g. [11], require the function
to be band-limited to achieve perfect reconstruction from the samples. This means
that the function can be represented by a finite number of basis functions. In a similar
manner, sampling theorems for functions on the sphere require the functions to be
order-limited, or represented by a finite number of spherical harmonics.
A general formulation of the sampling problem can be derived starting from the
problem of quadrature. Quadrature methods aim to compute the integral of a given
function using a summation over samples of the function. Cubature is sometimes
used to refer to multiple integration. Consider a function g(θ, φ) defined on the unit
sphere. A quadrature method aims to approximate the integral of g(θ, φ), given a set
of samples on the sphere, (θq , φq ), and sampling weights, αq , as follows:
2π π
Q
g(θ, φ) sin θ dθ dφ ≈ αq g(θq , φq ), (3.1)
0 0 q=1
where Q is the total number of samples. The quadrature formulation for estimating the
area under the function can be extended to function reconstruction by substituting
g(θ, φ) = f (θ, φ)[Ynm (θ, φ)]∗ . Starting from Eq. (1.41), this substitution leads to
the approximation of the spherical Fourier transform of function f (θ, φ) from its
samples:
2π π
∗
f nm = f (θ, φ) Ynm (θ, φ) sin θ dθ dφ
0 0
Q
∗
≈ αq f (θq , φq ) Ynm (θq , φq ) . (3.2)
q=1
Q
∗
αq Ynm (θq , φq ) Ynm (θq , φq ) ≈ δnn δmm , (3.3)
q=1
with the total number of samples given by 4(N + 1)2 , determined by N . N also
represents the maximum order of an order-limited function that can be reconstructed
from these samples, as detailed later in this section. Note that the value of 21 added
to the index q [5] ensures that samples are not selected at the poles. Placing samples
at the pole [2] leads to 2N + 2 collocated samples due to the repetition of azimuth
samples and, therefore, reduces the total number of non-collocated samples.
Figure 3.1 clearly shows that although the samples adhere to a uniform angular
distribution, as illustrated on the θ φ plane plot, they are not uniformly distributed on
the sphere, as illustrated on the unit sphere plot, since samples are more dense around
the poles. Such a sampling scheme may be useful when mechanically scanning
microphone positions or when representing sampled functions on the θ φ plane, for
example, due to the uniform grid when measured along the angles. A complete
theorem is available for this type of sampling, and is presented in this section in
some detail. The main results, providing expressions for the sampling weights and
the spherical Fourier transform, are presented in Eqs. (3.11) and (3.15).
Sampling of functions defined on the real line can be represented mathematically
by multiplication with a delta function at the sampling position. Similarly, for the
sphere, a “train” of delta functions at the sampling positions is defined as
2N +1 2N
+1
s(θ, φ) = αq δ(cos θ − cos θq )δ(φ − φl ). (3.5)
q=0 l=0
The coefficients αq determine the amplitude of the delta functions, which reduces
towards the poles to compensate for the increased density of the samples. The deriva-
tion of the values of αq is presented later in this section. The spherical Fourier trans-
form of s(θ, φ), denoted by snm , is derived by substituting Eq. (3.5) in the spherical
Fourier transform, Eq. (1.41), and using the sifting property of the delta function, as
in Eq. (1.52):
62 3 Sampling the Sphere
180
160
140
120
100
80
60
40
20
0
0 50 100 150 200 250 300 350
Fig. 3.1 Equal-angle sampling distribution, for N = 5 and a total of 144 samples, illustrated on
the surface of a unit sphere and over the θφ plane
2π π
∗
snm = s(θ, φ) Ynm (θ, φ) sin θ dθ dφ
0 0
2π π 2N
+1 2N
+1
∗
= αq δ(cos θ − cos θq )δ(φ − φl ) Ynm (θ, φ) sin θ dθ dφ
0 0 q=0 l=0
+1 2N
+1
2N
∗
= αq Ynm (θq , φl ) . (3.6)
q=0 l=0
The summation over l can be evaluated by substituting the definition of the spherical
harmonics in Eq. (1.9) and noting that αq are independent of l, leading to
3.2 Equal-Angle Sampling 63
2N +1 2N
+1
2n + 1 (n − m)! m
snm = αq P (cos θq )e−imφl
q=0 l=0
4π (n + m)! n
2N +1 +1
2n + 1 (n − m)!
2N
= αq Pnm (cos θq ) e−imφl
4π (n + m)! q=0 l=0
2N +1
2n + 1 (n − m)!
= αq Pnm (cos θq )(2N + 2)δ((m))2N +2 , (3.7)
4π (n + m)! q=0
where δm is a short notation for δm0 . The summation over l has been reduced to
a periodic delta function due to the uniform distribution of the samples along the
azimuth, where ((·)) N denotes modulo N . The modulo operation is also denoted
as (·) mod N in this book. In the range 0 ≤ n ≤ 2N + 1 with −N ≤ m ≤ N the
periodic delta function has only one non-zero term, therefore reducing to (2N + 2)δm .
The expression for snm can therefore be simplified further in this limited range:
+1
2n + 1
2N
snm = 2(N + 1) δm αq Pn (cos θq ), n ≤ 2N + 1. (3.8)
4π q=0
2N +1
2π
αq Pn (cos θq ) = δn , n ≤ 2N + 1. (3.10)
q=0
N +1
2π N
1
αq = sin(θq ) +1
sin [2q + 1]θq , 0 ≤ q ≤ 2N + 1. (3.11)
(N + 1)2 q =0
2q
where s̃nm is non-zero only for n > 2N + 1. Therefore, the spherical harmonics
transform of the impulse train is non-zero for n = 0, m = 0, and zero elsewhere in
the range n ≤ 2N + 1.
A sampled function on the sphere is now defined as f s (θ, φ) = f (θ, φ)s(θ, φ);
that is, an impulse train with the amplitude (area) of individual impulses being equal
to the amplitude of function f at the sampling points. The sampled function can be
written in terms of the original function by using Eq. (3.12):
n
∞
√
f s (θ, φ) = f (θ, φ)s(θ, φ) = f (θ, φ) 4π δn δm + s̃nm Ynm (θ, φ)
n=0 m=−n
∞
√ n
= f (θ, φ) 4π Y00 (θ, φ) + s̃nm Ynm (θ, φ)
n=0 m=−n
= f (θ, φ) + f (θ, φ)s̃(θ, φ), (3.13)
where s̃(θ, φ) is the inverse spherical Fourier transform of s̃nm , containing spherical
harmonics orders of 2N + 2 and above. It is argued in [2] that, because f (θ, φ)
and s̃(θ, φ) are polynomials in cos θ generated by the associated Legendre function,
the lowest order of the product of the two functions in the spherical harmonics
domain is given by the minimal difference between the orders of the individual
functions. Assuming f (θ, φ) is order-limited to n ≤ N , and knowing that s̃nm (θ, φ)
is order-limited by n ≥ 2N + 2, the minimal difference between the orders of the
spherical harmonic coefficients of the two functions is therefore (2N + 2) − N =
N + 2. It follows that the product f (θ, φ)s̃(θ, φ) has a spherical Fourier transform
with coefficients that are zero in the range n ≤ N + 1, leading to the following
equality:
f snm = f nm , n ≤ N . (3.14)
f nm = f snm , n ≤ N
2π π
∗
= f (θ, φ)s(θ, φ) Ynm (θ, φ) sin θ dθ dφ
0 0
2π π
2N +1 2N
+1
= f (θ, φ) αq δ(cos θ − cos θq )
0 0 q=0 l=0
∗
× δ(φ − φl ) Ynm (θ, φ) sin θ dθ dφ
+1 2N
+1
2N
∗
= αq f (θq , φl ) Ynm (θq , φl ) , (3.15)
q=0 l=0
with αq given by Eq. (3.11). The sifting property, Eqs. (1.52), (3.5) and (3.14), have
been employed in the derivation. This equation has the same form as Eq. (3.2)
defined through quadrature computation, with αq , in this case, defining the quadrature
weights. Substituting f (θ, φ) = Ynm (θ, φ), the orthogonality condition for equal-
angle sampling can be written in the same form as Eq. (3.3):
+1 2N
+1
2N
∗
αq Ynm (θq , φl ) Ynm (θq , φl ) = δnn δmm , n, n ≤ N . (3.16)
q=0 l=0
Second, function f (θ, φ) can be reconstructed from the sampled function fs (θ, φ)
by applying an ideal low-pass filter in the spherical harmonics domain, with a cut-off
order of N . This low-pass filter should set to zero the coefficients of f snm for n > N ,
and keep unchanged the coefficients for n ≤ N . Selecting the filter
N n
1 2n + 1 m
h(θ, φ) = Yn (θ, φ) (3.17)
n=0 m=−n
2π 4π
and applying spherical convolution, f s (θ, φ) ∗ h(θ, φ), which transforms to multi-
plication in the spherical harmonics domain [see Eq. (1.86)], f nm can be written as
4π
f nm = 2π f snm h n0
2n + 1
f snm , n ≤ N
= , (3.18)
0 otherwise
The Gaussian sampling scheme described in this section requires only 2(N + 1)2
samples, which is half of the number of samples required by the equal-angle sampling
scheme. The azimuth angle is sampled at 2(N + 1) equal-angle samples, but the
elevation angle requires only (N + 1) samples, which are nearly equally spaced.
The mathematical formulation of the Gaussian sampling scheme is similar to the
formulation derived in Sect. 3.2 for the equal-angle scheme. There is, however, a
difference in that for the Gaussian sampling scheme, the orthogonality over the
summation of the Legendre functions
N
2π
αq Pn (cos θq ) = δn , n ≤ 2N + 1 (3.19)
q=0
N +1
PN +1 (cos θq ) = 0, 0 ≤ q ≤ N , (3.20)
π 2(1 − cos2 θq )
αq = , 0 ≤ q ≤ N. (3.21)
N + 1 (N + 2)2 PN +2 (cos θq ) 2
The coefficients can also be found in tables [7], which also provide the sampling
positions. The spherical Fourier transform is given in this case by
+1
N 2N
∗
f nm = αq f (θq , φl ) Ynm (θq , φl ) , n ≤ N . (3.22)
q=0 l=0
The advantage of the Gaussian sampling scheme is the reduced number of sample
points for a given order N compared with the equal-angle sampling scheme. The
drawback is the potential inconvenience due to the non-equal spacings along θ ,
when microphones are mechanically rotated, for example, and an equal-step rotation
may be an advantage.
Figure 3.2 illustrates an example of a Gaussian sampling distribution for N = 7
and a total of 128 samples. The figure shows the samples plotted on the surface
of a unit sphere and over the θ φ plane. The figure shows the features of Gaussian
sampling – twice as many samples are distributed along the azimuth compared to the
elevation, while, similar to the equal-angle sampling scheme, the samples are more
dense near the poles.
3.4 Uniform and Nearly-Uniform Sampling 67
180
160
140
120
100
80
60
40
20
0
0 50 100 150 200 250 300 350
Fig. 3.2 Gaussian sampling distribution for N = 7 and a total of 128 samples, illustrated on the
surface of a unit sphere and over the θφ plane
The equal-angle and Gaussian sampling schemes have a uniform (or nearly-uniform)
distribution of samples along θ and φ, but, as illustrated in Figs. 3.1 and 3.2, the dis-
tributions are not uniform on the surface of the sphere. An attempt to distribute
sampling points uniformly around the surface of a sphere, leads directly to the five
convex regular polyhedra, known as Platonic solids, named after the Greek philoso-
pher Plato. Figure 3.3 shows the five Platonic solids, namely, the tetrahedron, the
cube (or hexahedron), the octahedron, the dodecahedron and the icosahedron. The
Greek prefix denotes the number of faces for each Platonic solid (see Table 3.1).
68 3 Sampling the Sphere
Fig. 3.3 The five Platonic solids; from left to right, top row: tetrahedron and cube, bottom row:
octahedron, dodecahedron and icosahedron
Table 3.1 Properties of the sampling designs based on the five Platonic solids: the number of
faces, the number of vertices representing the sampling points (Q), the t-design order and the
corresponding maximum spherical harmonics order, calculated as N =
t/2
Design Faces Vertices t-design N =
t/2
Tetrahedron 4 4 2 1
Hexahedron 6 8 3 1
(cube)
Octahedron 8 6 3 1
Dodecahedron 12 20 5 2
Icosahedron 20 12 5 2
2π π
4π
Q
g(θ, φ) sin θ dθ dφ = g(θq , φq ), (3.23)
Q q=1
0 0
such that the sampling weights, in reference to Eq. (3.1), are constant, satisfying
αq = 4π/Q. Equation (3.23) holds for an order-limited function, with an upper order
denoted by t in a t-design that is defined for each Platonic solid, as shown in Table 3.1.
The term t-design is used in spherical designs that aim to find a set of Q points on a
3.4 Uniform and Nearly-Uniform Sampling 69
sphere such that Eq. (3.23) holds for a function of a polynomial order t or lower [4].
Spherical designs can be used for the sampling of order-limited functions represented
by spherical harmonics, by replacing g(θ, φ) with f (θ, φ)[Ynm (θ, φ)]∗ , such that
Eq. (3.23) can be written in the form of Eq. (3.2) as
2π π
∗
f nm = f (θ, φ) Ynm (θ, φ) sin θ dθ dφ
0 0
4π
Q
∗
= f (θq , φq ) Ynm (θq , φq ) , n ≤ N . (3.24)
Q q=1
Assuming that f (θ, φ) has a maximum order N , and substituting Ynm (θ, φ) for n ≤
N , the maximum order of the product f (θ, φ)[Ynm (θ, φ)]∗ is 2N . This also denotes
the maximum t-design, with the relation N =
t/2 for a given value of t, where
·
denotes the floor function, as presented in the table.
Sampling distributions based on the Platonic solids and Table 3.1 offer uniform
distributions of samples, with a simple equation to compute the spherical Fourier
transform. However, they are available only for a limited number of configurations,
and for a maximum of 20 samples, supporting a maximum order of only N = 2.
This limited number of samples offered by the Platonic solids in a uniform-sampling
configuration motivated the search for methods to distribute a larger number of sam-
ples on the sphere in an almost-uniform manner. A wide range of methods have been
presented in the literature. Some are optimal in the sense that an objective function
is defined, after which the position of the samples and the corresponding sampling
weights are computed via numerical optimization. Other methods are characterized
by a special procedure for selecting the samples, or by other characteristics, such as
constant sampling weights. This section briefly reviews some of these methods.
Hardin and Sloane [4] extend the t-designs of the Platonic solids to a larger
set of sampling configurations, each satisfying Eq. (3.24) for some t value and a
corresponding order N . Similar to the Platonic solids, these designs offer an almost
uniform distribution of samples, with the convenience of constant sampling weights.
Although Hardin and Sloane computed and published the coordinates of a large
number of sampling sets, these sets are not available for any number of desired
samples, Q.
Saff and Kuijlaars [13] present an overview of approaches and methods for dis-
tributing many points on a sphere. They outline objectives for distributing points on
the sphere, which include maximizing the smallest distance between all points on the
sphere and minimizing the “energy” of points on the sphere. The latter is derived by
considering each point to be a charged particle repelling all other particles; therefore,
minimizing the sum of the inverse of the distances between these particles is analo-
gous to minimizing“energy”. The latter objective was also used by Fliege and Maier
[3], who presented a numerical method for computing the sampling positions and
70 3 Sampling the Sphere
weights. This method was recently employed in the design of spherical microphone
arrays [9].
Other approaches are characterized by the way in which the points are selected.
Equal-area partitioning aims to partition the sphere surface into equal area segments,
which each have a minimal diameter. One such method, described in [13] by Saff
and Kuijlaars and, more recently, by Leopardi [8], partitions the sphere surface into
azimuthal strips, each further divided into sections, with each section having the
same area. Sampling points are then positioned, one in each area element. Another
method described in [13] distributes points on spirals covering the sphere surface,
providing a relatively simple approach for a nearly-uniform distribution of samples.
Figure 3.4 illustrates an example of a uniform sampling distribution defined by
the vertices of a dodecahedron, with N = 2 and a total of 20 samples. Figure 3.5
illustrates a t-design with N = 8 and a total of 144 samples. Both figures show the
uniform distribution of the samples over the sphere surface and the non-uniform
distribution over the θ φ plane.
N
n
f (θq , φq ) = f nm Ynm (θq , φq ), 1 ≤ q ≤ Q. (3.25)
n=0 m=−n
f = Yfnm , (3.26)
where column vectors f of length Q and fnm of length (N + 1)2 are defined as
T
f = f (θ1 , φ1 ), f (θ2 , φ2 ), . . . , f (θ Q , φ Q ) (3.27)
3.5 Numerical Computation of Sampling Weights 71
180
160
140
120
100
80
60
40
20
0
0 50 100 150 200 250 300 350
Fig. 3.4 Uniform sampling distribution for N = 2 and a total of 20 samples, illustrated on the
surface of a unit sphere and over the θφ plane
and T
fnm = f 00 , f 1(−1) , f 10 , f 11 , . . . , f N N , (3.28)
(3.29)
72 3 Sampling the Sphere
180
160
140
120
100
80
60
40
20
0
0 50 100 150 200 250 300 350
Fig. 3.5 Nearly-uniform sampling distribution for N = 8 and a total of 144 samples, illustrated on
the surface of a unit sphere and over the θφ plane
For the special case of Q = (N + 1)2 , the system of equations defined in Eq. (3.26)
can be solved by taking the inverse of matrix Y:
fnm = Y† f, (3.31)
3.5 Numerical Computation of Sampling Weights 73
with Y† = (Y H Y)−1 Y H . For the case Q < (N + 1)2 , the number of samples is
insufficient, signifying under-sampling, and Eq. (3.26) may not provide the correct
solution.
Equations (3.30) and (3.31) can be used to find f nm for a general sampling set,
from which the function on the sphere, f (θ, φ), can be reconstructed using the
inverse spherical Fourier transform. This is employed below to formulate the com-
putation of f nm in a more standard manner, i.e. as the sum of the product of sam-
ples and sampling weights. Equations (3.30) or (3.31) are rewritten in the following
form:
Q
f nm = αqnm f (θq , φq ). (3.32)
q=1
Equation (3.32) has a form similar to Eq. (3.2), and so αqnm can be considered as the
sampling weights used to compute f nm given the samples f (θq , φq ). Note that in
this case the value of the weights may vary independently as a function of n and m.
Furthermore, the similarity between Eq. (3.32) and Eqs. (3.30) and (3.31) suggests
that the sampling weights, αqnm , are the elements of matrices Y−1 or Y† , having a
row index given by (n 2 + n + m) and a column index given by q.
The problem of first computing weights and then fnm given samples of the function
is related to the problem of interpolating a function given the samples. Substituting
Eq. (3.32) into the inverse spherical Fourier transform, Eq. (1.40), leads to the fol-
lowing derivation:
⎡ ⎤
N
n
N
n
Q
f (θ, φ) = f nm Ynm (θ, φ) = ⎣ αqnm f (θq , φq )⎦ Ynm (θ, φ)
n=0 m=−n n=0 m=−n q=1
Q
N
n
= αqnm Ynm (θ, φ) f (θq , φq )
q=1 n=0 m=−n
Q
= αq (θ, φ) f (θq , φq ), (3.33)
q=1
where αq (θ, φ) is the inverse spherical Fourier transform of αqnm . Functions αq (θ, φ)
can be considered as interpolating functions, such that when multiplied by the
value of the samples, f (θq , φq ), and added together, they provide the values of
f (θ, φ) in between the samples. This is in line with the interpolatory quadrature
method [1].
74 3 Sampling the Sphere
Equations (3.26) and (3.31), derived in Sect. 3.5, can be considered to be discrete
versions of the spherical Fourier transform and its inverse, presented in Eqs. (1.40)
and (1.41). Therefore, these are denoted the discrete spherical Fourier transform and
its inverse:
fnm = Y† f
f = Yfnm . (3.34)
For the special cases of the equal-angle, Gaussian and uniform sampling config-
urations, where closed-form expressions are available for the sampling weights, the
discrete spherical Fourier transform can be computed without the need for matrix
inversion, using
fnm = Y H diag(α)f, (3.35)
holds the sampling weights. Equation (3.35) is a matrix representation of Eq. (3.2).
Substituting Eq. (3.35) into the inverse discrete spherical Fourier transform in
Eq. (3.34), the following holds:
Y H diag(α)Y = I, (3.37)
which shows the orthogonality of the weighted columns in the spherical harmonics
matrix Y. Furthermore, for the uniform and nearly-uniform sampling configurations,
in which αq are constants equal to 4π/Q, Eqs. (3.35) and (3.37) reduce to
4π H
fnm = Y f (3.38)
Q
4π H
Y Y = I. (3.39)
Q
The three forms of the discrete spherical Fourier transform, Eqs. (3.34), (3.35),
and (3.38), can be written in a unified manner by defining matrix S such that
S = Y† , (3.41)
3.6 The Discrete Spherical Fourier Transform 75
S = Y H diag(α) (3.42)
4π H
S= Y . (3.43)
Q
Equation (3.39) suggests that matrix 4π Q
Y is unitary, when square. This property
is similar to the property of discrete Fourier transform (DFT) matrices; therefore,
uniform and nearly-uniform sampling schemes with the associated discrete spherical
Fourier transform matrices can be considered to be equivalent to the DFT matrices
in the time domain.
An important property of unitary matrices is that they have equal eigenvalues
and singular values. Sampling schemes in which the samples are distributed less
uniformly over the sphere will produce variance in the singular values magnitude,
and so the matrix inversion process required in the computation of the spherical
Fourier transform may have reduced numerical robustness. This motivates the design
of sampling sets that distribute samples on the sphere surface in an approximately
uniform manner.
Similar to the fast Fourier transform (FFT), developed to compute the DFT effi-
ciently, studies proposing fast and efficient computations of the spherical Fourier
transform have been published. The reader is referred to [10], for example, for fur-
ther reading on this topic.
Q
fˆnm = αqnm f (θq , φq )
q=1
Q ∞
n
= αqnm f n m Ynm (θq , φq )
q=1 n =0 m =−n
⎡ ⎤
∞
n
Q
= ⎣ αqnm Ynm (θq , φq )⎦ f n m
n =0 m =−n q=1
∞
n
= εnm
nm
fn m , (3.44)
n =0 m =−n
where
Q
εnm
nm
= αqnm Ynm (θq , φq ) (3.45)
q=1
has been defined to denote the contribution of each coefficient f n m to the approx-
imation of coefficient fˆnm . Under conditions of ideal, aliasing-free sampling, εnm n m
where column vector f̂nm of length (N + 1)2 holds the approximated spherical har-
monic coefficients fˆnm , column vector fnm of length ( Ñ + 1)2 holds the spherical
harmonic coefficients f nm of the original function, with Ñ ≥ N and potentially very
n m
large, and matrix E is of dimensions (N + 1)2 × ( Ñ + 1)2 , having elements εnm ,
with row index (n 2 + n + m) and column index (n 2 + n + m ).
Sampling schemes that guarantee aliasing-free sampling for order-limited func-
tions satisfying Q ≥ (N + 1)2 , where N is the order limit, should produce a matrix
E with the top-left part of dimensions (N + 1)2 × (N + 1)2 being the unit matrix I.
In this case, only orders higher than N may produce aliasing. For an arbitrary sam-
pling scheme, with αqnm denoting the elements of Y† (see Sect. 3.5), matrix E can be
written as
E = Y† Ỹ, (3.47)
where matrix Y of dimensions Q × (N + 1)2 has been defined in Eq. (3.29) and
matrix Ỹ, holding the values of Ynm (θq , φq ) as in Eq. (3.45), is of dimensions Q ×
( Ñ + 1)2 . For equal-angle and Gaussian sampling, the sampling weights are provided
in closed form and no matrix inversion is required. In these cases matrix E can be
3.7 Spatial Aliasing 77
(a) (dB)
15 0
10
5
0 -50
0 20 40 60 80
(b) (dB)
15 0
10
5
0 -50
0 20 40 60 80
(c) (dB)
15 0
10
5
0 -50
0 20 40 60 80
n m for n ≤ 3 and n ≤ 9 and for three sampling config-
Fig. 3.6 Elements of the aliasing matrix εnm
urations; a equal-angle (64 samples), b Gaussian (32 samples) and c nearly-uniform (32 samples)
written as
E = Y H diag(α)Ỹ, (3.48)
where vector α holds the sampling weights, as in Eq. (3.35). In the case of uniform
and nearly-uniform sampling, the expression for E is further simplified due to the
constant sampling weights, and is written as
4π H
E= Y Ỹ. (3.49)
Q
The magnitude of the elements of matrix E, i.e. εnm nm
, are presented in Fig. 3.6 for the
three sampling configurations; equal-angle, Gaussian and nearly uniform, for N = 3
and Ñ = 9. The values of (n, m) are presented on a single axis, with a running index
n 2 + n + m, where sections of equal order n are partitioned by a horizontal line. The
values of (n , m ) are presented similarly. The figure shows the manner in which high
orders, n > N , are aliased into lower orders. The figure shows that not all elements
(n , m ) contribute to the aliasing error in each (n, m).
An example is presented next to illustrate the process of sampling and aliasing.
Consider a function on the sphere:
78 3 Sampling the Sphere
This function is illustrated in Fig. 3.8. The figure confirms the sampling and aliasing
process described above: the spherical harmonic of order zero is reconstructed with
no error, while spherical harmonics of order n = 5 are aliased to spherical harmonics
of order n = 3.
The aliasing structure for the equal-angle and Gaussian sampling configurations
has been analyzed in detail in [12] and is now outlined here. First, note that although
3.7 Spatial Aliasing 79
+1 2N
+1
2N
∗
εnm
nm
= αq Ynm (θq , φl ) Ynm (θq , φl )
q=0 l=0
2n + 1 (n − m)! 2n + 1 (n − m )!
=
4π (n + m)! 4π (n − m )!
2N +1
2N +1
× αq Pnm (cos θq )Pnm (cos θq ) eiφl (m −m) . (3.52)
q=0 l=0
For the Gaussian sampling case, the summation over q ranges from zero to N . Now,
due to the equal spacing the summation over l is zero, unless (m − m) mod (2N +
2) = 0. Therefore, aliasing clearly occurs for terms with m = m, as is evident from
the diagonal behavior within given orders n, n . For the higher orders of n , replicas
of the diagonal term are also evident, due to the modulo operation.
80 3 Sampling the Sphere
The final property affecting the behavior of aliasing is due to the summation over
q. The samples, arranged symmetrically relative to the equator along the elevation,
with a similar symmetry for the sampling weights, produce a sum of zero along q
when n + m + n + m is odd [12]. Now, because (m − m) mod (2N + 2) is zero,
the condition for a sum of zero along q reduces to n + n being odd. This is clearly
evident in Fig. 3.6a, b, where alternating regions of constant n and n are zero, which
indeed occurs when n + n is odd.
Other sampling configurations may not exhibit such a regular aliasing pattern.
n m
For example, εnm for a nearly-uniform sampling configuration with 32 samples is
presented in Fig. 3.6c. Although some patterns similar to that shown in Fig. 3.6a, b
are observed, e.g. diagonal aliasing terms, the pattern in more complex in general.
References
1. Arfken, G., Weber, H.J.: Mathematical Methods for Physicists, 5th edn. Academic, San Diego
(2001)
2. Driscoll, J.R., Healy Jr., D.M.: Computing Fourier transforms and convolutions on the 2-sphere.
Adv. Appl. Math. 15(2), 202–250 (1994)
3. Fliege, J., Maier, U.: The distribution of points on the sphere and corresponding cubature
formulae. IMA J. Numer. Anal. 19(2), 317–334 (1999)
4. Hardin, R.H., Sloane, N.J.A.: McLaren’s improved snub cube and other new spherical designs
in three dimensions. Discret. Comput. Geom. 15(4), 429–441 (1995)
5. Healy Jr., D.M., Rockmore, D.N., Kostelec, P.J., Moore, S.: FFTs for the 2-sphere - improve-
ments and variations. J. Fourier Anal. Appl. 9(4), 341–384 (2003)
6. Hildebrand, F.B.: Introduction to Numerical Analysis, 2nd edn. McGraw-Hill, New York (1974)
7. Krylov, V.I.: Approximate Calculation of Integrals. Macmillan, New York (1962)
8. Leopardi, P.: A partition of the unit sphere into regions of equal area and small diameter.
Electron. Trans. Numer. Anal. 25, 309–327 (2006)
9. Li, Z., Duraiswami, R.: Flexible and optimal design of spherical microphone arrays for beam-
forming. IEEE Trans. Audio Speech Lang. Process. 15(2), 702–714 (2007)
10. Mohlenkamp, M.J.: A fast transform for spherical harmonics. J. Fourier Anal. Appl. 5(2/3),
159–184 (1999)
11. Proakis, J.G., Manolakis, D.K.: Digital Signal Processing, 4th edn. Prentice Hall, New Jersey
(2006)
12. Rafaely, B., Weiss, B., Bachmat, E.: Spatial aliasing in spherical microphone arrays. IEEE
Trans. Signal Process. 55(3), 1003–1010 (2007)
13. Saff, E.B., Kuijlaars, A.B.J.: Distibuting many points on a sphere. Math. Intel. 19(1), 5–11
(1997)
Chapter 4
Spherical Array Configurations
Q
pnm (k, r ) = αqnm p(k, r, θq , φq ), n ≤ N . (4.1)
q=1
The total number of samples is given by Q and the maximum reconstructed order is
N , where αqnm are the sampling weights. Perfect reconstruction will be achieved only
if the sampled pressure function is order limited, i.e. pnm = 0 ∀ n > N . However, as
discussed in Sect. 2.3 and illustrated in Fig. 2.6, a sound field composed of plane
waves, for example, is not order limited, so that errors due to spatial aliasing are
unavoidable when reconstructing the sound pressure from its samples. Nevertheless,
these errors can be made negligible if the magnitude of the high-order coefficients
is kept sufficiently small. This is maintained for all n kr. Hence, assuming that
the choice of sampling method, frequency and sphere radius satisfy kr < N , spatial
aliasing error can be kept small.
Although the reconstruction of sound pressure on the surface of the measurement
sphere may be feasible with some limited aliasing error, the reconstruction of sound
pressure around the sphere requires the following formulation [see Eq. (2.47)]:
∞ n
jn (kr )
p(k, r , θ , φ ) = pnm (k, r )Ynm (θ , φ ), (4.2)
n=0 m=−n
jn (kr )
difficult to avoid a division by zero, unless a very restricted set of frequencies, radii
and orders are selected. This is the main drawback of the single-sphere configuration
with pressure microphones in free field and is, therefore, the reason for consider-
ing other spherical array configurations, such as an array configured around a rigid
sphere.
Another important issue related to array configuration is sensitivity to sensor noise.
Figure 2.2 shows that jn (kr ) vanishes for all n > 0 as kr → 0. Furthermore, the decay
towards zero is steeper for higher orders. This means that pressure reconstruction
away from the sphere, as in Eq. (4.2), or general array processing methods, may
require division by a small value for low kr ; this may, potentially, amplify noise in
a practical array system. One way to avoid this undesirable effect is to reduce the
effective array order, N , at low frequencies, by including only coefficients with a
sufficiently large magnitude. This, however, may come at the expense of performance
in terms of accuracy of reconstruction and spatial resolution, which depend on N
(see Chap. 5).
It is clear from the analysis presented above that the array configuration may affect
various aspects of array performance. The theoretical analysis is now summarized
in the following points, which serve as considerations for spherical array design,
demonstrated in the design example that follows.
(i) The spatial sampling method is first selected (see Chap. 3). This defines the
angular part of the position of the microphones, (θq , φq ), q = 1, . . . , Q, and
the maximum order N for aliasing-free sampling of functions on the sphere.
(ii) The radius of the sphere, r , is then selected. This defines the radial part of the
position of the microphones. With r and N defined, the frequency range of
operation can be established.
(iii) The upper frequency limit is bounded by spatial aliasing. The upper frequency
f determines the wave number, k = 2π f /c, which, together with r and N , must
satisfy kr < N to avoid significant error due to spatial aliasing.
(iv) The lower frequency limit is bounded by sensor noise and other errors, such
as mismatch in microphone gain and phase response, inaccurate positioning
of microphones and limited computational accuracy. Array processing that
involves a division by jn (kr ) may be ill-conditioned at low frequencies and
low values of kr if the magnitude of jn (kr ) is small. For a given frequency
and radius satisfying kr N , the highest order, n = N , will have the lowest
magnitude and so will contribute most significantly to performance degrada-
tion due to noise. The exact frequency at which j N (kr ) will no longer be useful
may depend on the noise level and the level of other errors and may change,
depending on the system specifications in practice.
(v) At some frequencies within the operating range, i.e. between the upper and
lower frequency limits, jn (kr ) may also become small if these frequencies
satisfy jn (kr ) ≈ 0. This is an inherent limitation of the single open-sphere
configuration.
An example of an open-sphere array design is presented next. Consider a sphere of
radius r = 8 cm with 72 microphones arranged using a Gaussian sampling scheme,
84 4 Spherical Array Configurations
30
0
20
1
10
2
0
3
-10
4
-20
5
-30
6
-40 7
-50 8
-60
500 1000 1500 2000 2500 3000 3500 4000 4500
Fig. 4.1 The magnitude of 4πi n jn (kr ) for n = 0, . . . , 8 as a function of frequency, with r = 8 cm
and k = 2π f /c, showing the limit at f = 3,412 Hz where kr = 5 is satisfied
the sphere and the amplitude of the plane waves composing the sound field in the
spherical harmonics domain, as in Eq. (2.45):
where
bn (kr ) = 4πi n jn (kr ). (4.4)
This is an important relation as it defines the way in which the plane wave sound
field, anm , is measured on the sphere surface, pnm , with the function bn (kr ) defining
the projection of the sound field onto the sphere surface. It is clear now that the
computation of the sound field, anm , given the measurement, pnm , requires a division
by bn (kr ), which, in the case of a single open sphere, means a division by the spherical
Bessel function. Equations (4.3) and (4.4) represent a general and useful way to
present the effect of array configuration. As presented in the following sections,
other array configurations will also be presented in the form of Eq. (4.3), but with
different terms composing bn (kr ). The aim is to develop array configurations for
which bn (kr ) do not possess zeros within the operating frequency range and for the
selected radius values.
The rigid-sphere array configuration [8] comprises microphones placed on the surface
of a sphere composed of a hard, fully reflecting material, such as hard wood or thick
metal. The analysis of the single open-sphere configuration presented in Sect. 4.1
applies also to the rigid-sphere configuration. However, the relation between the
sound field around the sphere and the pressure on the sphere surface is characterized
by a different function bn (kr ), due to scattering from the rigid sphere. Chapter 2
presented an analysis of sound fields; in Eq. (2.62), the term for bn that includes the
effect of the incident sound field and the scattered sound field around a rigid sphere
is developed and is rewritten here for convenience:
jn (kra ) (2)
bn (kr ) = 4πi n jn (kr ) − (2) h n (kr ) . (4.5)
h n (kra )
Function bn is dependent on both ra , the radius of the rigid sphere, and r , satisfying
r ≥ ra , representing the distance from the origin of a point on or outside the rigid-
sphere surface. Note, however, that the explicit dependence of bn on ra is not shown,
for notation simplicity.
The magnitude of functions bn for an open sphere, or a sphere in free field, and a
rigid sphere, have been presented in Chap. 2 in Figs. 2.1 and 2.9, respectively. These
plots are presented here in a single figure, Fig. 4.2, omitting the order indices, for
simplicity. In the case of the rigid sphere, r = ra was assumed. The figure clearly
86 4 Spherical Array Configurations
30
Open sphere
20 Rigid sphere
10
-10
-20
-30
-40
-50
-60
1 2 3 4 5 6
Fig. 4.2 The magnitude of bn (kr ) for n = 0, . . . , 3 for a rigid sphere with r = ra and an open
sphere
30
0
20
1
10
2
0
3
-10 4
-20 5
-30 6
-40 7
8
-50
-60
500 1000 1500 2000 2500 3000 3500 4000 4500
Fig. 4.3 The magnitude of bn (kr ) in Eq. (4.5) with r = ra for n = 0, . . . , 8 as a function of
frequency, with ra = 8 cm and k = 2π f /c, showing the limit at f = 3, 412 Hz where kr = 5 is
satisfied
The example of a design introduced in Sect. 4.1 for the open-sphere configuration
is outlined here for the rigid-sphere configuration. The open sphere of radius r = 8 cm
is replaced by a rigid sphere of the same radius, ra = 8 cm. Figure 4.3 shows the
magnitude of bn (kr ) for this design, computed using Eq. (4.5) with r = ra . The figure
shows that the low-magnitude problem at frequencies 2,144 and 3,066 Hz no longer
exists due to the elimination of the zeros of the spherical Bessel function. Otherwise,
the designs are similar, with the exception that the magnitude of bn is slightly higher
in the rigid-sphere design. This becomes an advantage for b5 (kra , kra ) at 1,000 Hz,
for example, where it has a magnitude of −37 dB at this frequency, as marked in the
figure; the rigid-sphere design is therefore less sensitive to noise compared to the
open-sphere array under these conditions.
A spherical microphone array configuration that uses microphones in free field, but
nevertheless overcomes the problem introduced by the nulls of the spherical Bessel
function, is presented in this section. This configuration is the same as the single
open-sphere configuration discussed in Sect. 4.1, only here the microphones are
of the cardioid type rather than of the pressure type [4]. This means that instead
of using omni-directional microphones, one uses directional microphones with a
first-order cardioid directivity that measures a combination of pressure and radial
88 4 Spherical Array Configurations
1 ∂
x(k, r, θ, φ) = p(k, r, θ, φ) + p(k, r, θ, φ). (4.6)
ik ∂r
The microphone signal in response to a unit-amplitude plane wave can be derived by
substituting p(k, r, θ, φ) = ei k̃·r = eikr cos Θ in Eq. (4.6), where Θ denotes the angle
away from the radial look direction, and is given by
Here, k̃ = (k, θk , φk ) denotes the wave vector pointing in the arrival direction, as in
Eq. (2.37), and r = (r, θ, φ) denotes the position of the microphone. The output of
the microphone includes the term (1 + cos Θ), which is the cardioid directivity [4].
Figure 4.4 illustrates the directivity of a cardioid microphone on a polar plot.
Equation (4.6) can also be written in the spherical harmonics domain, by substi-
tuting the spherical harmonics representation of a unit-amplitude plane wave for p,
as in Eq. (2.37):
∗
xnm (k, r ) = 4πi n jn (kr ) − i jn (kr ) Ynm (θk , φk ) . (4.8)
Considering a plane wave with an amplitude of a(k, θk , φk ), and extending the sound
field to include a continuum of plane waves, as in Sect. 2.4, leads to
150 0.6 30
0.4
0.2
180 0 0
210 330
240 300
270
4.3 Open Sphere with Cardioid Microphones 89
30
Pressure mic.
20 Cardioid mic.
10
-10
-20
-30
-40
-50
-60
1 2 3 4 5 6
Fig. 4.5 The magnitude of bn (kr ) for n = 0, . . . , 3 for spherical arrays with microphones in free
field using cardioid microphones and pressure microphones
where
bn (kr ) = 4πi n jn (kr ) − i jn (kr ) . (4.10)
Equations (4.9) and (4.10) show that the output of a spherical array composed of
cardioid microphones in free field can be written in the same form as the output of
a spherical array with pressure microphones, either in free field or around a rigid
sphere, but with a different function bn (kr ). In this case, function bn includes a jn
term due to the pressure component and a term with a derivative of jn due to the
pressure gradient component.
Figure 4.5 compares |bn | for open-sphere arrays with pressure and with cardioid
microphones for n = 0, . . . , 3. The figure shows that, similar to the rigid-sphere
array, the use of cardioid microphones eliminates the zeros of the spherical Bessel
function. Furthermore, similar to the rigid-sphere array, the magnitude of bn at low
values of kr is higher than with the pressure microphone configuration. The increase
in magnitude is even larger than the increase in the rigid-sphere case, as illustrated
in Fig. 4.2, suggesting a potential improvement in robustness to noise. However, this
improvement may not be evident in practice. This is because cardioid microphones
usually suffer from excessive noise at low frequencies due to the spatial derivative
operation, which is typically approximated by pressure difference measurement,
which may be small at low frequencies.
Although the use of a single open-sphere array with cardioid microphones seems
attractive due to the simplicity of this configuration, it has drawbacks. First, in addi-
90 4 Spherical Array Configurations
tion to excessive noise at low frequencies, deviation from the cardioid pattern may
produce errors in the array model function bn , when used in array processing, for
example. Furthermore, pressure microphones are often the microphone of choice in
acoustic measurement systems, and so a spherical array based on pressure micro-
phones may be preferable. An open-sphere array that employs pressure microphones
and overcomes the limitations imposed by the spherical Bessel null is presented in
the next section.
with
bn (kr1 , kr2 ) = [1 − βn (kr1 , kr2 )] 4πi n jn (kr1 ) + βn (kr1 , kr2 )4πi n jn (kr2 ). (4.14)
4.4 Dual-Radius Open Sphere 91
30
20
10
-10
-20
-30
-40
Open r 1
-50 Open r 2
Dual sphere
-60
1 2 3 4 5 6
Fig. 4.6 The magnitude of bn (k) ≡ bn (kr1 , kr2 ) for n = 0, . . . , 3 for spherical arrays with a dual-
radius open-sphere configuration of radii r1 = 1 m and r2 = 0.833 m and two single open-sphere
configurations of radii r1 and r2
Function p12 nm (k) represents the spherical harmonic coefficients of the pressure
function from both spheres. Equations (4.13) and (4.14) describe a relation between
the measured pressure and the plane-wave sound field through function bn , defined
here for the case of the dual-radius array.
Figure 4.6 shows the magnitude of bn (k) ≡ bn (kr1 , kr2 ) for the dual-radius array
with r1 = 1 m and r2 = 0.833 m. The figure shows that the zeros of the spherical
Bessel function are avoided using this approach. The figure also shows bn for the
two open-sphere arrays with r1 and r2 both having zeros, but at scaled locations.
An important design issue for the dual-radius spherical array is the choice of the
ratio of the two radii, denoted as α = r1 /r2 . Balmages and Rafaely [2] proposed
both numerical and analytical approaches for finding the best ratio. Given r1 , and
assuming r2 is constrained to a smaller radius, r2 < r1 , the radius ratio should produce
the highest possible magnitude of jn (kr ), where for each wave number k and order n
the largest of the two values | jn (kr1 )| and | jn (kr2 )| is selected. This is formulated
as follows:
αopt = arg max min min max {| jn (kr1 )|, | jn (kr2 )|} . (4.15)
α n k
The minimization over k is typically taken in the range kr1 ≥ n to avoid low values
of jn that are due to the high-pass characteristic of jn at low kr values. Furthermore,
in typical arrays, aliasing is significant for kr > N and so k is typically restricted to
the range n ≤ kr1 ≤ N . The minimization over n is taken in the range 0 ≤ n ≤ N .
Examples for the numerical calculation of α have been presented in [2].
92 4 Spherical Array Configurations
A simplified expression for α has also been proposed in [2]. Increasing α from
α = 1 (a single sphere) is equivalent to scaling the argument of jn (kr ) and thereby
shifting the zeros to higher wave numbers. When shifted zeros of jn (kr2 ) re-coincide
with the original zeros of jn (kr1 ), the zero at the given wave number cannot be
recovered. Now, taking the mid-point between α = 1 and the value of α leading to
coincidence of zeros, and assuming limits on the gaps between zeros along kr , it has
been shown that a good approximation for the optimal α is given by [2]
π
αopt ≈ 1 + . (4.16)
2N
An example of a design has been presented in [11] for the measurement of room
impulse responses in an auditorium. The dual-sphere array was composed of 882
microphone positions on each sphere, arranged using the Gaussian sampling scheme.
This provides aliasing-free sampling up to order N = 20, such that 2(N + 1)2 = 882.
The radius of the first sphere was set to r1 = 0.43 m, such that kr1 = N was satisfied
at frequency f ≈ 2.5 kHz, which thus constitutes the upper operating frequency of
the array. Note that a slightly higher upper frequency was used in [11]. Substituting
N = 20 in Eq. (4.16) leads to α ≈ 1.078 and r2 = 0.4 m. This example illustrates that
even though the two radii in this dual-sphere configuration are a very small distance
apart, this is sufficient to eliminate the nulls due to the spherical Bessel functions.
Although the dual-radius spherical array presented in this section provides a prac-
tical solution to the problem of the zeros of the spherical Bessel function using pres-
sure microphones, the downside is that it requires two spheres and twice as many
microphones compared to the single open-sphere array. More efficient methods are
presented in the following sections, based on a design framework that is developed
in the next section.
The array configurations presented above were all based on a predefined distribution
of samples on a sphere, as discussed in Chap. 3. The spherical harmonic coefficients
of the sound pressure, pnm (k, r ), were then computed using appropriate sampling
weights, leading to the computation of anm (k), or plane-wave decomposition, through
a division of pnm (k, r ) by bn (kr ), as in Eq. (4.3). Ill-conditioning in this computation
is a direct result of the low magnitude of bn (kr ), particularly affecting the single-
sphere open array configuration. In the configurations presented above, microphone
positions were constrained to the surface of a single or a dual sphere. In the case
where microphones are placed more freely in three-dimensional space, a different
formulation is required to account for the numerical robustness of the proposed
configuration. Such a formulation is presented in this section.
Equation (4.3) presents the relation in the spherical harmonics domain between
the plane-waves amplitude composing the sound field and the pressure on a sphere
4.5 Robustness to Errors and Numerical Array Design 93
The pressure at these sampling points can be written using Eq. (4.3) as
∞
n
p(k, rq , θq , φq ) = anm (k)bn (krq )Ynm (θq , φq ), 1 ≤ q ≤ Q. (4.18)
n=0 m=−n
Note that this equation holds for various configurations, represented by different
bn functions, such as pressure microphones around open or rigid spheres, an open
sphere with cardioid microphones or the dual-radius configuration. Denoting the
maximum radius r̄ = max{rq } for all 1 ≤ q ≤ Q, and assuming that the wave number
satisfies k r̄ < N , the infinite summation in Eq. (4.18) can be approximated by a finite
summation, as discussed in Sect. 2.3:
N
n
p(k, rq , θq , φq ) ≈ anm (k)bn (krq )Ynm (θq , φq ), 1 ≤ q ≤ Q. (4.19)
n=0 m=−n
p = Banm , (4.20)
the (N + 1)2 × 1 vector anm represents the coefficients of the sound field:
T
anm = a00 , a1(−1) , a10 , a11 , ..., a N N (4.22)
o
where anm is the solution. Assuming over-sampling, such that Q > (N + 1)2 , the
pseudo-inverse is given by
−1
B† = B H B B H . (4.25)
o
When anm is substituted back into Eq. (4.20), it is expected that the equation is
satisfied exactly (or with a small error), validating the solution. In practice, however,
matrix B may not be known exactly. There are a number of possible causes for the
uncertainty, including: the microphone positions, (rq , θq , φq ), are only known with
a finite precision, perturbations from assumed values may exist in the gain and phase
response of microphones, there may be a non-ideal directional response in cardioid
microphones, reflections may be excited under an assumed free field condition due
to the microphone casing or the microphone boom and a non-negligible absorption
that may exist in a constructed rigid sphere.
The perturbation in matrix B is denoted by δB and, when substituting back into
Eq. (4.20), will lead to a perturbation in p denoted by δp:
p + δp = (B + δB)anm
o
. (4.26)
It is desired that a small perturbation δB will lead to a small perturbation δp, so that
the extent to which Eq. (4.20) is not satisfied is minimized. This sensitivity relation
o
is formulated by substituting Eq. (4.20) with anm into Eq. (4.26), leading to
δp = δBanm
o
. (4.27)
Rearranging and substituting the 2-norm condition number, the sensitivity of varia-
tion in p to variation in B is written as [12]
δp δB
≤ κ(B) , (4.29)
p B
4.5 Robustness to Errors and Numerical Array Design 95
where κ(B) is the condition number of matrix B, which for the 2-norm case can be
written as [12]
σ (B)
κ(B) = B · B† = , (4.30)
σ (B)
where σ denotes the maximal singular value and σ denotes the minimal singular
value. Equation (4.29) shows that the condition number amplifies the error in matrix
B, so it is important to keep the condition number as close to unity as possible.
For the special case of a square matrix B, with a full rank equal to (N + 1)2 , the
condition number is written as in Eq. (4.30), but with the pseudo-inverse replaced
by the inverse.
Perturbation can also take place in vector p. The sound pressure vector is typi-
cally measured by microphones, so that amplifier noise and quantization error when
sampled by a computer may produce errors, or a perturbation, in vector p. It has been
shown that the bound on the error of the solution, anm in this case, for errors in p
and for the non-square matrix case grows with κ(B), motivating the reduction of the
condition number in these cases as well [12].
Having established that the condition number of matrix B is an important measure
for the robustness of the solution of Eq. (4.20) to errors in the data represented by
vector p and matrix B, the condition number can be used as an objective for mini-
mization when designing a spherical array configuration. For example, the following
optimization problem can be formulated for searching for microphone positions that
will produce the most robust design [10]:
Such an optimization problem may not be convex, requiring global search methods
such as genetic algorithms. Selection of the sphere configuration, e.g. open or rigid,
and of the type of microphone, e.g. pressure or cardioid, can also be integrated
into such a design. In the next two sections, examples of κ(B) for some of the
designs described in this chapter are presented, after which the shell configuration is
introduced, which uses the design optimization presented in Eq. (4.31).
Practical limitations in the realization of arrays, causing deviations from the the-
oretical “ideal” design, will produce errors that propagate to the array output. As
discussed earlier in this chapter, common causes of errors may include, for example,
accuracy of microphone positioning, mismatch in the frequency response of micro-
phones and non-ideal acoustic models of the sphere. These errors may be represented
as perturbations in matrix B relative to an ideal matrix, so that the condition number
96 4 Spherical Array Configurations
of matrix B can be used as a general measure of the sensitivity of the array output to
these errors, as discussed in Sect. 4.5.
Several array configurations are investigated in this section. The condition number
of matrix B is computed for these selected array configurations, with the aim of illus-
trating and comparing their robustness. Matrix B is computed for each configuration
in the range 0 ≤ n ≤ 3 and 0 ≤ kr ≤ 6. In most cases, the sampling configuration
is designed for an order-limited function with a maximum order of N = 6, such
that spatial over-sampling is maintained. The reason for this relatively significant
over-sampling is to guarantee an operating region in the range 3 ≤ kr ≤ 6, in which
function bn (kr ) has a relatively uniform magnitude as a function of n.
In the first example, a spherical array configured around a rigid sphere is investi-
gated. Three sampling schemes, namely equal-angle, Gaussian and nearly uniform, as
discussed in Chap. 3, are studied. The three schemes are designed for order N = 6,
with 196, 98 and 84 samples, respectively. Matrix B for each of these three con-
figurations is computed and has dimensions Q by (N + 1)2 , where Q is the total
number of samples and (N + 1)2 = 49. The condition number of these matrices is
then computed for a range of values along kr . Although all three configurations were
considered robust when studied above, due to the inherent robustness of the rigid
sphere with regard to eliminating the zeros of the spherical Bessel function, Fig. 4.7
clearly shows that the nearly-uniform distribution is slightly more robust than the
Gaussian and the equal-angle distributions. This is probably due to the more uniform
manner in which the samples are distributed on the sphere, avoiding the clustering
at the poles. The figure also shows that the condition number is high at the lower-
4
10
equal-angle
Gaussian
nearly-uniform
103
102
101
100
1 2 3 4 5 6
Fig. 4.7 The condition number κ(B) as a function of kr for array configurations around a rigid
sphere, with sampling distributions as follows: equal-angle with 196 samples, Gaussian with 98
samples and nearly-uniform with 84 samples, all providing aliasing-free sampling up to order 6
4.6 Design Examples with Robustness Analysis 97
4
10
open + origin
open
rigid
103
102
101
100
1 2 3 4 5 6
Fig. 4.8 The condition number κ(B) as a function of kr for three array configurations; (i) around
a rigid sphere and (ii) around an open sphere, both with nearly-uniform sampling distribution
with 84 samples, and (iii) an open array configuration with an additional sample at the origin. All
configurations provide aliasing-free sampling up to order 6
frequency end (for kr < 3). This is due to the inherently low magnitude of b1 to b3
for kr < 3. This increase in condition number at low values of kr cannot be avoided
by re-distribution of the microphones and will typically require an increase in the
radius of the sphere, such that the cut-off point (kr = 3 in this case) occurs at a lower
wave number.
In the next example, three array configurations are compared, including one
around a rigid sphere and one around an open sphere, both with uniform sampling
distributions of 84 samples. The configuration around a rigid sphere is the same as
in Fig. 4.7 and is presented here as a reference. The third configuration is the same
as the open-array configuration, only an additional sample has been added to matrix
B, at the array origin. Figure 4.8 presents the condition number of matrix B for these
configurations. The open-array configuration clearly shows high condition numbers
at kr values close to the zeros of the spherical Bessel function. Equation (4.23) shows
that matrix B has its first column equal to zero for kr = π , due to the zero j0 (π ) = 0.
Now, when an additional row is added due to the sample at the origin, this column
will not be zero because j0 (0) = 0, and so the loss of rank due to the zero column is
recovered. This is also evident in Fig. 4.8, where the condition number for this new
configuration follows that of an open sphere, but avoids the high condition number
values around the first zero.
In the final example, the condition numbers of an open array with cardioid micro-
phones and of a dual-sphere array with a second radius that is 1.3 times smaller than
the first radius, have been computed and are presented in Fig. 4.9. A nearly-uniform
sampling scheme with 84 samples has been used for both arrays. For the dual-sphere
98 4 Spherical Array Configurations
4
10
rigid
cardioid
dual-max
dual-both
103
102
101
100
1 2 3 4 5 6
Fig. 4.9 The condition number κ(B) as a function of kr for four array configurations; (i) around
a rigid sphere, (ii) around an open sphere with cardioid microphones, (iii) around a dual-sphere
array with the second radius 1.3 times smaller than the first radius (dual-max) and (iv) around
another dual-sphere array with matrix B composed of a combination of elements from both spheres
(dual-both). All configurations provide aliasing-free sampling up to order 6 and use nearly-uniform
sampling with 84 samples on each sphere
array, only data points corresponding to the radius having the maximum magnitude
of bn (kr ) are selected, as discussed in Sect. 4.4. The figure shows that, as expected,
both the array based on cardioid microphones and the dual-sphere array overcome
the ill-conditioning due to the zeros of the spherical Bessel function, and achieve a
reasonably low condition number. In addition, the same dual-radius configuration is
presented with matrix B composed of rows from both spheres, rather than using the
maximization selection criterion. The result is a condition number very similar to that
of the original dual-radius array. In this case, matrix B has twice as many columns,
but the near-zero columns around the Bessel zeros, which do not contribute useful
information, are simply redundant. Therefore, in this case, the maximization process
can be avoided by simply using a larger matrix.
Section 4.4 showed how the ill-conditioning in the design of an open-sphere array
is removed by positioning microphones on the surfaces of dual concentric spheres.
Although the dual-sphere array solved the ill-conditioning due to the zeros of the
spherical Bessel function, it required twice as many microphones, compared to the
single-sphere configuration. Motivated by the theory behind the dual-sphere array,
4.7 Spherical Shell Configuration 99
and with the aim of minimizing the increase in the number of microphones, the
spherical shell configuration is presented in this section [10]. In this configuration,
microphones are distributed inside the volume enclosed by the two spheres of the
dual-sphere configuration. However, the overall number of microphones is the same
as that of the equivalent single-sphere configuration, such as the single open sphere
and the single rigid sphere. The design of the array in this configuration requires
selection of the angles (θ, φ) and the radius r for each microphone. Because of the
increased degree-of-freedom in this configuration (due to the varying radius), the
design framework presented in Sect. 4.5 can be used both to compare designs based
on some regular selection of the radius and angles of microphone positions and as a
framework for optimizing microphone positions.
A straightforward way to select microphone positions in this configuration is to
distribute microphones with a known nearly-uniform sampling distribution along
(θ, φ), or to use one of the other known methods and to distribute microphones
uniformly along the radius between the two spheres. Figure 4.10 shows the condition
number of a rigid sphere with the same configuration as presented in Sect. 4.6, with 84
nearly-uniformly distributed samples. The condition number of the spherical shell
with the same microphone distribution along the angles and with uniform radial
distribution between the two spheres is also shown. The first sphere has the same
radius as that of the rigid-sphere array, while the second radius is smaller by a factor of
1.3. The figure shows that, although the condition number of the shell array is higher
than that of the rigid-sphere array, it is still relatively low and so this configuration
can be considered relatively robust.
4
10
rigid
shell - uniform
shell - optimal
103
102
101
100
1 2 3 4 5 6
Fig. 4.10 The condition number κ(B) as a function of kr for three array configurations; (i) around a
rigid sphere, (ii) and (iii) around an open sphere with microphones distributed in the volume of a shell
with uniform radial distribution (shell-uniform) and with optimal radial distribution (shell-optimal),
respectively. All configurations provide aliasing-free sampling up to order 6
100 4 Spherical Array Configurations
210 330
240 300
270
90
120 1 60
0.8
150 0.6 30
0.4
0.2
180 0 0
210 330
240 300
270
Further details on the spherical shell array design, including other methods for
the distribution of samples within a shell volume, are presented in [10].
Other spherical array configurations not presented in previous sections of this chapter
have been developed and reported in the literature, and are outlined in this section
briefly. The first example can be viewed as a continuation of the spherical shell array.
Although the shell configuration provides numerical robustness without increasing
the number of microphones, it may possess drawbacks related to the irregular distri-
bution of samples. For example, in a mechanical-scanning microphone array system,
the dual-sphere array can be realized by a two degrees-of-freedom system, where
elevation and azimuth are controlled using separate motors or turn-tables, with an
additional single manual change of microphone radius. The spherical shell array, with
a uniform distribution of radial position, for example, may require a three degrees-
of-freedom system, i.e. with three motors, for automatic placement of microphones.
This means an additional cost and complexity. With the aim of maintaining the advan-
tages of the spherical shell array, Alon and Rafaely [1] proposed a realization of a
microphone scanning system with two motors arranged off-axis, therefore allowing
positioning of microphones within the approximate volume of a spherical shell. This
configuration, termed the spindle torus array due to the resulting scanning surface,
was shown to provide robustness at a similar level to that found in the shell array,
but with a realization that required only two degrees-of-freedom.
Parthy and Jin [9] presented an interesting design concept, combining both rigid
and open spheres in a single concentric arrangement. Such a design benefits from
both improved robustness due to the effect of the rigid sphere and improved frequency
range due to the measurement with two spheres of different radii. The larger open
sphere allows improved analysis in the lower frequency range and the smaller rigid
sphere allows an extension of the aliasing-free range to a higher frequency. In [9],
the proposed array was built and investigated for acoustic holography.
Another design variation that is based around a rigid sphere was introduced by
Li and Duraiswami [5]. It was proposed for situations in which the array is mounted
near a large rigid surface, such as a wall or a desk. Assuming this surface is infi-
nite and rigid, incoming waves undergo specular reflection, so that the outgoing
waves are a mirror image of the incoming waves. This symmetry allows the use of
a rigid microphone array in the shape of a hemisphere, where the pressure at the
missing microphones can be calculated by incorporating the symmetry in the sound
field. Although a hemispherical microphone array is used with half the number of
microphones, all methods developed for spherical arrays can be readily used by this
array due to the symmetry in the sound field. The proposed array, in addition to
saving half the number of microphones, has the shape of a hemisphere, which can be
conveniently placed on a large desk in a video conferencing scenario, for example.
102 4 Spherical Array Configurations
References
1. Alon, D., Rafaely, B.: Spindle-torus sampling for an efficient-scanning spherical microphone
array. Acta Acust. united Ac. 98(1), 83–90 (2012)
2. Balmages, I., Rafaely, B.: Open-sphere designs for spherical microphone arrays. IEEE Trans.
Audio Speech Lang. Process. 15(2), 727–732 (2007)
3. Hulsebos, E., Schuurmans, T., de Vries, D., Boone, R.: Circular microphone array for discrete
multichannel audio recording. In: Proceedings of the 114th Meeting of the Audio Engineering
Society, 5716. Amsterdam (2003)
4. Kinsler, L.E., Frey, A.R., Coppens, A.B., Sanders, J.V.: Fundamentals of Acoustics, 4th edn.
Wiley, New York (1999)
5. Li, Z., Duraiswami, R.: Hemispherical microphone arrays for sound capture and beamforming.
In: Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and
Acoustics (WASPAA 2005). New York (2005)
6. Melchior, F., Thiergart, O., Del Galdo, G., de Vries, D., Brix, S.: Dual radius spherical cardioid
microphone arrays for binaural auralization. In: Proceedings of the 127th Meeting of the Audio
Engineering Society, 7855. New York (2009)
7. Meyer, J.: Beamforming for a circular microphone array mounted on spherically shaped objects.
J. Acoust. Soc. Am. 109(1), 185–193 (2001)
8. Meyer, J., Elko, G.W.: A highly scalable spherical microphone array based on an orthonormal
decomposition of the soundfield. In: IEEE International Conference on Acoustics, Speech and
Signal Processing (ICASSP 2002), vol. II, pp. 1781–1784. Orlando (2002)
9. Parthy, A., Jin, C., van Schaik, A.: Acoustic holography with a concentric rigid and open
spherical microphone array. In: IEEE International Conference on Acoustics, Speech, and
Signal Processing (ICASSP 2009), pp. 2173–2176. Taipei (2009)
10. Rafaely, B.: The spherical-shell microphone array. IEEE Trans. Audio Speech Lang. Process.
16(4), 740–747 (2008)
11. Rafaely, B., Balmages, I., Eger, L.: High-resolution plane-wave decomposition in an auditorium
using a dual-radius scanning spherical microphone array. J. Acoust. Soc. Am. 122(5), 2661–
2668 (2007)
12. Trefethen, L.N., Bau, D.: Numerical Linear Algebra. Siam, Philadelphia (1997)
Chapter 5
Spherical Array Beamforming
2π π
y= [w(k, θ, φ)]∗ p(k, r, θ, φ) sin θ dθ dφ. (5.1)
0 0
A discrete version of the spatial filter is also defined in a similar manner, with weight
wq (k) corresponding to microphone number q. The Q × 1 weight vector is defined
as T
w = w1 (k), w2 (k), . . . , w Q (k) . (5.3)
In the standard space domain array processing literature, the array output is given as
an inner product of the two vectors [7] (see also Fig. 5.1) such that
y = w H p. (5.4)
5.1 Beamforming Equations 105
However, it is very important to note that the definitions in Eqs. (5.1) and (5.4)
are not equivalent. A discrete version of the array equation in the space domain that
is equivalent to Eq. (5.1) has to take the effect of spatial sampling into account. The
relation between the two forms will be derived later in this section, using the formu-
lation of the array equation in the spherical harmonics domain. It is also important
to note that the array equation in the form of Eq. (5.1) does not suffer from spatial
aliasing and may, therefore, be useful when studying aspects of array processing
other than spatial aliasing.
The general problem of array beamforming, or spatial filtering, can be defined
as designing w such that, for a given array input p, the array output y is produced
with some desired properties. When characterizing array properties, an array input
for a sound field composed of a single, unit-amplitude plane wave is often assumed
[7]. In this case, the measured pressure is replaced by a steering vector, or manifold
vector, which represents the plane-wave amplitude measured at each microphone.
The steering vector, denoted by v, has a simple analytical form for arrays composed
of pressure microphones in free field, which is
T
v = v1 , v2 , . . . , v Q , (5.5)
where
vq = ei k̃·r , 1 ≤ q ≤ Q. (5.6)
The wave vector k̃ = (k, θk , φk ) denotes the plane-wave arrival direction (see
Chap. 2), and the position vector r = (r, θq , φq ) denotes the position of microphone
q. The array output can now be written as
y = w H v. (5.7)
This is now an explicit function of the wave arrival direction, through the dependence
of v on (θk , φk ), that defines the directional response (or directivity) of the array. It is
important to note that when other array configurations are considered, e.g. pressure
microphones around a rigid sphere, the steering vector includes the effect of the
scattering of sound from the sphere. This complicates the analytical expressions for
vector v, which motivates the representation of the array equations in the spherical
harmonics domain, a mathematically more natural domain in this case.
Array equations developed in the space domain are derived next in the spherical
harmonics domain. Consider Eq. (5.1), where the pressure function p(k, r, θ, φ) and
the weight function w(k, θ, φ) are defined over the sphere, and denote by pnm (k)
and wnm (k) their respective spherical Fourier transforms. Substituting in Eq. (5.1)
the spherical harmonics expansion for p and w, as in Eq. (1.41), and evaluating the
integral using the orthogonality property of the spherical harmonics, Eq. (1.23), the
array output can be written as a function of the spherical Fourier coefficients:
106 5 Spherical Array Beamforming
2π π
y= [w(k, θ, φ)]∗ p(k, r, θ, φ) sin θ dθ dφ
0 0
∞
n
= [wnm (k)]∗ pnm (k, r ). (5.8)
n=0 m=−n
Now, assuming coefficients beyond order N are zero, wnm = 0 ∀n > N , this equation
can be written in a matrix form (see also Fig. 5.2) as
y = wnm
H
pnm , (5.9)
The array beam pattern, or array output due to a unit-amplitude plane-wave sound
field, can also be written in the spherical harmonics domain, in a manner similar to
Eq. (5.7):
y = wnmH
vnm , (5.12)
with vnm representing the array input due to the plane-wave sound field. The expres-
sion for vnm is derived from the sound pressure, pnm , due to the unit-amplitude plane
wave. For an open-sphere array configuration [see Eq. (2.41)] pnm is written as
∗
pnm (k, r ) = 4πi n jn (kr ) Ynm (θk , φk ) , (5.14)
with the plane-wave arrival directions denoted by (θk , φk ). Following the notation
introduced in Chap. 4, the open-sphere configuration can be written more generally
as ∗
vnm = bn (kr ) Ynm (θk , φk ) , (5.16)
with bn (kr ) = 4πi n jn (kr ). This can now be extended for a wide range of array con-
figurations, simply by modifying the expression for bn (kr ) to apply to a rigid-sphere
array, a dual-sphere open array, and more (see Chap. 4). This flexibility, facilitating
the modeling of the steering vectors of various array configurations within the same
framework, is a significant advantage of formulating array equations in the spherical
harmonics domain.
Another advantage of formulating the equations in the spherical harmonics
domain (compared with the space domain) is computational efficiency. In practice,
arrays perform over-sampling, such that Q > (N + 1)2 . This means that the vectors
and matrices in the spherical harmonics domain are of lower dimension than the
same vectors and matrices in the space domain.
In the remainder of this book, the formulation in the spherical harmonics domain
will be used as the standard formulation. As shown above, the spherical harmon-
ics formulation is more flexible, as it allows a unified representation for various
array configurations and sampling schemes. However, in some cases, formulation in
the space domain may be required; this formulation is more standard in the array
processing literature because it uses the microphone signals directly. Therefore, the
relation between the spherical harmonics domain formulation and the space domain
formulation is presented next.
Starting with the spherical harmonics formulation, the array equation, as in
Eq. (5.9), is rewritten here:
y = wnm H pnm . (5.17)
Next, the relations between the spherical harmonics vectors wnm and pnm and the
space domain vectors w and p are derived by introducing the effect of sampling, as in
Eqs. (3.34), (3.35) and (3.38), for the three types of sampling schemes, as presented
in Sect. 3.6.
108 5 Spherical Array Beamforming
Similarly, for the equal-angle sampling and the Gaussian sampling schemes, substi-
tuting wnm = Y H diag(α)w and a similar expression for p, the array output becomes
y = w H diag(α)YY H diag(α) p. (5.19)
Finally, for the uniform and nearly-uniform sampling schemes, with wnm = 4π H
Q
Y w
and a similar expression for p, the array output is expressed as
2
4π
y=w H
YY H
p. (5.20)
Q
Equations (5.18) to (5.20) are the space domain equivalent to the spherical harmonics
domain array equation, Eq. (5.17). It is important to note that they are different from
the standard space domain equation y = w H p, and so the two forms, y = wnm H
pnm
and y = w p are not the same and cannot be used interchangeably. Equations (5.18)–
H
(5.20) can be written in a unified manner by using matrix S, as defined in Eqs. (3.41)–
(3.43), such that
y = w H S H S p. (5.21)
Equation (5.12) presented the array output as a function of the array input and the
beamforming weights in the spherical harmonics domain. Meyer and Elko [2] pro-
posed a useful formulation for the weights wnm . These weights are functions of
two parameters, n and m (or, equivalently, θ and φ), in the two-dimensional space
domain, when taking the inverse spherical Fourier transform of wnm to calculate
w(θ, φ). The approach proposed in [2] was to reduce the beamforming weights to
a one-dimensional function, such that the resulting beam pattern is axis-symmetric,
with the look direction forming the axis of symmetry. The proposal used the following
formulation:
dn (k) m
[wnm (k)]∗ = Y (θl , φl ). (5.22)
bn (kr ) n
The new beamforming weights, dn (k), which may be a function of frequency, are
dependent only on n and can therefore be considered as one-dimensional. A division
by bn (kr ) guarantees that the resulting steering vectors and the beam pattern are not
dependent on the physical behavior of the sound field around the array. For example,
5.2 Axis-Symmetric Beamforming 109
Fig. 5.3 A block diagram of a spherical harmonics domain, axis-symmetric beamforming system
the effect of scattering from an array configured around a rigid sphere is removed
by this division. This is illustrated in the formulation that follows. Finally, (θl , φl )
denotes the array look direction. This will also be evident from the derivation that
follows.
Substituting Eq. (5.22) in Eq. (5.9), and rewriting the equation explicitly using
summations, leads to
N
n
y= [wnm (k)]∗ pnm (k, r )
n=0 m=−n
N n
dn (k) m
= Y (θl , φl ) pnm (k, r )
b (kr ) n
n=0 m=−n n
N
dn (k)
n
= pnm (k, r )Ynm (θl , φl ). (5.23)
n=0
bn (kr ) m=−n
The third line in Eq. (5.23) is presented in a form that is more computationally
efficient (see also the block diagram in Fig. 5.3) exploiting the single dimension of
the beamforming coefficients.
The array beam pattern for the axis-symmetric beamformer can be formulated by
substituting Eq. (5.16) for pnm , leading to
N n
dn (k) m
y= Y (θl , φl ) pnm (k, r )
b
n=0 m=−n n
(kr ) n
N n
dn (k) m ∗
= Yn (θl , φl )bn (kr ) Ynm (θk , φk )
b (kr )
n=0 m=−n n
110 5 Spherical Array Beamforming
N
n
∗
= dn (k) Ynm (θk , φk ) Ynm (θl , φl )
n=0 m=−n
N
2n + 1
= dn (k) Pn (cos Θ), (5.24)
n=0
4π
where the spherical harmonics addition theorem [see Eq. (1.26)] was employed in
the last line of the derivation, with
[see also Eq. (1.27)], where Θ denotes the angle between (θl , φl ) and (θk , φk ).
Equation (5.24) can be written in a matrix form by defining a steering vector vn
and an array weights vector dn :
y = dnT vn
1 T
vn = P0 (cos Θ), 3P1 (cos Θ), . . . , (2N + 1)PN (cos Θ)
4π
T
dn = d0 , d1 , . . . , d N . (5.26)
Now, array weights dn control y(Θ), which is the beam pattern of the array, or the
array response to a unit-amplitude plane wave. The output y depends on Θ, the angle
between (θl , φl ) and (θk , φk ). Typically (but not necessarily), y(Θ) peaks at Θ = 0,
which means that plane waves arriving from this direction are subject to the highest
amplification. Hence, this direction is typically considered as the look direction, or
the direction of most interest, already denoted as (θl , φl ). The beam pattern y depends
on Θ, the angle away from (θl , φl ), and so it is axis-symmetric around (θl , φl ). Now,
by changing the value of (θl , φl ), the function y(Θ) itself does not change, but it is
rotated, or steered, such that Θ = 0 coincides with (θl , φl ). Therefore, by changing
the value of (θl , φl ) in Eq. (5.22), the beam pattern is steered to the new direction
(θl , φl ). This shows that steering is achieved in a simple and direct manner in this case,
and that the beam pattern, y(Θ), controlled through dn , is independent of steering,
which is controlled through (θl , φl ), as also illustrated in Fig. 5.3.
The array output, y, in response to a unit-amplitude plane wave, has already been
presented as defining the directivity, or the beam pattern, of the array. A scalar that
quantifies the array directivity is the directivity index, which provides a measure for
the ratio between the peak and the average values of the squared beam pattern. The
directivity factor, with symbol D F, is defined as [7]
5.3 Directivity Index 111
|y(θl , φl )|2
DF =
2π
π , (5.27)
1
4π 0 0 |y(θ, φ)|2 sin θ dθ dφ
where vnm in the numerator is given by Eq. (5.16). It is typically assumed that the look
direction employed in the design of wnm equals the wave arrival direction, (θk , φk ).
The directivity factor can be rewritten in a matrix form as a generalized Rayleigh
quotient:
H
wnm Awnm
DF = H
wnm Bwnm
A = vnm vnm
H
1
B= diag |b0 |2 , |b1 |2 , |b1 |2 , |b1 |2 , . . . , |b N |2
4π
T
vnm = v00 , v1(−1) , v10 , v11 , . . . , v N N , (5.29)
with vnm = bn [Ynm (θk , φk )]∗ , as in Eq. (5.16), and with matrices A and B of dimen-
sions (N + 1)2 × (N + 1)2 . The explicit dependence of bn (kr ) on kr has been
dropped for notation simplicity.
A similar derivation of the directivity factor can also be obtained for the case of
an axis-symmetric beam pattern by substituting Eq. (5.22) in Eq. (5.28):
2
N
n=0 m=−n dn (k)Ynm (θl , φl )[Ynm (θk , φk )]∗
n
DF =
2π
π N n 2
1
d (k)Y m (θ , φ )[Y m (θ, φ)]∗ sin θ dθ dφ
4π 0 0 n=0 m=−n n n l l n
112 5 Spherical Array Beamforming
2
N
n=0 dn (k) 2n+1
4π
P n (cos 0)
= 1 N
n=0 |dn (k)| 4π Pn (cos 0)
2 2n+1
4π
2
N
n=0 dn (k) 2n+1
4π
= 1 N , (5.30)
n=0 |dn (k)| 4π
2 2n+1
4π
where it has been assumed that (θl , φl ) = (θk , φk ) in the derivation of the numerator,
i.e. the look direction equals the plane-wave arrival direction. Also, the orthogonality
property of the spherical harmonics, Eq. (1.23), and the spherical harmonics addi-
tion theorem, Eq. (1.26), have been employed in the derivation of the denominator.
Equation (5.30) can be written in a matrix form, in a similar manner to Eq. (5.29):
dnH Adn
DF =
dn H Bdn
A = vn vnH
1
B= diag(vn )
4π
1 T
vn = 1, 3, 5, . . . , 2N + 1 , (5.31)
4π
Arrays typically operate under non-ideal conditions, which include, for example,
sensor noise and uncertainties in the frequency response and in the position of the
microphones. It is important that the performance of the array, e.g. directivity index,
remains robust to the undesired effect of noise and uncertainties. A common param-
eter employed as a measure for array robustness is the WNG [7]. It is defined as the
improvement in SNR at the array output compared to the array input. The array input
is the signal at the individual microphones, or sensors, and the array output is the
combined signal, after array processing (such as beamforming) is applied.
With the aim of formulating simple expressions for the WNG, the following is
assumed.
(i) The sound field is composed of a single, unit-amplitude plane wave.
5.4 White Noise Gain 113
Omni-directional Directional
90 90
120 60 120 0.8 60
0.08
0.06 0.6
150 30 150 30
0.04 0.4
0.02 0.2
180 0 0 180 0 0
Fig. 5.4 Polar directivity plot, |y(Θ)|, for an axis-symmetric beamformer with dn = 1, n =
0, . . . , N , for an omni-directional directivity with D F = 1 and N = 0 and a directional response
with D F = 9 and N = 2
(ii) The array is composed of sound pressure microphones in a free field. Other
array configurations are considered later in this section.
(iii) The array beamforming weights are designed with a look direction equal to the
plane-wave arrival direction.
(iv) The noise at the sensors is assumed to be uncorrelated across sensors, or micro-
phones, and to have a variance of σ 2 with zero mean.
Under these conditions, the signal magnitude at the array input is unity and the
variance of the noise at the array input is σ 2 . The signal at the array output due to
the plane wave can be computed using Eq. (5.12) as |y|2 = |wnm H
vnm |2 , where vnm
in this case is the steering vector in the look direction. The variance of the noise at
the array output can be derived from the array equation:
E |y|2 = E yy H = wnm
H H
E pnm pnm wnm . (5.32)
Using the general form of the discrete spherical Fourier transform, pnm = Sp, as in
Eq. (3.40), and assuming that the signal at the individual microphones, p, includes
the sensor noise component, which is uncorrelated between sensors, the array output
reduces to
E |y|2 = wnm
H
SE pp H S H wnm
= wnm
H
Sσ 2 IS H wnm
H
= σ 2 wnm
H
SS wnm . (5.33)
Now, the WNG, computed as the ratio of the SNR at the array output and the SNR
at the array input, is given by
114 5 Spherical Array Beamforming
|wnm
H
vnm |
2
H
σ 2 wnm [ ] nm
H SS H w w vnm 2
WNG = = H
nm
. (5.34)
1/σ 2 wnm SS H wnm
B = SS H . (5.35)
4π H 4π 4π
SS H = Y Y = I, (5.36)
Q Q Q
A = vnm vnm
H
. (5.37)
the look direction, is substituted in Eq. (5.37). Now, using the spherical harmonics
addition theorem, Eq. (1.26), the WNG is rewritten here using summations in the
spherical harmonics domain, as derived in [5]:
∗ 2
N n
n=0 m=−n dn (k)Ynm (θl , φl ) Ynm (θl , φl )
WNG = 2
4π N n
m=−n [dn (k)/bn (kr )]Yn (θl , φl )
m
Q n=0
2
N
n=0 dn (k) 2n+1
4π
= 4π N . (5.38)
n=0 |dn (k)/bn (kr )|
2 2n+1
Q 4π
dnH Adn
WNG =
dnH Bdn
A = vn vnH
4π
B= diag(vn ) × diag |b0 |−2 , |b1 |−2 , . . . , |b N |−2
Q
1 T
vn = 1, 3, 5, . . . , 2N + 1 . (5.39)
4π
The derivations of expressions for the WNG presented above assumed sensors
in free field. This is convenient, because the SNR at the array input is the same for
all sensors, so that any sensor can be selected as representing the array input. This
is not the case for other array configurations. For example, the SNR at the array
input for an array configured around a rigid sphere may differ between sensors. Due
to the shadowing effect of the sphere, the SNR at the array input will degrade for
sensors located at angles on the sphere further away from the plane-wave arrival
direction. In this case, the definition of the WNG may need readjustment to take
into consideration the contributions from all sensors. It has been shown [4] that the
variation due to scattering of sound from the rigid sphere is smaller than 3 dB. In this
book this difference is ignored in favor of using the same WNG formulation across
all array configurations, even though this formulation strictly holds only for the free
field configuration.
The WNG for an axis-symmetric beamformer is presented next. Consider an
array with Q = 9 microphones arranged uniformly on the surface of an open sphere,
providing spherical harmonics analysis to order N = 2. Employing the same example
as in Sect. 5.3, beamforming coefficients are chosen with dn = 1, and Eq. (5.39) is
used to compute the WNG for kr = 0 to N . Figure 5.5 shows the WNG as a function
of kr , first for a single microphone and then for an array with Q = 9 microphones.
The WNG for the single microphone is unity, as expected, because in this case the
array input is the same as the array output. The WNG for the array of order N = 2 and
for large values of kr is larger than unity, meaning that the SNR at the array output
has improved, compared to the SNR at the array input. However, for low values of kr
the WNG is less than one, meaning that the SNR is degraded, which is an undesirable
property in array processing. For further discussion of WNG, including addressing
the factors affecting the WNG and ways to design arrays that maximize WNG, see
Chap. 6.
Examples of simple beamformers are presented in this section. The first beamformer
is the delay-and-sum beamformer, which is widely used due to its simple realization,
i.e. the beamforming weights are composed of delays. The delays are selected such
that the phase of a plane wave arriving from the array look direction is matched at all
116 5 Spherical Array Beamforming
5
Q=9,N=2
4.5 Q=1,N=0
3.5
2.5
1.5
0.5
0
0 0.5 1 1.5 2
Fig. 5.5 WNG for an axis-symmetric beamformer with dn = 1 , n = 0, . . . , N , for a single micro-
phone with Q = 1 and N = 0 and an array with Q = 9 and N = 2
sensors, providing maximum output at the look direction [7]. Furthermore, the delay-
and-sum beamformer also offers maximum WNG and therefore maximum robustness
to noise and uncertainties. This is discussed further in the next chapter. Note that the
delay-and-sum approach will work only if the plane waves propagate in free field,
so that the delay-and-sum beamformer is applicable to open array configurations.
However, its realization is also possible for other configurations, as detailed in this
section.
The beamforming integral equation, as presented in Eq. (5.1), is now employed
with the aim of developing an analytical formulation for the delay-and-sum beam-
former. The sound pressure on a sphere of radius r due to a single unit-amplitude
plane wave arriving from direction k̃ can be expressed as ei k̃·r . Phase alignment for
waves with arrival direction, (θk , φk ), equal to the array look direction, (θl , φl ), is
therefore achieved when selecting the beamforming weighting function to be
with k̃l = (k, θl , φl ) and r = (r, θ, φ) representing the array spherical surface. Using
Eq. (2.37), the coefficients of the beamforming weights can be written in the spherical
harmonics domain as ∗
wnm (k) = bn (kr ) Ynm (θl , φl ) , (5.41)
N
n
N
n
pnm (k, r )
y= [wnm (k)]∗ pnm (k, r ) = dn (k)Ynm (θl , φl )
n=0 m=−n n=0 m=−n
bn (kr )
N
n
= |bn (kr )|2 anm (k)Ynm (θl , φl ). (5.43)
n=0 m=−n
Now, anm (k) can be computed from the sound field measured by the various array
configurations presented in Chap. 4, using the appropriate function bn (kr ) for the
actual configuration; the terms bn (kr ) that replace the array weights are those repre-
senting an open sphere, regardless of the actual configuration. This is an illustration
of the flexibility of array design and processing in the spherical harmonics domain.
Another widely used beamformer is characterized by beamforming weights of
unit value, i.e. dn = 1. Equation (5.43) can be rewritten for this case, by substituting
dn = 1, as
N
n
N
n
y= [wnm (k)]∗ pnm (k, r ) = anm (k)Ynm (θl , φl )
n=0 m=−n n=0 m=−n
≈ a(k, θl , φl ), (5.44)
A beamforming example is presented in this section with the aim of illustrating the
way in which sound field composition, sampling, beamforming, and analysis are
118 5 Spherical Array Beamforming
formulated and realized using computer simulations. The example is broken down
into stages for clarity.
(i) Sound pressure in free field. Consider a sound field, composed of S harmonic
plane waves with wave number k, arrival directions denoted by (θs , φs ), s =
1, . . . , S, and amplitudes at the origin of the coordinate system, as (k), s =
1, . . . , S. Using Eqs. (2.40) and (2.41), the sound pressure at (r, θ, φ) can be
written as
∞
n
p(k, r, θ, φ) = pnm (k, r )Ynm (θ, φ)
n=0 m=−n
∞
n
S
∗
= 4πi n jn (kr )as (k) Ynm (θs , φs ) Ynm (θ, φ). (5.45)
n=0 m=−n s=1
This equation is exact. However, when the aim is to generate this sound field
using a computer simulation, an approximation must be applied by constraining
the summation to be finite.
(ii) Finite-order sound field. The finite-order sound field is computed by replacing
the upper summation limit over n with Ñ . The approximation error can still be
small if kr Ñ (see Sect. 2.3), with r denoting the distance from the origin.
The sound field generated in practice is therefore given by
Ñ
n
S
∗
p(k, r, θ, φ) = 4πi n jn (kr )as (k) Ynm (θs , φs ) Ynm (θ, φ). (5.46)
n=0 m=−n s=1
(iii) Sampling by microphones. In the next stage of this simulation example, a spher-
ical microphone array is introduced into the sound field, centered at the origin.
It is assumed that the array is composed of a rigid sphere of radius ra with
Q microphones arranged on its surface, following a t-design configuration
(see Sect. 3.4), which allows for aliasing-free sampling up to order N . Equa-
tion (5.46) can now be used directly to represent the pressure at the micro-
phone positions, (ra , θq , φq ), q = 1, . . . , Q. Note that, in this case, the term
4πi n jn (kr ) is replaced by bn (kr ), with r = ra , to represent a rigid-sphere con-
figuration, as in Eq. (2.62):
Ñ
n
p(k, ra , θq , φq ) = pnm (kra )Ynm (θq , φq )
n=0 m=−n
Ñ
n
S
∗
= bn (kra )as (k) Ynm (θs , φs ) Ynm (θq , φq ),
n=0 m=−n s=1
q = 1, . . . , Q. (5.47)
5.6 Beamforming Example 119
(iv) Spherical Fourier transform. In the next stage, the spherical Fourier transform
of the sound pressure at the sphere surface, pnm , is computed directly from the
pressure measurements at the microphones, p(k, ra , θq , φq ), using the spherical
Fourier transform for the nearly-uniform sampling scheme [see Eq. (3.24)], as
4π
Q
∗
pnm (k, ra ) = p(k, ra , θq , φq ) Ynm (θq , φq ) , n ≤ N . (5.48)
Q q=1
S
∗
pnm (k, ra ) = bn (kra ) as (k) Ynm (θs , φs ) , n ≤ N . (5.49)
s=1
N
n
y(θl , φl ) = [wnm (k)]∗ pnm (k, ra )
n=0 m=−n
N n
pnm (k, ra ) m
= Y (θl , φl ). (5.50)
n=0 m=−n
bn (kra ) n
It is important to note that the angles (θl , φl ) can be selected at any desired
density over the sphere and are not related to the original sampling set (θq , φq ).
In particular, when plotting y(θl , φl ) over the sphere, a high sampling density
may be desired.
As a numerical example, consider a sound field composed of S = 3 harmonic
plane waves with amplitudes 1.0, 0.7eiπ/3 and 0.4eiπ/2 and arrival directions
(90◦ , 45◦ ), (117◦ , 90◦ ) and (45◦ , 270◦ ), respectively, at wave number k and radius r
satisfying kr = kra = 6. The pressure on the surface of the rigid-sphere array is mea-
sured by Q = 84 microphones, allowing aliasing-free sampling up to order N = 6.
The sound pressure at the microphones is computed as in stage (iii) with Ñ = 10.
Then, pnm is computed as in stage (iv) and beamforming is applied as in stage (vi)
to produce the plane-wave decomposition, y(θl , φl ).
Figure 5.6 shows the normalized magnitude of y(θl , φl ) in this case. An equal-
angle grid of 60 × 60 points was used to generate (θl , φl ). The figure shows three
peaks, corresponding to the actual arrival directions of the plane waves, marked as
“+” on the figure. Note that with plane-wave decomposition, due to the finite spher-
120 5 Spherical Array Beamforming
160 0.9
140 0.8
0.7
120
0.6
100
0.5
80
0.4
60
0.3
40 0.2
20 0.1
0
0 50 100 150 200 250 300 350
Fig. 5.6 Normalized magnitude of y(θl , φl ), the plane-wave decomposition, computed using
Eq. (5.50) with pnm computed from Eq. (5.48), for kr = kra = 6. The arrival directions of the
three plane waves are marked with white “+”
ical harmonics order of the beamforming, each plane wave contributes a sinc-like
function to y (see Fig. 1.12), so that y is composed of the weighted summation of
these functions. This may explain effects such as peaks at directions other than the
wave arrival directions, wide peaks around the arrival directions and peaks not corre-
sponding exactly to arrival directions. Methods to reduce these effects by controlling
the beam pattern of the array are presented in the next chapter.
Figure 5.7 shows the normalized magnitude of y(θl , φl ); this time with pnm com-
puted directly from Eq. (5.49), therefore avoiding errors due to finite-order and spatial
aliasing. As Figs. 5.6 and 5.7 are relatively similar, it is clear that, in this case, the lim-
ited order and spatial sampling do not produce significant errors. This is reasonable,
because with Ñ = 10, N = 6 and kr = 6, both errors are expected to be small.
In contrast, errors cannot be expected to be small in the next example, where
the computation of y(θl , φl ) is repeated, as in Fig. 5.6, but this time for kr = 10.
Figure 5.8 shows a larger number of peaks away from the plane-wave arrival direc-
tions. These peaks are mostly due to aliasing errors, with the higher orders aliased
to the lower orders, n = 0, . . . , 6.
The formulations in this simulation example are presented here in a matrix form,
because this is the form most likely to be employed in practice using computer
programming. First, the pressure at the microphones, Eq. (5.47), is rewritten as
5.6 Beamforming Example 121
160 0.9
140 0.8
0.7
120
0.6
100
0.5
80
0.4
60
0.3
40 0.2
20 0.1
0
0 50 100 150 200 250 300 350
Fig. 5.7 Normalized magnitude of y(θl , φl ), the plane-wave decomposition, computed using
Eq. (5.50) with pnm computed from Eq. (5.49), for kr = kra = 6. The arrival directions of the
three plane waves are marked with white “+”
160 0.9
140 0.8
0.7
120
0.6
100
0.5
80
0.4
60
0.3
40 0.2
20 0.1
0
0 50 100 150 200 250 300 350
Fig. 5.8 Normalized magnitude of y(θl , φl ), the plane-wave decomposition, computed using
Eq. (5.50) with pnm computed from Eq. (5.48), for kr = kra = 10. The arrival directions of the
three plane waves are marked with white “+”
122 5 Spherical Array Beamforming
p = Ỹq B̃ỸsH as
as = [a1 (k), a2 (k), . . . , a S (k)]T
⎡ ⎤
Y00 (θ1 , φ1 ) · · · Y ÑÑ (θ1 , φ1 )
⎢ .. .. .. ⎥
Ỹs = ⎢
⎣ . . .
⎥
⎦
Y00 (θ S , φ S ) · · · Y ÑÑ (θ S , φ S )
B̃ = diag b0 , b1 , b1 , b1 , · · · , b Ñ
⎡ ⎤
Y00 (θ1 , φ1 ) · · · Y ÑÑ (θ1 , φ1 )
⎢ .. .. .. ⎥
Ỹq = ⎢
⎣ . . .
⎥
⎦
Y0 (θ Q , φ Q ) · · · Y Ñ (θ Q , φ Q )
0 Ñ
T
p = p(k, ra , θ1 , φ1 ), · · · , p(k, ra , θ Q , φ Q ) , (5.51)
where the S × 1 vector as holds the plane waves’ amplitudes, the Q × 1 vector p
holds the sound pressure amplitude at the microphones, the ( Ñ + 1)2 × ( Ñ + 1)2
diagonal matrix B̃ holds the values of bn (kr ) for a rigid sphere with r = ra , the
S × ( Ñ + 1)2 matrix Ỹs holds the spherical harmonics with the plane wave arrival
directions and, similarly, the Q × ( Ñ + 1)2 matrix Ỹq holds the spherical harmonics
with the microphone positions’ directions.
In the next stage, the spherical harmonic coefficients of the sound pressure on the
sphere, pnm , are computed, as in Eq. (5.48):
4π H
pnm = Y p
Q q
T
pnm = p00 , p1(−1) , p10 , p11 , . . . , p N N , (5.52)
where the (N + 1)2 × 1 vector pnm holds coefficients pnm , and Yq is similar to Ỹq
but with Ñ replaced by N . In the final stage, plane-wave decomposition is computed,
as in Eq. (5.50):
y = Yl B−1 pnm
⎡ 0 ⎤
Y0 (θ1 , φ1 ) · · · Y NN (θ1 , φ1 )
⎢ .. .. .. ⎥
Yl = ⎣ . . . ⎦, (5.53)
Y00 (θ L , φ L ) · · · Y NN (θ L , φ L )
where the L × (N + 1)2 matrix Yl holds the spherical harmonics with the plane-
wave decomposition look directions, and B is similar to B̃ but with Ñ replaced by
N.
5.7 Steering Non Axis-Symmetric Beam Patterns 123
cnm (k)
[wnm (k)]∗ = . (5.54)
bn (kr )
The array beam pattern, defined as the array output in response to a unit-amplitude
plane wave, can be formulated by substituting Eqs. (5.54) and (5.16) into Eq. (5.8):
N
n
∗
y= cnm (k) Ynm (θk , φk ) . (5.55)
n=0 m=−n
Beam pattern y and coefficients cnm (k) are therefore related through the spherical
Fourier transform and complex-conjugate operations, i.e. [y(θk , φk )]∗ is the spherical
∗
Fourier transform of cnm . This provides a simple framework for the calculation of
cnm once a desired beam pattern is available. However, the steering of such a beam
pattern may not be as simple as in the case of the axis-symmetric beam pattern.
Recall that in the case of the axis-symmetric beam pattern, steering was achieved by
substituting a desired look direction, (θl , φl ), in Eq. (5.22), without any modification
to the beamforming coefficients, dn . In the case of non axis-symmetric beamforming,
Eq. (5.55), steering will directly change the coefficients, cnm . However, steering the
beam pattern is equivalent to rotating function y(θk , φk ), and so the rotation operation
of functions on the sphere, as presented in Sect. 1.6, is employed [6].
Let us denote by y r (θk , φk ) ≡ Λ(α, β, γ )y(θk , φk ) the function on the sphere, y,
rotated by Euler angles (α, β, γ ) (see Sect. 1.6 for more details on rotation using
Euler angles). In the case of beamforming, the rotation will steer the beam pattern to
the desired orientation. It is important to note that in the case of a non axis-symmetric
beam pattern, in addition to conventional steering, which is the change in the look
direction, another degree-of-freedom is available; this can be interpreted as a rotation
of the beam pattern itself about the look direction. Such a rotation will only change
the beam pattern if it is non axis-symmetric about the look direction. This explains
the need for three angles, (α, β, γ ), when performing steering of non axis-symmetric
beam patterns.
124 5 Spherical Array Beamforming
Steering is now formulated based on Eq. (1.72) and Sect. 1.6, where a rotation of
a function on the sphere is decomposed into a set of rotations of spherical harmonics,
which, in turn, are formulated using multiplication with the Wigner-D functions [6]:
N
n
n
∗
∗
= cnm (k) Dmn m (α, β, γ ) Ynm (θk , φk )
n=0 m=−n m =−n
N
n
n
∗
∗
= cnm (k) Dmn m (α, β, γ ) Ynm (θk , φk )
n=0 m =−n m=−n
N
n ∗
= r
cnm (k) Yn (θk , φk )
m
. (5.56)
n=0 m =−n
n
∗
(k) = cnm (k) Dmn m (α, β, γ ) .
r
cnm (5.57)
m=−n
r
Substituting Eq. (5.54) into Eq. (5.57), the rotated coefficients wnm can be written in
terms of the original coefficients wnm as
n
(k) = wnm (k)Dmn m (α, β, γ ).
r
wnm (5.58)
m=−n
where wnm r
is the (N + 1)2 × 1 vector of coefficients of the rotated beam pattern,
wnm has been defined in Eq. (5.10) and the block-diagonal Wigner-D matrix D has
been defined in Sect. 1.6.
Rotations can be applied successively in an ongoing steering process, e.g. succes-
sive rotations D1 and D2 can be realized by multiplying the two rotation matrices, i.e.
D2 D1 , to produce an equivalent rotation. This can be useful to simplify the steering
process from the current look direction, (θl , φl ), to the desired new look direction,
(θl , φl ), where ψl and ψl represent the rotations about the current and desired look
directions, respectively. First, a rotation of Λ(−ψl , −θl , −φl ) is applied to align the
beam pattern look direction with the positive z-axis direction, without any further
5.7 Steering Non Axis-Symmetric Beam Patterns 125
rotation about this direction. Then, a rotation in the direction Λ(θl , φl , ψl ) is applied
to steer the beam pattern to the new direction. This entire process can be realized
using a single rotation matrix, by multiplying the two rotation matrices, as explained
above [6].
References
1. Li, Z., Duraiswami, R.: Flexible and optimal design of spherical microphone arrays for beam-
forming. IEEE Trans. Audio Speech Lang. Process. 15(2), 702–714 (2007)
2. Meyer, J., Elko, G.W.: A highly scalable spherical microphone array based on an orthonormal
decomposition of the soundfield. In: IEEE International Conference on Acoustics, Speech and
Signal Processing (ICASSP 2002), vol. II, pp. 1781–1784. Orlando (2002)
3. Rafaely, B.: Plane-wave decomposition of the pressure on a sphere by spherical convolution. J.
Acoust. Soc. Am. 116(4), 2149–2157 (2004)
4. Rafaely, B.: Analysis and design of spherical microphone arrays. IEEE Trans. Speech Audio
Process. 13(1), 135–143 (2005)
5. Rafaely, B.: Phase-mode versus delay-and-sum spherical microphone array processing. IEEE
Signal Process. Lett. 12(10), 713–716 (2005)
6. Rafaely, B.: Spherical microphone array beam steering using Wigner-D weighting. IEEE Signal
Process. Lett. 15, 417–420 (2008)
7. Van Trees, H.L.: Optimum Array Processing (Detection, Estimation, and Modulation Theory,
Part IV), 1st edn. Wiley, New York (2002)
Chapter 6
Optimal Beam Pattern Design
The directivity factor has been introduced in Sect. 5.3 to account for the ratio between
the array response in the look direction and the average response across all directions.
It is common in array processing to normalize the response in the look direction by
introducing a distortionless-response constraint [14], such that the average response is
minimized subject to the constraint of a unit response in the look direction. Following
the directivity factor derived in Eq. (5.29), the maximum directivity beamformer is
designed to satisfy
H
minimize wnm Bwnm
wnm
(6.1)
H
subject to wnm vnm = 1,
with
1
B= diag |b0 |2 , |b1 |2 , |b1 |2 , |b1 |2 , . . . , |bN |2 (6.2)
4π
with the elements of the steering vector vnm defined in Eq. (5.16).
A solution to the optimization problem in Eq. (6.1) is obtained using the method
of Lagrange multipliers, widely employed in array processing [14]. Note that the
average directivity, denoted by the denominator in Eq. (5.27), is a real quantity, and
H
so the denominator in Eq. (5.29) derived thereafter, i.e. wnm Bwnm , is also real. The
function to be minimized in Eq. (6.1) is therefore real. Using the method of Lagrange
multipliers, the constrained optimization problem is reduced to an unconstrained one
as follows:
H H
H
minimize wnm Bwnm + λ∗ wnm vnm − 1 + λ vnm wnm − 1 . (6.4)
wnm
Taking the derivative with respect to the complex vector wnm and setting the result
to zero, gives
H
wnm B + λvnm
H
= 0, (6.5)
which, when satisfied, implies that at the solution point both the quadratic objective
function and the linear constraint function have gradients in the same direction, only
normalized by λ. The solution therefore satisfies
H
wnm = −λvnm
H
B−1 . (6.6)
Multiplying both sides from the right by vnm and substituting the constraint in
Eq. (6.1), the value of λ is given by
1
λ=− H B−1 v
. (6.7)
vnm nm
The optimal value of wnm can now be written in the final form as
H
vnm B−1
H
wnm = H B−1 v
. (6.8)
vnm nm
6.1 Maximum Directivity Beamformer 129
Note that matrix B must be invertible, which amounts to requiring all values of bn (kr)
to be non-zero [see Eq. (6.2) and Chap. 4]. By substituting the elements of matrix B
and vnm in Eq. (6.8), the elements of wnm can be expressed as
4π 1
= Y m (θk , φk ). (6.9)
(N + 1)2 bn (kr) n
Two conclusions can be drawn from this result. First, comparing Eq. (6.9) to
Eq. (5.22), it is clear that the maximum directivity beamformer is axis-symmetric,
with
4π
dn (k) = . (6.10)
(N + 1)2
It immediately follows that the optimal beamformer in Eq. (6.9) is also a solution
to the axis-symmetric maximum directivity beamformer, with a directivity factor as
defined in Eq. (5.30). The second conclusion follows directly, by noting that Eq. (6.10)
is a normalized version of the plane-wave decomposition array described in Sect. 5.5.
The plane-wave decomposition array therefore achieves maximum directivity. This
is evidence of the following characteristic of the spherical harmonics domain formu-
lation, in particular with axis-symmetric beam patterns: the naive solution of setting
all coefficients to a constant value achieves the best directivity index! An alternative
approach to solve for the maximum directivity beamformer is outlined at the end of
this section.
The directivity factor of the maximum directivity beamformer is derived next, by
substituting the solution from Eq. (6.8) and the satisfied constraint into Eq. (5.29):
H
wnm Awnm
DFmax = H Bw
wnm nm
H H
w vnm vnm wnm
= nm −1
H −1
vnm B vnm
= vnm
H
B−1 vnm
N
n
4π ∗
= [bn (kr)]∗ Ynm (θk , φk ) bn (kr) Ynm (θk , φk )
n=0 m=−n
|bn (kr)| 2
N n
m ∗ N
2n + 1
= 4π Yn (θk , φk ) Ynm (θk , φk ) = 4π Pn (cos 0)
n=0 m=−n n=0
4π
= (N + 1)2 . (6.11)
130 6 Optimal Beam Pattern Design
The maximum achievable directivity factor therefore depends on the array order.
Arrays with a high directivity factor require a high-order N , which, in turn, requires
a large number of microphones. As the number of microphones for aliasing-free
sampling requires Q ≥ (N + 1)2 , it is clear that the maximum achievable directivity
is directly proportional to the number of microphones in the array.
Maximum directivity arrays exhibit a beam pattern that is referred to as hyper-
cardioid [6]. This beam pattern, well known for a directivity of 14 (1 + 3 cos Θ) for a
first order array, can also be extended specifically to spherical arrays by exploiting
the maximum directivity solution. The array beam pattern for an axis-symmetric
array, with dn = (N4π+1)2
, can be written using Eq. (5.24) as
4π
N
2n + 1
y(Θ) = Pn (cos Θ)
(N + 1) n=0 4π
2
PN +1 (cos Θ) − PN (cos Θ)
= (6.12)
(N + 1)(cos Θ − 1)
[see Sect. 1.5, describing the spherical Fourier transform of Ynm (θ, φ)]. Substituting
the expressions for the Legendre polynomials (see Sect. 1.3), Table 6.1 shows the
hyper-cardioid directivity for several array orders and Fig. 6.1 illustrates the beam
patterns for orders N = 1, . . . , 4. The figure shows that improved hyper-cardioid
directivity at high orders comes with reduced side-lobe level, and a narrower main
lobe. In fact, Rafaely [11] showed that for arrays with an order higher than about
N = 4, the width of the main lobe, defined as the angle between the two zeros on
either side of the main lobe, can be approximated by the following simple expression:
2π
2Θ0 ≈ , (6.13)
N
with Θ0 denoting the angle of the main lobe zero. The width of the main lobe is
also related to the ability of the array to spatially separate two plane waves arriving
90 90
120 1 60 120 1 60
0.8 0.8
150 0.6 30 150 0.6 30
0.4 0.4
0.2 0.2
180 0 0 180 0 0
90 90
120 1 60 120 1 60
0.8 0.8
150 0.6 30 150 0.6 30
0.4 0.4
0.2 0.2
180 0 0 180 0 0
from different directions. The limit of this separation ability is known in optics as
the Rayleigh resolution [2], such that
π
ΘRayleigh ≈ . (6.14)
N
For arrays with low orders, the Rayleigh resolution is poor, but as the order
increases, resolution improves in a proportional manner.
An alternative approach to the derivation of the maximum directivity beamformer,
which does not require the Lagrange multiplier, is briefly outlined next. In this
approach the directivity factor is maximized directly, without imposing a distor-
tionless response constraint, after which the solution is normalized to satisfy the
constraint. Maximizing the directivity factor, as in Eq. (5.29), can be written as
H
wnm Awnm
maximize λ, λ = H
. (6.15)
wnm wnm Bwnm
A solution to this scalar equation can be found by solving the following vector
equation:
132 6 Optimal Beam Pattern Design
H
because left-multiplication of Eq. (6.17) with wnm preserves the equality. Equation
(6.17) is a generalized eigenvalue problem [5], with Eq. (6.15) representing a gen-
eralized Rayleigh quotient. We now use the special structures of matrices A and
B, as defined in Eq. (5.29), to simplify the generalized eigenvalue problem into a
(standard) eigenvalue problem. First, both sides of the equation are multiplied by the
inverse of matrix B. Second, matrix A is written as a dyadic or outer product of two
H
vectors, vnm vnm , such that B−1 A = ṽnm vnm
H
, with ṽnm = B−1 vnm . Equation (6.17)
can now be rewritten as
H
ṽnm vnm wnm = λwnm . (6.18)
Equation (6.18) is an eigenvalue problem, with the matrix under consideration having
unit rank, as it is composed of the outer product of two vectors. Due to the single
rank, there is only one non-zero eigenvalue, with a corresponding right eigenvector
ṽnm [9]. Substituting wnm = ṽnm , this becomes a solution, provided λ = vnm H
ṽnm .
These are therefore the eigenvector and eigenvalue in this case; the eigenvalue is
the largest, as it is real and positive, and all other eigenvalues are zero. The optimal
beamforming coefficients can therefore be written as
H
wnm = vnm
H
B−1 , (6.19)
which is a normalized version of the solution derived in Eq. (6.8). Further normal-
ization can now be applied, as in Eq. (6.8), to satisfy the distortionless-response
constraint.
WNG was introduced in Sect. 5.4 as a general measure for array robustness. Arrays
that achieve maximum WNG will therefore be most robust to the effect of sensor
noise and other uncertainties in system parameters. This section presents the deriva-
tion of a spherical array with maximum WNG. Similar to the design of maximum
directivity beamformers, we constrain the beam pattern to have unit response at the
H
look direction, such that wnm vnm = 1, and so the numerator in Eq. (5.35) satisfies
wnm Awnm = 1. Maximum WNG beamformers can therefore be designed by solving
H
with
B = SSH . (6.21)
This problem is similar to the maximum directivity problem defined in Eq. (6.1), and
therefore a solution similar to Eq. (6.8) applies, leading to
H
vnm B−1
H
wnm = H B−1 v
. (6.22)
vnm nm
The maximum WNG in this case is derived by substituting the solution, Eq. (6.22),
H
in the expression for the WNG, Eq. (5.34), assuming wnm vnm = 1:
H
w vnm 2 1
nm
WNGmax = H = H
wnm Bwnm wnm Bwnm
H −1 2
vnm B vnm
= H −1 −H
vnm B BB vnm
= vnm
H
B−1 vnm . (6.23)
The last line of the derivation requires that B is Hermitian, which is satisfied because
B = SSH .
In the special case of uniform or nearly-uniform sampling [see Eq. (5.36)] matrix
B simplifies to
4π
B = SSH = I. (6.24)
Q
Substituting Eq. (6.24) in Eqs. (6.22) and (6.23), the expressions for the optimal
weights and the maximum WNG for the case of uniform and nearly-uniform sampling
can be written as
H
vnm
H
wnm = H
(6.25)
vnm vnm
and
Q H
WNGmax = v vnm . (6.26)
4π nm
The expression for the maximum WNG can be further simplified using the following
relation [see Eqs. (5.34) and (5.39)]:
Q H
vH v = vnm
H
YH Yvnm = v vnm . (6.27)
4π nm
134 6 Optimal Beam Pattern Design
WNGmax = vH v = Q. (6.28)
The equality to Q is achieved for the case of sensors in free field; in this case, the
steering vector is defined as in Eqs. (5.5) and (5.6), i.e. with elements vq = eik̃·r , r =
(r, θq , φq ), and so the maximum WNG is equal to Q, the number of sensors. This is
a well-known result for the maximum achievable WNG [14].
Substituting Eq. (5.16) in Eqs. (6.25) and (6.26), the solution for the optimal
weights and the maximum WNG can be expressed more explicitly in the spheri-
cal harmonics domain as
and
Q 2n + 1
N
WNGmax = |bn (kr)|2 . (6.30)
4π n=0 4π
|bn (kr)|2
dn (k) = N . (6.31)
n=0
2n+1
4π
|bn (kr)|2
Note also that this beamformer is similar to the beamformer presented in Eq. (5.42),
i.e. the delay-and-sum beamformer, when sensors are in a free field. It is therefore
clear that for free field arrays, the maximum WNG beamformer is equivalent to the
delay-and-sum beamformer. This further justifies the popular use of the delay-and-
sum beamformer in the literature, due to its robustness property [14]. Nevertheless,
Eq. (6.31) can be used to design maximum WNG beamformers for general array
configurations, not only for sensors in free fields, e.g. rigid-sphere arrays.
Figure 6.2 shows the WNG for an array of order N = 3, in the range kr ∈ [0, 3],
designed to achieve maximum WNG. The values of the WNG were calculated using
Eq. (6.30), substituting values of bn (kr) for rigid and open spheres. The open-sphere
array achieves WNG close to Q (about 15 dB), as expected. Only as kr approaches 3
does the value of the WNG slightly reduce, as in this range the contribution of orders
higher than 3 to the sound field becomes significant, and the approximation of the
complex exponential sound field function becomes less accurate. The rigid-sphere
array achieves a WNG slightly higher than Q in the higher frequency range. This
6.2 Maximum WNG Beamformer 135
19
Open
18.5 Rigid
Q
18
17.5
17
16.5
16
15.5
15
14.5
14
0 0.5 1 1.5 2 2.5 3
Fig. 6.2 WNG for an array of order N = 3 with Q = 32 microphones nearly-uniformly arranged
on the surface of rigid and open spheres
is due to the effect of scattering; however, as discussed in Sect. 5.4, the WNG was
defined for sensors in a free field and hence may not apply directly in the case of
sensors around a rigid sphere. This means that the increase in the WNG is somewhat
artificial.
The previous two sections presented two alternatives to the design of spherical arrays,
one that achieves maximum directivity index and the other that achieves maximum
WNG. These two designs are compared in this section by means of an example
[12]. Maximum directivity and maximum WNG beamformers are designed for a
spherical array composed of Q = 36 microphones arranged around an open sphere
and using a nearly-uniform sampling configuration, which achieves aliasing-free
sampling up to and including order N = 4. The directivity index and WNG for these
two beamformers are presented in Fig. 6.3. Several conclusions can be drawn from
this example:
• The directivity index plot clearly shows that the array designed for maximum
directivity does achieve a better directivity index than the array designed for max-
imum WNG. The value of the directivity index in this case (for the fourth-order
array) is given by 10 log10 (N + 1)2 ≈ 14 dB, as illustrated in the figure.
• The WNG plots show that the array designed for maximum WNG does achieve
a better WNG than the array designed for maximum directivity. The value of
136 6 Optimal Beam Pattern Design
20
Max DI
18 Max WNG
16
14
12
10
0
0 1 2 3 4 5
30
Max DI
Max WNG
20
10
-10
-20
-30
0 1 2 3 4 5
Fig. 6.3 Directivity index (top) and WNG (bottom) for two arrays of order N = 4, with Q = 36
microphones nearly-uniformly arranged on the surface of an open sphere; one is designed to achieve
maximum directivity index and the other is designed to achieve maximum WNG
6.3 Example: Directivity Versus WNG 137
the WNG for this delay-and-sum type array is given by 10 log10 Q ≈ 15.5 dB, as
illustrated in the figure.
• The directivity index of the array designed for maximum WNG decreases towards
the low frequencies, achieving DI = 0 dB at kr = 0. This is a result of the require-
ment introduced in this design to achieve maximum WNG. The required WNG
can only be achieved at low values of kr by allocating low-magnitude weights to
high-order coefficients, as is evident from the solution in this case, i.e. dn is pro-
portional to |bn (kr)|2 . For high n and low kr the magnitude of bn (kr) is small. The
low effective order of the array at low kr produces low directivity index values.
The high orders with their low magnitude present poor SNRs, so that allocating
weights with high gains to these orders would increase the noise and reduce the
WNG.
• For the same reason, the array designed for maximum directivity index achieves
poor WNG at low values of kr.
• At kr = N , both designs achieve a similar directivity index and WNG. This is
due to the behavior of bn (kr), n = 0, . . . , N , which have a similar magnitude at
kr = N . Arrays designed for narrow-band signals can therefore have both the best
directivity index and the best WNG if designed to operate at kr = N .
• The arrays designed for maximum directivity index have poor WNG at frequencies
around which bn (kr) = 0, i.e. the zeros of the spherical Bessel function. As dis-
cussed above, low values of bn (kr) impose poor WNG when attempting to achieve
a high directivity index. The disadvantage of the open-sphere array regarding
robustness is therefore clearly illustrated in this example.
The example presented above clearly shows the inherent trade-off between the
directivity index and WNG. This trade-off calls for a design which takes both the
directivity index and WNG into account. Such design approaches are presented in
the following sections.
Spherical microphone array designs for maximum directivity and maximum WNG
were presented in Sects. 6.1 and 6.2. The design example presented in the following
section demonstrates the inherent trade-off between directivity and WNG, i.e. high
directivity index may come at the expense of robustness. Design of spherical arrays
in practice therefore involves a balance between these two measures of performance.
Spherical arrays with maximum directivity are particularly useful when it is nec-
essary to reduce the effect of diffuse or spherically isotropic noise fields. On the
other hand, spherical arrays with maximum WNG are particularly useful when it
is necessary to reduce the effect of sensor noise. Therefore, the balance between
directivity and WNG represents the balance between reducing acoustic noise and
reducing sensor noise. Now, by minimizing the overall noise at the array output,
which is composed of both acoustic noise and sensor noise, a natural balance can be
achieved between directivity and WNG [10]. A framework for designing a spherical
138 6 Optimal Beam Pattern Design
array that minimizes the overall noise at the array output is presented in this section.
First, an expression for the overall noise at the array output is formulated and next,
a closed-form expression for the array beamforming coefficients is derived by min-
imizing the overall noise at the array output, subject to a distortionless-response
constraint.
Assuming spatially uncorrelated sensor noise with variance σs2 and following the
derivation in Sect. 5.4, the variance of the sensor noise at the array output, σso2 , can
be expressed, as in Eq. (5.33):
with matrix S dependent on the sampling scheme [see Eqs. (3.41)–(3.43)]. For the
particular case of uniform and nearly-uniform sampling, σso2 reduces to
4π
σso2 = σs2 wnm H wnm (6.33)
Q
2π π
σao
2
= |y(θ, φ)|2 sin θ d θ d φ
0 0
2π π N
2
n
∗
= [wnm (k)]∗ σa bn (kr) Ynm (θ, φ) sin θ d θ d φ
m=−n
0 0 n=0
N
n
= σa2 [wnm (k)]∗ bn (kr)2
n=0 m=−n
= σa wnm Bwnm ,
2 H
(6.34)
with
B = diag |b0 |2 , |b1 |2 , |b1 |2 , |b1 |2 , . . . , |bN |2 , (6.35)
similar to the expression found in Eq. (5.29). The orthogonality property of the spher-
ical harmonics, as formulated in Eq. (1.23), was used in the derivation to evaluate the
integral. The overall noise at the array output can now be written as a composition
of the acoustic noise and sensor noise:
6.4 Mixed-Objective Design 139
where
4π
N
n
σso2 = σs2 |wnm (k)|2
Q n=0 m=−n
4π |dn (k)|2 m 2
N n
= σs2 Yn (θl , φl )
Q n=0 |bn (kr)| m=−n
2
1 |dn (k)|2
N
= σs2 (2n + 1)
Q n=0 |bn (kr)|2
= σs2 dnH Adn , (6.40)
with
1
A= diag 1/|b0 |2 , 3/|b1 |2 , . . . , (2N + 1)/|bN |2 . (6.41)
Q
The spherical harmonics addition theorem, formulated in Eq. (1.26), was used to
simplify the summation over spherical harmonics.
The variance of the acoustic noise for the case of an axis-symmetric beamformer
with nearly-uniform sampling can be derived directly from Eq. (6.34) by substituting
Eq. (5.22):
140 6 Optimal Beam Pattern Design
2π π
N
n
σao
2
= |y(θ, φ)|2 sin θ d θ d φ = σa2 [wnm (k)]∗ bn (kr)2
0 0 n=0 m=−n
N
n
m
= σa2 |dn (k)|2 Y (θl , φl )2
n
n=0 m=−n
N
(2n + 1)
= σa2 |dn (k)|2
n=0
4π
= σa dn Bdn ,
2 H
(6.42)
1
B= diag (1, 3, 5, . . . , 2N + 1) . (6.43)
4π
Matrix R in this case has the same form as in Eq. (6.37), i.e. R = σs2 A + σa2 B. An
optimization problem similar to the one in Eq. (6.38) can now be written as
where, in this case, the elements of the steering vector, vn , are vn = 2n+1 4π
,n=
0, . . . , N [see Eq. (5.31)]. It is assumed in this case that the angle between the incom-
ing plane wave and the look direction is zero. The solution becomes
vnH R−1
dnT = . (6.45)
vnH R−1 vn
Table 6.2 presents examples of spherical microphone array designs using the
mixed-objective method. In all examples, an optimization problem, as formulated
in Eq. (6.44), was formulated and solved using Eq. (6.45). Then, the values for the
directivity factor and the WNG were computed using Eqs. (5.31) and (5.39), respec-
tively.
The first two rows of the table illustrate two simplified designs, based on a second-
order spherical array in an open configuration, at kr = 2, composed of 12 micro-
phones and using a uniform sampling scheme. The first design, with σs2 = 0, σa2 = 1,
reduces to a maximum directivity beamformer. Indeed, DF = 9 is achieved, follow-
ing the theoretical upper limit of (N + 1)2 for this case. The second design, with
σs2 = 1, σa2 = 0, reduces to the maximum WNG beamformer, achieving a WNG of
11.67, which is just below the upper limit of Q for an array in free field (or open
configuration), which is 12 in this case. This example illustrates that the maximum
directivity and the maximum WNG designs are special cases of the mixed objectives
design.
6.4 Mixed-Objective Design 141
Table 6.2 Directivity factor and WNG are shown for several designs, with parameters presented
on the left-hand side of the table
Sphere N Q kr σs2 σa2 DF WNG
Open 2 12 2 0.0 1.0 9.00 6.58
Open 2 12 2 1.0 0.0 5.97 11.67
Rigid 3 32 3 0.0 1.0 16.00 44.52
Rigid 3 32 3 1.0 0.0 15.31 46.72
Rigid 3 32 3 1.0 1.0 16.00 44.64
Rigid 4 36 2 0.0 1.0 25.00 1.60
Rigid 4 36 2 1.0 0.0 9.35 51.50
Rigid 4 36 2 1.0 0.4 17.78 15.48
The design of microphone arrays with an optimized beam pattern has been presented
in Sect. 6.1, where the ratio between the magnitude of the beam pattern in a single
look direction and the magnitude of the beam pattern averaged over all directions was
maximized. The underlying assumption in this maximum directivity design is that
the desired signal arrives from a single direction. This may not always be the case.
Consider, for example, the recording of live music, with the microphone facing the
stage. In this case the directivity factor should be maximized over a wider directional
range, to capture the sound sources from the entire stage. In addition, low magnitude
of the beam pattern from other directions (e.g. the audience) may be desired. A simple
design objective suitable for this example is to maximize the ratio between the front
and back parts of the beam pattern. Directional microphones with maximum front-
back ratio have been discussed in [4], with optimal solutions derived for differential
microphones. In this section, the maximum front-back ratio solution is derived for
the spherical microphone array.
The measure for the front-back ratio can be written as [4]
2π
π/2
|y(θ, φ)|2 sin θ d θ d φ
F =
02π
0π . (6.46)
0 π/2 |y(θ, φ)|2 sin θ d θ d φ
In this formulation, the “front” refers to the upper hemisphere and the “back” to the
lower hemisphere. As the problem is symmetric around the z-axis, the axis-symmetric
beam pattern is employed, as in Eq. (5.24), substituting y = Nn=0 dn 2n+1
4π
Pn (cos θ ),
omitting the dependence on k. The resulting integral in the numerator of Eq. (6.46)
is evaluated next, denoting F = FFNUM
DEN
. We first solve for FNUM :
2π π/2
FNUM = |y(θ, φ)|2 sin θ d θ d φ
0 0
1 ∗
N N
= d (2n + 1)dn (2n + 1)
8π n=0 n =0 n
π/2
× Pn (cos θ )Pn (cos θ ) sin θ d θ. (6.47)
0
1
n
n 1
Pn (z)Pn (z)dz = pqn pln z q+l dz
0 q=0 l=0 0
n
n
1
= pqn pln . (6.48)
q=0 l=0
q+l+1
2π π
FDEN = |y(θ, φ)|2 sin θ d θ d φ
0 π/2
1 ∗
N N
= d (2n + 1)dn (2n + 1)
8π n=0 n =0 n
π
× Pn (cos θ )Pn (cos θ ) sin θ d θ
π/2
1 ∗
N N n n
(−1)q+l n n
= dn (2n + 1)dn (2n + 1)
p p . (6.51)
8π n=0 n =0 q=0
q+l+1 q l
l=0
dnH Adn
F= , (6.52)
dnH Bdn
90 90
120 1 60 120 1 60
0.8 0.8
150 0.6 30 150 0.6 30
0.4 0.4
0.2 0.2
180 0 0 180 0 0
90 90
120 1 60 120 1 60
0.8 0.8
150 0.6 30 150 0.6 30
0.4 0.4
0.2 0.2
180 0 0 180 0 0
Fig. 6.4 Super-cardioid beam patterns for order N = 1, . . . , 4, with the corresponding F values in
decibels
Matrices A and B are real, symmetric and positive definite, and so the eigenvalues
are positive real and the eigenvectors are real (see also [4]). Writing the Rayleigh
quotient as a generalized eigenvalue problem,
the largest eigenvalue is the value of the maximum front-back ratio, and the corre-
sponding vector is the solution dn .
The maximum front-back ratio beam pattern is also known as the super-cardioid
pattern [4]. Figure 6.4 shows examples of the super-cardioid beam pattern for spher-
ical arrays of orders N = 1, . . . , 4. Note that very high front-back ratios can be
achieved with these arrays, as detailed on the figures.
Beam pattern design often involves some assumptions about the desired signal and
the unwanted noise. For example, in the maximum directivity beamformer design,
the desired signal is a plane wave arriving from the array look direction, while the
noise is composed of waves arriving from all directions, e.g. a diffuse sound field.
However, the noise may be composed of a smaller number of plane waves arriving
6.6 Dolph-Chebyshev Beam Pattern 145
-2
0 0.2 0.4 0.6 0.8 1
from unknown directions. In this case, constraining the level of the beam pattern side
lobes can guarantee a desired level of noise attenuation. A framework for the design
of such beam patterns, called the Dolph-Chebyshev design method, is presented in
this section.
In particular, beam patterns with minimal width of the main lobe can be designed
for a given limit on the level of the side lobes, or beam patterns with a minimal level
of side lobes can be designed given a limit on the width of the main lobe. A brief
overview of Dolph-Chebyshev beam patterns is first presented [14], followed by a
derivation of a closed-form Dolph-Chebyshev design method for spherical arrays
[7].
The Dolph-Chebyshev beam pattern is based on the Chebyshev polynomials, char-
acterized by equal-amplitude oscillations in the range [−1, 1] and rapid divergence
beyond this range. Figure 6.5 shows an example of a Chebyshev polynomial, T8 (x),
illustrating that |T8 (x)| ≤ 1 in the range x ∈ [0, 1] and rapidly increases thereafter.
With the design of a Dolph-Chebyshev beam pattern, the oscillatory part of the poly-
nomials is transformed into the equal-ripple side-lobe response of the beam pattern,
while the diverging part contributes to the main lobe with a monotonic response. To
set the width of the main lobe and the relative attenuation of the side lobes, a point
(x0 , R) is selected, as shown in Fig. 6.5. The point x0 is to be transformed into the
look direction, or the peak of the main lobe, such that a relative side-lobe attenua-
tion of 1/R is achieved. Finally, the polynomial undergoes parameter scaling, with
x = x0 cos(θ/2). The fundamental equation describing the Dolph-Chebyshev beam
pattern based on the Chebyshev polynomials is therefore given by
1
y(θ ) = TM x0 cos(θ/2) , (6.55)
R
146 6 Optimal Beam Pattern Design
0.6
0.4
0.2
-0.2
-150 -100 -50 0 50 100 150
Alternatively, the desired zero of the main lobe is set to θ0 , from which x0 and then
R are derived:
R = cosh M cosh−1 (x0 ) , (6.58)
with π
cos 2M
x0 = . (6.59)
cos(θ0 /2)
Following the development in [7], the axis-symmetric beam pattern for the spherical
array, as in Eq. (5.23),
is equated to that in Eq. (6.55), with further substitutions of
1 + cos θ
z = cos θ, cos(θ/2) = 2
and M = 2N , leading to
N
2n + 1 1 1+z
dn Pn (z) = T2N x0 . (6.60)
n=0
4π R 2
, for an
The Chebyshev polynomial, T2N even order 2N , consists only of even powers
[1], and the polynomial T2N x0 1 +2 z is therefore of order N in z. The polynomial
on the left-hand side of Eq. (6.60) is also of order N in z (see Sect. 1.3) and so the
coefficients of the two polynomials can be equated, leading to a derivation of dn for
a Dolph-Chebyshev beam pattern. First, both sides of Eq. (6.60) are multiplied by
2π Pm (z), m = 0, . . . , N , and then they are integrated over the range z ∈ [−1, 1]. The
left-hand side reduces to dm , due to the orthogonality of the Legendre polynomials
[see Eq. (1.36)] leading to
1
2π 1+z
dm = Pm (z)T2N x0 dz, m = 0, . . . , N . (6.61)
R 2
−1
To solve the integral, both polynomials are written explicitly in an expanded form
as
m
Pm (z) = psm z s
s=0
N
T2N (z) = t2l2N z 2l , (6.62)
l=0
where psm and t2l2N denote the coefficients of the Legendre and Chebyshev polynomials,
respectively. Although T2N (z) is of order 2N , it has only N + 1 coefficients, as the
coefficients of the odd powers are zero. Substituting Eq. (6.62) into Eq. (6.61) and
rearranging terms, yields,
1
2π −l 2N m 2l
m N
dm = 2 t2l ps x0 z s (1 + z)l dz. (6.63)
R s=0
l=0 −1
Further
simplification is obtained by substituting the binomial expansion [1] (1 +
z)l = lq=0 q!(l−q)!
l!
z q , and then solving the integral, with odd powers of z integrating
to zero, leading to
148 6 Optimal Beam Pattern Design
2π 1 − (−1)q+s+1
m N l
l!
dm = 2−l t2l2N psq x02l . (6.64)
R s=0 q=0
q + s + 1 q!(l − q)!
l=0
2π
d= PACTx0 (6.65)
R
where
d = [d0 , d1 , . . . , dN ]T
T
x0 = 1, x02 , x04 , . . . , x02N
⎡ 0 ⎤
p0 0 · · · 0
⎢ 1 1 ⎥
⎢ p0 p1 · · · 0 ⎥
⎢
P=⎢ . . . . ⎥ ⎥
⎣ .. .. . . .. ⎦
p0N p1N · · · pNN
⎡ 1−(−1)N +1
⎤
2 0 ··· N +1
⎢ ⎥
⎢ 0 2
··· 1−(−1)N +2 ⎥
⎢ 3 N +2 ⎥
A=⎢ ⎥
⎢ .. .. .. .. ⎥
⎣ . . . . ⎦
1−(−1)N +1 1−(−1)N +2 1−(−1)2N +1
N +1 N +2
··· 2N +1
⎡ 1 1 ⎤
1 2 · · · 2N
⎢0 1 ··· N ⎥
⎢ N ⎥
C = ⎢ . 2. . 2. ⎥
⎣ .. .. . . .. ⎦
0 0 · · · 21N
T = diag t02N , t22N , ..., t2N 2N
. (6.66)
All four matrices are of size (N + 1) × (N + 1), with matrix A consisting of ele-
q+s+1
ments (s, q) given by 1−(−1)
q+s+1
, and the upper triangular matrix C consisting of
elements (q, l) given by q!(l−q)!
l!
2−l for l ≥ q. The Dolph-Chebyshev beam pattern
for a spherical array can now be designed as follows:
(i) The array order N is defined.
(ii) Desired side-lobe level 1/R or desired main-lobe width 2θ0 is selected.
(iii) Either Eq. (6.56) or Eq. (6.58) is evaluated to make both x0 and R available.
(iv) Array coefficients are computed using Eq. (6.65).
Figure 6.7 illustrates two examples of Dolph-Chebyshev beam patterns for spher-
ical arrays of orders N = 4, 9. For both designs 20 log10 R = 25 dB and in both
designs a side-lobe level of −25 dB is maintained. The high-order array achieves a
narrower main lobe, as is clearly shown in the figure. A second set of design examples
6.6 Dolph-Chebyshev Beam Pattern 149
5
N=4
0 N=9
-5
-10
-15
-20
-25
-30
-35
-150 -100 -50 0 50 100 150
Fig. 6.7 Beam pattern for an axis-symmetric spherical array with a Dolph-Chebyshev design with
R set to achieve a side-lobe level reduction of 25 dB, for arrays of orders N = 4, 9
0 N=4
N=9
-10
-20
-30
-40
-50
-60
-70
-80
-150 -100 -50 0 50 100 150
Fig. 6.8 Beam pattern for an axis-symmetric spherical array with a Dolph-Chebyshev design with
x0 set to achieve a main-lobe with a zero at θ0 = 45◦ , for arrays of orders N = 4, 9
150 6 Optimal Beam Pattern Design
is illustrated in Fig. 6.8, where for both designs θ0 = 45◦ , achieving a zero-to-zero
main-lobe width of 90◦ . The higher-order array achieves a lower side-lobe level, as is
clearly shown in the figure. In summary, the figures illustrate the trade-off in design
between main-lobe width and side-lobe level and further show that a spherical array
with a higher order achieves better performance, either in terms of main-lobe width
or in terms of side-lobe level.
In the previous sections of this chapter, various approaches to the design of spherical
microphone array beamformers were presented. Each of these design methods is
based on a different objective, which expresses a desired characteristic of the array.
These objectives include maximum directivity, maximum WNG, minimum side-lobe
level, and minimum main-lobe width, among other objectives. Design methods that
include a single objective, or two objectives as in the case presented in Sect. 6.4,
allowed standard formulations and closed-form solutions. However, in practice, a
design which considers all (or many) of these objectives may be desired, because
all of these objectives relate to important array characteristics. Although multiple-
objective formulations typically do not have a closed-form solution, they can be
integrated into an optimization problem that can be solved numerically, as presented
in recent studies [8, 13, 15].
Two example design methods based on numerical optimization are presented in
this section. Similar formulations that include other mixtures of objectives are also
possible. As a first example, consider the design of a spherical array that maximizes
directivity, but maintains a minimum desired level of robustness by imposing a
lower limit on the WNG. In addition, the beam pattern is designed to maintain a
distortionless-response constraint in the look direction. Using the results presented in
the design for maximum directivity, maximum WNG and the distortionless-response
constraint, as presented in Eqs. (6.1) and (6.20), the following optimization problem
is formulated:
H
minimize wnm Bwnm
wnm
H
subject to wnm vnm = 1 (6.67)
1
H
wnm Awnm ≤ ,
WNGmin
where
A = SSH
1
B= diag |b0 |2 , |b1 |2 , |b1 |2 , |b1 |2 , . . . , |bN |2 (6.68)
4π
6.7 Multiple-Objective Design 151
and WNGmin is the lower limit on the WNG. Matrix S is dependent on the sampling
scheme [see Eqs. (3.41)–(3.43)]. Due to the special structure of matrices A and B,
these matrices are positive definite, i.e. the matrices are Hermitian and the scalars
xH Ax and xH Bx are positive for all non-zero vectors x. The optimization problem
in Eq. (6.67) is therefore convex and is called a quadratically-constrained quadratic
program (QCQP), having readily available numerical solution methods [3].
QCQP is a special case of second-order cone programming (SOCP), so that this
optimization problem can also be presented as a SOCP problem:
minimize μ
wnm
H
subject to wnm vnm = 1
1 (6.69)
wnm
H
B2 ≤ μ
1
wnm
H
S ≤ √ ,
WNGmin
with
1 1
B 2 = √ diag |b0 |, |b1 |, |b1 |, |b1 |, . . . , |bN | , (6.70)
4π
where
4π
A= diag(vn ) × diag |b0 |−2 , |b1 |−2 , . . . , |bN |−2
Q
1
B= diag(vn )
4π
1
vn = [1, 3, 5, . . . , 2N + 1]T . (6.72)
4π
In the next example, a constraint on the maximum side-lobe level of the array
beam pattern is introduced. An array beam pattern, as in Eq. (5.12), is presented
here, explicitly denoting the plane-wave arrival direction by (θk , φk ):
y(θk , φk ) = wnm
H
vnm (θk , φk ), (6.73)
152 6 Optimal Beam Pattern Design
with
vnm (θk , φk ) = v00 (θk , φk ), v1(−1) (θk , φk ), v10 (θk , φk ), v11 (θk , φk ), . . . ,
T
vNN (θk , φk )
∗
vnm (θk , φk ) = bn (kr) Ynm (θk , φk ) , (6.74)
similar to the expressions in Eqs. (5.13) and (5.16). Now, as in [13], the entire direc-
tional region is divided into one region denoting the main-lobe directions and a
second region denoting the side-lobe directions. The side lobes directional region is
denoted by ΩSL , such that the arrival directions within this region satisfy
Now, the requirement that the magnitude of the side lobes is not larger than a limit
denoted by lSL can be formulated as a constraint on the maximum side-lobe level:
|y(θk , φk )| ≤ lSL
(θk , φk ) ∈ ΩSL . (6.76)
|y(θi , φi )| ≤ lSL , i = 1, . . . , I
(θi , φi ) ∈ ΩSL , i = 1, . . . , I . (6.77)
It is important to note that the discrete formulation is not equal to the continuous
formulation, because maintaining the constraint is not guaranteed at directions other
than the selected set. However, assuming the beam pattern is order-limited in the
spherical harmonics domain, it cannot facilitate rapid changes along (θ, φ), so that
dense sampling of ΩSL will tend to reduce the error (due to sampling) in maintaining
the constraint [13].
Equation (6.73) is substituted into Eq. (6.77), forming a discrete formulation of
the side-lobe level constraint that can be integrated into the QCQP optimization. One
possibility is to simply add a side-lobe level constraint such that Eq. (6.67) is written
as
H
minimize wnm Bwnm
wnm
H
subject to wnm vnm = 1
1 (6.78)
H
wnm Awnm ≤
WNGmin
H
wnm Bi wnm ≤ lSL
2
, i = 1, . . . , I ,
6.7 Multiple-Objective Design 153
where
A = SSH
1
B= diag |b0 |2 , |b1 |2 , |b1 |2 , |b1 |2 , . . . , |bN |2
4π
Bi = vnm (θi , φi )vnm
H
(θi , φi ), i = 1, . . . , I . (6.79)
subject to dnT vn = 1
1 (6.81)
dnH Adn ≤
WNGmin
dnH Bi dn ≤ lSL
2
, i = 1, . . . , I ,
where
4π
A= diag(vn ) × diag |b0 |−2 , |b1 |−2 , . . . , |bN |−2
Q
1
vn = [1, 3, 5, . . . , 2N + 1]T
4π
1
B= diag(vn )
4π
Bi = vn (Θi )vnH (Θi )
1 T
vn (Θi ) = P0 (cos Θi ), 3P1 (cos Θi ), . . . , (2N + 1)PN (cos Θi ) , (6.82)
4π
with Θi denoting the angle between the array look direction and (θi , φi ).
Design examples using the multiple-objective method are presented next. Con-
sider a spherical microphone array with 36 microphones nearly-uniformly distributed
on the surface of a rigid sphere. The array order is N = 4, operating at kr = 2.
154 6 Optimal Beam Pattern Design
10
-10
-20
-30
-40
-50
-60
-70
-150 -100 -50 0 50 100 150
Fig. 6.9 The magnitude of the beam pattern, |y(Θ)|, for an axis-symmetric spherical array designed
to maximize the directivity factor, while maintaining a constraint of WNGmin = 10. The design
achieves DF = 19.5, while maintaining exactly the WNG constraint WNGmin = 10 and achieving
a maximum side-lobe level of −18.4 dB
Axis-symmetric beamformers are designed for this array. Table 6.2 shows that the
maximum directivity beamformer achieved DF = 25 with WNG = 1.6, while the
maximum WNG beamformer achieved WNG = 51.5 with DF = 9.35.
The optimization problems in Eqs. (6.71) and (6.81) are used in the design of
two beamformers. In both designs, a WNG constraint of WNGmin = 10 is desired,
and a distortionless-response constraint is also introduced. In the first design, using
Eq. (6.71), the directivity factor is maximized while maintaining the two constraints.
In the second design, using Eq. (6.81), an additional constraint of side-lobe level of
−30 dB, or lSL2
= 0.001, is introduced, within the side-lobe range of θ ∈ [60◦ , 180◦ ].
The side-lobe range was sampled by I = 50 uniformly distributed samples, each
defining an individual constraint.
Figure 6.9 shows the magnitude of the beam pattern for the first design. Both the
WNG and the distortionless-response constraints are maintained. Due to the WNG
constraint, the directivity factor achieved (DF = 19.5) is smaller than the maximum
achievable of DF = 25. A maximum side-lobe level of −18.4 dB is achieved for this
design.
The aim of the second design is to reduce the maximum side-lobe level, while
maintaining the same WNG constraint and maximizing the directivity factor. A max-
imum side-lobe level constraint of −30 dB is introduced using the formulation in
2
Eq. (6.81) with lSL = 0.001.
Figure 6.10 shows the magnitude of the beam pattern for the second design. The
WNG constraint is maintained with WNG = 10 and the maximum side-lobe level
6.7 Multiple-Objective Design 155
10
-10
-20
-30
-40
-50
-60
-70
-150 -100 -50 0 50 100 150
Fig. 6.10 The magnitude of the beam pattern, |y(Θ)|, for an axis-symmetric spherical array
designed to maximize the directivity factor, while maintaining a constraint of WNGmin = 10 and
maximum side-lobe level of −30 dB. The design achieves DF = 18.2, while maintaining exactly
the WNG constraint and the side-lobe level constraints
constraint is maintained at −30 dB. Due to the introduction of the side-lobe level
constraint, the directivity factor is further reduced to DF = 18.2.
These design examples demonstrated the flexibility of the multiple-objective
approach with a numerical optimization solution, providing beamformer design with
a high level of detail in the specification of performance.
References
1. Arfken, G., Weber, H.J.: Mathematical Methods for Physicists, 5th edn. Academic Press, San
Diego (2001)
2. Born, M., Wolf, E.: Principles of Optics: Electromagnetic Theory of Propagation, Interference
and Diffraction of Light, 7th edn. Cambridge University Press, Cambridge (1999)
3. Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge
(2004)
4. Elko, G.W.: Differential microphone arrays. In: Huang, Y., Benesty, J. (eds.) Audio Signal Pro-
cessing for Next-Generation Multimedia Communication Systems, pp. 11–89. Kluwer Aca-
demic Publishers, Boston (2004)
5. Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. The John Hopkins University
Press, Baltimore (1996)
6. Huang, Y., Benesty, J. (eds.): Audio Signal Processing for Multimedia Communication Sys-
tems. Kluwer Academic Publishers, Boston (2004)
7. Koretz, A., Rafaely, B.: Dolph-Chebyshev beampattern design for spherical arrays. IEEE Trans.
Signal Process. 57(6), 2417–2420 (2009)
156 6 Optimal Beam Pattern Design
8. Li, Z., Duraiswami, R.: Flexible and optimal design of spherical microphone arrays for beam-
forming. IEEE Trans. Audio Speech Lang. Process. 15(2), 702–714 (2007)
9. Osnaga, S.M.: On rank one matrices and invariant subspaces. Balk. J. Geom. Appl. 10(1),
145–148 (2005)
10. Peled, Y., Rafaely, B.: Objective performance analysis of spherical microphone arrays for
speech enhancement in rooms. J. Acoust. Soc. Am. 132(3), 1473–1481 (2012)
11. Rafaely, B.: Plane-wave decomposition of the pressure on a sphere by spherical convolution.
J. Acoust. Soc. Am. 116(4), 2149–2157 (2004)
12. Rafaely, B.: Phase-mode versus delay-and-sum spherical microphone array processing. IEEE
Signal Process. Lett. 12(10), 713–716 (2005)
13. Sun, H., Yan, S., Svensson, U.P.: Robust minimum sidelobe beamforming for spherical micro-
phone arrays. IEEE Trans. Speech Audio Process. 19(4), 1045–1051 (2011)
14. Van Trees, H.L.: Optimum Array Processing (Detection, Estimation, and Modulation Theory,
Part IV), 1st edn. Wiley, New York (2002)
15. Yan, S., Sun, H., Svensson, U.P., Xiaochuan, M., Hovem, J.M.: Optimal modal beamforming
for spherical microphone arrays. IEEE Trans. Speech Audio Process. 19(2), 361–371 (2011)
Chapter 7
Beamforming with Noise Minimization
Array equations in the space domain were developed in Sect. 5.1. Typical equations
for array processing also include the effect of noise and disturbing sources [6] and
so, in this section, the equations developed in Sect. 5.1 are extended to include noise.
The sound pressure at the microphones, denoted by p, is now replaced by x, which
includes noise:
x = p + n, (7.1)
represents the sound pressure at the Q sensors due to the desired sources and, simi-
larly,
T
n = n 1 (k), n 2 (k), . . . , n Q (k) (7.3)
represents the noise at the sensors. The array output can now be formulated by
applying array coefficients to the array input:
y = w H x. (7.4)
The variance of the array output (assuming zero mean, see discussion below) can
now be computed as follows:
E |y|2 = E w H xx H w = w H Sxx w, (7.5)
where
Sxx = E xx H (7.6)
is the spatial spectral matrix of the array input, in which each element represents
the cross-spectral density at wave number k between the signals at two sensors.
Substituting Eq. (7.1) into Eq. (7.6), the spatial cross-spectral density matrix of the
array input can be written as
with
Spp = E pp H (7.8)
and
Snn = E nn H , (7.9)
representing the spatial cross-spectrum matrices due to the desired pressure signal
and the noise signal, respectively, and Spn , Snp , representing the cross-spectrum
matrices between the signal and the noise. It is common to assume that the desired
pressure signal and the noise signal are independent, as they typically originate
from different, independent sources. Furthermore, in most applications of acoustics
no useful information is contained in the constant component of the time-domain
signals and so it can be removed in practice, if different from zero. The zero mean
in the time domain is transformed into a zero mean in the frequency domain, such
that E[p] = E[n] = 0, where 0 is the zero vector. Therefore, independence between
the desired pressure signal and the noise signal, i.e. E[pn H ] = E[p] · E[n] H , leads
to a zero cross-spectral density between the desired pressure and the noise signals,
Spn = Snp = 0, in this case and Eq. (7.7) is rewritten as
where I is a Q × Q unit matrix and σn2 is the variance of the sensor noise.
Another common noise model is due to acoustic noise in the form of a diffuse
sound field. This may represent an environment with a large number of sources
distributed in all directions, e.g. a hall occupied by many speakers, or a highly rever-
berant environment in which the sound field due to late reflections tends to be diffuse
[4]. A diffuse sound field is composed of an infinite number of plane waves having
amplitudes with equal magnitude and random phases, arriving from all directions
with equal distribution. Denoting the sound pressure at the qth microphone due to
the diffuse sound field by n q , the array input can be written in a manner identical
to Eq. (7.1). The spatial cross-spectrum matrix of the noise term in this case, Snn , is
composed of the spatial cross-spectrum between microphone pairs. When the micro-
phones are positioned in an open-sphere configuration, the spatial correlation is given
by [3]
E n q n q∗ = σn2 sinc(krqq ), (7.12)
where rqq is the distance between microphone q and microphone q ; the expecta-
tion operation in this case represents averaging over different realizations of diffuse
fields. For example, consider equal-angle sampling with 4(N + 1)2 microphones; the
distance between adjacent microphones on the equator is approximated by dividing
the circumference 2πr by the number of microphones on the equator, 2(N + 1). At
the highest operating frequency of the array, satisfying kr ≈ N , the distance between
adjacent microphones satisfies
k2πr N
kr ≈ ≈π ≈ π. (7.13)
2(N + 1) N +1
Because sinc(π ) = 0 is the first zero of the sinc function, the correlation between
adjacent microphones is near zero in this case. For microphone pairs that are not
adjacent the distance is larger and the sinc function oscillates, while converging to
zero for kr π . In this case, matrix Snn will have larger terms on the diagonal
[where sinc(0) = 1] and terms with a generally decreasing magnitude off the diag-
onal. At low frequencies, kr π and sinc(kr ) → 1. In this case, matrix Snn will
have all its elements close to σn2 and will be of rank one. It is therefore clear that
matrix Snn for the case of acoustic noise in the form of a diffuse field may change
considerably its characteristics as a function of the operating frequency.
160 7 Beamforming with Noise Minimization
For an array configured around a rigid sphere, the effect of scattering from the rigid
sphere is to slightly reduce the correlation values, so that the correlation function is
sinc-like and slightly compressed along the axis of the argument kr [2].
Array equations in the spherical harmonics domain were presented in Sect. 5.1.
In this section these equations are extended to include the effect of noise. The array
input can therefore be rewritten to include both sound pressure and noise:
where T
pnm = p00 (k), p1(−1) (k), p10 (k), p11 (k), . . . , p N N (k) (7.15)
y = wnm
H
xnm , (7.17)
with wnm representing the (N + 1)2 × 1 vector of array coefficients in the spherical-
harmonics domain, defined as in Eq. (5.10):
T
wnm = w00 (k), w1(−1) (k), w10 (k), w11 (k), . . . , w N N (k) . (7.18)
Similar to the space domain formulation, Eq. (7.5), the variance of the array output
can be formulated as
H
E |y|2 = E wnm H
xnm xnm wnm = wnm
H
Sxnm xnm wnm , (7.19)
where
Sxnm xnm = E xnm xnm
H
(7.20)
with
Spnm pnm = E pnm pnm
H
(7.22)
and
Snnm nnm = E nnm nnm
H
, (7.23)
representing the cross-spectrum matrices due to the desired pressure signal and the
noise signal, respectively, in the spherical harmonics domain.
When the noise at the array input is due to sensor noise and using the discrete
formulation of the spherical Fourier transform, as in Eq. (3.40), sensor noise in the
spherical harmonics domain can be written as
with matrix S dependent on the sampling scheme (see Sect. 3.6). This leads to
where it has been assumed that the noise is independent and identically distributed
such that Eq. (7.11) holds. In this case the spatial cross-spectrum matrix of the noise
depends on the sampling scheme. In the special case of uniform or nearly-uniform
sampling, S = 4π Q
Y H [see Eqs. (3.43) and (3.39)] such that
4π H 4π 4π
Snnm nnm = σn2 Y Y = σn2 I. (7.26)
Q Q Q
In this case the cross-spectrum matrix is proportional to a unit matrix, similar to the
space domain formulation.
In the case where the noise originates from a diffuse sound field, n nm (k) can be
represented in a manner similar to Eq. (2.63) as
2π π
∗
n nm (k) = bn (kr )anm (k) = bn (kr ) a(k, θk , φk ) Ynm (θk , φk ) sin θk dθk dφk .
0 0
(7.27)
In this case, the integral represents a continuum of plane waves, or an infinite num-
ber of plane waves, in which case a(k, θk , φk ) is the plane-wave amplitude density
function. For a diffuse sound field it is assumed that a(k, θk , φk ) has unit or equal
magnitude in all directions, with random phases, which defines a white noise process
along (θk , φk ) satisfying
E a(k, θk , φk )a(k, θk , φk )∗ = σn2 δ(cos θk − cos θk )δ(φk − φk ). (7.28)
162 7 Beamforming with Noise Minimization
Now, E[n nm n ∗n m ] can be derived using Eqs. (7.27) and (7.28) and the orthogonality
property of the spherical harmonics:
2π π 2π π
∗
E n nm n ∗n m = bn (kr ) [bn (kr )] E a(k, θk , φk )[a(k, θk , φk )]∗
0 0 0 0
∗
×Ynm (θk , φk ) Ynm (θk , φk ) sin θk dθk dφk sin θk dθk dφk
2π π ∗
∗
= bn (kr ) [bn (kr )] σn2 Ynm (θk , φk ) Ynm (θk , φk ) sin θk dθk dφk
0 0
= σn2 |bn (kr )|2 δnn δmm . (7.29)
This result shows that the noise due to a diffuse sound field is uncorrelated in the
spherical harmonics domain [7]. Written in a matrix form, this is
Snnm nnm = E nnm nnm
H
(7.30)
= σn2 B H B,
The array equation, Eq. (7.14), can also be rewritten with multiplication by the
inverse of matrix B:
In this form, the desired signal is anm , which is the plane-wave amplitude density
function in the spherical harmonics domain, satisfying anm = pnm /bn [see Eq. (2.63)].
The cross-spectrum matrix of the array input has a simple form in this case:
This form is particularly useful, because in the case of a diffuse sound field the noise
term is a scaled unit matrix, or spatially white.
Optimal beamformers have been discussed in Chap. 6. In particular, Sect. 6.1 pre-
sented beam patterns that are optimal in attenuating noise due to a diffuse sound
field. However, when the noise field is not perfectly diffuse, this maximum directiv-
7.2 Minimum Variance Distortionless Response 163
ity beam pattern is no longer optimal. In this case an optimal beam pattern, tailored
to the actual measured noise, can be designed. One such a design is the minimum
variance distortionless response (MVDR), where the beam pattern is constrained to
be unity in the look direction, while minimizing the variance of the array output.
This beamformer is particularly useful when the desired signal is a plane wave arriv-
ing from the array look direction, with all other contributions to the array output
considered as noise and, therefore, to be minimized.
Consider a desired signal s(k), originating from a distant source at direction
(θk , φk ); the source generates a plane wave at the array position, with a steering
vector v denoting the transfer function from the source s(k) to the array input. The
array also measures noise, such that array input can be written in a manner similar
to Eq. (7.1):
x = p+n
= vs + n, (7.34)
where the dependence of s(k) on k has been removed for simplicity and with p and n
denoting the desired pressure signal and noise at the sensors, respectively. Applying
beamforming, as in Eq. (7.4), the variance of the signal at the array output is given
by
E |y|2 = w H Sxx w
= w H Spp w + w H Snn w
2
= w H v E |s|2 + w H Snn w. (7.35)
minimize w H Sxx w
w
(7.36)
subject to w H v = 1.
v H S−1
wH = xx
. (7.37)
v H S−1
xx v
The optimal solution requires the inversion of Sxx , so that this matrix has to be of
full rank. With a desired signal composed of a single plane wave, Spp has unit rank,
and so the inversion of Sxx requires Snn to be of full rank or nearly full rank. It is
164 7 Beamforming with Noise Minimization
important to note that the beamformer described here is sometimes referred to as the
minimum power distortionless response (MPDR) beamformer, but in this case the
MVDR beamformer is the same as in Eq. (7.37), with S−1 −1
xx replaced by Snn :
minimize w H Snn w
w
(7.38)
subject to w H v = 1,
with a solution
v H S−1
wH = nn
. (7.39)
v H S−1
nn v
In the context of this section, with a single plane-wave sound field for the desired
signal and a distortionless response in the same direction, the two forms are equiv-
alent. However, when the desired signal has additional components, for example
due to a reflection from a wall in a room, minimization of Sxx may lead to sig-
nal cancellation, i.e. the reflection component cancels the desired signal from the
look direction, even when the distortionless-response constraint is maintained (see
Sect. 7.4 for examples and further discussion). This can be avoided by directly mini-
mizing Snn , although an estimate of Snn may not always be available separately from
the desired signal.
In the special case of sensor noise, substituting Eq. (7.11), Snn = σn I, leads to
v
w= . (7.40)
vH v
For sensors in free field, with the steering vector v composed of complex expo-
nentials [see Eq. (5.6)] the solution reduces to that of a delay-and-sum beamformer
(see Sect. 5.5) or a maximum WNG beamformer (see Sect. 6.2) formulated in the
space domain [6]. Indeed, the MVDR beamformer in this case maximizes the signal
to sensor-noise ratio.
The MVDR beamformer can also be formulated in the spherical harmonics
domain, following the array equations developed in the spherical harmonics domain
in Sect. 7.1. Starting from Eq. (7.14), and using a steering-vector notation as in
Eq. (7.34), the equation can be written as
Now, following the derivation in Eq. (7.35), the MVDR optimization problem can
be written in the spherical harmonics domain in a way similar to Eq. (7.36) as
H
minimize wnm Sxnm xnm wnm
wnm
(7.42)
H
subject to wnm vnm = 1.
7.2 Minimum Variance Distortionless Response 165
Similar to Eq. (7.37), a solution can be written for the spherical harmonics beam-
forming coefficients:
H −1
vnm Sxnm xnm
H
wnm = −1
. (7.43)
H
vnm Sxnm xnm vnm
In the case of sensor noise and a spherical array with a nearly-uniform sampling
scheme configuration, the spatial cross-spectrum matrix of the noise is proportional
to a unit matrix [see Eq. (7.26)] and the solution in this case becomes
H
vnm
H
wnm = H v
. (7.45)
vnm nm
This result is the same as the maximum WNG beamformer [see Eq. (6.25)], showing
a similar behavior to the space domain formulation.
In the case of noise generated by a diffuse sound field, and using the formulation
as in Eq. (7.32), the spatial cross-spectrum matrix of the noise is proportional to a
unit matrix. A solution in the form of Eq. (7.45) can be written as
H
ṽnm
H
w̃nm = H ṽ
, (7.46)
ṽnm nm
4π 1
[wnm (k)]∗ = Y m (θk , φk ), (7.48)
(N + 1) bn (kr ) n
2
which is equivalent to the maximum directivity beamformer [see Eq. (6.9)], devel-
oped in Sect. 6.1. Indeed, the maximum directivity beamformer maximizes the SNR
in the case where the noise originates from a diffuse field, arriving equally from all
directions.
166 7 Beamforming with Noise Minimization
Examples of beam patterns designed using the MVDR method are presented in
this section. Consider a spherical microphone array designed around a rigid sphere,
operating at kr = N , with N = 4. The array is composed of Q = 36 microphones
arranged nearly-uniformly, with sensor noise assumed to be spatially uncorrelated
and with variance σn2 = 0.1. In this case, Snnm nnm due to the sensor noise can be
written as in Eq. (7.26):
4π
Snnm nnm = σn2 I. (7.49)
Q
The desired signal is assumed to propagate with a plane wave arriving from direction
(θ0 , φ0 ) = (60◦ , 36◦ ), having a variance of σ02 = 1 at the operating frequency. As the
desired signal and noise are assumed uncorrelated, the solution for the beamforming
weights in the spherical harmonics domain can be calculated from Eq. (7.45), having
a maximum WNG beam pattern. The resulting beam pattern is then calculated using
wnm and Eq. (5.12) as
y(θ, φ) = wnm
H
vnm (θ, φ)
N
n
∗
= [wnm (k)]∗ bn (kr ) Ynm (θ, φ) . (7.50)
n=0 m=−n
Figure 7.1 shows the magnitude of the beam pattern for this example. The contour
plot shows that the main lobe is directed at the desired signal, marked by the “+” sign,
while the balloon plot illustrates that the beam pattern is symmetric around the look
direction axis, as expected from the maximum WNG beamformer (see Sect. 6.2).
In the second part of this example, a disturbance is added to the noise signal
in the form of a plane wave arriving from direction (θ1 , φ1 ) = (60◦ , 320◦ ), with a
disturbance signal uncorrelated to the desired signal and to the sensor noise signal,
having a variance of σ12 = 0.5. The spatial spectrum matrix of the noise for this
example, formulated in the spherical harmonics domain, can be written as
4π
Snnm nnm = σn2 I + σ12 vnm1 vnm1
H
, (7.51)
Q
where vnm1 is the steering vector in the direction of the disturbance. The optimal
beamforming weights for this example are given by Eq. (7.44) and the resulting
beam pattern by Eq. (7.50). Figure 7.2 illustrates the magnitude of the beam pattern
for this example. The main lobe is directed at the desired signal, as in the first
example. In the direction of the disturbance signal, marked by the dark “+” sign,
the beam pattern has low magnitude, as expected if the array output due to Snnm nnm
is to be minimized. It is interesting to note by comparing the balloon plots of Figs.
7.1 and 7.2 that the first side lobe has been modified and now includes a null in
the direction of the disturbance signal, therefore breaking the axis-symmetry of the
7.3 Example: MVDR with Sensor Noise and Disturbance 167
160 0.9
140 0.8
0.7
120
0.6
100
0.5
80
0.4
60
0.3
40 0.2
20 0.1
0
0 50 100 150 200 250 300 350
Fig. 7.1 Magnitude of the array beam pattern, |y(θ, φ)|, for the MVDR beamformer with sensor
noise. Upper: contour plot, the arrival direction of the plane wave holding the desired signal is
marked by the white “+”. Lower: balloon plot. In this plot cyan (green-blue) color shades represent
positive values of Re{y(θ, φ)}, while magenta (purple-red) color shades represent negative values
of Re{y(θ, φ)}
beam pattern around the look direction. This example illustrates the advantage of the
MVDR beamformer - the ability to shape the beam pattern to account for uncorrelated
disturbances in the sound field.
168 7 Beamforming with Noise Minimization
160 0.9
140 0.8
0.7
120
0.6
100
0.5
80
0.4
60
0.3
40 0.2
20 0.1
0
0 50 100 150 200 250 300 350
Fig. 7.2 Magnitude of the array beam pattern, |y(θ, φ)|, for the MVDR beamformer with sensor
noise and a single disturbance. Upper: contour plot, the arrival direction of the plane wave holding
the desired signal is marked by the white “+” and the arrival direction of the plane wave holding the
disturbance signal is marked by the dark “+”. Lower: balloon plot. For color scheme see Fig. 7.1
In this section, the example presented in Sect. 7.3 is further extended to include a
disturbance signal that is correlated to the desired signal. This may occur in practice,
for example, when the disturbance is the result of the desired signal being reflected
from a nearby surface like a wall in a room. At the operating frequency, the distur-
bance signal is therefore an attenuated and phase-shifted version of the desired signal.
7.4 Example: MVDR with Correlated Disturbance 169
Denoting by s0 the amplitude of the desired signal at the origin of the coordinate
system, the disturbance signal satisfies s1 = As0 , where A is a complex constant.
The same spherical array as in Sect. 7.3 is also used in this example, i.e. a
rigid-sphere array with nearly-uniform sampling, Q = 36, N = 4, and kr = N .
The desired signal propagates as a plane wave arriving from (θ0 , φ0 ) = (60◦ , 36◦ )
with σ02 = 1 and the disturbance is another plane wave with arrival direction
(θ1 , φ1 ) = (60◦ , 320◦ ), with σ12 = |A|2 σ02 and A = 0.8e−iπ/3 . Sensor noise with
σn2 = 0.1 is also assumed. The spatial spectrum matrix of the noise, including the
contribution from the disturbance, is given by
4π
Snnm nnm = σn2 I + σ12 vnm1 vnm1
H
. (7.52)
Q
Now, recalling that the disturbance is correlated to the desired signal, the spatial
spectrum matrix of the overall input signal is derived:
4π
Sxnm xnm = σn2 I + σ02 vnm0 vnm0
H
+ σ12 vnm1 vnm1
H
Q
+A∗ σ02 vnm0 vnm1
H
+ Aσ02 vnm1 vnm0
H
, (7.53)
160 0.9
140 0.8
0.7
120
0.6
100
0.5
80
0.4
60
0.3
40 0.2
20 0.1
0
0 50 100 150 200 250 300 350
Fig. 7.3 Magnitude of the array beam pattern, |y(θ, φ)|, for the MVDR beamformer, as presented
in Eq. (7.44), with sensor noise and a single correlated disturbance. Upper: contour plot, arrival
direction of the plane wave holding the desired signal is marked by the white “+” and the arrival
direction of the plane wave holding the disturbance signal is marked by the dark “+”. Lower:
balloon plot. For color scheme see Fig. 7.1
satisfies |wnm
H
vnm1 | = 1.25, which shows that the disturbance is not attenuated but
even enhanced! Now, because the desired signal and the disturbance are correlated,
the combined contribution of the desired signal and the disturbance at the array output
is given by
7.4 Example: MVDR with Correlated Disturbance 171
1.2
160
140 1
120
0.8
100
80 0.6
60 0.4
40
0.2
20
0
0 50 100 150 200 250 300 350
Fig. 7.4 Magnitude of the array beam pattern, |y(θ, φ)|, for the MVDR beamformer, as presented
in Eq. (7.43), with sensor noise and a single correlated disturbance. Upper: contour plot, arrival
directions of the plane waves holding the desired signal and the disturbance are marked by the white
“+”. Lower: balloon plot. For color scheme see Fig. 7.1
2
|y|2 = s0 wnmH
vnm0 + s1 wnm
H
vnm1
H 2
= |s0 |2 wnm vnm0 + AwnmH
vnm1
= 7.5 × 10−6 |s0 |2 , (7.54)
showing that the overall signal at the array output has been attenuated by more than
50 dB. This phenomenon is referred to as signal cancellation [6], where instead of
keeping the desired signal unchanged and attenuating the disturbance, the beam-
172 7 Beamforming with Noise Minimization
former satisfies the distortionless-response constraint in the look direction, but then
employs the correlated disturbance to cancel the desired signal through the mini-
mization of Sxnm xnm , which includes contributions from both.
This example shows the limitation of the MVDR method for correlated distur-
bances. One way to overcome this limitation is by designing a null at the direction
of the disturbance through an additional constraint. This is made possible using an
extended method, the LCMV, as detailed in the next section.
Section 7.2 presented the MVDR design method that aims to minimize the noise
at the array output, while avoiding distortion of the signal by imposing a constraint
in the array look direction. The MVDR method can be extended by introducing
additional constraints to the desired beam pattern. For example, the distortionless-
response constraint, or a similar constraint, can be introduced at directions near the
look direction, thereby improving robustness against errors in the estimation of the
arrival direction of the desired signal. Also, if the noise field is composed of disturbing
sources, then the effect of these can be explicitly removed by constraining the beam
pattern to be zero at these directions. This is referred to as null constraints. In addition,
spatial derivatives of the beam pattern can be employed, for example, with the aim of
controlling the width of the main lobe in the look direction, or the width of the nulls
in the direction of disturbances. The general formulation of the linearly constrained
minimum variance (LCMV) beamformer, incorporating linear constraints within the
beamformer design, is derived in this section, while the following sections present
more specific designs.
The LCMV beamformer is first formulated in the space domain, designed as the
solution to the following optimization problem [6]:
minimize w H Sxx w
w
(7.55)
subject to V H w = c.
w H Sxx + λ H V H = 0, (7.57)
with w satisfying
w H = −λ H V H S−1
xx . (7.58)
Multiplying from the right by V and substituting the constraint term in Eq. (7.55), λ
can be written as
−1
λ H = −c H V H S−1
xx V . (7.59)
The solution can be derived in a similar manner to the derivation of the space-
domain solution, Eqs. (7.56)–(7.60), and is given by
H −1
−1 H −1
H
wnm = c H Vnm Sxnm xnm Vnm Vnm Sxnm xnm . (7.63)
The LCMV in the spherical harmonics domain can also be formulated and solved
by replacing Sxnm xnm with Snnm nnm . Furthermore, in the case of sensor noise and a
spherical array with a nearly-uniform sampling scheme, Snnm nnm is proportional to a
unit matrix [see Eq. (7.26)] and the solution becomes
H
−1 H
H
wnm = c H Vnm Vnm Vnm . (7.64)
174 7 Beamforming with Noise Minimization
The spatial cross-spectrum matrix of the noise is also proportional to a unit matrix,
when using the array equations as shown in Eq. (7.32) and assuming that the noise
signal is generated by a diffuse sound field. The solution in this case becomes
−1
H
w̃nm = c H Ṽnm
H
Ṽnm H
Ṽnm , (7.65)
with Ṽnm = B−1 Vnm [see Eq. (7.32)] and with the columns of Ṽnm equal to Ynm (θ, φ)
for the case where Vnm represent steering vectors.
The steering matrix includes both the direction of the desired signal and the direction
of the null, and is defined by
where vnm0 and vnm1 are the steering vectors corresponding to plane waves arriving
from directions (θ0 , φ0 ) and (θ1 , φ1 ), respectively. The constraint vector is given by
160 0.9
140 0.8
0.7
120
0.6
100
0.5
80
0.4
60
0.3
40 0.2
20 0.1
0
0 50 100 150 200 250 300 350
Fig. 7.5 Magnitude of the array beam pattern, |y(θ, φ)|, for the LCMV beamformer with sensor
noise and a single null constraint. Upper: contour plot, arrival direction of the plane wave holding
the desired signal is marked by the white “+” and direction of the null constraint is marked by the
dark “+”. Lower: balloon plot. For color scheme see Fig. 7.1
the need to identify the direction-of-arrival of the disturbance, while in the LCMV
approach, the null is achieved by explicitly specifying the null direction in Vnm .
However, the advantage of the LCMV design is that the null is achieved regardless
of the type of disturbance signal, while the MVDR design is significantly degraded
if the disturbance is correlated with the desired signal, due to signal cancellation.
As discussed above, this LCMV design requires the knowledge of the direction-
of-arrival of the disturbance to set the null constraint. In the case where this direction
176 7 Beamforming with Noise Minimization
is estimated inaccurately, it may be advisable to extend the width of the null in the
beam pattern, so that the disturbance is significantly attenuated even if the arrival
direction of the disturbance is slightly different from the null direction. One way
to achieve this is to introduce additional null constraints at directions close to the
original null, as illustrated in the following example.
In addition to the distortionless-response and the null constraints introduced in
the previous example, there are two nulls at directions (θ2 , φ2 ) = (70◦ , 290◦ ) and
(θ3 , φ3 ) = (15◦ , 310◦ ). Matrix Sxnm xnm is defined as in Eq. (7.66), while the steering
matrix Vnm is reconstructed to include the new steering vectors:
and, accordingly,
c = [1, 0, 0, 0]T . (7.70)
The solution is computed as in Eq. (7.63), and the resulting beam pattern is
computed using Eq. (7.50). Figure 7.6 shows the magnitude of the beam pattern,
also denoting the directions of the three nulls. It is clear that compared with Fig. 7.5,
a wider near-zero response of the beam pattern is achieved around the directions
of the nulls, thereby achieving a wider directional region with low magnitude, as
desired. The corresponding balloon plot is also presented in Fig. 7.6.
In the final example of this section, a single null constraint is used, as in the first
example, but now four distortionless-response constraints are added around the array
look direction. This could be useful to extend the width of the main lobe in a case
where the arrival direction of the desired signal is not known with high accuracy.
In this example, the look direction is (θ0 , φ0 ) = (60◦ , 36◦ ) and four distortionless-
response constraints are added at (60 ± 5◦ , 36 ± 5◦ ). A null constraint is applied, as
before, at (θ1 , φ1 ) = (60◦ , 320◦ ). In this case, the steering matrix Vnm is constructed
as follows:
Vnm = [vnm0 , vnm1 , vnm2 , vnm3 , vnm4 , vnm5 ] , (7.71)
Figure 7.7 shows the magnitude of the beam pattern and also denotes the directions
of all constraints. It is clear that the null constraint is maintained, while the width of
the main lobe is significantly increased compared to Fig. 7.5, showing the ability of
the LCMV to control the main-lobe width by introducing additional constraints.
7.7 LCMV with Derivative Constraints 177
160 0.9
140 0.8
0.7
120
0.6
100
0.5
80
0.4
60
0.3
40 0.2
20 0.1
0
0 50 100 150 200 250 300 350
Fig. 7.6 Magnitude of the array beam pattern, |y(θ, φ)|, for the LCMV beamformer with sensor
noise and three null constraints. Upper: contour plot, arrival direction of the plane wave holding the
desired signal is marked by the white “+” and directions of the null constraints are marked by the
dark “+”. Lower: balloon plot. For color scheme see Fig. 7.1
The LCMV with amplitude constraints, as presented in the previous section, illus-
trates the broadening of the main lobe and the null region by adding constraints at
directions close to the look direction and the null. A similar effect can be achieved
using a more analytical approach by constraining the first (and higher) order deriva-
tives of the beam pattern to be zero. This derivative constraint can be formulated as
linear constraints in the beam pattern optimization problem, hence directly integrat-
178 7 Beamforming with Noise Minimization
160 1
140
0.8
120
100 0.6
80
0.4
60
40
0.2
20
0
0 50 100 150 200 250 300 350
Fig. 7.7 Magnitude of the array beam pattern, |y(θ, φ)|, for the LCMV beamformer with sensor
noise and several distortionless-response and null constraints. Upper: contour plot, directions of the
distortionless-response constraints are marked by the white “+” and direction of the null constraint
is marked by the dark “+”. Lower: balloon plot. For color scheme see Fig. 7.1
ing into the LCMV framework [6]. The steering vectors in the spherical harmonics
domain are first written more explicitly as a function of the angles. It is important
to note that the following formulation of the derivative constraint has a closed-form
expression, due to the efficient separation of frequency, distance and angles in the
spherical harmonics domain. The steering vectors are written as in Eq. (5.16):
∗
vnm (θ, φ) = bn (kr ) Ynm (θ, φ) , (7.73)
7.7 LCMV with Derivative Constraints 179
with
T
vnm (θ, φ) = v00 (θ, φ), v1(−1) (θ, φ), v10 (θ, φ), v11 (θ, φ), . . . , v N N (θ, φ) .
(7.74)
A partial derivative of the beam pattern with respect to the azimuth angle, φ, is
derived first. The derivative is written as
∂ ∂ ∂
y(θ, φ) = wnm vnm (θ, φ) = wnm
H H
vnm (θ, φ) , (7.75)
∂φ ∂φ ∂φ
where
T
∂ ∂ ∂
vnm (θ, φ) = v00 (θ, φ), . . . , v N N (θ, φ) . (7.76)
∂φ ∂φ ∂φ
∂
the elements of v
∂φ nm
can be derived:
∂ ∗
vnm (θ, φ) = −imbn (kr ) Ynm (θ, φ) = −imvnm (θ, φ). (7.78)
∂φ
∂2
vnm (θ, φ) = −m 2 vnm (θ, φ). (7.79)
∂φ 2
where
180 7 Beamforming with Noise Minimization
T
∂ ∂ ∂
vnm (θ, φ) = v00 (θ, φ), . . . , v N N (θ, φ) (7.82)
∂θ ∂θ ∂θ
and
∂ ∂ m ∗
vnm (θ, φ) = bn (kr ) Y (θ, φ) . (7.83)
∂θ ∂θ n
The derivative of the spherical harmonics can be derived from the following
relation [5]:
∂ m
Yn (θ, φ) = m cot θ Ynm (θ, φ) + (n − m)(n + m + 1)e−iφ Ynm+1 (θ, φ).
∂θ
(7.84)
A few notes about this equation are presented next. First, Ynm+1 (θ, φ) = 0 for m +
1 > n in general, and for m = n in the context of this equation. Second, for θ = 0
and θ = π , the cotangent function diverges, but for these angles ∂θ∂ Ynm (θ, φ) = 0.
This is because Ynm (0, φ) = Ynm (π, φ) = 0 ∀m = 0, while for m = 0 the spherical
harmonics reduce to the Legendre polynomials, composed of cosine functions that
have a gradient of zero at θ = 0 and θ = π [1]. The same argument follows for the
Ynm+1 (θ, φ) term. Therefore, at these specific angles the first-order derivative with
respect to θ is zero, and a zero constraint would be satisfied anyway.
In summary, the derivative of the elements of the steering vector with respect to
θ can be written as
∂
vnm (θ, φ) = g1 vnm (θ, φ) + g2 vn(m+1) (θ, φ)
∂θ
g1 = m cot θ
g2 = (n − m)(n + m + 1)eiφ
vn(m+1) (θ, φ) = 0 ∀ m = n. (7.85)
Finally, derivatives with respect to both θ and φ can be set to zero using linear
constraints within the LCMV framework, as follows:
∂ ∂
wnm H
vnm (θ, φ), vnm (θ, φ) = [0, 0]. (7.87)
∂θ ∂φ (θ0 ,φ0 )
7.8 Example: Robust LCMV with Derivative Constraints 181
An LCMV design example is presented in this section with the aim of illustrating
the use of derivative constraints. A spherical microphone array with the same con-
figuration as in the design example in Sect. 7.6 is used in this section. An LCMV
beamformer with a distortionless-response constraint at (θ0 , φ0 ) = (60◦ , 36◦ ) and
one null constraint at (θ0 , φ0 ) = (60◦ , 90◦ ) is designed, following the formulation
presented in Sect. 7.5. The input signal to the array was assumed to be composed
of a desired signal with a plane wave arriving from the look direction (θ0 , φ0 ) with
variance σ02 = 1 and sensor noise with variance σn2 = 0.1. Figure 7.8 shows a con-
tour plot and a balloon plot of the beam pattern, clearly illustrating the main lobe at
the look direction and the null at (60◦ , 90◦ ).
In the next step, a derivative constraint is added to the LCMV design, following
the formulation developed in Sect. 7.7, with a single derivative constraint with respect
to φ at (θ1 , φ1 ), i.e. the null direction. Figure 7.9 shows the results for this design.
Comparing with Fig. 7.8, two effects of the derivative constraint on the beam pattern
are observed. First, the width of the low-level magnitude of the beam pattern around
the null constraint in the φ direction has been increased. This is an expected effect
because now both the beam pattern function and its derivative along φ are zero. The
advantage of this null-width increase is improved robustness with respect to uncer-
tainty in the arrival direction of a potential disturbance. However, another change
to the beam pattern is a slight shift in the main lobe, such that its peak value seems
to be slightly to the left of the look direction shown in the contour plot of Fig. 7.9.
This can be regarded as a degradation, as we would like to have the maximum gain
exactly at the look direction. This issue will be discussed towards the end of this
design example.
In the following step, a derivative constraint with respect to θ at the null direction
has been added to the derivative constraint with respect to φ. Figure 7.10 shows
the resulting beam pattern. An increase in the width of the low-magnitude region
around the null constraint along θ is observed when compared with Fig. 7.9. This is
expected, as in this design the derivatives with respect to both θ and φ are set to zero
around the null direction.
In the final step of this design, two additional derivative constraints are included.
These are derivative constraints with respect to both θ and φ, but this time at the
look direction, (θ0 , φ0 ) = (60◦ , 36◦ ). Now, both the look direction and the null
are set to have zero derivatives. The effect on the main lobe is clear, as illustrated in
Fig. 7.11. The peak of the main lobe has shifted back to the look direction, because the
derivative constraints have forced the main lobe to have a local maximum at the look
direction. This has therefore corrected the undesired shift generated by the derivative
constraints at the null direction. However, this correction comes at a cost - with this
complex set of constraints, the LCMV introduces high side lobes at directions away
from the constraints. The overall behavior of this beam pattern may not be attractive,
although all imposed constraints are maintained. This example shows that constraints
182 7 Beamforming with Noise Minimization
160 0.9
140 0.8
0.7
120
0.6
100
0.5
80
0.4
60
0.3
40 0.2
20 0.1
0
0 50 100 150 200 250 300 350
Fig. 7.8 Magnitude of the array beam pattern, |y(θ, φ)|, for the LCMV beamformer with sensor
noise and a single null constraint. Upper: contour plot, arrival direction of the plane wave holding
the desired signal is marked by the white “+” and direction of the null constraint is marked by the
dark “+”. Lower: balloon plot. For color scheme see Fig. 7.1
have to be introduced with care, as they may come at the expense of reduction of
noise and disturbances arriving from other directions.
7.8 Example: Robust LCMV with Derivative Constraints 183
160 1
140
0.8
120
100 0.6
80
0.4
60
40
0.2
20
0
0 50 100 150 200 250 300 350
Fig. 7.9 Magnitude of the array beam pattern, |y(θ, φ)|, for the LCMV beamformer with sensor
noise, a single null constraint and a derivative constraint with respect to φ at the null direction.
Upper: contour plot, arrival direction of the plane wave holding the desired signal is marked by the
white “+” and direction of the null constraint is marked by the dark “+”. Lower: balloon plot. For
color scheme see Fig. 7.1
184 7 Beamforming with Noise Minimization
160 1
140
0.8
120
100 0.6
80
0.4
60
40
0.2
20
0
0 50 100 150 200 250 300 350
Fig. 7.10 Magnitude of the array beam pattern, |y(θ, φ)|, for the LCMV beamformer with sensor
noise, a single null constraint and derivative constraints with respect to both θ and φ at the null
direction. Upper: contour plot, arrival direction of the plane wave holding the desired signal is
marked by the white “+” and direction of the null constraint is marked by the dark “+”. Lower:
balloon plot. For color scheme see Fig. 7.1
7.8 Example: Robust LCMV with Derivative Constraints 185
160 0.9
140 0.8
0.7
120
0.6
100
0.5
80
0.4
60
0.3
40 0.2
20 0.1
0
0 50 100 150 200 250 300 350
Fig. 7.11 Magnitude of the array beam pattern, |y(θ, φ)|, for the LCMV beamformer with sensor
noise, a single null constraint and derivative constraints with respect to both θ and φ at the null
direction and at the look direction. Upper: contour plot, arrival direction of the plane wave holding
the desired signal is marked by the white “+” and direction of the null constraint is marked by the
dark “+”. Lower: balloon plot. For color scheme see Fig. 7.1
References
1. Arfken, G., Weber, H.J.: Mathematical Methods for Physicists, 5th edn. Academic, San Diego
(2001)
2. Avni, A., Rafaely, B.: Interaural cross-correlation and spatial correlation in a sound field repre-
sented by spherical harmonics. In: First International Symposium on Ambisonics and Spherical
Acoustics (Ambisonics 2009). Graz, Austria (2009)
186 7 Beamforming with Noise Minimization
3. Cook, R.K., Waterhouse, R.V., Berendt, R.D., Seymour, E., Thompson, M.C.: Measurement
of correlation coefficients in reverberant sound fields. J. Acoust. Soc. Am. 27(6), 1072–1077
(1955)
4. Kinsler, L.E., Frey, A.R., Coppens, A.B., Sanders, J.V.: Fundamentals of Acoustics, 4th edn.
Wiley, New York (1999)
5. Spherical harmonics, low order differentiation with respect to θ (2013). http://functions.wolfram.
com/05.10.20.0001.01
6. Van Trees, H.L.: Optimum Array Processing (Detection, Estimation, and Modulation Theory,
Part IV), 1st edn. Wiley, New York (2002)
7. Yan, S., Sun, H., Svensson, U.P., Xiaochuan, M., Hovem, J.M.: Optimal modal beamforming
for spherical microphone arrays. IEEE Trans. Speech Audio Process. 19(2), 361–371 (2011)
Glossary
Acronyms
Mathematical Operators
· 2-norm
(·)∗ Complex conjugate
(·)T Transpose
(·) H Hermitian or complex transpose
(·)† Pseudo matrix inverse
(·)! Factorial
∇ Gradient
∇x2 Laplacian in Cartesian coordinates
∇r2 Laplacian in spherical coordinates
E[ · ] Expectation
I m{·} Imaginary part
κ(·) Condition number of a matrix
Re{·} Real part
Λ(·) Rotation operator
Greek Symbols
Symbols
C F
Cartesian coordinate, 1, 2, 4, 24, 26, 28, 29, Front-back ratio, 142, 144
33, 34, 41
Chebyshev polynomial, 145–147
Concentric spheres, 81, 90, 98, 101, 102 G
Condition number, 94–100 Gaussian sampling, 59, 66, 67, 70, 74–79,
Convolution, 1, 28, 30, 65 83, 92, 96, 108
D H
Delay and sum, 103, 115–117, 134, 137, 164 Hankel function, spherical Hankel function,
Derivative constraint, 172, 177–181, 183– 33, 36–41, 48, 49, 51, 56, 57
185 Helmholtz equation, 34–36
Diffuse sound, 111, 137, 138, 144, 157, 159, Hemispherical array, 81, 101
161, 162, 165, 174 Hermitian matrix, 133, 151
Directivity, 87, 88, 105, 110, 112, 113, 128, Hilbert space, 1, 16, 20
130, 137, 141, 150 Hyper-cardioid, 130, 131
© Springer Nature Switzerland AG 2019 191
B. Rafaely, Fundamentals of Spherical Array Processing, Springer Topics
in Signal Processing 16, https://doi.org/10.1007/978-3-319-99561-8
192 Index
I Q
Isotropic noise, 111, 137, 138 Quadratically-constrained quadratic pro-
gram, 151
Quadrature, 60, 65, 68, 73
L
Lagrange multiplier, 128, 131, 172, 173
Laplacian, 33, 34 R
Legendre polynomial, 9, 12, 14–16, 22, 130, Rank, 95, 97, 132, 159, 163
142, 146, 147, 180 Rayleigh formula, 36
Linearly constrained minimum power, 173 Rayleigh quotient, 114, 143, 144
Linearly constrained minimum variance, generalized, 111, 114, 132
157, 172–178, 180–185 Rayleigh resolution, 131
Regular beamformer, 117
Rigid sphere, 33, 49–55, 81, 83, 85–87, 89,
M 93–99, 101, 103, 105, 107, 109, 115,
Main lobe, 123, 130, 145, 146, 148–150, 118, 119, 122, 134, 135, 141, 153,
152, 166, 172, 176, 177, 181 160, 166, 169
Manifold vector, 105 Robustness, 75, 89, 92, 95, 96, 100, 101, 112,
Microphone 116, 117, 127, 132, 134, 137, 141,
150, 172, 181
cardioid, 87–89, 93–95, 97, 98, 102
Rotation, 23, 25–30, 66, 123, 124
mismatch, 83, 95
pressure, 81–83, 87, 89, 90, 92, 93, 95,
105, 111, 113
S
Minimum power distortionless response,
Sampling weights, 60, 61, 68–70, 73, 74, 76,
164
77, 79, 80, 82, 92
Minimum variance distortionless response,
Second-order cone programming, 151
157, 162–169, 171–175
Sensor noise, 81, 83, 86, 112, 113, 132, 137–
139, 141, 158, 159, 161, 163–171,
173–175, 177, 178, 181–185
N Side lobe, 127, 130, 145, 146, 148–152, 154,
Null constraint, 172, 174–178, 181–185 155, 166, 181
Spatial resolution, 83, 131
Spherical Bessel equation, 36
O Spherical cap, 22–25, 27, 28
Open sphere, 81, 83–87, 89–93, 95, 97–99, Spherical coordinate, 1–3, 29, 33, 34, 36, 40,
101, 107, 115–117, 134–137, 140, 50, 55
159 Spherical Fourier transform
complex conjugate, 18, 123
conditions, 17
P definition, 17, 43, 46, 47, 59–61, 70, 73,
Perturbation, 94, 95 75, 105
Plane wave, 33, 34, 40–43, 45–49, 51, 52, discrete, 74, 75, 113, 119, 161
55, 57, 81, 82, 84, 85, 88, 92, 105, Gibbs phenomenon, 20, 23
107, 110–123, 127, 130, 138, 140, inner product, 20
144, 151, 159, 161, 163, 165–170, linearity, 18
174, 175, 177, 181–185 Parseval’s relation, 17
amplitude density, 46, 47, 161, 162 symmetry, 19, 30
decomposition, 81, 92, 103, 117, 119– Spherical harmonics
122, 129 addition theorem, 12, 22, 41, 110, 112,
sound field, 36, 41, 43, 47–49, 52, 53, 85, 114, 139, 165
90, 91, 106, 107, 164 completeness, 12, 17
Platonic solids, 67–69 complex conjugate, 8, 18, 20, 123
Point source, 33, 36, 48, 49, 51, 55 definition, 4, 5, 62, 179
Index 193