0% found this document useful (0 votes)
313 views

Rafaely B Fundamentals of Spherical Array Processing

Spherical Array

Uploaded by

Lu Hkarr
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
313 views

Rafaely B Fundamentals of Spherical Array Processing

Spherical Array

Uploaded by

Lu Hkarr
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 201

Springer Topics in Signal Processing

Boaz Rafaely

Fundamentals
of Spherical
Array Processing
Second Edition
Springer Topics in Signal Processing

Volume 16

Series editors
Jacob Benesty, Montreal, Canada
Walter Kellermann, Erlangen, Germany
More information about this series at http://www.springer.com/series/8109
Boaz Rafaely

Fundamentals of Spherical
Array Processing
Second Edition

123
Boaz Rafaely
Department of Electrical and Computer
Engineering
Ben-Gurion University of the Negev
Beer-Sheva, Israel

ISSN 1866-2609 ISSN 1866-2617 (electronic)


Springer Topics in Signal Processing
ISBN 978-3-319-99560-1 ISBN 978-3-319-99561-8 (eBook)
https://doi.org/10.1007/978-3-319-99561-8

Library of Congress Control Number: 2018952886

1st edition: © Springer-Verlag Berlin Heidelberg 2015


2nd edition: © Springer Nature Switzerland AG 2019
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar
methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made. The publisher remains neutral with regard to
jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
To my parents, Nitzan and Rivka Rafaely
Preface

Microphone arrays and associated array processing techniques have been developed
for a wide range of applications over the past few decades. These applications
include speech communication, spatial audio, room acoustics analysis, noise control
and acoustic holography, defense and security, entertainment, and many more. In
the cases of speech in rooms and music in concert halls, the sound tends to travel
throughout the entire enclosed space, producing a three-dimensional sound field.
Microphone arrays that effectively measure and process three-dimensional sound
fields typically require the positioning of microphones within a volume in
three-dimensional space. Planar arrays, mounted on an enclosure wall, have been
studied for several decades; more recently, spherical arrays, in which microphones
are mounted around a rigid sphere, for example, have been developed. These offer
several advantages over classical linear, rectangular, or circular arrays:
(i) The sphere, having complete rotational symmetry, facilitates spatial filtering
or beamforming that can be designed to effectively enhance or attenuate
sources in any direction.
(ii) Array processing and performance analysis can be formulated in the spher-
ical harmonics domain, which is the Fourier domain for the sphere. This
domain facilitates efficient algorithms and extensive acoustic modeling of
both the array and the surrounding sound field.
(iii) Beamforming can be efficiently implemented by decoupling beam pattern
design from beam pattern steering, therefore providing simplicity and flex-
ibility in array realization.
These advantages have motivated an increasing number of researchers in recent
years to develop spherical microphone array systems, to study spherical array
configurations, to develop algorithms for spherical arrays, and to apply these arrays
in a wide range of applications. This growing activity has provided the author with
the motivation and inspiration to write this book, with the aim of presenting the
fundamentals of spherical array processing in a tutorial manner suitable for
researchers, graduate students, and engineers interested in this topic.

vii
viii Preface

The first two chapters provide the reader with the necessary mathematical and
physical background, including an introduction to the spherical Fourier transform
and to the formulation of plane-wave sound fields in the spherical harmonics
domain. The third chapter covers the theory of spatial sampling, which becomes
useful when selecting the positions of microphones to sample sound pressure
functions in space. The next chapter presents various spherical array configurations,
including the popular configuration based on a rigid sphere. The fifth chapter
introduces the concept of beamforming and its basic equations, including popular
design methods such as delay-and-sum and regular beamforming. The following
chapter presents methods for the optimal design of beam patterns, formulated to
achieve various objectives such as maximum robustness, maximum directivity, or
minimum side-lobe level. The final chapter develops more advanced array pro-
cessing algorithms such as the minimum variance distortionless response (MVDR)
algorithm. These algorithms aim to enhance a desired signal while attenuating
undesired noise components in the sound field by exploring their unique formu-
lation in the spherical harmonics domain.
My own interest in spherical array processing began during a six-month visit to
the sensory communication group at MIT in 2002, working with Julie Greenberg
and greatly enjoying the stimulating vibe of Boston. I would like to thank Julie for
providing this opportunity, for the hospitality, and for the helpful discussions.
During my visit to Boston I was exposed to the inspiring publications on spherical
arrays by Jens Meyer and Gary Elko. Their pioneering work planted the seeds that
later flourished to an extensive research effort at my lab, the Acoustics Laboratory
at Ben-Gurion University of the Negev. The research at the Acoustics Laboratory
was pursued through an invaluable cooperation with a great number of research
students, postdoctoral researchers, and visitors. The relaxed atmosphere at the lab,
the great teamwork, and the endless discussions were the fuel that kept the writing
of this book viable. I would like to express great thanks to the Acoustics Laboratory
researchers: Dr. Jonathan Sheaffer, Dr. Jonathan Rathsam, Dr. Noam Shabtai, Dr.
Dror Lederman, Dr. Yotam Peled, Dr. Etan Fisher, Dr. Vladimir Tourbabin, Dr. Hai
Morgenstern, Dr. David Alon, Zamir Ben-Hur, Lior Madmoni, Moti Lugasi, Koby
Alhaiany, Mickey Jeffet, Eran Miller, Hanan Beit-On, Itay Ifergan, Ran Weisman,
Tom Shlomo, Amir Musicant, Yoav Biderman, Uri Abend, Elad Cohen, Dima
Lvov, Or Nadiri, Shahar Villeval, Tal Szpruch, Nejem Hulihel, Ilan Ben-Hagai,
Tomer Peleg, Amir Avni, Morag Agmon, Maor Klieder, Dima Haykin, Itai Peer,
and Ilya Balmages. Also, special thanks to Dr. Franz Zotter for the helpful com-
ments on a draft version of the manuscript made during a visit to the lab. Thanks
also to Debbie Kedar for the prompt and professional editing and proofreading of
this book. Finally, thanks to my family, Vered, Asaf, Yonathan, and Tal, for pro-
viding love therapy that time and again pulled me out of the writing stumbles and
falls.
This second edition of Fundamentals of Spherical Array Processing, in addition
to the correction of all known errors, now includes comprehensive support in
MATLAB. A manual has been developed that includes MATLAB code to repro-
duce all examples and figures in the book, with the aim of providing MATLAB
Preface ix

support to complement the theory and signal processing methods presented in this
book. The MATLAB manual is provided as additional material to this book and can
be downloaded from https://www.mathworks.com or from the author’s website
http://www.ee.bgu.ac.il/*br.

Beer-Sheva, Israel Boaz Rafaely


September 2018
Contents

1 Mathematical Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Functions on the Sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Spherical Harmonics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Exponential and Legendre Functions . . . . . . . . . . . . . . . . . . . . . . 12
1.4 Spherical Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.5 Some Useful Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.6 Rotation of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.7 Spherical Convolution and Correlation . . . . . . . . . . . . . . . . . . . . . 28
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2 Acoustical Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.1 The Acoustic Wave Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.2 Spherical Bessel and Hankel Functions . . . . . . . . . . . . . . . . . . . . 36
2.3 A Single Plane Wave . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.4 Plane-Wave Composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.5 Point Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.6 Sound Pressure Around a Rigid Sphere . . . . . . . . . . . . . . . . . . . . 49
2.7 Translation of Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3 Sampling the Sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.1 Sampling Order-Limited Functions . . . . . . . . . . . . . . . . . . . . . . . 59
3.2 Equal-Angle Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.3 Gaussian Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.4 Uniform and Nearly-Uniform Sampling . . . . . . . . . . . . . . . . . . . . 67
3.5 Numerical Computation of Sampling Weights . . . . . . . . . . . . . . . 70
3.6 The Discrete Spherical Fourier Transform . . . . . . . . . . . . . . . . . . 74
3.7 Spatial Aliasing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

xi
xii Contents

4 Spherical Array Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81


4.1 Single Open Sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.2 Rigid Sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.3 Open Sphere with Cardioid Microphones . . . . . . . . . . . . . . . . . . . 87
4.4 Dual-Radius Open Sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.5 Robustness to Errors and Numerical Array Design . . . . . . . . . . . . 92
4.6 Design Examples with Robustness Analysis . . . . . . . . . . . . . . . . . 95
4.7 Spherical Shell Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.8 Other Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5 Spherical Array Beamforming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.1 Beamforming Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.2 Axis-Symmetric Beamforming . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.3 Directivity Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.4 White Noise Gain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.5 Simple Axis-Symmetric Beamformers . . . . . . . . . . . . . . . . . . . . . 115
5.6 Beamforming Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
5.7 Steering Non Axis-Symmetric Beam Patterns . . . . . . . . . . . . . . . . 123
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
6 Optimal Beam Pattern Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.1 Maximum Directivity Beamformer . . . . . . . . . . . . . . . . . . . . . . . 127
6.2 Maximum WNG Beamformer . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
6.3 Example: Directivity Versus WNG . . . . . . . . . . . . . . . . . . . . . . . 135
6.4 Mixed-Objective Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
6.5 Maximum Front-Back Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
6.6 Dolph-Chebyshev Beam Pattern . . . . . . . . . . . . . . . . . . . . . . . . . 144
6.7 Multiple-Objective Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
7 Beamforming with Noise Minimization . . . . . . . . . . . . . . . . . . . . . . . 157
7.1 Beamforming Equations Including Noise . . . . . . . . . . . . . . . . . . . 157
7.2 Minimum Variance Distortionless Response . . . . . . . . . . . . . . . . . 162
7.3 Example: MVDR with Sensor Noise and Disturbance . . . . . . . . . . 166
7.4 Example: MVDR with Correlated Disturbance . . . . . . . . . . . . . . . 168
7.5 Linearly Constrained Minimum Variance . . . . . . . . . . . . . . . . . . . 172
7.6 Example: LCMV with Beam Pattern Amplitude Constraints . . . . . 174
7.7 LCMV with Derivative Constraints . . . . . . . . . . . . . . . . . . . . . . . 177
7.8 Example: Robust LCMV with Derivative Constraints . . . . . . . . . . 181
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
Chapter 1
Mathematical Background

Abstract This chapter provides the mathematical background necessary for study-
ing spherical array processing. Spherical arrays typically sample functions on a
sphere (e.g. sound pressure); therefore, this chapter begins by presenting the spherical
coordinate system, as well as some examples of functions on the sphere. Spherical
harmonics are a central theme of this book as they form a basis for representing
functions on the sphere. Therefore, spherical harmonics are first defined and illus-
trated, and then an introduction to the spherical Fourier transform and a description
of functions on the sphere in Hilbert space follows. The chapter concludes with a pre-
sentation of the topics of rotation, convolution, and correlation defined for functions
on the sphere.

1.1 Functions on the Sphere

Consider the standard Cartesian coordinate system with coordinates

x ≡ (x, y, z) ∈ R3 , (1.1)

where R3 is the three-dimensional space of real numbers and x represents a vector in


geometric notation. A spherical surface of unit radius, denoted by S 2 , can be defined
in the Cartesian coordinate system as

S 2 = {x ∈ R3 : x = 1}, (1.2)

which represents all positions having unit distance from the origin, with  ·  denoting
the Euclidean norm. Positions on S 2 can be denoted by elevation and azimuth angles,
θ and φ, which define the spherical coordinates, together with the radial distance (or
radius), r :
r ≡ (r, θ, φ). (1.3)

© Springer Nature Switzerland AG 2019 1


B. Rafaely, Fundamentals of Spherical Array Processing, Springer Topics
in Signal Processing 16, https://doi.org/10.1007/978-3-319-99561-8_1
2 1 Mathematical Background

Fig. 1.1 Spherical coordinate system defined relative to the Cartesian coordinate system

Fig. 1.2 Plot of function f (θ, φ) = sin2 θ cos(2φ) over the surface of a unit sphere

The azimuth angle φ is measured from the x-axis towards the y-axis, while the
elevation angle θ is measured downwards from the z-axis, as illustrated in Fig. 1.1.
Strictly speaking, θ denotes inclination, but the term elevation will be used in this
book.
A position r = (r, θ, φ) represented in spherical coordinates can be related to the
same position represented in Cartesian coordinates x = (x, y, z) using

x = r sin θ cos φ
y = r sin θ sin φ
z = r cos θ. (1.4)
1.1 Functions on the Sphere 3

160 0.8

140 0.6

0.4
120
0.2
100
0
80
-0.2
60
-0.4

40 -0.6

20 -0.8

-1
0 50 100 150 200 250 300 350

Fig. 1.3 Plot of function f (θ, φ) = sin2 θ cos(2φ) over the θφ plane

Spherical functions, or functions defined over the unit sphere, are central to this book.
An example of a function over the sphere is

f (θ, φ) = sin2 θ cos(2φ). (1.5)

The function can be presented graphically in various ways: as a color map on the
surface of a unit sphere, as in Fig. 1.2, as a color contour map on a θ φ plane mapping
the surface of a unit sphere, as in Fig. 1.3, and with magnitude denoted by the
distance from the origin (balloon plot), as in Fig. 1.4. In the latter plot, cyan (green-
blue) shades represent positive values, while magenta (purple-red) shades represent
negative values. All three figures show one maximum and two zeros over θ , due to
the sin2 θ term in the range θ ∈ [0, π ], and two maxima, two minima and four zeros
over φ, due to the cos(2φ) term in the range φ ∈ [0, 2π ].
In this book more than one single notation is used to represent functions on the
unit sphere. A common notation uses the angles of the spherical coordinate system
directly, i.e.
f (θ, φ), (θ, φ) ∈ S 2 . (1.6)

Sometimes a more compact notation is desired, in which case the two angles will be
denoted by a single parameter, e.g. μ ≡ μ(θ, φ), using the function representation

f (μ), μ ≡ μ(θ, φ) ∈ S 2 . (1.7)


4 1 Mathematical Background

Fig. 1.4 Balloon plot of function f (θ, φ) = sin2 θ cos(2φ), with the distance from the origin
defined by | f (θ, φ)|, and with cyan (green-blue) shades representing positive values of f , and
magenta (purple-red) shades representing negative values of f

Finally, it may be desired to represent the sphere surface in Cartesian coordinates,


in which case the following notation is used:

f (x), x = (sin θ cos φ, sin θ sin φ, cos θ ) ∈ S 2 . (1.8)

1.2 Spherical Harmonics

In the sections that follow, functions on the unit sphere are presented as a weighted
sum of a set of basis functions, also forming the Fourier basis for functions on the
sphere. These basis functions are the spherical harmonics, defined as follows [13]:

2n + 1 (n − m)! m
Ynm (θ, φ) ≡ P (cos θ )eimφ , (1.9)
4π (n + m)! n

where (·)! represents the factorial function, Pnm (·) are the associated Legendre func-
tions, m ∈ Z is an integer denoting the function degree and n ∈ N is a natural number
denoting the function order.
Table 1.1 presents expressions for the spherical harmonics of orders zero to four
[11]. Note that the spherical harmonics have a complex exponential dependence on
φ, so that the absolute value, |Ynm (θ, φ)|, will be constant along φ. Therefore, plots of
the real and imaginary parts of the spherical harmonics are typically presented, rather
than plots of the magnitude and phase. The order n determines the highest power of
1.2 Spherical Harmonics 5

Table 1.1 Spherical harmonics Ynm (θ, φ) for orders n = 0, . . . , 4



n=0 Y00 (θ, φ) = 4π1

n=1 Y1−1 (θ, φ) = 8π 3
sin θe−iφ

Y10 (θ, φ) = 4π3
cos θ

Y1 (θ, φ) = − 8π sin θeiφ
1 3

n=2 Y2−2 (θ, φ) = 32π 15
sin2 θe−2iφ

Y2−1 (θ, φ) = 8π 15
sin θ cos θe−iφ

Y20 (θ, φ) = 16π5
(3 cos2 θ − 1)

Y21 (θ, φ) = − 8π 15
sin θ cos θeiφ

Y22 (θ, φ) = 32π15
sin2 θe2iφ

n=3 Y3−3 (θ, φ) = 64π 35
sin3 θe−3iφ

Y3−2 (θ, φ) = 32π105
cos θ sin2 θe−2iφ

Y3−1 (θ, φ) = 64π 21
(5 cos2 θ − 1) sin θe−iφ

Y30 (θ, φ) = 16π7
(5 cos3 θ − 3 cos θ)

Y31 (θ, φ) = − 64π 21
(5 cos2 θ − 1) sin θeiφ

Y32 (θ, φ) = 32π
105
cos θ sin2 θe2iφ

Y33 (θ, φ) = − 64π 35
sin3 θe3iφ

n=4 Y4−4 (θ, φ) = 512π315
sin4 θe−4iφ

Y4−3 (θ, φ) = 64π315
cos θ sin3 θe−3iφ

Y4−2 (θ, φ) = 128π 45
(7 cos2 θ − 1) sin2 θe−2iφ

Y4−1 (θ, φ) = 64π 45
(7 cos3 θ − 3 cos θ) sin θe−iφ

Y40 (θ, φ) = 256π9
(35 cos4 θ − 30 cos2 θ + 3)

Y41 (θ, φ) = − 64π 45
(7 cos3 θ − 3 cos θ) sin θeiφ

Y42 (θ, φ) = 128π
45
(7 cos2 θ − 1) sin2 θe2iφ

Y43 (θ, φ) = − 64π315
cos θ sin3 θe3iφ

Y44 (θ, φ) = 512π
315
sin4 θe4iφ

the cos θ and sin θ terms controlling the dependence of the spherical harmonics over
θ , while m determines the dependence over φ through the exponential term eimφ .
Figure 1.5 presents balloon plots of the real and imaginary parts of the spherical
harmonics, Re{Ynm (θ, φ)} and Im{Ynm (θ, φ)}, with a view angle of
6 1 Mathematical Background

Fig. 1.5 Balloon plot of the spherical harmonics for n = 0 (top row) to n = 4 (bottom row), with
Yn0 (θ, φ), which is a real function, presented in the central column. I m{Ynm (θ, φ)} for −n ≤ m ≤ −1
are presented in the left-hand side columns, and Re{Ynm (θ, φ)} for 1 ≤ m ≤ n are presented in the
right-hand side columns. The view direction is indicated by the orientation of the axes presented at
the top of the figure. Colors indicate the sign of the spherical harmonic functions, with cyan (green-
blue) shades representing positive values, and magenta (purple-red) shades representing negative
values

(θ, φ) = (60◦ , −127.5◦ ). The rows in the figure present plots for n = 0 (top row) to
n = 4 (bottom row), while the columns present plots for m = −n (leftmost column)
to m = n (rightmost column). Im{Ynm (θ, φ)} is presented for m < 0, Re{Ynm (θ, φ)} is
presented for m > 0, and Yn0 (θ, φ), which is real, is presented in the central column.
Table 1.2 explicitly illustrates the functions presented in Fig. 1.5, for clarity. Figure
1.5 shows that Y00 is constant over the sphere, similar to a monopole function. The
real and imaginary parts of the spherical harmonics of order n = 1 have dipole-like
shapes, while higher orders have more complex forms, with the number of lobes
increasing with n and m.
With the aim of visualizing the spherical harmonics more clearly, Fig. 1.6 presents
the spherical harmonics in a similar manner to Fig. 1.5, but as viewed from the z-
axis, i.e. downwards from above. In this case, the behavior of the real and imaginary
parts of the spherical harmonics over the azimuth angle φ is illustrated clearly. All
spherical harmonics at m = 0 are constant over φ, while exhibiting cos(mφ) behavior
for the real parts, and sin(mφ) behavior for the imaginary parts. The plots on the left
1.2 Spherical Harmonics 7

Table 1.2 Illustration of the functions presented in Fig. 1.5

Fig. 1.6 Same as Fig. 1.5, but viewed from the z-axis (top view)

side (imaginary part, m < 0) are therefore rotated versions of the plots on the right
side (real part, m > 0), by 90◦ /m.
Figures 1.7 and 1.8 follow the same approach as Fig. 1.6, but with x-axis and y-
axis viewpoints, respectively, showing the dependence on θ more clearly. Spherical
harmonics Yn0 have a high value around θ = 0 and θ = π due to the cosn θ terms. The
behavior of the other spherical harmonics is more complex. For example, spherical
8 1 Mathematical Background

Fig. 1.7 Same as Fig. 1.5, but viewed from the x-axis (front view)

harmonics Ynn and Yn−n have a sinn θ dependence, producing “flat” looking functions
from the viewpoints shown in Figs. 1.7 and 1.8.
1.2 Spherical Harmonics 9

Fig. 1.8 Same as Fig. 1.5, but viewed from the y-axis (side view)

Some of the properties of the spherical harmonics are presented next, starting with
basic properties and progressing to properties involving integration and summation.
• Complex conjugate. The spherical harmonics are complex functions due to the com-
plex exponential term, eimφ , while the associated Legendre functions, Pnm (cos θ ),
are all real. The complex conjugate of the spherical harmonics take the form
 ∗
Ynm (θ, φ) = (−1)m Yn−m (θ, φ), (1.10)

which is derived from the expression of the associated Legendre function for
negative values of m [see Eq. (1.31)]. The complex conjugate property also defines
the relation between Ynm (θ, φ) and Yn−m (θ, φ), which are spherical harmonics of
the same order and opposite degrees.
• Limit on degree value, m. By definition, spherical harmonics with a degree that is
higher than the order are zero, i.e.

Ynm (θ, φ) = 0 ∀ |m| > n. (1.11)

• Zeros of the spherical harmonics. The spherical harmonics contain sin|m| θ terms,
defining the zeros of the function for m = 0, i.e.
10 1 Mathematical Background

Ynm (0, φ) = Ynm (π, φ) = 0 ∀ m = 0. (1.12)

• Spherical harmonics at m = 0. At m = 0, the associated Legendre function degen-


erates to the Legendre polynomials (see Sect. 1.3) and so the spherical harmonics
have a simplified expression:

2n + 1
Yn0 (θ, φ) = Pn (cos θ ). (1.13)

These spherical harmonics are not dependent on φ and are therefore axis-
symmetric relative to the z-axis. This is clearly illustrated in Figs. 1.7 and 1.8,
by the spherical harmonic functions in the central columns.
• Spherical harmonics at m = n and m = −n. At these extreme values of m, the
spherical harmonics have a sine dependence on θ and a simplified form:

1(2n + 1)! n −inφ
Yn−n (θ, φ) = sin θ e
2n+1 n!π

(−1)n (2n + 1)! n inφ
Ynn (θ, φ) = n+1 sin θ e . (1.14)
2 n! π

• Mirror symmetry along θ with respect to the equator, θ = π/2. The spherical har-
monics have a mirror symmetry in θ , such that the function on the upper hemisphere
is equal to the function on the lower hemisphere, up to a sign factor:

Ynm (π − θ, φ) = (−1)n+m Ynm (θ, φ). (1.15)

This symmetry is clearly illustrated in Figs. 1.7 and 1.8 by the real and imaginary
parts of the spherical harmonics, in which the sign is indicated by color. For even
n + m the functions are symmetric about the equator, whereas for odd n + m the
functions are antisymmetric about the equator.
• Symmetry with respect to φ. The spherical harmonics have mirror symmetry with
respect to φ due to the exponential function, such that

Ynm (θ, φ + π ) = (−1)m Ynm (θ, φ). (1.16)

This property is illustrated in Fig. 1.6, where spherical harmonic functions for
even values of m are equal at opposite sides of the circle defined by φ, while for
odd values of m the functions have the opposite sign (different color) at a phase
shift of 180◦ along φ.
Another symmetry along φ is defined relative to the x-axis, again due to the
behavior of the exponential function:
 ∗
Ynm (θ, −φ) = Ynm (θ, φ) . (1.17)
1.2 Spherical Harmonics 11

Fig. 1.9 Surface of a sphere


illustrating the area elements

Figure 1.6 illustrates that the real part of the spherical harmonics, plotted in the
right-hand side columns, is symmetric about the x-axis, while the imaginary part
is antisymmetric.
• Opposite direction. Combining the last two properties, Eqs. (1.15) and (1.16), the
spherical harmonics at (π − θ, φ + π ), which is the opposite direction to (θ, φ),
can be written as

Ynm (π − θ, φ + π ) = (−1)n Ynm (θ, φ). (1.18)

• Periodicity with respect to φ. The spherical harmonics are periodic with respect to
φ with a period of 2π/m, due to the exponential term eimφ , and therefore satisfy

Ynm (θ, φ + 2π/m) = Ynm (θ, φ). (1.19)

The periodicity is illustrated in Fig. 1.6, where, for example, the spherical harmon-
ics in the central column with m = 0 are constant along φ, spherical harmonics
corresponding to m = ±1 have a period of 2π , those corresponding to m = ±2
have a period of π , and so on.
The next set of properties is related to the integration of the spherical harmonic
functions over the unit sphere. In general, integration over a sphere of radius r can
be calculated by dividing the sphere area into elements, as illustrated in Fig. 1.9. The
length along φ of each element on the sphere surface is given by r sin θ dφ, denoting
the fact that the elements are narrower in the azimuth dimension nearer the poles.
The width along θ of each element is given by r dθ . The area element is therefore
defined as
r 2 dΩ = r 2 sin θ dθ dφ, (1.20)

where Ω is the solid angle and dΩ is the area element covered by sin θ dθ dφ on a unit
sphere. With a finer grid on the sphere surface, and elements becoming infinitesimally
small, the area can be calculated by integrating over the entire sphere surface:
12 1 Mathematical Background

 2π π 2π 1
r 2
dΩ = r 2
sin θ dθ dφ = r 2
dzdφ = 4πr 2 , (1.21)
S2 0 0 0 −1

where z = cos θ has been substituted to derive the last integral.


Properties related to integration and summation of spherical harmonics are pre-
sented next.
• Integration of spherical harmonic functions. The integral of the spherical harmonic
functions over the unit sphere is zero for all spherical harmonics, except for the
spherical harmonic of zero order:

2π π √
Ynm (θ, φ) sin θ dθ dφ = 4π δn0 δm0 , (1.22)
0 0

where δn0 is the Kronecker delta function, which is zero for all n except for n = 0.
• Orthogonality of spherical harmonics. The previous property can be easily derived
from the orthogonality property of the spherical harmonics over the sphere surface,
given by
2π π
 m ∗
Yn (θ, φ) Ynm (θ, φ) sin θ dθ dφ = δnn δmm , (1.23)
0 0

where δnn is equal to unity for n = n and zero otherwise. Although spherical
harmonics are normalized to maintain orthonormality, the term orthogonality will
be used in this book.
• Completeness of spherical harmonics. The completeness property states that
∞ 
 n
 ∗
Ynm (θ, φ) Ynm (θ , φ ) = δ(cos θ − cos θ )δ(φ − φ ), (1.24)
n=0 m=−n

where δ(cos θ )δ(φ) is the Dirac delta function on the sphere, which is zero every-
where on the sphere except at (θ, φ) = (π/2, 0), and satisfies

2π π 2π 1
δ(cos θ )δ(φ) sin θ dθ dφ = δ(z)δ(φ)dzdφ = 1, (1.25)
0 0 0 −1

where z = cos θ was used to remove the dependence of the Dirac delta function
on the cosine function.
• Spherical harmonics addition theorem. Another property related to completeness
is the addition theorem, which involves a summation over m:
1.2 Spherical Harmonics 13


n
 ∗ 2n + 1
Ynm (θ, φ) Ynm (θ , φ ) = Pn (cos Θ), (1.26)
m=−n

where
cos Θ = cos θ cos θ + cos(φ − φ ) sin θ sin θ , (1.27)

Θ is the angle between (θ, φ) and (θ , φ ) and Pn (·) is the Legendre polynomial.

1.3 Exponential and Legendre Functions

The properties of the spherical harmonic functions presented in Sect. 1.2 are the direct
result of the properties of the functions that compose the spherical harmonics, i.e.
the complex exponential eimφ , the associated Legendre function Pnm (cos θ ) and the
Legendre polynomials, Pn (cos θ ), for m = 0. Therefore, these functions and some
of their properties are presented in this section.
The complex exponential, widely used in signal processing, forms a complete and
orthogonal basis for functions on the circle, i.e.



e−imφ eimφ = 2π δ(φ − φ ) (1.28)
m=−∞

2π
1
e−imφ eim φ dφ = δmm , (1.29)

0

and is responsible for the behavior of the spherical harmonics as a function of φ.


Defined over the unit circle, the complex exponential functions are periodic with
periods of 2π/m for |m| > 0, have unit magnitude, and are the reason that spherical
harmonics are complex, rather than real functions.
The associated Legendre function, less common in signal processing or engi-
neering, will be presented in this section in more detail. This function is derived by
differentiation of the Legendre polynomials, (presented later in this section):

dm
Pnm (x) = (−1)m (1 − x 2 )m/2 Pn (x), x ∈ [−1, 1]. (1.30)
dxm
Table 1.3 presents expressions for the associated Legendre function for orders zero to
four. Figure 1.10 presents plots of Pnm (x) for m ≥ 0. Associated Legendre functions
for negative values of m are proportional to the same functions with a positive value
of m, and are given by
14 1 Mathematical Background

(0,0)
2
1
0
-1 0 1
(1,0) (1,1)
1 0
0 -0.5
-1 -1
-1 0 1 -1 0 1
(2,0) (2,1) (2,2)
1 2
0.5 0 2
0 1
-2 0
-1 0 1 -1 0 1 -1 0 1
(3,0) (3,1) (3,2) (3,3)
1 1 5 0
0 0 0 -5
-1 -10
-1 -2 -5
-1 0 1 -1 0 1 -1 0 1 -1 0 1
(4,0) (4,1) (4,2) (4,3) (4,4)
1 2 10 50 150
0.5 0 0 0 100
0 50
-2 -10 -50 0
-1 0 1 -1 0 1 -1 0 1 -1 0 1 -1 0 1

Fig. 1.10 Associated Legendre function Pnm (x), with (n, m) denoted on each figure

(n − m)! m
Pn−m (x) = (−1)m P (x). (1.31)
(n + m)! n

They are therefore not illustrated in Fig. 1.10. The behavior of Pnm (x) illustrated in
the curves in Fig. 1.10 is responsible for the behavior of the spherical harmonics over
the elevation angle θ , as illustrated in Figs. 1.7 and 1.8.
The associated Legendre functions for different orders n and the same degree m
are orthogonal under the integral satisfying

1
2 (n + m)!
Pnm (x)Pnm (x)d x = δnn , −n ≤ m ≤ n. (1.32)
2n + 1 (n − m)!
−1

This property is responsible for the orthogonality of the spherical harmonics, Eq.
(1.23), when integrating along θ . Combining Eqs. (1.32) and (1.29) (the orthogonality
of the exponential functions) one can directly derive the orthogonality of the spherical
harmonics, Eq. (1.23).
The values of the associated Legendre function for m = 0, i.e. Pn0 (x), or the values
of the spherical harmonics for m = 0, i.e. Yn0 (θ, φ), are determined by the Legendre
polynomials that satisfy

Pn (x) = Pn0 (x). (1.33)


1.3 Exponential and Legendre Functions 15

Table 1.3 Associated Legendre function Pnm (x) for orders n = 0, . . . , 4


n=0 P00 (x) = 1
P1−1 (x) = 21 (1 − x 2 ) 2
1
n=1
P10 (x) = x
1
P11 (x) = −(1 − x 2 ) 2
n=2 P2−2 (x) = 18 (1 − x 2 )
P2−1 (x) = 21 x(1 − x 2 ) 2
1

P20 (x) = 21 (3x 2 − 1)


1
P21 (x) = −3x(1 − x 2 ) 2
P22 (x) = 3(1 − x 2 )
P3−3 (x) = 48
3
n=3 1
(1 − x 2 ) 2
P3−2 (x) = 18 x(1 − x 2 )
P3−1 (x) = 18 (5x 2 − 1)(1 − x 2 ) 2
1

P30 (x) = 21 (5x 3 − 3x)


1
P31 (x) = − 23 (5x 2 − 1)(1 − x 2 ) 2
P32 (x) = 15x(1 − x 2 )
3
P33 (x) = −15(1 − x 2 ) 2
n=4 P4−4 (x) = 384
1
(1 − x 2 )2
P4−3 (x) = 2 23
48 x(1 − x )
1

P4−2 (x) = 48 (7x − 1)(1 −


1 2 x 2)
P4−1 (x) = 18 (7x 3 − 3x)(1 − x 2 ) 2
1

P40 (x) = 18 (35x 4 − 30x 2 + 3)


1
P41 (x) = − 25 (7x 3 − 3x)(1 − x 2 ) 2
P42 (x) = 15
2 (7x − 1)(1 − x )
2 2

3
P43 (x) = −105x(1 − x 2 ) 2
P44 (x) = 105(1 − x 2 )2

Table 1.4 presents expressions for the Legendre polynomials for orders zero to four.
Figure 1.11 presents plots of Pn (x) for n = 0, . . . , 4. Note that these curves are the
same as the curves presented in Fig. 1.10, left column, for the associated Legendre
function.
The Legendre polynomials can be derived directly through the following differ-
entiation formula:
1 dn 2
Pn (x) = n (x − 1)n . (1.34)
2 n! d x n
16 1 Mathematical Background

Table 1.4 Legendre P0 (x) = 1


polynomials Pn (x) for orders
n = 0, . . . , 4 P1 (x) = x
P2 (x) = 21 (3x 2 − 1)
P3 (x) = 21 (5x 3 − 3x)
P4 (x) = 18 (35x 4 − 30x 2 + 3)

Fig. 1.11 Legendre


polynomials Pn (x) for 2
n = 0, . . . , 4
1

1 1
0.5
0
0
-1

1 1
0.5
0
0
-1
-1 0 1 -1 0 1

The Legendre polynomials form a complete and orthogonal set of basis functions
over the line section x ∈ [−1, 1]. They are in L 2 ([−1, 1]), the space of functions on
this line section, and satisfy [1]

 2n + 1
Pn (x)Pn (x ) = δ(x − x ), (1.35)
n=0
2

1
2
Pn (x)Pn (x)d x = δnn . (1.36)
2n + 1
−1

Therefore, one can define a Legendre transform, or Fourier Legendre series [1], as
will be presented in Eq. (1.48). Substituting Eq. (1.26) into Eq. (1.24), or simply
substituting x = 1 and Pn (1) = 1 [1] in Eq. (1.35), leads to

 2n + 1
Pn (x) = δ(x − 1). (1.37)
n=0
2

This equation can be viewed as a Legendre series pair, i.e. 2n+1 2


and δ(cos Θ − 1),
where Θ is the angle between (θ, φ) and (θ , φ ), as defined in Eq. (1.27). Equation
(1.35) for the case of a finite summation over n can be written as [6]
1.3 Exponential and Legendre Functions 17

N
N +1  
(2n + 1)Pn (x)Pn (x ) =
PN +1 (x)PN (x ) − PN (x)PN +1 (x ) , (1.38)
n=0
x−x

which is known as the Christoffel summation formula [10]. Substituting x = 1 and


Pn (1) = 1, Eq. (1.37) can also be written for the case of a finite summation over n
as
N
N +1 
(2n + 1)Pn (x) = PN +1 (x) − PN (x) . (1.39)
n=0
x −1

1.4 Spherical Fourier Transform

The spherical Fourier transform, based on the spherical harmonics, is introduced in


this section. The set of spherical harmonics Ynm (θ, φ), for n ≥ 0 and −n ≤ m ≤ n,
can be used to compose a wide range of functions on the sphere. In fact, Ynm (θ, φ)
form a basis in Hilbert space L 2 (S 2 ) that is the set of all square-integrable functions
on the unit sphere. The norm L 2 implies that the spherical harmonics can compose
any square-integrable function with a diminishing square-integrated error.
A function f (θ, φ) ∈ L 2 (S 2 ) can be represented using a weighted sum of spherical
harmonics as
∞  n
f (θ, φ) = f nm Ynm (θ, φ), (1.40)
n=0 m=−n

where f nm are the weights. These weights form the spherical Fourier transform of
f (θ, φ) and can be derived from f (θ, φ) by

2π π
 ∗
f nm = f (θ, φ) Ynm (θ, φ) sin θ dθ dφ. (1.41)
0 0

Equations (1.41) and (1.40) form the spherical Fourier transform and its inverse,
respectively. Although denoted in this book (and elsewhere) as “transform”, Fourier
series may be a more suitable name, as Eq. (1.40) involves a summation rather than
an integral, f nm is discrete rather than continuous, and f (θ, φ) has a finite support
over (θ, φ), similar to Fourier series representations of periodic functions over R.
The requirement that f (θ, φ) ∈ L 2 (S 2 ) is also a sufficient condition for a bounded
spherical Fourier transform, i.e. | f nm | < ∞, n ∈ N, −n ≤ m ≤ n. The Cauchy-
Schwarz inequality is employed in the proof, as follows:
18 1 Mathematical Background

2π π 2
 
 m ∗
| f nm |2 = f (θ, φ) Yn (θ, φ) sin θ dθ dφ

0 0
2π π 2π π
m
≤ | f (θ, φ)| sin θ dθ dφ ×
2 Y (θ, φ) 2 sin θ dθ dφ
n
0 0 0 0
2π π
= | f (θ, φ)|2 sin θ dθ dφ < ∞, (1.42)
0 0

where the orthogonality property, as in Eq. (1.23), has been used to evaluate the
integral over |Ynm (θ, φ)|2 , and f ∈ L 2 (S 2 ) has been substituted in deriving the final
inequality. Equation (1.42) suggests that any function in L 2 (S 2 ) will have a spherical
Fourier transform with bounded coefficients. This is clearly a sufficient condition
and not a necessary condition. For example, f (θ, φ) = δ(cos θ − cos θ )δ(φ − φ )
is not in L 2 (S 2 ), since the integral over the square of a delta function diverges.
However, the spherical harmonic coefficients in this case are f nm = Ynm (θ , φ ), as
can be deduced from Eq. (1.24), and are bounded for all n and m.
Some of the properties of the spherical Fourier transform and of functions defined
as a linear combination of spherical harmonics are outlined next.
• Parseval’s relation. Orthogonality and completeness of the spherical harmonics
have been presented in Eqs. (1.23) and (1.24), respectively. Parseval’s relation
follows directly:

2π π ∞ 
 n
| f (θ, φ)| sin θ dθ dφ =
2
| f nm |2 , (1.43)
0 0 n=0 m=−n

or, more generally,

2π π ∞ 
 n
f (θ, φ) [g(θ, φ)]∗ sin θ dθ dφ = ∗
f nm gnm . (1.44)
0 0 n=0 m=−n

• Linearity. The spherical Fourier transform maintains the property of linearity due
to the integral operation of the transform. This implies that scaling and addition
of two functions lead to scaling and addition of their transforms:

h(θ, φ) = α f (θ, φ) + βg(θ, φ)


h nm = α f nm + βgnm , α, β ∈ R. (1.45)

• Complex conjugate. The complex conjugate of spherical harmonics, as in Eq.


(1.10), together with the inverse spherical Fourier transform, Eq. (1.40), can be
1.4 Spherical Fourier Transform 19

combined to derive the complex conjugate of f (θ, φ) and its transform:

g(θ, φ) = [ f (θ, φ)]∗



gnm = (−1)m f n(−m) . (1.46)

• Constancy along φ. If the function f is constant along φ, f (θ, φ) = f (θ ), the


coefficients will have the following property:

2π π
 ∗
f nm = f (θ ) Ynm (θ, φ) sin θ dθ dφ
0 0
 π
2n + 1
= 2π δm0 f (θ )Pnm (cos θ ) sin θ dθ

0


= f n δm0 , (1.47)
2n + 1

where f n depends only on n. This property has been derived by solving for the
integral over φ. In this case, f n inherits the properties of the Legendre series [1]:


2n + 1
fn = f (θ )Pn (cos θ ) sin θ dθ
2
0


f (θ ) = f n Pn (cos θ ), (1.48)
n=0

and the two-dimensional spherical Fourier transform reduces to the one-


dimensional Fourier-Legendre series.
• Constancy along θ . If the function f is constant along θ , f (θ, φ) = f (φ), the
spherical harmonic coefficients will reduce to

2π π
 ∗
f nm = f (φ) Ynm (θ, φ) sin θ dθ dφ
0 0
2π  π
1 −imφ 2n + 1 (n − m)!
= f (φ)e dφ × 2π Pnm (cos θ ) sin θ dθ
2π 4π (n + m)!
0 0
= f m Cnm , (1.49)

where f m is the Fourier series coefficient of f (φ) and Cnm is a constant derived by
evaluating the integral over θ [4]. In this case, f m inherits all the properties of the
Fourier series:
20 1 Mathematical Background

2π
1
fm = f (φ)e−imφ dφ

0


f (φ) = f m eimφ , (1.50)
m=−∞

reducing the two-dimensional spherical Fourier transform to one dimension, with


an additional two-dimensional factor, Cnm .
• Symmetry with respect to φ. A function that is symmetric with respect to φ, obeying
f (θ, φ) = f (θ, π − φ), has a symmetric spherical Fourier transform:

∞ 
 n
f (θ, φ) = f nm Ynm (θ, φ)
n=0 m=−n
∞ 
 n
= f nm Ynm (θ, π − φ)
n=0 m=−n
∞  n
 ∗
= f nm (−1)m Ynm (θ, φ)
n=0 m=−n
∞ 
 n
= f nm Yn−m (θ, φ)
n=0 m=−n
∞  n
= f n(−m) Ynm (θ, φ), (1.51)
n=0 m=−n

leading to f nm = f n(−m) . The properties of the spherical harmonics presented in


Eqs. (1.16), (1.17) and (1.10) were employed in the derivation.
• Sifting. The sifting property for the integral of a function multiplied by a Dirac
delta function holds for functions on the sphere:

2π π
f (θ, φ)δ(cos θ − cos θ )δ(φ − φ ) sin θ dθ dφ = f (θ , φ ), (1.52)
0 0

where the following equality has been used in the derivation:

1
δ(cos θ − cos θ )δ(φ − φ ) = δ(θ − θ )δ(φ − φ ). (1.53)
sin θ
Some of the properties of the spherical harmonics and functions defined on the
sphere are a direct result of the spherical harmonics forming a basis in Hilbert space
L 2 (S 2 ), i.e. the space of all square-integrable functions on the unit sphere. The inner
1.4 Spherical Fourier Transform 21

product in this space is defined as

2π π
 f, g ≡ f (θ, φ) [g(θ, φ)]∗ sin θ dθ dφ. (1.54)
0 0

Now, the spherical Fourier transform can be written in a compact form as




f nm = f, Ynm , (1.55)

such that
∞ 
 n


f = f, Ynm Ynm . (1.56)
n=0 m=−n

Coefficients f nm provide a complete description of functions in L 2 (S 2 ), apart from


functions with discontinuities. In this case, the representation is subject to the Gibbs
phenomenon. This has been studied intensively for the Fourier series, but also applies
for Fourier representations with other basis functions, such as the spherical harmonics
[1, 3, 12]. For functions with discontinuities, reconstruction using Fourier series, e.g.
as in Eq. (1.56), will not be identical to the original function, but the difference will
be zero in an L 2 sense.

1.5 Some Useful Functions

Some useful functions defined over the sphere and their spherical Fourier transform
are presented in this section.
• Constant function. A function that is constant along both θ and φ can be rep-
resented using only the zero-order spherical harmonics, leading to the following
transform pair:

f (θ, φ) = 1

f nm = 4π δn0 δm0 , (1.57)

which can be derived by noting that f (θ, φ) = 1 = 4π Y00 (θ, φ), substituting in
Eq. (1.41), and evaluating the integral using the orthogonality property, as in Eq.
(1.23).
• Dirac delta function. The Dirac delta function over the sphere, δ(cos θ − cos θ ) ×
δ(φ − φ ), is considered next. Substituting the Dirac delta function in Eq. (1.41)
(the spherical Fourier transform) and evaluating the integral using the sifting prop-
erty, as in Eq. (1.52), the spherical Fourier coefficients for the Dirac delta function
are found to be simply the spherical harmonics:
22 1 Mathematical Background

f (θ, φ) = δ(cos θ − cos θ )δ(φ − φ )


 ∗
f nm = Ynm (θ , φ ) . (1.58)

• Spherical harmonics. The spherical Fourier transform, Eq. (1.41), for f (θ, φ) =

Ynm (θ, φ), can be evaluated using the orthogonality property, Eq. (1.23), leading
to the following spherical Fourier transform pair:

f (θ, φ) = Ynm (θ, φ)
f nm = δnn δmm . (1.59)

• Truncated spherical harmonics series. An infinite spherical harmonics series with


coefficients [Ynm (θ , φ )]∗ form the Dirac delta function over the sphere around
(θ , φ ), as suggested by Eq. (1.58). If this summation is truncated to a finite order
N , the result can be reduced to a closed-form expression as follows [7]:


N 
n
 ∗
f (θ, φ) = Ynm (θ , φ ) Ynm (θ, φ)
n=0 m=−n


N
2n + 1
= Pn (cos Θ)
n=0

N +1  
= PN +1 (cos Θ) − PN (cos Θ) . (1.60)
4π(cos Θ − 1)

The spherical harmonics addition theorem, Eq. (1.26), was used to derive the
second line in the equation, where Θ is the angle between (θ, φ) and (θ , φ ),
defined in Eq. (1.27). The third line was derived using (1.39) [7], leading to the
following transform pair:

N +1  
f (θ, φ) = PN +1 (cos Θ) − PN (cos Θ)
4π(cos Θ − 1)
 ∗
Ynm (θ , φ ) , n ≤ N
f nm = (1.61)
0, n > N.

This function behaves in a sinc-like manner, converging to a delta function as


n → ∞ (see examples in Fig. 1.12).
• Spherical cap. A useful function over the sphere is a spherical cap centered at
the north pole, defined as having unity value for |θ | ≤ α, and zero elsewhere. The
spherical Fourier transform of this function is derived as follows [8, 13]:
1.5 Some Useful Functions 23

2π α
 ∗
f nm = Ynm (θ, φ) sin θ dθ dφ
0 0
2π α 
2n + 1 (n − m)! m
= P (cos θ )e−imφ sin θ dθ dφ
4π (n + m)! n
0 0
 α
2n + 1 (n − m)!
= 2π δm0 Pnm (cos θ ) sin θ dθ
4π (n + m)!
0
 1
2n + 1
= 2π δm0 Pn (z)dz. (1.62)

cos α


For n = 0, Pn (cos θ ) reduces to 1, leading to f 00 = π (1 − cos α), while for
n > 0 a recurrence formula for the Legendre polynomials can be used to evaluate
the integral [13], leading to

1, 0 ≤ θ ≤ α
f (θ, φ) =
0, α < θ ≤ π

π (1 − cos α), n=0
f nm =    . (1.63)
π
δm0 2n+1 Pn−1 (cos α) − Pn+1 (cos α) , n > 0

The coefficients f nm have sinc-like behavior, as illustrated in Fig. 1.13, which


shows examples of the function and its spherical Fourier transform.
The spherical cap is used next to illustrate the Gibbs phenomenon. A spherical
cap function with α = 30◦ is defined, with spherical harmonic coefficients computed
using Eq. (1.63), and illustrated in Fig. 1.14 using a balloon plot, where the function
is truncated to various finite orders N . A constant value around the sphere has also
been added to the function by increasing the value of f 00 . Note that even for high
orders, the functions show some ripple in their values due to the Gibbs phenomenon.

1.6 Rotation of Functions

Functions defined on the unit sphere can be shifted, in a similar manner to functions
defined over the line or the unit circle. For functions defined over the line, f (x) ∈
L 2 (R), and for functions defined over the circle, f (φ) ∈ L 2 ([0, 2π ]), the shift param-
eter is in the same domain as the function argument, e.g. f (x − x0 ), x, x0 ∈ R and
f (φ − φ0 ), φ, φ0 ∈ [0, 2π ]. However, for a function defined over the unit sphere,
f (θ, φ), the “shift” parameter is a three-dimensional operation, not in the same
24 1 Mathematical Background

40
N=8
35 N=20
30

25

20

15

10

-5

-10
-80 -60 -40 -20 0 20 40 60 80

Fig. 1.12 A truncated spherical harmonics series to order N (a sinc-like function) with coefficients
[Ynm (θ , φ )]∗ , illustrated for orders N = 8, 20

domain as (θ, φ). For example, the function f (θ, φ) can be rotated around the z-axis
(a one-parameter operation), and can then be further rotated by moving the point on
the sphere intersecting the z-axis (the north pole) to any other point on the sphere
(a two parameter operation), therefore supporting a three-dimensional rotation oper-
ation.
Rotation of a function on the sphere is typically defined using the parameter set
(α, β, γ ), formulated using Euler angles [1]. In this case, an initial counter-clockwise
rotation of angle γ is performed about the z-axis, followed by a counter-clockwise
rotation by angle β about the y-axis, and concluded by a counter-clockwise rotation
of angle α about the z-axis. See, for example, [1] for more details on Euler angles
and rotations. First, a position on the unit sphere in Cartesian coordinates is written
in algebraic vector notation:

x = [x, y, z]T = [sin θ cos φ, sin θ sin φ, cos θ ]T . (1.64)

Then, a rotated position using Euler angles is calculated as

x = Rz (α)R y (β)Rz (γ )x, (1.65)

where the 3 × 3 Euler rotation matrices are given by


⎡ ⎤
cos α − sin α 0
Rz (α) = ⎣ sin α cos α 0⎦ (1.66)
0 0 1
1.6 Rotation of Functions 25

=15 °
1
=45 °

0.8

0.6

0.4

0.2

0
0 10 20 30 40 50 60 70 80 90

0.8
=15 °
=45 °
0.6

0.4

0.2

-0.2

-0.4
0 5 10 15 20

Fig. 1.13 Spherical cap function f (θ, φ) as a function of θ (top) and its spherical Fourier transform
f nm , shown for m = 0 (bottom), for α = 15◦ , 45◦

and
⎡ ⎤
cos β 0 sin β
R y (β) = ⎣ 0 1 0 ⎦. (1.67)
− sin β 0 cos β

The Euler matrices are defined in S O(3), i.e. the Special Orthogonal group of 3 × 3
orthogonal matrices, satisfying
26 1 Mathematical Background

Fig. 1.14 Balloon plot of a


spherical cap function, as
defined in Eq. (1.63), for
α = 30◦ , reconstructed using
spherical harmonic
coefficients truncated to
various orders N , as
indicated in the figure. The
value of f 00 has been
increased to add a constant
term to the cap function for
clarity of visualization

R T R = I, det(R) = 1, (1.68)

such that an inverse rotation is defined as


 −1
Rz (α)R y (β)Rz (γ ) = RzT (γ )R Ty (β)RzT (α)
= Rz (−γ )R y (−β)Rz (−α). (1.69)

The rotation matrices introduced in Eqs. (1.66) and (1.67) operate on position vectors
in Cartesian coordinates and so, in this section, functions on the unit sphere are
presented in a similar manner, i.e. f (x), x ∈ S 2 (see Sect. 1.1). The rotation operation
is denoted by Λ and is written as
 −1 
Λ(α, β, γ ) f (x) = f Rz (α)R y (β)Rz (γ ) x , (1.70)

where the left hand side denotes rotation of the function values, while keeping the
coordinate system fixed; this is equivalent to keeping the function values fixed, while
rotating the coordinate system with an inverse rotation, as represented by the right-
hand side. Now, a series of L rotations R1 , R2 , . . . , R L is described by the product
of the rotation matrices:
R = R L · · · R2 R1 , (1.71)

with matrix R denoting the overall rotation operation.


The rotation of the spherical harmonic functions Ynm (θ, φ) for a given n and m
produces a function on the sphere that can be represented by a weighted sum of
spherical harmonics of the same order n and a range of degrees, as follows [5]:
1.6 Rotation of Functions 27


n

Λ(α, β, γ )Ynm (θ, φ) = Dmn m (α, β, γ )Ynm (θ, φ), (1.72)
m =−n

where Dmn m (α, β, γ ) is the Wigner-D function [11] [see Sect. 4.3, Eq. (1)]:

Dmn m (α, β, γ ) = e−im α dmn m (β)e−imγ , (1.73)

and dmn m is the Wigner-d function, which is real and can be written in terms of the
Jacobi polynomial [5, 11]:

s!(s + μ + ν)!
dmn m (β) = ζm m sin(β/2)μ cos(β/2)ν Ps(μ,ν) (cos β), (1.74)
(s + μ)!(s + ν)!

with μ = |m − m|, ν = |m + m|, s = n − (μ + ν)/2, and ζm m given by



1 m ≥ m
ζm m = m−m . (1.75)
(−1) m < m

The Wigner-D functions form a basis for the rotational Fourier transform, applied
to functions defined over the rotation group S O(3)[5, 11].
Equation (1.72) is useful in formulating rotations in the spherical harmonics
domain:

g(θ, φ) = Λ(α, β, γ ) f (θ, φ)


∞ 
 n
= f nm Λ(α, β, γ )Ynm (θ, φ)
n=0 m=−n
∞  n 
n

= f nm Dmn m (α, β, γ )Ynm (θ, φ)
n=0 m=−n m =−n
∞ 
 
 n 
n

= f nm Dmn m (α, β, γ ) Ynm (θ, φ). (1.76)
n=0 m =−n m=−n

Using the final line in Eqs. (1.76) and (1.40), rotation in the spherical harmonics
domain can now be written as


n
gnm = f nm Dmn m (α, β, γ ), (1.77)
m=−n

such that the Fourier coefficients of the rotated function are formulated as a weighted
sum of the Fourier coefficients of the original function. For order-limited functions,
Eq. (1.77) can be written in a matrix form:
28 1 Mathematical Background

Fig. 1.15 Balloon plot of a


spherical cap function
defined in Eq. (1.63) for
α = 30◦ , reconstructed using
spherical harmonic
coefficients truncated to
order N = 2, marked as
“Original” in the figure. The
function is then rotated using
various rotation operations
denoted by Λ(α, β, γ ) in the
figure. The plots are viewed
from the direction of the
y-axis, which can be inferred
from the Cartesian
coordinate system showing
the orientation of the balloon
plots

gnm = Dfnm , (1.78)

with
 T
gnm = g00 , g1(−1) , g10 , g11 , . . . , g N N
 T
fnm = f 00 , f 1(−1) , f 10 , f 11 , . . . , f N N , (1.79)

and D is an (N + 1)2 × (N + 1)2 block diagonal matrix, having block elements of


D0 , D1 , . . . , D N . Matrices Dn are of dimension (2n + 1) × (2n + 1) with elements
Dmn m (α, β, γ ). For example, D0 = D00 0
,
⎡ 1 1 1 ⎤
D(−1)(−1) D(−1)0 D(−1)1
⎢ 1 ⎥
D1 = ⎣ D0(−1)
1 1
D00 D01 ⎦ (1.80)
1 1 1
D1(−1) D10 D11

and so on.
The rotation of a function defined over the unit sphere is presented next. Consider
the spherical cap function, defined in Eq. (1.63), but with spherical harmonic coeffi-
cients truncated to an order of N = 2, i.e. all coefficients above n = 2 are set to zero.
1.6 Rotation of Functions 29

Figure 1.15 illustrates the function using a balloon plot, marked as “Original” in the
figure. The function is then rotated by multiplying its spherical Fourier coefficient
vector with the appropriate Wigner-D rotation matrix, as defined in Eq. (1.78). In the
figure, balloon plots of the rotated function are illustrated for different rotations. In
this example, the spherical harmonic coefficients vector of the original function is
given by
fnm = [(0.24), (0, 0.38, 0), (0, 0, 0.43, 0, 0)]T , (1.81)

with round brackets artificially separating coefficients with the same order. The ele-
ments in fnm are non zero only for m = 0, as expected, because the function is
constant along φ [see Eq. (1.47)]. However, when rotated, the operation of multipli-
cation with the Wigner-D rotation matrix results in a vector fnm , which is no longer
non-zero only at m = 0, and the function is no longer constant along φ. In this exam-
ple, the original function after rotation by Λ(0, 45◦ , 0) has the following vector of
the spherical harmonic coefficients:

fnm = [(0.24), (0.19, 0.27, −0.19), (0.13, 0.26, 0.11, −0.26, 0.13)] T . (1.82)

1.7 Spherical Convolution and Correlation

Convolution and correlation are widely used in signal processing to describe the
operation of linear systems and to investigate similarity between signals. Convolu-
tion and correlation can also be defined for functions on the unit sphere. Spherical
convolution, for example, has been previously applied to describe the sound pres-
sure measured on a spherical surface [7], while spherical correlation has been used
to describe spatial filtering on the sphere [9].
The operations of convolution and correlation of functions defined over the unit
sphere are presented in this section. The operation of convolution of two functions
defined over the line or the circle is typically formulated as an integral over one
function multiplied by a reversed and shifted version of the other function. Similarly,
convolution over the sphere can be described as the result of integrating the product
of one function with a rotated version of another function. However, since rotation is
a three-parameter operation, it involves a triple integral. The operation of convolving
function f (θ, φ) with function g(θ, φ) to produce y(θ, φ) is formulated as follows.
First, a compact notation is introduced for a double integral over the sphere and a
triple integral over the rotation angles:

 2π π
dμ ≡ sin θ dθ dφ, (1.83)
S2 0 0

such that μ ≡ μ(θ, φ) ∈ S 2 , and


30 1 Mathematical Background

 2π π 2π
dξ ≡ dα sin βdβdγ , (1.84)
S O(3) 0 0 0

such that ξ ≡ ξ(α, β, γ ) ∈ S O(3). Using this notation, and denoting the functions
defined over the unit sphere f (μ), g(μ), and y(μ) as in Sect. 1.1, the convolution is
now defined as [2]

y(μ) = f (μ) ∗ g(μ)



 
= f R(ξ )η Λ(ξ )g(μ)dξ
S O(3)

   
= f R(ξ )η g R−1 (ξ )μ dξ, (1.85)
S O(3)

where, in this notation, R(ξ ) ≡ Rz (α)R y (β)Rz (γ ) represents a rotation by


ξ(α, β, γ ), with η representing the north pole, i.e. η = [0, 0, 1]T in Cartesian coordi-
nates. Rotation of η by ξ involves an initial rotation by γ about the z-axis, which does
not affect η, followed by a rotation by β about the y-axis, shifting η to (β, 0) in angles
of spherical coordinates, followed by a final rotation by α about the z-axis, which
shifts (β, 0) to (β, α). f (R(ξ )η) is therefore simply f (β, α). Similarly, R−1 (ξ )μ
represents an inverse rotation of μ by ξ .
Similar to the Fourier transform over the line, spherical convolution transforms
to multiplication in the spherical harmonics domain, such that [2]


ynm = 2π f nm gn0 . (1.86)
2n + 1

Note that gnm is evaluated only at m = 0. This is a result of the fact that f (β, α)
is not dependent on γ , and so the integral over γ defined within the integral over ξ
operates only on the rotated function g, averaging its value along the azimuth. The
coefficients gn0 evaluated only for m = 0 (gn0 δm0 ) represent a function that varies
only with elevation, while the coefficients gn0 evaluated for all m (gn0 ∀m) represent
a function with symmetry along φ that satisfies f (θ, φ) = f (θ, π − φ), because
this is a special case of the symmetry property presented in Eq. (1.51). A detailed
derivation of Eq. (1.86) can be found in [2].
The correlation between two functions is a measure of the similarity of the two
functions. It is typically formulated as the integral of the product of the two functions,
with one of the functions shifted, or rotated in the case of functions on a sphere.
Therefore, the correlation between f (μ) and g(μ), denoted by c(ξ ), is defined as
[5]
1.7 Spherical Convolution and Correlation 31

c(ξ ) = f (μ) [Λ(ξ )g(μ)]∗ dμ. (1.87)
S2

Note that the result of the correlation operation, c(ξ ), is a function of the three
parameters of the rotation ξ . Using the spherical harmonics representation for f and
g, as in Eq. (1.40), and substituting in Eq. (1.87), c(ξ ) can be written in terms of f nm
and gnm as [5]
∞ 
 
n n

 n ∗
c(ξ ) = f nm gnm Dmm (ξ ) . (1.88)
n=0 m=−n m =−n

Equation (1.88) may be more useful than Eq. (1.87) as it involves summations rather
than integrals, which could be particularly useful if the functions are order-limited.

References

1. Arfken, G., Weber, H.J.: Mathematical Methods for Physicists, 5th edn. Academic, San Diego
(2001)
2. Driscoll, J.R., Healy Jr., D.M.: Computing Fourier transforms and convolutions on the 2-sphere.
Adv. Appl. Math. 15(2), 202–250 (1994)
3. Gelb, A.: The resolution of the Gibbs phenomenon for spherical harmonics. Math. Comput.
66(218), 699–717 (1997)
4. Jespen, D.W., Haugh, E.F., Hirschfelder, J.O.: The integral of the associated Legendre function.
Technical report, University of Wisconsin, Naval Research Laboratory (1955)
5. Kostelec, P.J., Rockmore, D.N.: FFTs on the rotation group. J. Fourier Anal. Appl. 14, 145–179
(2008)
6. Legendre polynomials (2014). http://functions.wolfram.com/Polynomials/LegendreP/
7. Rafaely, B.: Plane-wave decomposition of the pressure on a sphere by spherical convolution.
J. Acoust. Soc. Am. 116(4), 2149–2157 (2004)
8. Rafaely, B.: Spherical loudspeaker array for local active control of sound. J. Acoust. Soc. Am.
125(5), 3006–3017 (2009)
9. Rafaely, B., Weiss, B., Bachmat, E.: Spatial aliasing in spherical microphone arrays. IEEE
Trans. Sig. Process. 55(3), 1003–1010 (2007)
10. Sansone, G.: Orthogonal Functions. Interscience Publishers, New York (1959)
11. Varshalovich, D.A., Moskalev, A.N., Khersonskii, V.K.: Quantum Theory of Angular Momen-
tum, 1st edn. World Scientific Publishing, Singapore (1988)
12. Weyl, H.: Die Gibbssche erscheinung in der theorie der kugelfunktionen. Rend. Circ. Mat.
Palermo 29(1), 308–323 (1910)
13. Williams, E.G.: Fourier Acoustics: Sound Radiation and Nearfield Acoustical Holography.
Academic, New York (1999)
Chapter 2
Acoustical Background

Abstract The mathematical background for functions defined on the unit sphere
was presented in Chap. 1. Spherical harmonics played an important role in presenting
and manipulating these functions. In this chapter, functions on the sphere are defined
through the formulations of fields in three dimensions. Although sound fields are
of primary concern in this book, which is oriented towards microphone arrays, the
material presented here can be applied to scalar fields in general. This chapter begins
by presenting the acoustic wave equation in Cartesian and spherical coordinates,
with possible solutions. Solutions to the wave equation in spherical coordinates are
shown to involve spherical harmonics and spherical Bessel and Hankel functions.
Having formulated the fundamental solutions, sound fields due to a plane wave and
a point source are presented, including an analysis of the effect of a rigid sphere
introduced into the sound field. The latter is useful for describing the sound field
around a microphone array configured over a rigid sphere, for example. The chapter
concludes with a formulation of the three-dimensional translation of sound fields.

2.1 The Acoustic Wave Equation

Sound pressure in free three-dimensional space, denoted by p(x, t), and measured
in Pascals, with x = (x, y, z) ∈ R3 measured in meters, and t representing time in
seconds, satisfies the homogeneous acoustic wave equation [6]:

1 ∂2
∇x2 p(x, t) − p(x, t) = 0, (2.1)
c2 ∂t 2
with c denoting the speed of sound in air, typically 343 m/s under normal ambient
conditions, and ∇x2 denoting the Laplacian in Cartesian coordinates, defined for a
function f (x, y, z) as

∂2 ∂2 ∂2
∇x2 f ≡ f + f + f. (2.2)
∂x2 ∂ y2 ∂z 2

For a single-frequency sound field, the sound pressure can be expressed as


© Springer Nature Switzerland AG 2019 33
B. Rafaely, Fundamentals of Spherical Array Processing, Springer Topics
in Signal Processing 16, https://doi.org/10.1007/978-3-319-99561-8_2
34 2 Acoustical Background

p(x, t) = p(x)eiωt , (2.3)

where ω is the radial frequency in radians per second. In this representation, p(x)
can be regarded as the space-dependent amplitude of the sound pressure at frequency
ω. With k = ω/c denoting the wave number in radians per meter, the dependence of
the pressure amplitude on ω or on k can be explicitly described using the notation
p(k, x). Substituting Eq. (2.3) into Eq. (2.1), the wave equation transforms into the
Helmholtz equation (with time-dependence omitted):

∇x2 p(k, x) + k 2 p(k, x) = 0. (2.4)

The notation p(k, x) can also be used to represent broadband or multiple-frequency


sound fields in steady state, in which case p(k, x) is the Fourier transform of the
sound pressure at frequency ω = kc. Note that p is complex, representing the com-
plex amplitude of the sound pressure. The actual sound pressure, such as would be
measured by a microphone, for example, is given by the real part of p.
A solution to the wave equation can be derived by using separation of variables.
The most commonly used solution is the plane wave, given by

p(x, t) = Ae−ik·x eiωt , (2.5)

where A is the amplitude, k ≡ (k x , k y , k z ) represents the wave vector, denoting the


propagation direction of the plane wave, and k · x = k x x + k y y + k z z represents the
dot product of vectors k and x. Plane wave sound fields can be described directly using
this solution. Representation of other sound fields is also possible using the Fourier
transform, with e−ik·x providing the basis function for describing the spatial variation
of the sound pressure amplitude. In some cases it may be more useful to denote the
direction of arrival of the plane wave, rather than the direction of propagation. For
this purpose, wave vector k̃ = −k is introduced, denoting the arrival direction, and
will be used later in this chapter. In this case, the sound pressure is given by

p(x, t) = Aei k̃·x eiωt . (2.6)

In this book, sound fields are measured by spherical microphone arrays, so that it
is preferable to represent the position vector in spherical coordinates, r = (r, θ, φ).
The wave equation is now rewritten in spherical coordinates, for which the Laplacian
in spherical coordinates is first defined for a function f (r, θ, φ):
   
1 ∂ 2 ∂ 1 ∂ ∂ 1 ∂2
∇r2 f ≡ 2 r f + 2 sin θ f + 2 2 f. (2.7)
r ∂r ∂r r sin θ ∂θ ∂θ r sin θ ∂φ 2

Equation (2.7) can be derived from the Laplacian in Cartesian coordinates [Eq. (2.2)]
using Eq. (1.4) and the chain rule. The wave equation in spherical coordinates can
now be written as
2.1 The Acoustic Wave Equation 35

1 ∂2
∇r2 p(r, t) − p(r, t) = 0, (2.8)
c2 ∂t 2

where p(r, t) is the sound pressure as a function of time and space in spherical
coordinates. For single frequency, i.e. harmonic, sound fields, the Helmholtz equation
can also be written in spherical coordinates as

∇r2 p(k, r) + k 2 p(k, r) = 0, (2.9)

where p(k, r) is the amplitude of the sound pressure over space and the dependence
on k is explicitly expressed. The amplitude of the pressure can be represented in a
way similar to Eq. (2.3), as
p(r, t) = p(r)eiωt . (2.10)

A solution to the wave equation (2.8) can be obtained using separation of variables:

p(r, t) = R(r )Θ(θ )Φ(φ)T (t). (2.11)

Substituting Eq. (2.11) into the wave equation (2.8), the single equation as a function
of p can be decomposed into four partial equations in the separate variables as
a function of Θ, Φ, R and T . The equation representing dependence on time is a
second-order differential equation:

d2T
+ ω2 T = 0, (2.12)
dt 2
with a fundamental solution

T (t) = eiωt , ω ∈ R, (2.13)

as also implied by Eq. (2.10). Substituting Eq. (2.11) in the Helmholtz equation (2.9)
and multiplying by r 2 sin2 θ/ p, the term depending on φ can be isolated, satisfying

d 2Φ
+ m 2 Φ = 0, (2.14)
dφ 2

with a fundamental exponential solution

Φ(φ) = eimφ , m ∈ Z, (2.15)

where m is an integer, because of the periodicity of Φ as a function defined over


the unit circle, with φ ∈ [0, 2π ). Substituting Eq. (2.15) back into the Helmholtz
equation, a term dependent only on θ can be isolated, satisfying
   
d d m2
(1 − μ2 ) Θ + n(n + 1) − Θ = 0, (2.16)
dμ dμ 1 − μ2
36 2 Acoustical Background

with μ = cos θ . This equation is known as the associated Legendre differential equa-
tion and has two types of solutions, one singular at μ = 1 and a second solution that
is typically selected and is referred to as the associated Legendre function of the first
kind:
Θ(θ ) = Pnm (cos θ ), n ∈ N, m ∈ Z. (2.17)

Substituting Eq. (2.16) into the Helmholtz equation and applying some further
manipulations, a term dependent only on r can be isolated, satisfying

d2 d  
r2 R + 2r R + (kr )2 − n(n + 1) R = 0. (2.18)
dr 2 dr
This equation is known as the spherical Bessel equation and its solution comprises
spherical Bessel functions of the first kind, jn (kr ), or spherical Hankel functions of
the first kind, h n (kr ), or both (see Sect. 2.2).
Combining the solutions over r , θ , φ, and t, a fundamental solution for the wave
equation in spherical coordinates can written in the form

p(r, t) = jn (kr )Ynm (θ, φ)eiωt (2.19)

or
p(r, t) = h n (kr )Ynm (θ, φ)eiωt , (2.20)

or as a combination of these solutions for various values of n and m. Specific solutions


in the case of a plane-wave sound field and a sound field produced by a point source
are presented later in this chapter.

2.2 Spherical Bessel and Hankel Functions

Solutions to the wave equation in spherical coordinates include spherical Bessel and
Hankel functions. These functions are presented in this section. The spherical Bessel
function of the first kind, jn (x), and of the second kind, yn (x), can be written using
Rayleigh formulas as [1]
 n
1 d sin(x)
jn (x) = (−1) x
n n
(2.21)
x dx x

and  n
1 d cos(x)
yn (x) = −(−1) xn n
, (2.22)
x dx x

with the spherical Hankel functions of the first kind, h n (x), and the second kind,
h (2)
n (x), written as
2.2 Spherical Bessel and Hankel Functions 37
 n
1 d ei x
h n (x) = −i(−1) x n n
(2.23)
x dx x

and  n
1 d e−i x
h (2)
n (x) = i(−1) x n n
, (2.24)
x dx x

with the relations


h n (x) = jn (x) + i yn (x) (2.25)

and
h (2)
n (x) = jn (x) − i yn (x). (2.26)

Because the spherical Bessel functions are real, jn (x) and yn (x) compose the real
and imaginary parts of h n (x), i.e.

jn (x) = Re {h n (x)} (2.27)

and
yn (x) = I m {h n (x)} . (2.28)

The spherical Bessel and Hankel functions are also related to the Bessel function,
Jα (x), and the Hankel function, Hα (x), through

π
jn (x) = J 1 (x) (2.29)
2x n+ 2

and 
π
h n (x) = H 1 (x). (2.30)
2x n+ 2

Solutions to the wave equation can be represented as a linear combination of spherical


Bessel functions of the first and second kinds, or of spherical Bessel and Hankel
functions. The latter representation is more common and will be employed in this
book.
Tables 2.1 and 2.2 present expressions for the spherical Bessel function and the
spherical Hankel function of the first kind, respectively, for the first few orders.
Figures 2.1 and 2.2 illustrate | jn (x)| for the first few orders.
Figure 2.2 shows that at low values of x, jn (x) has a steeper slope for larger orders.
Indeed, jn (x) for x  1 can be approximated by [1]

xn
jn (x) ≈ , x  1, (2.31)
(2n + 1)!!

where (·)!! is the double factorial function, i.e. (2n + 1)!! = (2n + 1)(2n − 1) · · · 1.
38 2 Acoustical Background

Table 2.1 Spherical Bessel functions of the first kind for n = 0, . . . , 3


j0 (x) = sin x
x

j1 (x) = − cosx x + sin x


x2

j2 (x) = − sinx x − 3 cos x


x2
+ 3 sin x
x3

j3 (x) = cos x
x − 6 sin x
x2
− 15 cos x
x3
+ 15 sin x
x4

Table 2.2 Spherical Hankel functions of the first kind for n = 0, . . . , 3


ei x
h 0 (x) = ix
i x (i+x)
h 1 (x) = − e x2
iei x (−3+3i x+x 2 )
h 2 (x) = x3
ei x (−15i−15x+6i x 2 +x 3 )
h 3 (x) = x4

10
0
0

-10 1

-20 2

-30 3

4
-40
5
-50
6
-60

-70

-80
1 2 3 4 5 6 7 8 9 10

Fig. 2.1 Magnitude of the spherical Bessel function, | jn (x)|, for n = 0, . . . , 6

Figure 2.1 shows that at large values of x, the amplitude of jn (x) decays in a similar
manner for all n. Indeed, for x  n [or more specifically, for x  n(n + 1)/2] jn (x),
as expressed in Table 2.1, is dominated by the first term, decays as 1/x, and can be
approximated by [1]

1
jn (x) ≈ sin(x − nπ/2), x  n(n + 1)/2. (2.32)
x
2.2 Spherical Bessel and Hankel Functions 39

0
0

1
-50

-100 3
4
5
-150
6

-200
10-3 10-2 10-1 100

Fig. 2.2 Magnitude of the spherical Bessel function, | jn (x)|, for n = 0, . . . , 6 and for x < 1

40

30 6
5
20 4

3
10
2
1
0 0

-10

-20
1 2 3 4 5 6 7 8 9 10

Fig. 2.3 Magnitude of the spherical Hankel function, |h n (x)|, for n = 0, . . . , 6

Figure 2.1 also shows that the spherical Bessel function has zeros. The zeros of j0 (x)
are at ±lπ, l = 1, 2, . . . ∞; for higher orders, the first zeros are positioned at x > π ,
but tend to appear at a spacing of π for large x, as suggested by Eq. (2.32).
Figure 2.3 presents |h n (x)|, illustrating that the spherical Hankel functions, unlike
the spherical Bessel functions, diverge towards the origin. Furthermore, Fig. 2.4
illustrates that for x  1, higher orders increase towards the origin with a larger
slope. This is supported by the small argument approximation of the spherical Hankel
function:
40 2 Acoustical Background

500

400
6
5
300 4

3
200
2

1
100
0

0
10-3 10-2 10-1 100

Fig. 2.4 Magnitude of the spherical Hankel function, |h n (x)|, for n = 0, . . . , 6 and x < 1

(2n − 1)!!
h n (x) ≈ −i , x  1. (2.33)
x n+1

On the other hand, for large values of x, h n (x) decays similarly for all n, which is
supported by the large argument approximation:

ei x
h n (x) ≈ (−i)n+1 , x  n(n + 1)/2. (2.34)
x
The spherical Bessel function also satisfies recurrence equations:

2n + 1
jn (x) = jn−1 (x) + jn+1 (x) (2.35)
x
and
(2n + 1) jn (x) = n jn−1 (x) − (n + 1) jn+1 (x), (2.36)

with jn (x) denoting the first derivative of jn (x) with respect to x. These relations also
hold for the spherical Bessel function of the second kind and the spherical Hankel
functions of the first and second kinds [1].

2.3 A Single Plane Wave

Consider a unit-amplitude, single-frequency plane wave, arriving from direction


(θk , φk ) with a wave vector k̃ = −k = (k, θk , φk ) written in spherical coordinates.
The plane wave is a solution to the homogeneous wave equation in Cartesian
2.3 A Single Plane Wave 41

coordinates, and so could also be represented as a combination of the general solu-


tions of the wave equation in spherical coordinates. As the spherical Hankel functions
diverge at the origin, spherical Bessel functions are used to represent a plane-wave
sound field. The familiar expression for the sound pressure at r = (r, θ, φ) due to
a plane wave, i.e. e−ik·r , can be written as a summation of spherical harmonics and
spherical Bessel functions [5, 8]:

p(k, r, θ, φ) = e−ik·r = ei k̃·r


∞ n
 ∗
= 4πi n jn (kr ) Ynm (θk , φk ) Ynm (θ, φ). (2.37)
n=0 m=−n

The dot product is given by k̃ · r = kr cos Θ. By applying the spherical harmonics


addition theorem, as in Eq. (1.26), Eq. (2.37) is reduced to


p(k, r, Θ) = eikr cos Θ = i n jn (kr )(2n + 1)Pn (cos Θ). (2.38)
n=0

The exponential representation of a plane wave, as in the first line of Eq. (2.37),
is simple and natural, compared to the infinite summation on the second line of the
same equation. However, the advantage of representing a plane wave in spherical har-
monics lies in the possibility of performing separation of variables. Terms including
kr , wave arrival direction (θk , φk ), and position (θ, φ) on the surface of a sphere of
radius r , can thus be formulated as parameters of separate functions. This advantage
is exploited later in the book when developing array processing algorithms in the
spherical harmonics domain. Derivation of Eqs. ( 2.37) and (2.38) and further reading
can be found in [1, 8], for example.
The shortcoming of the representation of plane waves using spherical harmonics
with an infinite summation is typically overcome in practice by approximating the
infinite summation with a finite summation, i.e. Eq. (2.37) is rewritten as


N
n
 ∗
p(k, r, θ, φ) ≈ 4πi n jn (kr ) Ynm (θk , φk ) Ynm (θ, φ), (2.39)
n=0 m=−n

introducing truncation errors.


As an example of spherical harmonics representation of a plane-wave sound field,
consider a sound field composed of a single unit-amplitude plane wave arriving from
(θk , φk ) = (90◦ , 20◦ ). Figure 2.5 shows the real part of the sound pressure, Re{ p},
for k = 1, computed using Eq. (2.39) for various values of N and plotted over the
x y plane. The figure shows that for N = 32, a sinusoidal behavior is observed, as
expected from the real part of the amplitude of a single plane wave. However, for
42 2 Acoustical Background

Fig. 2.5 Re{ p(k, r, θ, φ)}


20 1
for a unit-amplitude plane
wave arriving from 0.8
15
(θk , φk ) = (90◦ , 20◦ ) and 0.6
computed using Eq. (2.39) 10
0.4
with N = 8, 16, 32 and
5
k = 1, plotted over the x y 0.2
plane 0 0

-0.2
-5
-0.4
-10
-0.6
-15
-0.8

-20 -1
-20 -10 0 10 20

20 2

15 1.5

10
1
5
0.5
0
0
-5

-10 -0.5

-15 -1

-20
-20 -10 0 10 20

20 2

15 1.5

10 1

5 0.5

0 0

-5 -0.5

-10 -1

-1.5
-15

-2
-20
-20 -10 0 10 20
2.3 A Single Plane Wave 43

smaller values of N , e.g. N = 16 and N = 8, the sinusoidal behavior is distorted, and


is only maintained within a limited circle around the origin. This behavior is typical
of the representation of plane waves using spherical harmonics – it is accurate only
within the volume of a sphere. The radius of the sphere depends on k and N , as
discussed next.
Equation (2.37) provided an expression for the sound pressure at (r, θ, φ) for a
sound field composed of a single plane wave. Now, the sound pressure is evaluated at
the surface of a sphere of radius r . Therefore, p(k, r, θ, φ) is a function defined over
a sphere, having a spherical Fourier transform with coefficients denoted by pnm (k, r )
satisfying
∞ n
p(k, r, θ, φ) = pnm (k, r )Ynm (θ, φ). (2.40)
n=0 m=−n

Comparing Eqs. (2.37) and (2.40), the spherical harmonic coefficients for the
sound pressure over a sphere of radius r , in a sound field composed of a single
unit-amplitude plane wave arriving from (θk , φk ), can be written as
 ∗
pnm (k, r ) = 4πi n jn (kr ) Ynm (θk , φk ) . (2.41)

Equation (2.41) also shows that the magnitude of pnm is proportional to the magnitude
of jn (kr ). It is therefore expected that pnm for a plane-wave sound field decays as a
function of n for n > kr , as suggested by Fig. 2.1, and, more explicitly, as illustrated
in Fig. 2.6 for kr = 8 and kr = 16. This is an important result, as it suggests that the
sound field represented by the infinite summation in Eq. (2.37) can be represented by
a finite summation as in Eq. (2.39) with little error. The spherical harmonics series
for a plane-wave sound field can therefore be considered as nearly order limited,
so that sampling theories for order-limited functions, as detailed in Chap. 3, can be
applied with little error.
This behavior is clearly illustrated in Fig. 2.5 for N = 16, for example. The figure
shows a circle of radius r = 16 (equivalent to kr = 16 because k = 1), illustrating
that with N = 16, the pressure inside the circle satisfying kr < N is reconstructed
accurately, while outside the circle, with kr > N , the pressure is reconstructed with
significant error.
The condition of kr < N for accurate sound pressure reconstruction is further
illustrated in the following example, analyzing the magnitude of sound pressure over
the surface of a sphere of a fixed radius r = 1, at wave number k = 10, satisfying
kr = 10. The pressure is produced by a single unit-amplitude plane wave arriving
from direction (θk , φk ) = (45◦ , −45◦ ), which is then reconstructed using Eq. (2.39)
for various values of N . Figure 2.7 shows that for N = 20, satisfying N > kr , good
44 2 Acoustical Background

20

-20

-40

-60

-80

-100
0 5 10 15 20 25 30

20

-20

-40

-60

-80

-100
0 5 10 15 20 25 30

Fig. 2.6 Magnitude of the normalized spherical Bessel function, |4πi n jn (kr )|, for kr = 8, 16

reconstruction is achieved, as shown by the sinusoidal behavior of the pressure. For


N = kr = 10, some distortion is observed in the reconstructed sound pressure, while
for N = 5, the reconstructed pressure is significantly different from the expected
pressure.
2.3 A Single Plane Wave 45

Fig. 2.7 Re{ p(k, r, θ, φ)}


due to a unit-amplitude plane
wave with arrival direction
(45◦ , −45◦ ), evaluated using
Eq. (2.39) and plotted on the
surface of a sphere of radius
r = 1 at kr = 10, for
N = 5, 10, 20
46 2 Acoustical Background

2.4 Plane-Wave Composition

A sound field composed of multiple plane waves can be represented as a summation


over plane-wave terms, as in Eq. (2.37). When the sound field is composed of an
infinite number of plane waves, or a continuum of plane waves, with directional
amplitude density denoted by a(k, θk , φk ), then the sound pressure can be written as



π
p(k, r, θ, φ) = a(k, θk , φk )ei k̃·r sin θk dθk dφk
0 0

n
= 4πi n jn (kr )Ynm (θ, φ)
n=0 m=−n


π
 ∗
× a(k, θk , φk ) Ynm (θk , φk ) sin θk dθk dφk
0 0

n
= 4πi n anm (k) jn (kr )Ynm (θ, φ), (2.42)
n=0 m=−n

where anm (k) is the spherical Fourier transform of a(k, θk , φk ). Comparing Eqs.
(2.37) and (2.42), it is clear that for a sound field composed of a single unit-amplitude
plane wave, the following holds:
 ∗
anm (k) = Ynm (θk , φk ) , (2.43)

in which case, following Eq. (1.58),

a(k, θ, φ) = δ(cos θ − cos θk )δ(φ − φk ). (2.44)

When a sound field constructed from a composition of plane waves is evaluated


at the surface of a sphere of radius r , it can be written in the spherical harmonics
domain, following Eq. (2.42), as

pnm (k, r ) = 4πi n anm (k) jn (kr ). (2.45)

This is a very useful result, relating directly the spherical harmonic coefficients of
the plane-wave amplitude density to the spherical harmonic coefficients of the sound
pressure. This is also the advantage of analyzing the sound pressure over the surface
of a sphere – the measured function pnm is in the same domain (spherical harmonics)
as the function generating the sound field, anm , thus facilitating a direct relation
between the two.
2.4 Plane-Wave Composition 47

Equation (2.42) involves an infinite summation, but, similar to the case of a single
plane wave, the infinite summation may be approximated in practice by a finite
summation, leading to


N
n
p(k, r, θ, φ) ≈ 4πi n anm (k) jn (kr )Ynm (θ, φ). (2.46)
n=0 m=−n

The properties derived for a finite-summation approximation of a sound field com-


posed of a single plane wave also hold here, due to the similar dependence on the
radial function jn (kr ).
Equations (2.40) and (2.45) suggest that complete information in a three-
dimensional space about a sound field composed of a single plane wave, or mul-
tiple plane waves, is available simply from the knowledge of the sound pressure at
the surface of a single sphere. This is facilitated by the direct relation between the
spherical harmonic coefficients of the sound pressure over a sphere, pnm (k, r ), and
the plane-wave amplitude density, anm (k), composing the sound field in the entire
space, as shown in Eq. (2.45). Given p(k, r, θ, φ) and having computed pnm (k, r )
using the spherical Fourier transform, Eq. (1.41), the sound pressure at any other
position in space, (r , θ , φ ), can be derived. First, the plane-wave amplitude density
is computed by extracting anm (k) through a division by 4πi n jn (kr ) [see Eq. (2.45)]
and then pnm (k, r ) is reconstructed by a multiplication with 4πi n jn (kr ), leading to

∞ n
jn (kr )
p(k, r , θ , φ ) = pnm (k, r )Ynm (θ , φ ). (2.47)
n=0 m=−n
jn (kr )

This use of Eq. (2.47) is limited in practice by several factors. First, kr values
corresponding to the zeros of the spherical Bessel functions lead to division by zero
and a diverging quotient. Second, as discussed above, pnm (kr ) has significant terms
only up to order n = kr , whereas if r  r , accurate reconstruction of the pressure
at r will require terms up to order n = kr  kr . Therefore, accurate calculation of
p(k, r , θ , φ ) may require division by jn (kr ), which may have low magnitude at n >
kr , again leading to numerical instability. Furthermore, if the infinite summation in
Eq. (2.47) is replaced by a finite summation of order N , as expressed in the following
equation, the order-limited equation will be useful only in the range where both kr
and kr are smaller than N :

N n
jn (kr )

p(k, r , θ , φ ) ≈ pnm (k, r )Ynm (θ , φ ). (2.48)
j
n=0 m=−n n
(kr )
48 2 Acoustical Background

2.5 Point Sources

Real-world sources produce sound fields in their immediate vicinity with a behavior
that makes it appropriate to model them as a simple point source (a monopole source)
or a combination of these. Consider a point source located at rs = (rs , θs , φs ), pro-
ducing unit-amplitude sound pressure at a distance of 1 m from the source. The source
produces a spherical sound field, i.e. the pressure magnitude decays at a rate that is
inversely proportional to the distance from the source, while the phase is constant
as a function of θ and φ for a fixed distance from the source. The sound pressure at
location r = (r, θ, φ) for this spherical radiation field can be written using a series
of spherical harmonics as [5, 8]


e−ik r−rs
n
 m ∗ m
= 4π(−i)kh (2)
n (kr s ) jn (kr ) Yn (θs , φs ) Yn (θ, φ), r < r s ,
r − rs n=0 m=−n
(2.49)

where r = r and · is the Euclidean norm. The condition r < rs means that the
measurement point is nearer the origin relative to the point source. If a spherical
measurement surface of radius r is considered, then the point source is assumed to
be outside the measurement sphere. Note the similarity, in this case, between the
sound field produced by the point-source and the plane-wave sound field; the latter is
described by Eq. (2.37), with the plane-wave arrival direction replaced by the direc-
tion of the point source. Indeed, a point source positioned far from a measurement
region will produce a sound field similar to a plane-wave sound field. The minus sign
in the exponential e−ik r−rs guarantees that when combined with the time-dependent
exponential, eiωt , the sound radiation is outwards from the point source.
When the point source is nearer the origin relative to the measurement point, r
and rs exchange places, such that


e−ik r−rs
n
 m ∗ m
= 4π(−i)kh (2)
n (kr ) jn (kr s ) Yn (θs , φs ) Yn (θ, φ), r > r s .
r − rs n=0 m=−n
(2.50)

Similarly, considering a spherical measurement surface of radius r , the point source is


inside the measurement sphere. This equation is useful when analyzing sound radia-
tion of a source by measuring the sound pressure at a surface surrounding the source.
Note that, in this case, the spherical harmonic coefficients of the sound pressure
function measured at a sphere of radius r have radial dependence due to the spher-
ical Hankel function, h (2)
n (kr ), rather than a spherical Bessel function dependence,
jn (kr ). The latter would be the case for a far point source or a plane wave. Although
both spherical functions are solutions to the wave equation along r , the Hankel func-
tion has a singularity that is more suitable when describing a point source, as both
produce infinite sound pressure at the singularity and source locations, respectively.
2.5 Point Sources 49

The sound pressure at the surface of a sphere of radius r , p(k, r, θ, φ), due to a
point source positioned at (rs , θs , φs ) can be described using the spherical harmonic
coefficients by comparing Eq. (2.40) with Eqs. (2.49) and (2.50), leading to
 m ∗
pnm (k, r ) = 4π(−i)kh (2)
n (kr s ) jn (kr ) Yn (θs , φs ) , r < r s , (2.51)

and  m ∗
pnm (k, r ) = 4π(−i)kh (2)
n (kr ) jn (kr s ) Yn (θs , φs ) , r > r s . (2.52)

Equation (2.47), representing extrapolation of the sound pressure from a measure-


ment sphere to other positions, can also be used for the case of a sound field generated
by a point source, or by point sources, as long as the sources are outside the spheres
of radii r and r . In the case where the sources are inside the spheres of radii r
and r , jn (kr ) and jn (kr ) in Eq. (2.47) should be replaced by h n (kr ) and h n (kr ),
respectively.
Equation (2.49) can be used to describe the pressure around the origin at (r, θ, φ),
for a point source that is positioned a significant distance away from the origin.
Substituting the large argument approximation for the spherical Hankel function in
n+1 e−ikrs
this case, as in Eq. (2.34), the term h (2)n (kr s ) can be replaced by (i) krs
and, when
substituted back into Eq. (2.51), the following approximation holds:

e−krs  ∗
pnm (k, r ) ≈ 4πi n jn (kr ) Ynm (θs , φs ) , r < rs , krs  n(n + 1)/2.
rs
(2.53)
The spherical harmonic coefficients of the sound pressure on a sphere of radius r are
similar to the coefficients produced on the same sphere by a plane wave, as shown
−krs
in Eq. (2.41), with (θk , φk ) = (θs , φs ) normalized by the term e rs representing the
phase shift and attenuation due to the propagation from the point source to the origin.
Furthermore, if we consider the sound pressure limited to a sphere of radius r and
approximately order limited to N = kr and assume that rs satisfies krs > N (N +
1)/2, then the sound pressure produced by the point source is approximately the same
as the sound pressure produced by a plane wave with (θk , φk ) = (θs , φs ). This is a
useful result, as it allows the sound pressure in a limited region in space, produced
by a distant point source, to be approximated by the sound pressure produced by a
plane wave and thus to inherit the properties of a plane-wave sound field. For a more
detailed comparison between the sound field produced around the origin by a point
source and by a plane wave, the reader is referred to [4].

2.6 Sound Pressure Around a Rigid Sphere

The sound pressure on the surface of a sphere in a free field due to plane waves and
point sources has been analyzed in previous sections. In this section the pressure
around a rigid sphere is derived. This is useful when measuring microphones are
50 2 Acoustical Background

placed around a rigid sphere, which is often the case in practice, or when such a rigid
sphere is employed to mimic a human head.
The sound pressure around a rigid sphere is composed of the incident sound field,
which is the sound field in free field without the rigid sphere, and the scattered sound
field, which is the sound field that is scattered from the rigid sphere due to the incident
field. The contribution of both fields to the sound pressure around a rigid sphere is
formulated next. Consider a rigid sphere of radius ra . The sphere imposes a boundary
condition on its surface of zero radial velocity:

u r (k, ra , θ, φ) = 0, (2.54)

because of the infinite impedance at the sphere boundary and the inability of the
sound pressure to generate radial motion at this boundary. Acoustic velocity can
be related to pressure through the equation of momentum conservation (or Euler
equation) in spherical coordinates:

iρ0 cku(k, r, θ, φ) = ∇ p(k, r, θ, φ), (2.55)

where the gradient operator in spherical coordinates is given by

∂p 1 ∂p 1 ∂p
∇p ≡ r̂ + θ̂ + φ̂. (2.56)
∂r r ∂θ r sin θ ∂φ

ρ0 is the air density in kilograms per cubic meter and r̂ , θ̂ , φ̂ are unit vectors, as
shown in Fig. 2.8, with r̂ pointing in the direction of r, θ̂ is tangential to the surface
of a sphere of radius r , pointing downwards along the longitude, and φ̂ is tangential
to the surface of a sphere of radius r , pointing along the latitude.

Fig. 2.8 Spherical


coordinate system showing
coordinate directions
2.6 Sound Pressure Around a Rigid Sphere 51

Substituting Eqs. (2.56) and (2.54) in Eq. (2.55) with p = pi + ps and u r = u ri +


u r s representing the total pressure and the total radial velocity, respectively, composed
of the incident and scattered components, leads to


[ pi (k, r, θ, φ) + ps (k, r, θ, φ)] r =ra = 0. (2.57)
∂r
The scattered pressure is now written as a spherical harmonics series as

n
ps (k, r, θ, φ) = cnm (k)h (2)
n (kr )Yn (θ, φ).
m
(2.58)
n=0 m=−n

Note the use of the spherical Hankel function, h (2)


n (kr ), as the scattered sound field
originates from within the sphere of radius r , propagating outwards from the rigid
sphere. The spherical Hankel function of the second kind is used; this is because it
has terms of the form e−ikr that, when combined with the time-dependent term, i.e.
ei(ωt−kr ) , suggest that the waves propagate in the positive r̂ direction, i.e. outwards
from the rigid sphere. The incident sound pressure around the sphere can be written
in the spherical harmonics domain as

n
pi (k, r, θ, φ) = anm (k)4πi n jn (kr )Ynm (θ, φ). (2.59)
n=0 m=−n

Note that anm assumes an incident sound field composed of plane waves described
in the notation previously used [see Eq. (2.42)]. However, a similar formulation also
holds for sound fields due to point sources, as long as they are outside the sphere of
radius r [see Eq. (2.49)].
Writing Eq. (2.57) in the spherical harmonics domain, by substituting Eq. (2.58)
for the scattered pressure and Eq. (2.59) for the incident pressure, yields

jn (kra )
cnm (k) = −anm (k)4πi n . (2.60)
h (2)
n (kra )

Substituting cnm in Eq. (2.58) and adding the incident field, Eq. (2.59), the total sound
field around a rigid sphere is given by

n  
jn (kra )
p(k, r, θ, φ) = anm (k)4πi n
jn (kr ) − h (2)
n (kr ) Ynm (θ, φ).
n=0 m=−n h (2)
n (kra )
(2.61)

By denoting  
jn (kra ) (2)
bn (kr ) = 4πi n jn (kr ) − (2) h n (kr ) , (2.62)
h n (kra )
52 2 Acoustical Background

10
0
0
1
-10
2
-20
3
-30
4
-40
5
-50
6
-60

-70

-80
1 2 3 4 5 6 7 8 9 10

Fig. 2.9 Function |bn (kr )|/(4π ) for a rigid sphere with r = ra , for n = 0, . . . , 6

the pressure outside the rigid sphere can be written in the spherical harmonics domain
as
pnm (k, r ) = anm (k)bn (kr ). (2.63)

Note the similarity to Eq. (2.45) with 4πi n jn (kr ) replaced by bn (kr ), now containing
a scattering term. Also note that the explicit dependence of bn on ra has been omitted
for notation simplicity. The behavior of the magnitude of bn , normalized by 4π , is
presented in Fig. 2.9. Compared to Fig. 2.1, showing the magnitude of jn , function
bn does not have zeros away from the origin. This important property is useful when
a division by jn is replaced by a division by bn , such as in sound extrapolation [see
Eq. (2.48)] or, generally, in array processing, as presented later in this book.
Similar to the case of the pressure around a sphere in a free field, around a rigid
sphere the magnitude of the spherical harmonic coefficients of the pressure due to a
plane-wave sound field decreases for n > kr , as shown in Fig. 2.10. This figure is
similar to Fig. 2.6, only here the functions are smoother for low values of n due to
the absence of the zeros.
Figure 2.11 shows the sound pressure, Re{ p(k, r, θ, φ)}, around rigid spheres of
radii ra = 1, 3, 10, due to a single unit-amplitude plane wave arriving from (θk , φk ) =
(90◦ , 20◦ ), with k = 1. The sound pressure was calculated using Eq. (2.61), with
terms limited to order N = 32. Comparing Figs. 2.5 and 2.11, the effect of the sound
pressure scattered from the rigid sphere is clear. For large radii, e.g. ra = 3, 10, the
sound field around the rigid sphere is significantly altered by the scattered field, while
for smaller radii, e.g. ra = 1, the change is minor.
The relation between the radius of the rigid sphere and the magnitude of the scat-
tered sound field can be studied analytically. The scattered sound field is dependent on
2.6 Sound Pressure Around a Rigid Sphere 53

20

-20

-40

-60

-80

-100
0 5 10 15 20 25 30

20

-20

-40

-60

-80

-100
0 5 10 15 20 25 30

Fig. 2.10 Function |bn (kr )| for a rigid sphere with r = ra and kr = 8, 16


the scattering term jn (kra )/ h (2)
n (kra ) in bn [see Eq. (2.62)]. For a small rigid sphere
satisfying kra  1, substituting the relation in Eq. (2.36) for the derivatives and using

the small argument approximations in Eqs. (2.31) and (2.33), jn (kra )/ h (2) n (kra ) is
proportional to (kra )2n+1 ; this term tends to zero as kra → 0, therefore leading to a
negligible contribution from the scattered field.
The sound pressure on the surface of a rigid sphere due to a plane-wave
sound field is illustrated in Fig. 2.12, showing Re{ p(k, r, θ, φ)} on the surface
54 2 Acoustical Background

Fig. 2.11 Re{ p(k, r, θ, φ)}


20 1.5
for a unit-amplitude plane
wave arriving from 15 1
(θk , φk ) = (90◦ , 20◦ ), with
k = 1, plotted over the x y 10
0.5
plane. Rigid spheres of radii
5
ra = 1, 3, 10 m are
0
positioned at the origin, also 0
illustrated in the figure
-0.5
-5

-10 -1

-15 -1.5

-20
-20 -10 0 10 20

20 1

15
0.5
10

5 0

0
-0.5
-5

-1
-10

-15
-1.5

-20
-20 -10 0 10 20

20 1

0.8
15
0.6
10
0.4

5 0.2

0 0

-0.2
-5
-0.4
-10
-0.6

-15 -0.8

-1
-20
-20 -10 0 10 20
2.6 Sound Pressure Around a Rigid Sphere 55

Fig. 2.12 Re{ p(k, r, θ, φ)} due to a unit-amplitude plane wave with arrival direction (45◦ , −45◦ ),
evaluated using Eq. (2.61) with kra = 10 and plotted on the surface of a rigid sphere

of a sphere of a radius satisfying kra = 10. The plane-wave arrival direction is


(θk , φk ) = (45◦ , −45◦ ), computed using Eq. (2.61) with terms up to order N = 32.
The figure clearly shows that the magnitude of the sound pressure on the surface of
the rigid sphere is highest at the location on the sphere near the arrival direction of
the plane wave and attenuated along the propagation direction due to the effect of
the rigid sphere.

2.7 Translation of Fields

So far in this chapter the sound pressure has been presented relative to the origin of
the spherical coordinate system. It may be useful to present the sound pressure in
the spherical harmonics domain, relative to a translated spherical coordinate system.
For example, the sound pressure around several spheres can be presented relative to
a common origin. Other examples of translated sound fields represented in spherical
harmonics have been investigated in recent publications [2, 7]. The aim of this section
is therefore to provide an overview of the operation of translation of sound fields and
of the effect of translation on the representation of the sound fields in spherical
harmonics.
Sound fields due to plane waves or distant point sources at (r, θ, φ) are described as
a series of weighted jn (kr )Ynm (θ, φ) terms, whereas sound fields due to point sources
that are near the origin are described as a series of weighted h (2)
n (kr )Yn (θ, φ) terms
m

[see Eq. (2.50)]. Consider a translation in the coordinate system from the origin to
r = (r , θ , φ ), such that
56 2 Acoustical Background

Fig. 2.13 Translation of the


origin to r

r = r + r , (2.64)

as illustrated in Fig. 2.13.


It may be useful to compute the coefficients of the sound field in the spherical
harmonics domain with respect to the translated coordinates, relative to the original
coefficients. Such a formulation can take different forms, depending on whether the
original and the translated sound fields employ spherical Bessel or Hankel terms.
The formulation of the translation therefore uses the following three transformations
of terms: (i) from spherical Bessel functions to spherical Bessel functions [3],


n

jn (kr )Ynm (θ, φ) = jn (kr )Ynm (θ , φ )
n =0 m =−n


× jn (kr )Ynm−m
(θ , φ )Cnnmn
m , (2.65)
n =0

(ii) from spherical Hankel functions to spherical Hankel functions,




n
h (2)

h (2)
n (kr )Yn (θ, φ)
m
=
n (kr )Yn (θ , φ )
m

n =0 m =−n


× jn (kr )Ynm−m
(θ , φ )Cnnmn
m , r > r (2.66)
n =0

and (iii) from spherical Hankel functions to spherical Bessel functions,




n

h (2)
n (kr )Yn (θ, φ)
m
= jn (kr )Ynm (θ , φ )
n =0 m =−n

h (2) m−m

× n (kr )Yn (θ , φ )Cnnmn
m , r < r , (2.67)
n =0
2.7 Translation of Fields 57

where

(2n + 1)(2n + 1)(2n + 1)
(n +n −n)
Cnnmn
m = 4πi (−1) m




nn n n n n
× (2.68)
0 0 0 −m m m − m
 
j1 j2 j3
and is the Wigner 3-j symbol [3] . Equation (2.65) was derived from
m1 m2 m3

the equation ei k̃·r = ei k̃·r ei k̃·r by first substituting Eq. (2.37) for all terms and then

multiplying by Ynm (θk , φk ) and integrating over the sphere with respect to (θk , φk ).
Equations (2.66) and (2.67) can then be derived by exploring relationships between
spherical Bessel and spherical Hankel functions [3].
We now consider the case where a sound field composed of multiple plane waves
is measured around a spherical surface, r = (r, θ, φ), such that r is constant. In this
case, the function on the sphere can be represented by coefficients in the spherical
harmonics domain, as in Eq. (2.45):

pnm (k, r ) = 4πi n anm (k) jn (kr ). (2.69)

The coefficients anm (k) provide information on the sound field and can be used to
calculate the sound pressure at a position (r, θ, φ) relative to the origin. Now, keeping
the same sound field, but shifting the origin of the coordinate system to r , we would
like to calculate the sound pressure at position (r , θ , φ ) relative to the new origin,

using a similar set of coefficients anm (k). We would like to formulate a direct relation

between anm (k) and anm (k). The sound pressure can be written using Eqs. (2.65) and
(2.69) as

n
p(k, r, θ, φ) = 4πi n anm (k) jn (kr )Ynm (θ, φ)
n=0 m=−n


n ∞
n

= 4πi anm (k)
n
jn (kr )Ynm (θ , φ )
n=0 m=−n n =0 m =−n


× jn (kr )Ynm−m
(θ , φ )Cnnmn
m

n =0


n

= 4πi n jn (kr )Ynm (θ , φ )
n =0 m =−n
∞ ∞

n
× anm (k) jn (kr )Ynm−m
(θ , φ )i n−n Cnnmn
m .
n=0 m=−n n =0
(2.70)
58 2 Acoustical Background

Therefore, the following holds:



n ∞

an m (k) = anm (k) jn (kr )Ynm−m
(θ , φ )i n−n Cnnmn
m , (2.71)
n=0 m=−n n =0

which provides a relationship between the sound field coefficients in the original
and in the translated coordinates. Similar relations can also be developed using Eqs.


(2.66) and (2.67). Note that Cnnmn m is non-zero only for |n − n | ≤ n ≤ n + n , and

so if anm is of finite order, each coefficient an m can be calculated by a finite number
of summations.

References

1. Arfken, G., Weber, H.J.: Mathematical Methods for Physicists, 5th edn. Academic Press, San
Diego (2001)
2. Ben Hagai, I., Pollow, M., Vorlander, M., Rafaely, B.: Acoustic centering of sources measured
by surrounding spherical microphone arrays. J. Acoust. Soc. Am. 130(4), 2003–2015 (2011)
3. Chew, W.C.: Waves and Fields in Inhomogeneous Media, 1st edn. Wiley-IEEE Press, New York
(1999)
4. Fisher, E.: Near-field spherical microphone array processing with radial filtering. IEEE Trans.
Audio Speech Lang. Process. 19(2), 256–265 (2011)
5. Jackson, J.D.: Classical Electrodynamics, 3rd edn. Wiley, New York (1999)
6. Kinsler, L.E., Frey, A.R., Coppens, A.B., Sanders, J.V.: Fundamentals of Acoustics, 4th edn.
Wiley, New York (1999)
7. Peleg, T., Rafaely, B.: Investigation of spherical loudspeaker arrays for local active control of
sound. J. Acoust. Soc. Am. 130(4), 1926–1935 (2011)
8. Williams, E.G.: Fourier Acoustics: Sound Radiation and Nearfield Acoustical Holography. Aca-
demic, New York (1999)
Chapter 3
Sampling the Sphere

Abstract Spherical microphone arrays are realized by placing microphones in three-


dimensional space and recording the signals at the microphone locations. When the
microphones are placed on the surface of a sphere, they sample the sound pressure
at the sphere surface. Estimation of the sound pressure function on the measure-
ment sphere may depend on the sampling configuration and, therefore, methods for
sampling functions on the sphere, such as equal-angle sampling, Gaussian sampling,
and uniform sampling, are presented in this chapter. An important feature of the
sampling methods is their capacity to facilitate computation of the spherical Fourier
transform of the function on the sphere in the case of order-limited functions. When
this capacity is not fully achieved, sampling errors occur and the function cannot
be reconstructed from its samples. The sampling methods mentioned above have
closed-form expressions for computing the spherical harmonic coefficients from the
samples, using a summation rather than integration. Computation of the spherical
harmonic coefficients can also be realized for arbitrary sampling configurations,
using an inversion of the sampled spherical harmonics matrix, as detailed in this
chapter. The methods presented here will provide the basis for selecting microphone
locations in the process of spherical microphone array design.

3.1 Sampling Order-Limited Functions

The sampling of functions defined over continuous variables such as time and space is
often required to enable digital processing of the sampled functions using computers.
The sampling of sound pressure functions in space requires microphones, where the
positions of the microphones determine the sampling points. The design of a spatial
sampling systems using microphones involves a trade-off – reducing the number of
microphones leads to a reduction in system complexity, while increasing the number
of microphones may lead to improvement in the accuracy of the reconstruction of
the sound pressure function.
Sampling theorems, such as the Nyquist theorem, e.g. [11], require the function
to be band-limited to achieve perfect reconstruction from the samples. This means
that the function can be represented by a finite number of basis functions. In a similar

© Springer Nature Switzerland AG 2019 59


B. Rafaely, Fundamentals of Spherical Array Processing, Springer Topics
in Signal Processing 16, https://doi.org/10.1007/978-3-319-99561-8_3
60 3 Sampling the Sphere

manner, sampling theorems for functions on the sphere require the functions to be
order-limited, or represented by a finite number of spherical harmonics.
A general formulation of the sampling problem can be derived starting from the
problem of quadrature. Quadrature methods aim to compute the integral of a given
function using a summation over samples of the function. Cubature is sometimes
used to refer to multiple integration. Consider a function g(θ, φ) defined on the unit
sphere. A quadrature method aims to approximate the integral of g(θ, φ), given a set
of samples on the sphere, (θq , φq ), and sampling weights, αq , as follows:

2π π 
Q
g(θ, φ) sin θ dθ dφ ≈ αq g(θq , φq ), (3.1)
0 0 q=1

where Q is the total number of samples. The quadrature formulation for estimating the
area under the function can be extended to function reconstruction by substituting
g(θ, φ) = f (θ, φ)[Ynm (θ, φ)]∗ . Starting from Eq. (1.41), this substitution leads to
the approximation of the spherical Fourier transform of function f (θ, φ) from its
samples:

2π π
 ∗
f nm = f (θ, φ) Ynm (θ, φ) sin θ dθ dφ
0 0


Q
 ∗
≈ αq f (θq , φq ) Ynm (θq , φq ) . (3.2)
q=1

For order-limited functions, the approximation becomes an equality, given a suffi-


ciently large Q. In this case, f (θ, φ) can be reconstructed perfectly on the sphere

using the inverse spherical Fourier transform, Eq. (1.40). Substituting Ynm (θ, φ) for
f (θ, φ) and, consequently, δnn  δmm  for f nm [see Eq. (1.59)], Eq. (3.2) reduces to


Q
  ∗
αq Ynm (θq , φq ) Ynm (θq , φq ) ≈ δnn  δmm  , (3.3)
q=1

where for order-limited functions the approximation becomes an equality within


a given order range. This shows a basic property of an ideal sampling scheme –
orthogonality of the spherical harmonics is maintained, at least for a limited range
of orders.
Several common sampling schemes are presented in the following sections, for
which the sampling weight αq and sampling points (θq , φq ) are derived such that
Eq. (3.3) is maintained with an equality for order-limited functions.
3.2 Equal-Angle Sampling 61

3.2 Equal-Angle Sampling

Equal-angle sampling is a sampling method on the sphere in which a function f (θ, φ)


is sampled at uniformly-spaced angular positions along θ and φ. Figure 3.1 shows
an equal-angle sampling distribution on the sphere, where 12 samples are positioned
along the azimuth for φ ∈ [0, 2π ) and along the elevation for θ ∈ [0, π ]. In accor-
dance, the positions of the samples on a unit sphere are given by
 
1 π
θq = q + , q = 0, . . . , 2N + 1
2 2N + 2

φl = l , l = 0, . . . , 2N + 1, (3.4)
2N + 2

with the total number of samples given by 4(N + 1)2 , determined by N . N also
represents the maximum order of an order-limited function that can be reconstructed
from these samples, as detailed later in this section. Note that the value of 21 added
to the index q [5] ensures that samples are not selected at the poles. Placing samples
at the pole [2] leads to 2N + 2 collocated samples due to the repetition of azimuth
samples and, therefore, reduces the total number of non-collocated samples.
Figure 3.1 clearly shows that although the samples adhere to a uniform angular
distribution, as illustrated on the θ φ plane plot, they are not uniformly distributed on
the sphere, as illustrated on the unit sphere plot, since samples are more dense around
the poles. Such a sampling scheme may be useful when mechanically scanning
microphone positions or when representing sampled functions on the θ φ plane, for
example, due to the uniform grid when measured along the angles. A complete
theorem is available for this type of sampling, and is presented in this section in
some detail. The main results, providing expressions for the sampling weights and
the spherical Fourier transform, are presented in Eqs. (3.11) and (3.15).
Sampling of functions defined on the real line can be represented mathematically
by multiplication with a delta function at the sampling position. Similarly, for the
sphere, a “train” of delta functions at the sampling positions is defined as


2N +1 2N
 +1
s(θ, φ) = αq δ(cos θ − cos θq )δ(φ − φl ). (3.5)
q=0 l=0

The coefficients αq determine the amplitude of the delta functions, which reduces
towards the poles to compensate for the increased density of the samples. The deriva-
tion of the values of αq is presented later in this section. The spherical Fourier trans-
form of s(θ, φ), denoted by snm , is derived by substituting Eq. (3.5) in the spherical
Fourier transform, Eq. (1.41), and using the sifting property of the delta function, as
in Eq. (1.52):
62 3 Sampling the Sphere

180

160

140

120

100

80

60

40

20

0
0 50 100 150 200 250 300 350

Fig. 3.1 Equal-angle sampling distribution, for N = 5 and a total of 144 samples, illustrated on
the surface of a unit sphere and over the θφ plane

2π π
 ∗
snm = s(θ, φ) Ynm (θ, φ) sin θ dθ dφ
0 0
2π π 2N
 +1 2N
 +1
 ∗
= αq δ(cos θ − cos θq )δ(φ − φl ) Ynm (θ, φ) sin θ dθ dφ
0 0 q=0 l=0

 +1 2N
 +1
2N
 ∗
= αq Ynm (θq , φl ) . (3.6)
q=0 l=0

The summation over l can be evaluated by substituting the definition of the spherical
harmonics in Eq. (1.9) and noting that αq are independent of l, leading to
3.2 Equal-Angle Sampling 63


2N +1 2N
 +1
2n + 1 (n − m)! m
snm = αq P (cos θq )e−imφl
q=0 l=0
4π (n + m)! n

2N +1 +1
2n + 1 (n − m)!  
2N
= αq Pnm (cos θq ) e−imφl
4π (n + m)! q=0 l=0

2N +1
2n + 1 (n − m)! 
= αq Pnm (cos θq )(2N + 2)δ((m))2N +2 , (3.7)
4π (n + m)! q=0

where δm is a short notation for δm0 . The summation over l has been reduced to
a periodic delta function due to the uniform distribution of the samples along the
azimuth, where ((·)) N denotes modulo N . The modulo operation is also denoted
as (·) mod N in this book. In the range 0 ≤ n ≤ 2N + 1 with −N ≤ m ≤ N the
periodic delta function has only one non-zero term, therefore reducing to (2N + 2)δm .
The expression for snm can therefore be simplified further in this limited range:
+1
2n + 1 
2N
snm = 2(N + 1) δm αq Pn (cos θq ), n ≤ 2N + 1. (3.8)
4π q=0

The values of αq are selected to satisfy


2N +1
2n + 1  √
2(N + 1) αq Pn (cos θq ) = 4π δn , n ≤ 2N + 1, (3.9)
4π q=0

which, due to the delta function on the right, reduces to


2N +1

αq Pn (cos θq ) = δn , n ≤ 2N + 1. (3.10)
q=0
N +1

This orthogonality condition amounts to finding 2N + 2 parameters αq by solving


the system of 2N + 2 linear equations. The system of equations has a closed-form
solution given by [2]

2π  N
1

αq = sin(θq ) +1
sin [2q  + 1]θq , 0 ≤ q ≤ 2N + 1. (3.11)
(N + 1)2 q  =0
2q

Substituting Eq. (3.9) in Eq. (3.8), snm can be written as



snm = 4π δn δm + s̃nm , (3.12)
64 3 Sampling the Sphere

where s̃nm is non-zero only for n > 2N + 1. Therefore, the spherical harmonics
transform of the impulse train is non-zero for n = 0, m = 0, and zero elsewhere in
the range n ≤ 2N + 1.
A sampled function on the sphere is now defined as f s (θ, φ) = f (θ, φ)s(θ, φ);
that is, an impulse train with the amplitude (area) of individual impulses being equal
to the amplitude of function f at the sampling points. The sampled function can be
written in terms of the original function by using Eq. (3.12):
n

∞ 
 √ 
f s (θ, φ) = f (θ, φ)s(θ, φ) = f (θ, φ) 4π δn δm + s̃nm Ynm (θ, φ)
n=0 m=−n


√   n
= f (θ, φ) 4π Y00 (θ, φ) + s̃nm Ynm (θ, φ)
n=0 m=−n
= f (θ, φ) + f (θ, φ)s̃(θ, φ), (3.13)

where s̃(θ, φ) is the inverse spherical Fourier transform of s̃nm , containing spherical
harmonics orders of 2N + 2 and above. It is argued in [2] that, because f (θ, φ)
and s̃(θ, φ) are polynomials in cos θ generated by the associated Legendre function,
the lowest order of the product of the two functions in the spherical harmonics
domain is given by the minimal difference between the orders of the individual
functions. Assuming f (θ, φ) is order-limited to n ≤ N , and knowing that s̃nm (θ, φ)
is order-limited by n ≥ 2N + 2, the minimal difference between the orders of the
spherical harmonic coefficients of the two functions is therefore (2N + 2) − N =
N + 2. It follows that the product f (θ, φ)s̃(θ, φ) has a spherical Fourier transform
with coefficients that are zero in the range n ≤ N + 1, leading to the following
equality:
f snm = f nm , n ≤ N . (3.14)

This result implies that if an order-limited function with a maximum order N is


sampled using equal-angle sampling with 2N + 2 samples along the azimuth and
along the elevation, the replicas in the spherical harmonics domain will occur at
orders beyond N and so aliasing-free sampling is achieved; the sampled function
can be reconstructed by removing spherical harmonic coefficients of orders N + 1
and beyond. This is similar to the sampling of band-limited functions of time, for
example, where the sampled function has the same Fourier transform as the original
function at the operating bandwidth if the sampling condition is satisfied.
Two results can be derived from the analysis presented above. First, the coefficients
f nm , n ≤ N , of an order-limited function f (θ, φ) with a spherical Fourier transform
f nm = 0, n > N can be computed as follows:
3.2 Equal-Angle Sampling 65

f nm = f snm , n ≤ N
2π π
 ∗
= f (θ, φ)s(θ, φ) Ynm (θ, φ) sin θ dθ dφ
0 0
2π π 
2N +1 2N
 +1
= f (θ, φ) αq δ(cos θ − cos θq )
0 0 q=0 l=0
 ∗
× δ(φ − φl ) Ynm (θ, φ) sin θ dθ dφ
 +1 2N
 +1
2N
 ∗
= αq f (θq , φl ) Ynm (θq , φl ) , (3.15)
q=0 l=0

with αq given by Eq. (3.11). The sifting property, Eqs. (1.52), (3.5) and (3.14), have
been employed in the derivation. This equation has the same form as Eq. (3.2)
defined through quadrature computation, with αq , in this case, defining the quadrature
weights. Substituting f (θ, φ) = Ynm (θ, φ), the orthogonality condition for equal-
angle sampling can be written in the same form as Eq. (3.3):

 +1 2N
 +1
2N
  ∗
αq Ynm (θq , φl ) Ynm (θq , φl ) = δnn  δmm  , n, n  ≤ N . (3.16)
q=0 l=0

Second, function f (θ, φ) can be reconstructed from the sampled function fs (θ, φ)
by applying an ideal low-pass filter in the spherical harmonics domain, with a cut-off
order of N . This low-pass filter should set to zero the coefficients of f snm for n > N ,
and keep unchanged the coefficients for n ≤ N . Selecting the filter

N  n
1 2n + 1 m
h(θ, φ) = Yn (θ, φ) (3.17)
n=0 m=−n
2π 4π

and applying spherical convolution, f s (θ, φ) ∗ h(θ, φ), which transforms to multi-
plication in the spherical harmonics domain [see Eq. (1.86)], f nm can be written as


f nm = 2π f snm h n0
2n + 1

f snm , n ≤ N
= , (3.18)
0 otherwise

such that perfect reconstruction of f (θ, φ) can be achieved.


66 3 Sampling the Sphere

3.3 Gaussian Sampling

The Gaussian sampling scheme described in this section requires only 2(N + 1)2
samples, which is half of the number of samples required by the equal-angle sampling
scheme. The azimuth angle is sampled at 2(N + 1) equal-angle samples, but the
elevation angle requires only (N + 1) samples, which are nearly equally spaced.
The mathematical formulation of the Gaussian sampling scheme is similar to the
formulation derived in Sect. 3.2 for the equal-angle scheme. There is, however, a
difference in that for the Gaussian sampling scheme, the orthogonality over the
summation of the Legendre functions


N

αq Pn (cos θq ) = δn , n ≤ 2N + 1 (3.19)
q=0
N +1

is not achieved by selecting 2(N + 1) equal-angle samples along θ ; it is achieved by


selecting (N + 1) samples that are the zeros of PN +1 (cos θ ),

PN +1 (cos θq ) = 0, 0 ≤ q ≤ N , (3.20)

and the weights are given by [6]

π 2(1 − cos2 θq )
αq =   , 0 ≤ q ≤ N. (3.21)
N + 1 (N + 2)2 PN +2 (cos θq ) 2

The coefficients can also be found in tables [7], which also provide the sampling
positions. The spherical Fourier transform is given in this case by

  +1
N 2N
 ∗
f nm = αq f (θq , φl ) Ynm (θq , φl ) , n ≤ N . (3.22)
q=0 l=0

The advantage of the Gaussian sampling scheme is the reduced number of sample
points for a given order N compared with the equal-angle sampling scheme. The
drawback is the potential inconvenience due to the non-equal spacings along θ ,
when microphones are mechanically rotated, for example, and an equal-step rotation
may be an advantage.
Figure 3.2 illustrates an example of a Gaussian sampling distribution for N = 7
and a total of 128 samples. The figure shows the samples plotted on the surface
of a unit sphere and over the θ φ plane. The figure shows the features of Gaussian
sampling – twice as many samples are distributed along the azimuth compared to the
elevation, while, similar to the equal-angle sampling scheme, the samples are more
dense near the poles.
3.4 Uniform and Nearly-Uniform Sampling 67

180

160

140

120

100

80

60

40

20

0
0 50 100 150 200 250 300 350

Fig. 3.2 Gaussian sampling distribution for N = 7 and a total of 128 samples, illustrated on the
surface of a unit sphere and over the θφ plane

3.4 Uniform and Nearly-Uniform Sampling

The equal-angle and Gaussian sampling schemes have a uniform (or nearly-uniform)
distribution of samples along θ and φ, but, as illustrated in Figs. 3.1 and 3.2, the dis-
tributions are not uniform on the surface of the sphere. An attempt to distribute
sampling points uniformly around the surface of a sphere, leads directly to the five
convex regular polyhedra, known as Platonic solids, named after the Greek philoso-
pher Plato. Figure 3.3 shows the five Platonic solids, namely, the tetrahedron, the
cube (or hexahedron), the octahedron, the dodecahedron and the icosahedron. The
Greek prefix denotes the number of faces for each Platonic solid (see Table 3.1).
68 3 Sampling the Sphere

Fig. 3.3 The five Platonic solids; from left to right, top row: tetrahedron and cube, bottom row:
octahedron, dodecahedron and icosahedron

Table 3.1 Properties of the sampling designs based on the five Platonic solids: the number of
faces, the number of vertices representing the sampling points (Q), the t-design order and the
corresponding maximum spherical harmonics order, calculated as N =
t/2
Design Faces Vertices t-design N =
t/2
Tetrahedron 4 4 2 1
Hexahedron 6 8 3 1
(cube)
Octahedron 8 6 3 1
Dodecahedron 12 20 5 2
Icosahedron 20 12 5 2

The vertices of each of these polyhedra can be considered as sampling points on a


circumscribed sphere, having a total number of samples as shown in the table.
Sampling points based on the Platonic solids satisfy the following quadrature
relation [4]:

2π π
4π 
Q
g(θ, φ) sin θ dθ dφ = g(θq , φq ), (3.23)
Q q=1
0 0

such that the sampling weights, in reference to Eq. (3.1), are constant, satisfying
αq = 4π/Q. Equation (3.23) holds for an order-limited function, with an upper order
denoted by t in a t-design that is defined for each Platonic solid, as shown in Table 3.1.
The term t-design is used in spherical designs that aim to find a set of Q points on a
3.4 Uniform and Nearly-Uniform Sampling 69

sphere such that Eq. (3.23) holds for a function of a polynomial order t or lower [4].
Spherical designs can be used for the sampling of order-limited functions represented
by spherical harmonics, by replacing g(θ, φ) with f (θ, φ)[Ynm (θ, φ)]∗ , such that
Eq. (3.23) can be written in the form of Eq. (3.2) as

2π π
 ∗
f nm = f (θ, φ) Ynm (θ, φ) sin θ dθ dφ
0 0

4π 
Q
 ∗
= f (θq , φq ) Ynm (θq , φq ) , n ≤ N . (3.24)
Q q=1

Assuming that f (θ, φ) has a maximum order N , and substituting Ynm (θ, φ) for n ≤
N , the maximum order of the product f (θ, φ)[Ynm (θ, φ)]∗ is 2N . This also denotes
the maximum t-design, with the relation N =
t/2 for a given value of t, where
·
denotes the floor function, as presented in the table.
Sampling distributions based on the Platonic solids and Table 3.1 offer uniform
distributions of samples, with a simple equation to compute the spherical Fourier
transform. However, they are available only for a limited number of configurations,
and for a maximum of 20 samples, supporting a maximum order of only N = 2.
This limited number of samples offered by the Platonic solids in a uniform-sampling
configuration motivated the search for methods to distribute a larger number of sam-
ples on the sphere in an almost-uniform manner. A wide range of methods have been
presented in the literature. Some are optimal in the sense that an objective function
is defined, after which the position of the samples and the corresponding sampling
weights are computed via numerical optimization. Other methods are characterized
by a special procedure for selecting the samples, or by other characteristics, such as
constant sampling weights. This section briefly reviews some of these methods.
Hardin and Sloane [4] extend the t-designs of the Platonic solids to a larger
set of sampling configurations, each satisfying Eq. (3.24) for some t value and a
corresponding order N . Similar to the Platonic solids, these designs offer an almost
uniform distribution of samples, with the convenience of constant sampling weights.
Although Hardin and Sloane computed and published the coordinates of a large
number of sampling sets, these sets are not available for any number of desired
samples, Q.
Saff and Kuijlaars [13] present an overview of approaches and methods for dis-
tributing many points on a sphere. They outline objectives for distributing points on
the sphere, which include maximizing the smallest distance between all points on the
sphere and minimizing the “energy” of points on the sphere. The latter is derived by
considering each point to be a charged particle repelling all other particles; therefore,
minimizing the sum of the inverse of the distances between these particles is analo-
gous to minimizing“energy”. The latter objective was also used by Fliege and Maier
[3], who presented a numerical method for computing the sampling positions and
70 3 Sampling the Sphere

weights. This method was recently employed in the design of spherical microphone
arrays [9].
Other approaches are characterized by the way in which the points are selected.
Equal-area partitioning aims to partition the sphere surface into equal area segments,
which each have a minimal diameter. One such method, described in [13] by Saff
and Kuijlaars and, more recently, by Leopardi [8], partitions the sphere surface into
azimuthal strips, each further divided into sections, with each section having the
same area. Sampling points are then positioned, one in each area element. Another
method described in [13] distributes points on spirals covering the sphere surface,
providing a relatively simple approach for a nearly-uniform distribution of samples.
Figure 3.4 illustrates an example of a uniform sampling distribution defined by
the vertices of a dodecahedron, with N = 2 and a total of 20 samples. Figure 3.5
illustrates a t-design with N = 8 and a total of 144 samples. Both figures show the
uniform distribution of the samples over the sphere surface and the non-uniform
distribution over the θ φ plane.

3.5 Numerical Computation of Sampling Weights

Equal-angle, Gaussian and nearly-uniform sampling methods provide both sampling


positions and sampling weights, such that the spherical Fourier coefficients can be
computed directly using Eq. (3.2). In some cases, it may not be feasible to select sam-
pling sets from these or other predefined sampling configurations due to mechanical
constraints in the positioning of microphones, for example. Therefore, methods that
facilitate the computation of sampling weights for any given sampling set and then
employ Eq. (3.2) to compute the spherical Fourier coefficients may be of great value
in practice.
Consider an order-limited function f (θ, φ) defined on the unit sphere, satisfying
f nm = 0∀ n > N . The samples of the function, f (θq , φq ), are given, together with
the positions of the samples, (θq , φq ), for q = 1, . . . , Q. Using the inverse spherical
Fourier transform, Eq. (1.40), the samples can be written as a function of the Fourier
coefficients as


N 
n
f (θq , φq ) = f nm Ynm (θq , φq ), 1 ≤ q ≤ Q. (3.25)
n=0 m=−n

This equation can be written in a matrix form as

f = Yfnm , (3.26)

where column vectors f of length Q and fnm of length (N + 1)2 are defined as
 T
f = f (θ1 , φ1 ), f (θ2 , φ2 ), . . . , f (θ Q , φ Q ) (3.27)
3.5 Numerical Computation of Sampling Weights 71

180

160

140

120

100

80

60

40

20

0
0 50 100 150 200 250 300 350

Fig. 3.4 Uniform sampling distribution for N = 2 and a total of 20 samples, illustrated on the
surface of a unit sphere and over the θφ plane

and  T
fnm = f 00 , f 1(−1) , f 10 , f 11 , . . . , f N N , (3.28)

and the matrix Y of dimensions Q × (N + 1)2 is given by


⎡ ⎤
Y00 (θ1 , φ1 ) Y1−1 (θ1 , φ1 ) Y10 (θ1 , φ1 ) Y11 (θ1 , φ1 ) · · · Y NN (θ1 , φ1 )
⎢ Y 0 (θ2 , φ2 ) Y −1 (θ2 , φ2 ) Y 0 (θ2 , φ2 ) Y 1 (θ2 , φ2 ) · · · Y N (θ2 , φ2 ) ⎥
⎢ 0 1 1 1 N ⎥
Y=⎢ .. .. .. .. .. ⎥.
⎣ . . . . ··· . ⎦
−1
Y0 (θ Q , φ Q ) Y1 (θ Q , φ Q ) Y1 (θ Q , φ Q ) Y1 (θ Q , φ Q ) · · · Y N (θ Q , φ Q )
0 0 1 N

(3.29)
72 3 Sampling the Sphere

180

160

140

120

100

80

60

40

20

0
0 50 100 150 200 250 300 350

Fig. 3.5 Nearly-uniform sampling distribution for N = 8 and a total of 144 samples, illustrated on
the surface of a unit sphere and over the θφ plane

For the special case of Q = (N + 1)2 , the system of equations defined in Eq. (3.26)
can be solved by taking the inverse of matrix Y:

fnm = Y−1 f. (3.30)

To compute fnm it is required that matrix Y is invertible. In many cases over-sampling


is employed, such that Q > (N + 1)2 . The linear system of equations in (3.26) is then
over-determined, with a solution in a least-square sense given by the pseudo-inverse:

fnm = Y† f, (3.31)
3.5 Numerical Computation of Sampling Weights 73

with Y† = (Y H Y)−1 Y H . For the case Q < (N + 1)2 , the number of samples is
insufficient, signifying under-sampling, and Eq. (3.26) may not provide the correct
solution.
Equations (3.30) and (3.31) can be used to find f nm for a general sampling set,
from which the function on the sphere, f (θ, φ), can be reconstructed using the
inverse spherical Fourier transform. This is employed below to formulate the com-
putation of f nm in a more standard manner, i.e. as the sum of the product of sam-
ples and sampling weights. Equations (3.30) or (3.31) are rewritten in the following
form:

Q
f nm = αqnm f (θq , φq ). (3.32)
q=1

Equation (3.32) has a form similar to Eq. (3.2), and so αqnm can be considered as the
sampling weights used to compute f nm given the samples f (θq , φq ). Note that in
this case the value of the weights may vary independently as a function of n and m.
Furthermore, the similarity between Eq. (3.32) and Eqs. (3.30) and (3.31) suggests
that the sampling weights, αqnm , are the elements of matrices Y−1 or Y† , having a
row index given by (n 2 + n + m) and a column index given by q.
The problem of first computing weights and then fnm given samples of the function
is related to the problem of interpolating a function given the samples. Substituting
Eq. (3.32) into the inverse spherical Fourier transform, Eq. (1.40), leads to the fol-
lowing derivation:
⎡ ⎤

N 
n 
N 
n 
Q
f (θ, φ) = f nm Ynm (θ, φ) = ⎣ αqnm f (θq , φq )⎦ Ynm (θ, φ)
n=0 m=−n n=0 m=−n q=1


Q 
N 
n
= αqnm Ynm (θ, φ) f (θq , φq )
q=1 n=0 m=−n


Q
= αq (θ, φ) f (θq , φq ), (3.33)
q=1

where αq (θ, φ) is the inverse spherical Fourier transform of αqnm . Functions αq (θ, φ)
can be considered as interpolating functions, such that when multiplied by the
value of the samples, f (θq , φq ), and added together, they provide the values of
f (θ, φ) in between the samples. This is in line with the interpolatory quadrature
method [1].
74 3 Sampling the Sphere

3.6 The Discrete Spherical Fourier Transform

Equations (3.26) and (3.31), derived in Sect. 3.5, can be considered to be discrete
versions of the spherical Fourier transform and its inverse, presented in Eqs. (1.40)
and (1.41). Therefore, these are denoted the discrete spherical Fourier transform and
its inverse:

fnm = Y† f
f = Yfnm . (3.34)

For the special cases of the equal-angle, Gaussian and uniform sampling config-
urations, where closed-form expressions are available for the sampling weights, the
discrete spherical Fourier transform can be computed without the need for matrix
inversion, using
fnm = Y H diag(α)f, (3.35)

where the column vector  T


α = α0 , α1 , . . . , α Q (3.36)

holds the sampling weights. Equation (3.35) is a matrix representation of Eq. (3.2).
Substituting Eq. (3.35) into the inverse discrete spherical Fourier transform in
Eq. (3.34), the following holds:

Y H diag(α)Y = I, (3.37)

which shows the orthogonality of the weighted columns in the spherical harmonics
matrix Y. Furthermore, for the uniform and nearly-uniform sampling configurations,
in which αq are constants equal to 4π/Q, Eqs. (3.35) and (3.37) reduce to

4π H
fnm = Y f (3.38)
Q

4π H
Y Y = I. (3.39)
Q

The three forms of the discrete spherical Fourier transform, Eqs. (3.34), (3.35),
and (3.38), can be written in a unified manner by defining matrix S such that

fnm = Sf. (3.40)

In the case of a general sampling scheme, Matrix S is given by

S = Y† , (3.41)
3.6 The Discrete Spherical Fourier Transform 75

in the case of equal-angle and Gaussian sampling schemes is given by

S = Y H diag(α) (3.42)

and for uniform and nearly-uniform sampling schemes is given by

4π H
S= Y . (3.43)
Q

Equation (3.39) suggests that matrix 4π Q
Y is unitary, when square. This property
is similar to the property of discrete Fourier transform (DFT) matrices; therefore,
uniform and nearly-uniform sampling schemes with the associated discrete spherical
Fourier transform matrices can be considered to be equivalent to the DFT matrices
in the time domain.
An important property of unitary matrices is that they have equal eigenvalues
and singular values. Sampling schemes in which the samples are distributed less
uniformly over the sphere will produce variance in the singular values magnitude,
and so the matrix inversion process required in the computation of the spherical
Fourier transform may have reduced numerical robustness. This motivates the design
of sampling sets that distribute samples on the sphere surface in an approximately
uniform manner.
Similar to the fast Fourier transform (FFT), developed to compute the DFT effi-
ciently, studies proposing fast and efficient computations of the spherical Fourier
transform have been published. The reader is referred to [10], for example, for fur-
ther reading on this topic.

3.7 Spatial Aliasing

Sampling of order-limited functions on the sphere with an appropriate sampling


scheme should lead to an exact and aliasing-free computation of the spherical har-
monic coefficients. However, in practice, high-order harmonics of a sampled func-
tion may not be zero, and so it may be useful to understand the way in which errors
may occur in non-ideal sampling and to provide ways to analyze and describe these
errors. Consider a function on the sphere, f (θ, φ), with a spherical Fourier trans-
form f nm of infinite order. The function is sampled at Q sampling points, denoted by
(θq , φq ), q = 1, . . . , Q. Assuming, at this stage, an arbitrary set of sampling points,
and substituting the inverse spherical Fourier transform, Eq. (1.40), in the general
form of the discrete spherical Fourier transform, Eq. (3.32), a relation between f nm
and the values approximated from the samples, denoted by fˆnm , is derived:
76 3 Sampling the Sphere


Q
fˆnm = αqnm f (θq , φq )
q=1


Q ∞ 
 n

= αqnm f n  m  Ynm (θq , φq )
q=1 n  =0 m  =−n 

⎡ ⎤
∞ 
 n 
Q
= ⎣ αqnm Ynm (θq , φq )⎦ f n  m 


n  =0 m  =−n  q=1

∞ 
 n
 
= εnm
nm
fn m  , (3.44)
n  =0 m  =−n 

where
 

Q

εnm
nm
= αqnm Ynm (θq , φq ) (3.45)
q=1

has been defined to denote the contribution of each coefficient f n  m  to the approx-
imation of coefficient fˆnm . Under conditions of ideal, aliasing-free sampling, εnm n m 

should equal one for (n, m) = (n  , m  ) and zero elsewhere.


It may be convenient to represent Eq. (3.44) in a matrix form, in which case the
spherical harmonic coefficients of the original function before sampling, f n  m  , need
to be order-limited. However, the limit can be extended to very high orders, denoted
here Ñ , beyond which the magnitude of f n  m  may be insignificant. Equation (3.44)
is now written as
f̂nm = Efnm , (3.46)

where column vector f̂nm of length (N + 1)2 holds the approximated spherical har-
monic coefficients fˆnm , column vector fnm of length ( Ñ + 1)2 holds the spherical
harmonic coefficients f nm of the original function, with Ñ ≥ N and potentially very
n m 
large, and matrix E is of dimensions (N + 1)2 × ( Ñ + 1)2 , having elements εnm ,
with row index (n 2 + n + m) and column index (n 2 + n  + m  ).
Sampling schemes that guarantee aliasing-free sampling for order-limited func-
tions satisfying Q ≥ (N + 1)2 , where N is the order limit, should produce a matrix
E with the top-left part of dimensions (N + 1)2 × (N + 1)2 being the unit matrix I.
In this case, only orders higher than N may produce aliasing. For an arbitrary sam-
pling scheme, with αqnm denoting the elements of Y† (see Sect. 3.5), matrix E can be
written as
E = Y† Ỹ, (3.47)

where matrix Y of dimensions Q × (N + 1)2 has been defined in Eq. (3.29) and

matrix Ỹ, holding the values of Ynm (θq , φq ) as in Eq. (3.45), is of dimensions Q ×
( Ñ + 1)2 . For equal-angle and Gaussian sampling, the sampling weights are provided
in closed form and no matrix inversion is required. In these cases matrix E can be
3.7 Spatial Aliasing 77

(a) (dB)
15 0
10
5
0 -50
0 20 40 60 80

(b) (dB)
15 0
10
5
0 -50
0 20 40 60 80

(c) (dB)
15 0
10
5
0 -50
0 20 40 60 80

 
n m for n ≤ 3 and n  ≤ 9 and for three sampling config-
Fig. 3.6 Elements of the aliasing matrix εnm
urations; a equal-angle (64 samples), b Gaussian (32 samples) and c nearly-uniform (32 samples)

written as
E = Y H diag(α)Ỹ, (3.48)

where vector α holds the sampling weights, as in Eq. (3.35). In the case of uniform
and nearly-uniform sampling, the expression for E is further simplified due to the
constant sampling weights, and is written as

4π H
E= Y Ỹ. (3.49)
Q
 
The magnitude of the elements of matrix E, i.e. εnm nm
, are presented in Fig. 3.6 for the
three sampling configurations; equal-angle, Gaussian and nearly uniform, for N = 3
and Ñ = 9. The values of (n, m) are presented on a single axis, with a running index
n 2 + n + m, where sections of equal order n are partitioned by a horizontal line. The
values of (n  , m  ) are presented similarly. The figure shows the manner in which high
orders, n  > N , are aliased into lower orders. The figure shows that not all elements
(n  , m  ) contribute to the aliasing error in each (n, m).
An example is presented next to illustrate the process of sampling and aliasing.
Consider a function on the sphere:
78 3 Sampling the Sphere

Fig. 3.7 Function f (θ, φ),


as in Eq. (3.50), illustrated
with a balloon plot (upper
plot) and with function
elements f 1 (θ, φ) and
f 2 (θ, φ) illustrated
separately (lower plots, left
and right, respectively)

f (θ, φ) = f 1 (θ, φ) + f 2 (θ, φ)



√ 1 1024π  −5 
= 4π Y0 (θ, φ) +
0
Y5 (θ, φ) − Y55 (θ, φ) . (3.50)
2 693

The function is composed of the zero-order spherical harmonic normalized to unit


magnitude and two spherical harmonics of order n = 5 and degrees m = −5, 5,
which, when combined after normalization, form a real function of unit magnitude.
The function is illustrated in Fig. 3.7 with the separate components shown as well. The
function has been sampled using an equal-angle sampling scheme designed for N =
3, with 64 sampling points. Note that this sampling scheme guarantees aliasing-free
sampling only for functions order-limited to N = 3. Figure 3.6a suggests that, using
this sampling scheme, elements of the function of order n = 5 will be aliased to order
n = 3, without significant scaling. In particular, Y5−5 (θ, φ), will be aliased to Y33 (θ, φ)
and Y55 (θ, φ) will be aliased to Y3−3 (θ, φ). The axes on Fig. 3.6 denote the running
index n 2 + n + m and so n  = 5, m  = −5, which represents n 2 + n  + m  = 25 will
be aliased to n = 3, m = 3, which represents n 2 + n + m = 15 on the figure. After
sampling, f (θq , φq ), q = 1, . . . , 64 is employed for reconstruction using Eq. (3.15).
The reconstructed function is denoted fˆ(θ, φ) and is given by

fˆ(θ, φ) = fˆ1 (θ, φ) + fˆ2 (θ, φ)



√ 1 64π  −3 
≈ 4π Y0 (θ, φ) +
0
Y3 (θ, φ) − Y33 (θ, φ) . (3.51)
2 35

This function is illustrated in Fig. 3.8. The figure confirms the sampling and aliasing
process described above: the spherical harmonic of order zero is reconstructed with
no error, while spherical harmonics of order n = 5 are aliased to spherical harmonics
of order n = 3.
The aliasing structure for the equal-angle and Gaussian sampling configurations
has been analyzed in detail in [12] and is now outlined here. First, note that although
3.7 Spatial Aliasing 79

Fig. 3.8 Function f (θ, φ),


as in Eq. (3.50), after
sampling using an
equal-angle sampling
scheme with N = 3 and after
reconstruction using
Eq. (3.15), leading to
fˆ(θ, φ), as in Eq. (3.51),
illustrated with a balloon plot
(upper plot) and with
fˆ1 (θ, φ) and fˆ2 (θ, φ)
illustrated separately (lower
plots, left and right,
respectively)

these sampling configurations are designed for order-limited functions of maximal


order N , the contribution of aliasing starts at N + 2 and higher. This is due to the
fact that when computing the spherical Fourier transform, the product of the function
and the spherical harmonics, f (θ, φ)[Ynm (θ, φ)]∗ , is sampled and then weighted and
summed. Aliasing-error-free computation is guaranteed when the sum of the orders
of the function and the spherical harmonics is limited, i.e. n + n  ≤ 2N + 1. This
means that the contribution of aliasing error for harmonics n = N will occur for
n  ≥ N + 2, while for n = 0 aliasing will only start at n  ≥ 2N + 2. This behavior
is shown in Fig. 3.6a, b.
To analyze the other properties of the aliasing error, Eq. (3.45) is written for the
special case of equal-angle and Gaussian sampling. Different indices are used to
denote the elevation and azimuth coordinates of the samples, as in Eq. (3.4), and the
sampling weights are independent of n and m, as in Eq. (3.11):

 +1 2N
 +1
 
2N
 ∗ 
εnm
nm
= αq Ynm (θq , φl ) Ynm (θq , φl )
q=0 l=0
 
2n + 1 (n − m)! 2n  + 1 (n  − m  )!
=
4π (n + m)! 4π (n  − m  )!

2N +1 
2N +1
 
× αq Pnm (cos θq )Pnm (cos θq ) eiφl (m −m) . (3.52)
q=0 l=0

For the Gaussian sampling case, the summation over q ranges from zero to N . Now,
due to the equal spacing the summation over l is zero, unless (m  − m) mod (2N +
2) = 0. Therefore, aliasing clearly occurs for terms with m  = m, as is evident from
the diagonal behavior within given orders n, n  . For the higher orders of n  , replicas
of the diagonal term are also evident, due to the modulo operation.
80 3 Sampling the Sphere

The final property affecting the behavior of aliasing is due to the summation over
q. The samples, arranged symmetrically relative to the equator along the elevation,
with a similar symmetry for the sampling weights, produce a sum of zero along q
when n + m + n  + m  is odd [12]. Now, because (m  − m) mod (2N + 2) is zero,
the condition for a sum of zero along q reduces to n + n  being odd. This is clearly
evident in Fig. 3.6a, b, where alternating regions of constant n and n  are zero, which
indeed occurs when n + n  is odd.
Other sampling configurations may not exhibit such a regular aliasing pattern.
n m 
For example, εnm for a nearly-uniform sampling configuration with 32 samples is
presented in Fig. 3.6c. Although some patterns similar to that shown in Fig. 3.6a, b
are observed, e.g. diagonal aliasing terms, the pattern in more complex in general.

References

1. Arfken, G., Weber, H.J.: Mathematical Methods for Physicists, 5th edn. Academic, San Diego
(2001)
2. Driscoll, J.R., Healy Jr., D.M.: Computing Fourier transforms and convolutions on the 2-sphere.
Adv. Appl. Math. 15(2), 202–250 (1994)
3. Fliege, J., Maier, U.: The distribution of points on the sphere and corresponding cubature
formulae. IMA J. Numer. Anal. 19(2), 317–334 (1999)
4. Hardin, R.H., Sloane, N.J.A.: McLaren’s improved snub cube and other new spherical designs
in three dimensions. Discret. Comput. Geom. 15(4), 429–441 (1995)
5. Healy Jr., D.M., Rockmore, D.N., Kostelec, P.J., Moore, S.: FFTs for the 2-sphere - improve-
ments and variations. J. Fourier Anal. Appl. 9(4), 341–384 (2003)
6. Hildebrand, F.B.: Introduction to Numerical Analysis, 2nd edn. McGraw-Hill, New York (1974)
7. Krylov, V.I.: Approximate Calculation of Integrals. Macmillan, New York (1962)
8. Leopardi, P.: A partition of the unit sphere into regions of equal area and small diameter.
Electron. Trans. Numer. Anal. 25, 309–327 (2006)
9. Li, Z., Duraiswami, R.: Flexible and optimal design of spherical microphone arrays for beam-
forming. IEEE Trans. Audio Speech Lang. Process. 15(2), 702–714 (2007)
10. Mohlenkamp, M.J.: A fast transform for spherical harmonics. J. Fourier Anal. Appl. 5(2/3),
159–184 (1999)
11. Proakis, J.G., Manolakis, D.K.: Digital Signal Processing, 4th edn. Prentice Hall, New Jersey
(2006)
12. Rafaely, B., Weiss, B., Bachmat, E.: Spatial aliasing in spherical microphone arrays. IEEE
Trans. Signal Process. 55(3), 1003–1010 (2007)
13. Saff, E.B., Kuijlaars, A.B.J.: Distibuting many points on a sphere. Math. Intel. 19(1), 5–11
(1997)
Chapter 4
Spherical Array Configurations

Abstract Motivated by the problem of spatial sampling of a sound field by a


spherical array, Chap. 3 presented methods for sampling functions on a sphere,
followed by methods for reconstructing a function from its samples. These could
form the basis for computing the sound pressure on the surface of a sphere, given
measurements by an array of microphones. However, in spherical microphone array
processing one may also be interested in computing the sound field around the array
by decomposing the sound field into plane-wave components, for example. In this
case, placing pressure or omni-directional microphones on the surface of a single
sphere in free-field may not yield accurate plane-wave decomposition, due to zeros
of the spherical Bessel function. This problem is presented at the beginning of the
chapter. One possible solution is to place microphones on the surface of a rigid
sphere. This configuration offers a practical advantage – the rigid sphere provides an
ideal housing for all microphone wiring and conditioning electronics. However, one
drawback of the rigid sphere is that sound scattered from the sphere can be reflected
back by surrounding objects, thereby modifying the sound field it measures. This
is particularly important for arrays used for sound field analysis in room acoustics,
for example, in which case placing microphones in a free field, in an open-sphere
configuration, may be preferable. Open spherical array configurations that avoid the
problem of the zeros of the spherical Bessel function are therefore presented next.
The array configuration may also affect other aspects of array performance related
to the frequency range of operation and to the sensitivity to sensor noise and to
other uncertainties. A general framework for array design that considers a range of
objectives is introduced, followed by example designs. The chapter concludes with
a description of an open spherical array configuration in which the microphones are
placed within the volume of a shell. Other array configurations, including the hemi-
spherical array, another array comprised of concentric rigid and open spheres, and
an array incorporating non-spherical sampling surfaces, are also discussed.

© Springer Nature Switzerland AG 2019 81


B. Rafaely, Fundamentals of Spherical Array Processing, Springer Topics
in Signal Processing 16, https://doi.org/10.1007/978-3-319-99561-8_4
82 4 Spherical Array Configurations

4.1 Single Open Sphere

This section presents one of the simplest configurations of a spherical microphone


array. Here, pressure microphones, or microphones that directly measure sound pres-
sure [4], are placed on the surface of a virtual sphere in a free field. Typically, these
microphones require some mechanical support, but it is assumed that the construc-
tion is sufficiently slim so that free field pressure measurement is attained. The
sound pressure measured at the microphone locations can be considered as sam-
ples of the continuous sound pressure function on the sphere surface. Therefore, the
methods for sampling and reconstruction presented in Chap. 3 can be used here to
reconstruct the sound pressure at the sphere surface, p(k, r, θ, φ), given the sam-
ples, p(k, r, θq , φq ), q = 1, . . . , Q. Following the previously-presented notation, k
denotes the wave number and r the sphere radius. Reconstruction is achieved through
computation of the spherical harmonic coefficients, as formulated, for example, in
Eq. (3.32), and rewritten here:


Q
pnm (k, r ) = αqnm p(k, r, θq , φq ), n ≤ N . (4.1)
q=1

The total number of samples is given by Q and the maximum reconstructed order is
N , where αqnm are the sampling weights. Perfect reconstruction will be achieved only
if the sampled pressure function is order limited, i.e. pnm = 0 ∀ n > N . However, as
discussed in Sect. 2.3 and illustrated in Fig. 2.6, a sound field composed of plane
waves, for example, is not order limited, so that errors due to spatial aliasing are
unavoidable when reconstructing the sound pressure from its samples. Nevertheless,
these errors can be made negligible if the magnitude of the high-order coefficients
is kept sufficiently small. This is maintained for all n  kr. Hence, assuming that
the choice of sampling method, frequency and sphere radius satisfy kr < N , spatial
aliasing error can be kept small.
Although the reconstruction of sound pressure on the surface of the measurement
sphere may be feasible with some limited aliasing error, the reconstruction of sound
pressure around the sphere requires the following formulation [see Eq. (2.47)]:

∞  n
jn (kr  )
p(k, r  , θ  , φ  ) = pnm (k, r )Ynm (θ  , φ  ), (4.2)
n=0 m=−n
jn (kr )

where (r  , θ  , φ  ) is a position outside the measurement sphere, such that r  > r.


It is clear from Eq. (4.2) that reconstruction of the pressure outside the measure-
ment sphere is only possible if jn (kr ) = 0. In practice, jn (kr ) must be significantly
different from zero to avoid numerical errors due to division by a small number.
This requirement also holds for general array processing methods, not only pres-
sure reconstruction, see Chap. 5. Figure 2.1 clearly shows that the spherical Bessel
function equals zero for various values of n and kr and so, in practice, it may be
4.1 Single Open Sphere 83

difficult to avoid a division by zero, unless a very restricted set of frequencies, radii
and orders are selected. This is the main drawback of the single-sphere configuration
with pressure microphones in free field and is, therefore, the reason for consider-
ing other spherical array configurations, such as an array configured around a rigid
sphere.
Another important issue related to array configuration is sensitivity to sensor noise.
Figure 2.2 shows that jn (kr ) vanishes for all n > 0 as kr → 0. Furthermore, the decay
towards zero is steeper for higher orders. This means that pressure reconstruction
away from the sphere, as in Eq. (4.2), or general array processing methods, may
require division by a small value for low kr ; this may, potentially, amplify noise in
a practical array system. One way to avoid this undesirable effect is to reduce the
effective array order, N , at low frequencies, by including only coefficients with a
sufficiently large magnitude. This, however, may come at the expense of performance
in terms of accuracy of reconstruction and spatial resolution, which depend on N
(see Chap. 5).
It is clear from the analysis presented above that the array configuration may affect
various aspects of array performance. The theoretical analysis is now summarized
in the following points, which serve as considerations for spherical array design,
demonstrated in the design example that follows.
(i) The spatial sampling method is first selected (see Chap. 3). This defines the
angular part of the position of the microphones, (θq , φq ), q = 1, . . . , Q, and
the maximum order N for aliasing-free sampling of functions on the sphere.
(ii) The radius of the sphere, r , is then selected. This defines the radial part of the
position of the microphones. With r and N defined, the frequency range of
operation can be established.
(iii) The upper frequency limit is bounded by spatial aliasing. The upper frequency
f determines the wave number, k = 2π f /c, which, together with r and N , must
satisfy kr < N to avoid significant error due to spatial aliasing.
(iv) The lower frequency limit is bounded by sensor noise and other errors, such
as mismatch in microphone gain and phase response, inaccurate positioning
of microphones and limited computational accuracy. Array processing that
involves a division by jn (kr ) may be ill-conditioned at low frequencies and
low values of kr if the magnitude of jn (kr ) is small. For a given frequency
and radius satisfying kr N , the highest order, n = N , will have the lowest
magnitude and so will contribute most significantly to performance degrada-
tion due to noise. The exact frequency at which j N (kr ) will no longer be useful
may depend on the noise level and the level of other errors and may change,
depending on the system specifications in practice.
(v) At some frequencies within the operating range, i.e. between the upper and
lower frequency limits, jn (kr ) may also become small if these frequencies
satisfy jn (kr ) ≈ 0. This is an inherent limitation of the single open-sphere
configuration.
An example of an open-sphere array design is presented next. Consider a sphere of
radius r = 8 cm with 72 microphones arranged using a Gaussian sampling scheme,
84 4 Spherical Array Configurations

30
0
20
1
10
2
0
3
-10
4
-20
5
-30
6
-40 7
-50 8

-60
500 1000 1500 2000 2500 3000 3500 4000 4500

Fig. 4.1 The magnitude of 4πi n jn (kr ) for n = 0, . . . , 8 as a function of frequency, with r = 8 cm
and k = 2π f /c, showing the limit at f = 3,412 Hz where kr = 5 is satisfied

facilitating aliasing-free sampling of functions that are order-limited to N = 5.


Assuming the sound field is composed of a superposition of plane waves, the upper
frequency limit may be chosen to satisfy kr = N to limit aliasing error. Substituting
the values of r and N and using the relation k = 2π f /c, where c is the speed of sound
(343 m/s at 20 ◦ C) and f is the frequency in Hertz, the upper operating frequency
of the array is about 3,400 Hz. Figure 4.1 shows the magnitude of 4πi n jn (kr ) as a
function of frequency and also illustrates the limit kr = N at 3,412 Hz. The figure
shows that the spherical Bessel function j0 (kr ) is zero at 2,144 Hz and the spherical
Bessel function j1 (kr ) is zero at 3,066 Hz. The spherical harmonic coefficients of
the measured sound pressure, p00 (k, r ) and p1m (k, r ), m = −1, 0, 1, are expected to
be of low magnitude around the zero frequencies and are, therefore, sensitive to the
effect of noise. Furthermore, at frequencies below about 1,000 Hz the magnitude of
the spherical Bessel functions for n > 0 decays towards the origin, therefore increas-
ing the sensitivity of the measured pnm (k, r ) to noise. For example, the magnitude of
4πi n j5 (kr ) is about −43 dB at 1,000 Hz, as marked on the figure. If this is the lowest
magnitude to be measured with a useful signal-to-noise ratio (SNR), then coefficients
p5m for −5 ≤ m ≤ 5 will be usable only above about 1,000 Hz; as a consequence
any measurement below this frequency will be effectively of maximum order N = 4
or lower.
The following sections are dedicated to presenting array configurations that over-
come the limitation outlined in point (v) above, i.e. the effect of the zeros of the
spherical Bessel function. These configurations typically show a behavior similar
to that of the single open sphere with regard to the other points in the above list.
With regard to point (v), it is useful to present the relation between the pressure on
4.1 Single Open Sphere 85

the sphere and the amplitude of the plane waves composing the sound field in the
spherical harmonics domain, as in Eq. (2.45):

pnm (k, r ) = bn (kr )anm (k), (4.3)

where
bn (kr ) = 4πi n jn (kr ). (4.4)

This is an important relation as it defines the way in which the plane wave sound
field, anm , is measured on the sphere surface, pnm , with the function bn (kr ) defining
the projection of the sound field onto the sphere surface. It is clear now that the
computation of the sound field, anm , given the measurement, pnm , requires a division
by bn (kr ), which, in the case of a single open sphere, means a division by the spherical
Bessel function. Equations (4.3) and (4.4) represent a general and useful way to
present the effect of array configuration. As presented in the following sections,
other array configurations will also be presented in the form of Eq. (4.3), but with
different terms composing bn (kr ). The aim is to develop array configurations for
which bn (kr ) do not possess zeros within the operating frequency range and for the
selected radius values.

4.2 Rigid Sphere

The rigid-sphere array configuration [8] comprises microphones placed on the surface
of a sphere composed of a hard, fully reflecting material, such as hard wood or thick
metal. The analysis of the single open-sphere configuration presented in Sect. 4.1
applies also to the rigid-sphere configuration. However, the relation between the
sound field around the sphere and the pressure on the sphere surface is characterized
by a different function bn (kr ), due to scattering from the rigid sphere. Chapter 2
presented an analysis of sound fields; in Eq. (2.62), the term for bn that includes the
effect of the incident sound field and the scattered sound field around a rigid sphere
is developed and is rewritten here for convenience:
 
jn (kra ) (2)
bn (kr ) = 4πi n jn (kr ) − (2)  h n (kr ) . (4.5)
h n (kra )

Function bn is dependent on both ra , the radius of the rigid sphere, and r , satisfying
r ≥ ra , representing the distance from the origin of a point on or outside the rigid-
sphere surface. Note, however, that the explicit dependence of bn on ra is not shown,
for notation simplicity.
The magnitude of functions bn for an open sphere, or a sphere in free field, and a
rigid sphere, have been presented in Chap. 2 in Figs. 2.1 and 2.9, respectively. These
plots are presented here in a single figure, Fig. 4.2, omitting the order indices, for
simplicity. In the case of the rigid sphere, r = ra was assumed. The figure clearly
86 4 Spherical Array Configurations

30
Open sphere
20 Rigid sphere

10

-10

-20

-30

-40

-50

-60
1 2 3 4 5 6

Fig. 4.2 The magnitude of bn (kr ) for n = 0, . . . , 3 for a rigid sphere with r = ra and an open
sphere

shows the elimination of the nulls of bn in the rigid-sphere configuration compared


to the open-sphere configuration. Also note that the magnitude of bn is slightly larger
in the rigid-sphere configuration, due to the scattered sound field component. This
is actually advantageous, because it means that the magnitude of bn is higher at high
orders and low frequencies and, as a result, the array is slightly more robust to the
effect of sensor noise and other errors, as discussed in Sect. 4.1.
An additional advantage of the rigid-sphere array is the ease of microphone mount-
ing and the potential use of the interior space of the sphere for housing the microphone
amplifiers and other conditioning electronics. It is therefore suitable for real-time
microphone array applications, which require simultaneous recording of all micro-
phone signals. One clear disadvantage of the rigid-sphere array concerns low fre-
quency performance. If it is required to compute anm up to a high order, N , but at low
frequencies, one needs to design an array with a large radius ra to avoid operating
at a condition of kr N , where there would be excessive noise in the measure-
ment of pnm at high orders. However, arrays built around very large rigid spheres
may not be easy to handle in practice, and may be undesirable for other practical
reasons. Furthermore, scattering of the incident sound from the large rigid sphere
might be reflected back into the measurement region by surrounding objects (such as
room walls), modifying the measured sound field. To summarize, small rigid-sphere
arrays may be useful; however, in some cases it is desirable to design array config-
urations that avoid the nulls of the spherical Bessel functions without introducing a
rigid sphere into the measurement region. Such configurations are discussed in the
following sections.
4.2 Rigid Sphere 87

30
0
20
1
10
2
0
3
-10 4
-20 5

-30 6

-40 7
8
-50

-60
500 1000 1500 2000 2500 3000 3500 4000 4500

Fig. 4.3 The magnitude of bn (kr ) in Eq. (4.5) with r = ra for n = 0, . . . , 8 as a function of
frequency, with ra = 8 cm and k = 2π f /c, showing the limit at f = 3, 412 Hz where kr = 5 is
satisfied

The example of a design introduced in Sect. 4.1 for the open-sphere configuration
is outlined here for the rigid-sphere configuration. The open sphere of radius r = 8 cm
is replaced by a rigid sphere of the same radius, ra = 8 cm. Figure 4.3 shows the
magnitude of bn (kr ) for this design, computed using Eq. (4.5) with r = ra . The figure
shows that the low-magnitude problem at frequencies 2,144 and 3,066 Hz no longer
exists due to the elimination of the zeros of the spherical Bessel function. Otherwise,
the designs are similar, with the exception that the magnitude of bn is slightly higher
in the rigid-sphere design. This becomes an advantage for b5 (kra , kra ) at 1,000 Hz,
for example, where it has a magnitude of −37 dB at this frequency, as marked in the
figure; the rigid-sphere design is therefore less sensitive to noise compared to the
open-sphere array under these conditions.

4.3 Open Sphere with Cardioid Microphones

A spherical microphone array configuration that uses microphones in free field, but
nevertheless overcomes the problem introduced by the nulls of the spherical Bessel
function, is presented in this section. This configuration is the same as the single
open-sphere configuration discussed in Sect. 4.1, only here the microphones are
of the cardioid type rather than of the pressure type [4]. This means that instead
of using omni-directional microphones, one uses directional microphones with a
first-order cardioid directivity that measures a combination of pressure and radial
88 4 Spherical Array Configurations

pressure gradient. First-order directional microphones have been recently employed


in microphone arrays with circular configurations [3, 7]. For a spherical array, the
use of these microphones has been discussed in [2, 6].
The output of a cardioid microphone facing the radial direction can be written as

1 ∂
x(k, r, θ, φ) = p(k, r, θ, φ) + p(k, r, θ, φ). (4.6)
ik ∂r
The microphone signal in response to a unit-amplitude plane wave can be derived by
substituting p(k, r, θ, φ) = ei k̃·r = eikr cos Θ in Eq. (4.6), where Θ denotes the angle
away from the radial look direction, and is given by

x(k, r, θ, φ) = eikr cos Θ (1 + cos Θ). (4.7)

Here, k̃ = (k, θk , φk ) denotes the wave vector pointing in the arrival direction, as in
Eq. (2.37), and r = (r, θ, φ) denotes the position of the microphone. The output of
the microphone includes the term (1 + cos Θ), which is the cardioid directivity [4].
Figure 4.4 illustrates the directivity of a cardioid microphone on a polar plot.
Equation (4.6) can also be written in the spherical harmonics domain, by substi-
tuting the spherical harmonics representation of a unit-amplitude plane wave for p,
as in Eq. (2.37):
  ∗
xnm (k, r ) = 4πi n jn (kr ) − i jn (kr ) Ynm (θk , φk ) . (4.8)

Considering a plane wave with an amplitude of a(k, θk , φk ), and extending the sound
field to include a continuum of plane waves, as in Sect. 2.4, leads to

Fig. 4.4 Normalized polar 90


plot of the directivity of a 120 1 60
cardioid microphone,
0.8
2 (1 + cos Θ)
1

150 0.6 30
0.4

0.2

180 0 0

210 330

240 300
270
4.3 Open Sphere with Cardioid Microphones 89

30
Pressure mic.
20 Cardioid mic.

10

-10

-20

-30

-40

-50

-60
1 2 3 4 5 6

Fig. 4.5 The magnitude of bn (kr ) for n = 0, . . . , 3 for spherical arrays with microphones in free
field using cardioid microphones and pressure microphones

xnm (k, r ) = bn (kr )anm (k), (4.9)

where  
bn (kr ) = 4πi n jn (kr ) − i jn (kr ) . (4.10)

Equations (4.9) and (4.10) show that the output of a spherical array composed of
cardioid microphones in free field can be written in the same form as the output of
a spherical array with pressure microphones, either in free field or around a rigid
sphere, but with a different function bn (kr ). In this case, function bn includes a jn
term due to the pressure component and a term with a derivative of jn due to the
pressure gradient component.
Figure 4.5 compares |bn | for open-sphere arrays with pressure and with cardioid
microphones for n = 0, . . . , 3. The figure shows that, similar to the rigid-sphere
array, the use of cardioid microphones eliminates the zeros of the spherical Bessel
function. Furthermore, similar to the rigid-sphere array, the magnitude of bn at low
values of kr is higher than with the pressure microphone configuration. The increase
in magnitude is even larger than the increase in the rigid-sphere case, as illustrated
in Fig. 4.2, suggesting a potential improvement in robustness to noise. However, this
improvement may not be evident in practice. This is because cardioid microphones
usually suffer from excessive noise at low frequencies due to the spatial derivative
operation, which is typically approximated by pressure difference measurement,
which may be small at low frequencies.
Although the use of a single open-sphere array with cardioid microphones seems
attractive due to the simplicity of this configuration, it has drawbacks. First, in addi-
90 4 Spherical Array Configurations

tion to excessive noise at low frequencies, deviation from the cardioid pattern may
produce errors in the array model function bn , when used in array processing, for
example. Furthermore, pressure microphones are often the microphone of choice in
acoustic measurement systems, and so a spherical array based on pressure micro-
phones may be preferable. An open-sphere array that employs pressure microphones
and overcomes the limitations imposed by the spherical Bessel null is presented in
the next section.

4.4 Dual-Radius Open Sphere

The dual-radius open-sphere array configuration is composed of two concentric open-


sphere arrays with pressure microphones. Figure 4.2 shows that the zeros of bn for
an open-sphere array appear at specific values of kr . Therefore, if we measure the
sound field using two concentric open-sphere arrays with different radii, r1 and r2 ,
each zero will appear at a different frequency, or wave number, for each sphere. This
property is the basis for the dual-radius array; missing information in one array due
to the zeros of the spherical Bessel function is obtained from the other array. Hence,
the arrays operate in a complementary manner to overcome the limitations imposed
by the zeros. The sound pressure measured by both arrays, presented in the spherical
harmonics domain, for a plane-wave sound field with an amplitude density of anm (k),
can be written, following Eqs. (4.3) and (4.4), as

p1 nm (k) = 4πi n jn (kr1 )anm (k)


p2 nm (k) = 4πi n jn (kr2 )anm (k). (4.11)

Computation of anm (k) at each frequency requires a division by jn (kr ), so that at


each frequency, or wave number, and each order, only one of the equations in (4.11) is
selected, according to the magnitude of jn (kr ). More formally, a selection parameter,
β, is first introduced as follows [2]:

0, | jn (kr1 )| ≥ | jn (kr2 )|
βn (kr1 , kr2 ) = . (4.12)
1, | jn (kr1 )| < | jn (kr2 )|

Now, an expression combining terms from both arrays can be derived:

p12 nm (k) = bn (kr1 , kr2 )anm (k), (4.13)

with

bn (kr1 , kr2 ) = [1 − βn (kr1 , kr2 )] 4πi n jn (kr1 ) + βn (kr1 , kr2 )4πi n jn (kr2 ). (4.14)
4.4 Dual-Radius Open Sphere 91

30

20

10

-10

-20

-30

-40
Open r 1

-50 Open r 2
Dual sphere
-60
1 2 3 4 5 6

Fig. 4.6 The magnitude of bn (k) ≡ bn (kr1 , kr2 ) for n = 0, . . . , 3 for spherical arrays with a dual-
radius open-sphere configuration of radii r1 = 1 m and r2 = 0.833 m and two single open-sphere
configurations of radii r1 and r2

Function p12 nm (k) represents the spherical harmonic coefficients of the pressure
function from both spheres. Equations (4.13) and (4.14) describe a relation between
the measured pressure and the plane-wave sound field through function bn , defined
here for the case of the dual-radius array.
Figure 4.6 shows the magnitude of bn (k) ≡ bn (kr1 , kr2 ) for the dual-radius array
with r1 = 1 m and r2 = 0.833 m. The figure shows that the zeros of the spherical
Bessel function are avoided using this approach. The figure also shows bn for the
two open-sphere arrays with r1 and r2 both having zeros, but at scaled locations.
An important design issue for the dual-radius spherical array is the choice of the
ratio of the two radii, denoted as α = r1 /r2 . Balmages and Rafaely [2] proposed
both numerical and analytical approaches for finding the best ratio. Given r1 , and
assuming r2 is constrained to a smaller radius, r2 < r1 , the radius ratio should produce
the highest possible magnitude of jn (kr ), where for each wave number k and order n
the largest of the two values | jn (kr1 )| and | jn (kr2 )| is selected. This is formulated
as follows:
αopt = arg max min min max {| jn (kr1 )|, | jn (kr2 )|} . (4.15)
α n k

The minimization over k is typically taken in the range kr1 ≥ n to avoid low values
of jn that are due to the high-pass characteristic of jn at low kr values. Furthermore,
in typical arrays, aliasing is significant for kr > N and so k is typically restricted to
the range n ≤ kr1 ≤ N . The minimization over n is taken in the range 0 ≤ n ≤ N .
Examples for the numerical calculation of α have been presented in [2].
92 4 Spherical Array Configurations

A simplified expression for α has also been proposed in [2]. Increasing α from
α = 1 (a single sphere) is equivalent to scaling the argument of jn (kr ) and thereby
shifting the zeros to higher wave numbers. When shifted zeros of jn (kr2 ) re-coincide
with the original zeros of jn (kr1 ), the zero at the given wave number cannot be
recovered. Now, taking the mid-point between α = 1 and the value of α leading to
coincidence of zeros, and assuming limits on the gaps between zeros along kr , it has
been shown that a good approximation for the optimal α is given by [2]
π
αopt ≈ 1 + . (4.16)
2N
An example of a design has been presented in [11] for the measurement of room
impulse responses in an auditorium. The dual-sphere array was composed of 882
microphone positions on each sphere, arranged using the Gaussian sampling scheme.
This provides aliasing-free sampling up to order N = 20, such that 2(N + 1)2 = 882.
The radius of the first sphere was set to r1 = 0.43 m, such that kr1 = N was satisfied
at frequency f ≈ 2.5 kHz, which thus constitutes the upper operating frequency of
the array. Note that a slightly higher upper frequency was used in [11]. Substituting
N = 20 in Eq. (4.16) leads to α ≈ 1.078 and r2 = 0.4 m. This example illustrates that
even though the two radii in this dual-sphere configuration are a very small distance
apart, this is sufficient to eliminate the nulls due to the spherical Bessel functions.
Although the dual-radius spherical array presented in this section provides a prac-
tical solution to the problem of the zeros of the spherical Bessel function using pres-
sure microphones, the downside is that it requires two spheres and twice as many
microphones compared to the single open-sphere array. More efficient methods are
presented in the following sections, based on a design framework that is developed
in the next section.

4.5 Robustness to Errors and Numerical Array Design

The array configurations presented above were all based on a predefined distribution
of samples on a sphere, as discussed in Chap. 3. The spherical harmonic coefficients
of the sound pressure, pnm (k, r ), were then computed using appropriate sampling
weights, leading to the computation of anm (k), or plane-wave decomposition, through
a division of pnm (k, r ) by bn (kr ), as in Eq. (4.3). Ill-conditioning in this computation
is a direct result of the low magnitude of bn (kr ), particularly affecting the single-
sphere open array configuration. In the configurations presented above, microphone
positions were constrained to the surface of a single or a dual sphere. In the case
where microphones are placed more freely in three-dimensional space, a different
formulation is required to account for the numerical robustness of the proposed
configuration. Such a formulation is presented in this section.
Equation (4.3) presents the relation in the spherical harmonics domain between
the plane-waves amplitude composing the sound field and the pressure on a sphere
4.5 Robustness to Errors and Numerical Array Design 93

of radius r. Now, consider Q sampling points distributed in three-dimensional space


at positions
(rq , θq , φq ), 1 ≤ q ≤ Q. (4.17)

The pressure at these sampling points can be written using Eq. (4.3) as
∞ 
 n
p(k, rq , θq , φq ) = anm (k)bn (krq )Ynm (θq , φq ), 1 ≤ q ≤ Q. (4.18)
n=0 m=−n

Note that this equation holds for various configurations, represented by different
bn functions, such as pressure microphones around open or rigid spheres, an open
sphere with cardioid microphones or the dual-radius configuration. Denoting the
maximum radius r̄ = max{rq } for all 1 ≤ q ≤ Q, and assuming that the wave number
satisfies k r̄ < N , the infinite summation in Eq. (4.18) can be approximated by a finite
summation, as discussed in Sect. 2.3:


N 
n
p(k, rq , θq , φq ) ≈ anm (k)bn (krq )Ynm (θq , φq ), 1 ≤ q ≤ Q. (4.19)
n=0 m=−n

Equation (4.19) can be written in a matrix form as

p = Banm , (4.20)

where the Q × 1 vector p represents the pressure samples:


 T
p = p(k, r1 , θ1 , φ1 ), p(k, r2 , θ2 , φ2 ), ..., p(k, r Q , θ Q , φ Q ) , (4.21)

the (N + 1)2 × 1 vector anm represents the coefficients of the sound field:
 T
anm = a00 , a1(−1) , a10 , a11 , ..., a N N (4.22)

and the Q × (N + 1)2 matrix B is given by


⎡ ⎤
b0 (kr1 )Y00 (θ1 , φ1 ) b1 (kr1 )Y1−1 (θ1 , φ1 ) ... b N (kr1 )Y NN (θ1 , φ1 )
⎢ b0 (kr2 )Y00 (θ2 , φ2 ) b1 (kr2 )Y1−1 (θ2 , φ2 ) ... b N (kr2 )Y NN (θ2 , φ2 ) ⎥
⎢ ⎥
B=⎢ .. .. .. .. ⎥.
⎣ . . . . ⎦
b0 (kr Q )Y00 (θ Q , φ Q ) b1 (kr Q )Y1−1 (θ Q , φ Q ) ... b N (kr Q )Y NN (θ Q , φ Q )
(4.23)

Plane-wave decomposition, which requires a division by bn as in Eq. (4.3), now


involves an inversion of matrix B. Therefore, the requirement for avoiding low mag-
nitude in bn is replaced in the more general case by the requirement that matrix B is
invertible, either directly if Q = (N + 1)2 , or through pseudo-inversion in the gen-
94 4 Spherical Array Configurations

eral case. In the case of a single-sphere configuration, matrix B can be decomposed


into a diagonal matrix holding values of function bn (kr ) and a matrix holding values
of the spherical harmonics at the sampling points, Ynm (θq , φq ). Hence, inversion of
matrix B requires that the magnitude of bn is not too small, which is consistent with
the analysis presented in the previous sections.
Having measured the sound pressure at the microphones, vector p, and formulated
a model for matrix B that describes the array configuration in use, vector anm can be
computed by solving Eq. (4.20), either exactly or in a least-squares sense:
o
anm = B† p, (4.24)

o
where anm is the solution. Assuming over-sampling, such that Q > (N + 1)2 , the
pseudo-inverse is given by
 −1
B† = B H B B H . (4.25)

o
When anm is substituted back into Eq. (4.20), it is expected that the equation is
satisfied exactly (or with a small error), validating the solution. In practice, however,
matrix B may not be known exactly. There are a number of possible causes for the
uncertainty, including: the microphone positions, (rq , θq , φq ), are only known with
a finite precision, perturbations from assumed values may exist in the gain and phase
response of microphones, there may be a non-ideal directional response in cardioid
microphones, reflections may be excited under an assumed free field condition due
to the microphone casing or the microphone boom and a non-negligible absorption
that may exist in a constructed rigid sphere.
The perturbation in matrix B is denoted by δB and, when substituting back into
Eq. (4.20), will lead to a perturbation in p denoted by δp:

p + δp = (B + δB)anm
o
. (4.26)

It is desired that a small perturbation δB will lead to a small perturbation δp, so that
the extent to which Eq. (4.20) is not satisfied is minimized. This sensitivity relation
o
is formulated by substituting Eq. (4.20) with anm into Eq. (4.26), leading to

δp = δBanm
o
. (4.27)

Substituting Eq. (4.24) and evaluating the 2-norm, leads to

δp ≤ δB · B†  · p. (4.28)

Rearranging and substituting the 2-norm condition number, the sensitivity of varia-
tion in p to variation in B is written as [12]

δp δB
≤ κ(B) , (4.29)
p B
4.5 Robustness to Errors and Numerical Array Design 95

where κ(B) is the condition number of matrix B, which for the 2-norm case can be
written as [12]
σ (B)
κ(B) = B · B†  = , (4.30)
σ (B)

where σ denotes the maximal singular value and σ denotes the minimal singular
value. Equation (4.29) shows that the condition number amplifies the error in matrix
B, so it is important to keep the condition number as close to unity as possible.
For the special case of a square matrix B, with a full rank equal to (N + 1)2 , the
condition number is written as in Eq. (4.30), but with the pseudo-inverse replaced
by the inverse.
Perturbation can also take place in vector p. The sound pressure vector is typi-
cally measured by microphones, so that amplifier noise and quantization error when
sampled by a computer may produce errors, or a perturbation, in vector p. It has been
shown that the bound on the error of the solution, anm in this case, for errors in p
and for the non-square matrix case grows with κ(B), motivating the reduction of the
condition number in these cases as well [12].
Having established that the condition number of matrix B is an important measure
for the robustness of the solution of Eq. (4.20) to errors in the data represented by
vector p and matrix B, the condition number can be used as an objective for mini-
mization when designing a spherical array configuration. For example, the following
optimization problem can be formulated for searching for microphone positions that
will produce the most robust design [10]:

(rq , θq , φq ) = arg min κ(B), 1 ≤ q ≤ Q, (4.31)


rq ,θq ,φq

Such an optimization problem may not be convex, requiring global search methods
such as genetic algorithms. Selection of the sphere configuration, e.g. open or rigid,
and of the type of microphone, e.g. pressure or cardioid, can also be integrated
into such a design. In the next two sections, examples of κ(B) for some of the
designs described in this chapter are presented, after which the shell configuration is
introduced, which uses the design optimization presented in Eq. (4.31).

4.6 Design Examples with Robustness Analysis

Practical limitations in the realization of arrays, causing deviations from the the-
oretical “ideal” design, will produce errors that propagate to the array output. As
discussed earlier in this chapter, common causes of errors may include, for example,
accuracy of microphone positioning, mismatch in the frequency response of micro-
phones and non-ideal acoustic models of the sphere. These errors may be represented
as perturbations in matrix B relative to an ideal matrix, so that the condition number
96 4 Spherical Array Configurations

of matrix B can be used as a general measure of the sensitivity of the array output to
these errors, as discussed in Sect. 4.5.
Several array configurations are investigated in this section. The condition number
of matrix B is computed for these selected array configurations, with the aim of illus-
trating and comparing their robustness. Matrix B is computed for each configuration
in the range 0 ≤ n ≤ 3 and 0 ≤ kr ≤ 6. In most cases, the sampling configuration
is designed for an order-limited function with a maximum order of N = 6, such
that spatial over-sampling is maintained. The reason for this relatively significant
over-sampling is to guarantee an operating region in the range 3 ≤ kr ≤ 6, in which
function bn (kr ) has a relatively uniform magnitude as a function of n.
In the first example, a spherical array configured around a rigid sphere is investi-
gated. Three sampling schemes, namely equal-angle, Gaussian and nearly uniform, as
discussed in Chap. 3, are studied. The three schemes are designed for order N = 6,
with 196, 98 and 84 samples, respectively. Matrix B for each of these three con-
figurations is computed and has dimensions Q by (N + 1)2 , where Q is the total
number of samples and (N + 1)2 = 49. The condition number of these matrices is
then computed for a range of values along kr . Although all three configurations were
considered robust when studied above, due to the inherent robustness of the rigid
sphere with regard to eliminating the zeros of the spherical Bessel function, Fig. 4.7
clearly shows that the nearly-uniform distribution is slightly more robust than the
Gaussian and the equal-angle distributions. This is probably due to the more uniform
manner in which the samples are distributed on the sphere, avoiding the clustering
at the poles. The figure also shows that the condition number is high at the lower-

4
10
equal-angle
Gaussian
nearly-uniform

103

102

101

100
1 2 3 4 5 6

Fig. 4.7 The condition number κ(B) as a function of kr for array configurations around a rigid
sphere, with sampling distributions as follows: equal-angle with 196 samples, Gaussian with 98
samples and nearly-uniform with 84 samples, all providing aliasing-free sampling up to order 6
4.6 Design Examples with Robustness Analysis 97

4
10
open + origin
open
rigid

103

102

101

100
1 2 3 4 5 6

Fig. 4.8 The condition number κ(B) as a function of kr for three array configurations; (i) around
a rigid sphere and (ii) around an open sphere, both with nearly-uniform sampling distribution
with 84 samples, and (iii) an open array configuration with an additional sample at the origin. All
configurations provide aliasing-free sampling up to order 6

frequency end (for kr < 3). This is due to the inherently low magnitude of b1 to b3
for kr < 3. This increase in condition number at low values of kr cannot be avoided
by re-distribution of the microphones and will typically require an increase in the
radius of the sphere, such that the cut-off point (kr = 3 in this case) occurs at a lower
wave number.
In the next example, three array configurations are compared, including one
around a rigid sphere and one around an open sphere, both with uniform sampling
distributions of 84 samples. The configuration around a rigid sphere is the same as
in Fig. 4.7 and is presented here as a reference. The third configuration is the same
as the open-array configuration, only an additional sample has been added to matrix
B, at the array origin. Figure 4.8 presents the condition number of matrix B for these
configurations. The open-array configuration clearly shows high condition numbers
at kr values close to the zeros of the spherical Bessel function. Equation (4.23) shows
that matrix B has its first column equal to zero for kr = π , due to the zero j0 (π ) = 0.
Now, when an additional row is added due to the sample at the origin, this column
will not be zero because j0 (0) = 0, and so the loss of rank due to the zero column is
recovered. This is also evident in Fig. 4.8, where the condition number for this new
configuration follows that of an open sphere, but avoids the high condition number
values around the first zero.
In the final example, the condition numbers of an open array with cardioid micro-
phones and of a dual-sphere array with a second radius that is 1.3 times smaller than
the first radius, have been computed and are presented in Fig. 4.9. A nearly-uniform
sampling scheme with 84 samples has been used for both arrays. For the dual-sphere
98 4 Spherical Array Configurations

4
10
rigid
cardioid
dual-max
dual-both
103

102

101

100
1 2 3 4 5 6

Fig. 4.9 The condition number κ(B) as a function of kr for four array configurations; (i) around
a rigid sphere, (ii) around an open sphere with cardioid microphones, (iii) around a dual-sphere
array with the second radius 1.3 times smaller than the first radius (dual-max) and (iv) around
another dual-sphere array with matrix B composed of a combination of elements from both spheres
(dual-both). All configurations provide aliasing-free sampling up to order 6 and use nearly-uniform
sampling with 84 samples on each sphere

array, only data points corresponding to the radius having the maximum magnitude
of bn (kr ) are selected, as discussed in Sect. 4.4. The figure shows that, as expected,
both the array based on cardioid microphones and the dual-sphere array overcome
the ill-conditioning due to the zeros of the spherical Bessel function, and achieve a
reasonably low condition number. In addition, the same dual-radius configuration is
presented with matrix B composed of rows from both spheres, rather than using the
maximization selection criterion. The result is a condition number very similar to that
of the original dual-radius array. In this case, matrix B has twice as many columns,
but the near-zero columns around the Bessel zeros, which do not contribute useful
information, are simply redundant. Therefore, in this case, the maximization process
can be avoided by simply using a larger matrix.

4.7 Spherical Shell Configuration

Section 4.4 showed how the ill-conditioning in the design of an open-sphere array
is removed by positioning microphones on the surfaces of dual concentric spheres.
Although the dual-sphere array solved the ill-conditioning due to the zeros of the
spherical Bessel function, it required twice as many microphones, compared to the
single-sphere configuration. Motivated by the theory behind the dual-sphere array,
4.7 Spherical Shell Configuration 99

and with the aim of minimizing the increase in the number of microphones, the
spherical shell configuration is presented in this section [10]. In this configuration,
microphones are distributed inside the volume enclosed by the two spheres of the
dual-sphere configuration. However, the overall number of microphones is the same
as that of the equivalent single-sphere configuration, such as the single open sphere
and the single rigid sphere. The design of the array in this configuration requires
selection of the angles (θ, φ) and the radius r for each microphone. Because of the
increased degree-of-freedom in this configuration (due to the varying radius), the
design framework presented in Sect. 4.5 can be used both to compare designs based
on some regular selection of the radius and angles of microphone positions and as a
framework for optimizing microphone positions.
A straightforward way to select microphone positions in this configuration is to
distribute microphones with a known nearly-uniform sampling distribution along
(θ, φ), or to use one of the other known methods and to distribute microphones
uniformly along the radius between the two spheres. Figure 4.10 shows the condition
number of a rigid sphere with the same configuration as presented in Sect. 4.6, with 84
nearly-uniformly distributed samples. The condition number of the spherical shell
with the same microphone distribution along the angles and with uniform radial
distribution between the two spheres is also shown. The first sphere has the same
radius as that of the rigid-sphere array, while the second radius is smaller by a factor of
1.3. The figure shows that, although the condition number of the shell array is higher
than that of the rigid-sphere array, it is still relatively low and so this configuration
can be considered relatively robust.

4
10
rigid
shell - uniform
shell - optimal

103

102

101

100
1 2 3 4 5 6

Fig. 4.10 The condition number κ(B) as a function of kr for three array configurations; (i) around a
rigid sphere, (ii) and (iii) around an open sphere with microphones distributed in the volume of a shell
with uniform radial distribution (shell-uniform) and with optimal radial distribution (shell-optimal),
respectively. All configurations provide aliasing-free sampling up to order 6
100 4 Spherical Array Configurations

Fig. 4.11 Radius 90


distribution for the optimal 120 1 60
radial design, illustrated
0.8
using x-marks on two polar
plots, with the upper plot 150 0.6 30
showing (r, θ) for each
0.4
position and the lower plot
showing (r, φ) for each 0.2
position
180 0 0

210 330

240 300
270

90
120 1 60
0.8

150 0.6 30
0.4

0.2

180 0 0

210 330

240 300
270

In an attempt to improve the robustness by lowering the condition number, the


radial component of the microphone positions was selected by numerical optimiza-
tion, based on the formulation in Sect. 4.5 and a genetic algorithms solver [10], in the
radial range from zero to the larger radius in the dual-sphere configuration. Figure
4.10 shows the condition number for this configuration, which is, indeed, lower at
most kr values compared to the uniform distribution along the radius, with a lower
upper-bound in the range 3 ≤ kr ≤ 6.
The radii generated using this optimized design are presented in Fig. 4.11, showing
(r, θ ) and (r, φ) for each optimized position, with r = 1 representing the larger radius
in the dual-sphere design. The figure shows that most radii are at or near the maximum
allowed radius, and some are distributed inside the sphere.
4.7 Spherical Shell Configuration 101

Further details on the spherical shell array design, including other methods for
the distribution of samples within a shell volume, are presented in [10].

4.8 Other Configurations

Other spherical array configurations not presented in previous sections of this chapter
have been developed and reported in the literature, and are outlined in this section
briefly. The first example can be viewed as a continuation of the spherical shell array.
Although the shell configuration provides numerical robustness without increasing
the number of microphones, it may possess drawbacks related to the irregular distri-
bution of samples. For example, in a mechanical-scanning microphone array system,
the dual-sphere array can be realized by a two degrees-of-freedom system, where
elevation and azimuth are controlled using separate motors or turn-tables, with an
additional single manual change of microphone radius. The spherical shell array, with
a uniform distribution of radial position, for example, may require a three degrees-
of-freedom system, i.e. with three motors, for automatic placement of microphones.
This means an additional cost and complexity. With the aim of maintaining the advan-
tages of the spherical shell array, Alon and Rafaely [1] proposed a realization of a
microphone scanning system with two motors arranged off-axis, therefore allowing
positioning of microphones within the approximate volume of a spherical shell. This
configuration, termed the spindle torus array due to the resulting scanning surface,
was shown to provide robustness at a similar level to that found in the shell array,
but with a realization that required only two degrees-of-freedom.
Parthy and Jin [9] presented an interesting design concept, combining both rigid
and open spheres in a single concentric arrangement. Such a design benefits from
both improved robustness due to the effect of the rigid sphere and improved frequency
range due to the measurement with two spheres of different radii. The larger open
sphere allows improved analysis in the lower frequency range and the smaller rigid
sphere allows an extension of the aliasing-free range to a higher frequency. In [9],
the proposed array was built and investigated for acoustic holography.
Another design variation that is based around a rigid sphere was introduced by
Li and Duraiswami [5]. It was proposed for situations in which the array is mounted
near a large rigid surface, such as a wall or a desk. Assuming this surface is infi-
nite and rigid, incoming waves undergo specular reflection, so that the outgoing
waves are a mirror image of the incoming waves. This symmetry allows the use of
a rigid microphone array in the shape of a hemisphere, where the pressure at the
missing microphones can be calculated by incorporating the symmetry in the sound
field. Although a hemispherical microphone array is used with half the number of
microphones, all methods developed for spherical arrays can be readily used by this
array due to the symmetry in the sound field. The proposed array, in addition to
saving half the number of microphones, has the shape of a hemisphere, which can be
conveniently placed on a large desk in a video conferencing scenario, for example.
102 4 Spherical Array Configurations

Another array configuration that aims to achieve an improved frequency range


of operation while overcoming the ill-conditioning introduced by the zeros of the
spherical Bessel function has been presented by Melchior et al. [6]. This array is
based on two concentric spheres, similar to the dual-sphere array, only here, car-
dioid microphones are employed. These overcome the ill-conditioning at the null
frequencies at each of the two spheres (see Sect. 4.3). Now, with the spheres having
significantly different radii, the frequency range of operation can be extended beyond
that achievable with the single-sphere design, or even with the dual-sphere design
with radii close in value. Sound field data measured by this array has been used for
binaural auralization.

References

1. Alon, D., Rafaely, B.: Spindle-torus sampling for an efficient-scanning spherical microphone
array. Acta Acust. united Ac. 98(1), 83–90 (2012)
2. Balmages, I., Rafaely, B.: Open-sphere designs for spherical microphone arrays. IEEE Trans.
Audio Speech Lang. Process. 15(2), 727–732 (2007)
3. Hulsebos, E., Schuurmans, T., de Vries, D., Boone, R.: Circular microphone array for discrete
multichannel audio recording. In: Proceedings of the 114th Meeting of the Audio Engineering
Society, 5716. Amsterdam (2003)
4. Kinsler, L.E., Frey, A.R., Coppens, A.B., Sanders, J.V.: Fundamentals of Acoustics, 4th edn.
Wiley, New York (1999)
5. Li, Z., Duraiswami, R.: Hemispherical microphone arrays for sound capture and beamforming.
In: Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and
Acoustics (WASPAA 2005). New York (2005)
6. Melchior, F., Thiergart, O., Del Galdo, G., de Vries, D., Brix, S.: Dual radius spherical cardioid
microphone arrays for binaural auralization. In: Proceedings of the 127th Meeting of the Audio
Engineering Society, 7855. New York (2009)
7. Meyer, J.: Beamforming for a circular microphone array mounted on spherically shaped objects.
J. Acoust. Soc. Am. 109(1), 185–193 (2001)
8. Meyer, J., Elko, G.W.: A highly scalable spherical microphone array based on an orthonormal
decomposition of the soundfield. In: IEEE International Conference on Acoustics, Speech and
Signal Processing (ICASSP 2002), vol. II, pp. 1781–1784. Orlando (2002)
9. Parthy, A., Jin, C., van Schaik, A.: Acoustic holography with a concentric rigid and open
spherical microphone array. In: IEEE International Conference on Acoustics, Speech, and
Signal Processing (ICASSP 2009), pp. 2173–2176. Taipei (2009)
10. Rafaely, B.: The spherical-shell microphone array. IEEE Trans. Audio Speech Lang. Process.
16(4), 740–747 (2008)
11. Rafaely, B., Balmages, I., Eger, L.: High-resolution plane-wave decomposition in an auditorium
using a dual-radius scanning spherical microphone array. J. Acoust. Soc. Am. 122(5), 2661–
2668 (2007)
12. Trefethen, L.N., Bau, D.: Numerical Linear Algebra. Siam, Philadelphia (1997)
Chapter 5
Spherical Array Beamforming

Abstract Chapter 4 presented various ways to configure a spherical microphone


array and discussed the advantages of each configuration. Once microphones are
positioned in space in a desired configuration, e.g. on the surface of a rigid sphere,
they can be connected to conditioning equipment, and the signal at each microphone
can be recorded. In this chapter, the signals at the microphones are defined as the
inputs to an array processor, producing a single processed output with some desired
characteristics. One possible desired characteristic is to enhance signals from a sound
source that is located in a specific direction and to attenuate signals from sources
located in other directions, therefore forming a spatial, or directional filter. Such a
filter is called a beamformer, because the beam it forms looks at a desired direction,
and is probably the simplest form of array processing. The first section of this chapter
presents array equations, with array input, spatial filter, and array output formulated
in the space domain. This is followed by the derivation of the same equations in
the spherical harmonics domain, where the benefits of processing in this domain
are emphasized. Two important measures of array performance, namely directivity
index and white noise gain (WNG), are presented in the following sections. These are
derived both in the space and in the spherical harmonics domains. A simplified beam-
forming structure that produces axis-symmetric beam patterns and that decouples the
shaping and the steering of a beam pattern is also introduced. The chapter contin-
ues with a presentation of two common beamformers, namely delay-and-sum and
plane-wave decomposition. Finally, steering of non axis-symmetric beamformers is
presented, and the chapter concludes with a beamforming example.

5.1 Beamforming Equations

Array equations, or beamforming equations, are initially defined in this section in


the space domain. First, a theoretical framework is developed using a continuous
pressure function over the surface of a sphere. Although in practice the continuous
pressure function is not available, the continuous form of the array equation will
© Springer Nature Switzerland AG 2019 103
B. Rafaely, Fundamentals of Spherical Array Processing, Springer Topics
in Signal Processing 16, https://doi.org/10.1007/978-3-319-99561-8_5
104 5 Spherical Array Beamforming

Fig. 5.1 A block diagram of


a space domain
beamforming system

be used as a theoretical reference for the developments to follow. Consider sound


pressure on a sphere of radius r , denoted by p(k, r, θ, φ). A spatial filter is defined
by multiplying the sound pressure function with a weighting function, [w(k, θ, φ)]∗ ,
and integrating over the entire sphere surface to produce an array output y:

2π π
y= [w(k, θ, φ)]∗ p(k, r, θ, φ) sin θ dθ dφ. (5.1)
0 0

In the next step, a spherical microphone array composed of Q microphones positioned


at the surface of the same sphere of radius r is introduced. Microphone positions are
denoted by (r, θq , φq ), q = 1, . . . , Q. The sound pressure measured by microphone
q at wave number k is denoted by pq (k) ≡ p(k, r, θq , φq ); these form the elements
of a Q × 1 vector of measured sound pressure amplitudes:
 T
p = p1 (k), p2 (k), . . . , p Q (k) . (5.2)

A discrete version of the spatial filter is also defined in a similar manner, with weight
wq (k) corresponding to microphone number q. The Q × 1 weight vector is defined
as  T
w = w1 (k), w2 (k), . . . , w Q (k) . (5.3)

In the standard space domain array processing literature, the array output is given as
an inner product of the two vectors [7] (see also Fig. 5.1) such that

y = w H p. (5.4)
5.1 Beamforming Equations 105

However, it is very important to note that the definitions in Eqs. (5.1) and (5.4)
are not equivalent. A discrete version of the array equation in the space domain that
is equivalent to Eq. (5.1) has to take the effect of spatial sampling into account. The
relation between the two forms will be derived later in this section, using the formu-
lation of the array equation in the spherical harmonics domain. It is also important
to note that the array equation in the form of Eq. (5.1) does not suffer from spatial
aliasing and may, therefore, be useful when studying aspects of array processing
other than spatial aliasing.
The general problem of array beamforming, or spatial filtering, can be defined
as designing w such that, for a given array input p, the array output y is produced
with some desired properties. When characterizing array properties, an array input
for a sound field composed of a single, unit-amplitude plane wave is often assumed
[7]. In this case, the measured pressure is replaced by a steering vector, or manifold
vector, which represents the plane-wave amplitude measured at each microphone.
The steering vector, denoted by v, has a simple analytical form for arrays composed
of pressure microphones in free field, which is
 T
v = v1 , v2 , . . . , v Q , (5.5)

where
vq = ei k̃·r , 1 ≤ q ≤ Q. (5.6)

The wave vector k̃ = (k, θk , φk ) denotes the plane-wave arrival direction (see
Chap. 2), and the position vector r = (r, θq , φq ) denotes the position of microphone
q. The array output can now be written as

y = w H v. (5.7)

This is now an explicit function of the wave arrival direction, through the dependence
of v on (θk , φk ), that defines the directional response (or directivity) of the array. It is
important to note that when other array configurations are considered, e.g. pressure
microphones around a rigid sphere, the steering vector includes the effect of the
scattering of sound from the sphere. This complicates the analytical expressions for
vector v, which motivates the representation of the array equations in the spherical
harmonics domain, a mathematically more natural domain in this case.
Array equations developed in the space domain are derived next in the spherical
harmonics domain. Consider Eq. (5.1), where the pressure function p(k, r, θ, φ) and
the weight function w(k, θ, φ) are defined over the sphere, and denote by pnm (k)
and wnm (k) their respective spherical Fourier transforms. Substituting in Eq. (5.1)
the spherical harmonics expansion for p and w, as in Eq. (1.41), and evaluating the
integral using the orthogonality property of the spherical harmonics, Eq. (1.23), the
array output can be written as a function of the spherical Fourier coefficients:
106 5 Spherical Array Beamforming

Fig. 5.2 A block diagram of a spherical harmonics domain beamforming system

2π π
y= [w(k, θ, φ)]∗ p(k, r, θ, φ) sin θ dθ dφ
0 0
∞ 
 n
= [wnm (k)]∗ pnm (k, r ). (5.8)
n=0 m=−n

Now, assuming coefficients beyond order N are zero, wnm = 0 ∀n > N , this equation
can be written in a matrix form (see also Fig. 5.2) as

y = wnm
H
pnm , (5.9)

where the (N + 1)2 × 1 vector wnm is given by


 T
wnm = w00 (k), w1(−1) (k), w10 (k), w11 (k), . . . , w N N (k) , (5.10)

and the (N + 1)2 × 1 vector pnm is given by


 T
pnm = p00 (k, r ), p1(−1) (k, r ), p10 (k, r ), p11 (k, r ), . . . , p N N (k, r ) . (5.11)

The array beam pattern, or array output due to a unit-amplitude plane-wave sound
field, can also be written in the spherical harmonics domain, in a manner similar to
Eq. (5.7):
y = wnmH
vnm , (5.12)

where the (N + 1)2 × 1 column vector vnm is defined as


5.1 Beamforming Equations 107
 T
vnm = v00 , v1(−1) , v10 , v11 , . . . , v N N , (5.13)

with vnm representing the array input due to the plane-wave sound field. The expres-
sion for vnm is derived from the sound pressure, pnm , due to the unit-amplitude plane
wave. For an open-sphere array configuration [see Eq. (2.41)] pnm is written as
 ∗
pnm (k, r ) = 4πi n jn (kr ) Ynm (θk , φk ) , (5.14)

which is also the expression for vnm , i.e.


 ∗
vnm = 4πi n jn (kr ) Ynm (θk , φk ) , (5.15)

with the plane-wave arrival directions denoted by (θk , φk ). Following the notation
introduced in Chap. 4, the open-sphere configuration can be written more generally
as  ∗
vnm = bn (kr ) Ynm (θk , φk ) , (5.16)

with bn (kr ) = 4πi n jn (kr ). This can now be extended for a wide range of array con-
figurations, simply by modifying the expression for bn (kr ) to apply to a rigid-sphere
array, a dual-sphere open array, and more (see Chap. 4). This flexibility, facilitating
the modeling of the steering vectors of various array configurations within the same
framework, is a significant advantage of formulating array equations in the spherical
harmonics domain.
Another advantage of formulating the equations in the spherical harmonics
domain (compared with the space domain) is computational efficiency. In practice,
arrays perform over-sampling, such that Q > (N + 1)2 . This means that the vectors
and matrices in the spherical harmonics domain are of lower dimension than the
same vectors and matrices in the space domain.
In the remainder of this book, the formulation in the spherical harmonics domain
will be used as the standard formulation. As shown above, the spherical harmon-
ics formulation is more flexible, as it allows a unified representation for various
array configurations and sampling schemes. However, in some cases, formulation in
the space domain may be required; this formulation is more standard in the array
processing literature because it uses the microphone signals directly. Therefore, the
relation between the spherical harmonics domain formulation and the space domain
formulation is presented next.
Starting with the spherical harmonics formulation, the array equation, as in
Eq. (5.9), is rewritten here:
y = wnm H pnm . (5.17)

Next, the relations between the spherical harmonics vectors wnm and pnm and the
space domain vectors w and p are derived by introducing the effect of sampling, as in
Eqs. (3.34), (3.35) and (3.38), for the three types of sampling schemes, as presented
in Sect. 3.6.
108 5 Spherical Array Beamforming

Substituting wnm = Y† w for a general sampling scheme and a similar expression


for pnm into Eq. (5.17), the array output can be written in the space domain as
 
y = w H Y†H Y† p. (5.18)

Similarly, for the equal-angle sampling and the Gaussian sampling schemes, substi-
tuting wnm = Y H diag(α)w and a similar expression for p, the array output becomes
 
y = w H diag(α)YY H diag(α) p. (5.19)

Finally, for the uniform and nearly-uniform sampling schemes, with wnm = 4π H
Q
Y w
and a similar expression for p, the array output is expressed as
 2

y=w H
YY H
p. (5.20)
Q

Equations (5.18) to (5.20) are the space domain equivalent to the spherical harmonics
domain array equation, Eq. (5.17). It is important to note that they are different from
the standard space domain equation y = w H p, and so the two forms, y = wnm H
pnm
and y = w p are not the same and cannot be used interchangeably. Equations (5.18)–
H

(5.20) can be written in a unified manner by using matrix S, as defined in Eqs. (3.41)–
(3.43), such that  
y = w H S H S p. (5.21)

5.2 Axis-Symmetric Beamforming

Equation (5.12) presented the array output as a function of the array input and the
beamforming weights in the spherical harmonics domain. Meyer and Elko [2] pro-
posed a useful formulation for the weights wnm . These weights are functions of
two parameters, n and m (or, equivalently, θ and φ), in the two-dimensional space
domain, when taking the inverse spherical Fourier transform of wnm to calculate
w(θ, φ). The approach proposed in [2] was to reduce the beamforming weights to
a one-dimensional function, such that the resulting beam pattern is axis-symmetric,
with the look direction forming the axis of symmetry. The proposal used the following
formulation:
dn (k) m
[wnm (k)]∗ = Y (θl , φl ). (5.22)
bn (kr ) n

The new beamforming weights, dn (k), which may be a function of frequency, are
dependent only on n and can therefore be considered as one-dimensional. A division
by bn (kr ) guarantees that the resulting steering vectors and the beam pattern are not
dependent on the physical behavior of the sound field around the array. For example,
5.2 Axis-Symmetric Beamforming 109

Fig. 5.3 A block diagram of a spherical harmonics domain, axis-symmetric beamforming system

the effect of scattering from an array configured around a rigid sphere is removed
by this division. This is illustrated in the formulation that follows. Finally, (θl , φl )
denotes the array look direction. This will also be evident from the derivation that
follows.
Substituting Eq. (5.22) in Eq. (5.9), and rewriting the equation explicitly using
summations, leads to


N 
n
y= [wnm (k)]∗ pnm (k, r )
n=0 m=−n

N  n
dn (k) m
= Y (θl , φl ) pnm (k, r )
b (kr ) n
n=0 m=−n n

N
dn (k) 
n
= pnm (k, r )Ynm (θl , φl ). (5.23)
n=0
bn (kr ) m=−n

The third line in Eq. (5.23) is presented in a form that is more computationally
efficient (see also the block diagram in Fig. 5.3) exploiting the single dimension of
the beamforming coefficients.
The array beam pattern for the axis-symmetric beamformer can be formulated by
substituting Eq. (5.16) for pnm , leading to

N  n
dn (k) m
y= Y (θl , φl ) pnm (k, r )
b
n=0 m=−n n
(kr ) n
N  n
dn (k) m  ∗
= Yn (θl , φl )bn (kr ) Ynm (θk , φk )
b (kr )
n=0 m=−n n
110 5 Spherical Array Beamforming


N 
n
 ∗
= dn (k) Ynm (θk , φk ) Ynm (θl , φl )
n=0 m=−n


N
2n + 1
= dn (k) Pn (cos Θ), (5.24)
n=0

where the spherical harmonics addition theorem [see Eq. (1.26)] was employed in
the last line of the derivation, with

cos Θ = cos θl cos θk + cos(φl − φk ) sin θl sin θk (5.25)

[see also Eq. (1.27)], where Θ denotes the angle between (θl , φl ) and (θk , φk ).
Equation (5.24) can be written in a matrix form by defining a steering vector vn
and an array weights vector dn :

y = dnT vn
1  T
vn = P0 (cos Θ), 3P1 (cos Θ), . . . , (2N + 1)PN (cos Θ)

 T
dn = d0 , d1 , . . . , d N . (5.26)

Now, array weights dn control y(Θ), which is the beam pattern of the array, or the
array response to a unit-amplitude plane wave. The output y depends on Θ, the angle
between (θl , φl ) and (θk , φk ). Typically (but not necessarily), y(Θ) peaks at Θ = 0,
which means that plane waves arriving from this direction are subject to the highest
amplification. Hence, this direction is typically considered as the look direction, or
the direction of most interest, already denoted as (θl , φl ). The beam pattern y depends
on Θ, the angle away from (θl , φl ), and so it is axis-symmetric around (θl , φl ). Now,
by changing the value of (θl , φl ), the function y(Θ) itself does not change, but it is
rotated, or steered, such that Θ = 0 coincides with (θl , φl ). Therefore, by changing
the value of (θl , φl ) in Eq. (5.22), the beam pattern is steered to the new direction
(θl , φl ). This shows that steering is achieved in a simple and direct manner in this case,
and that the beam pattern, y(Θ), controlled through dn , is independent of steering,
which is controlled through (θl , φl ), as also illustrated in Fig. 5.3.

5.3 Directivity Index

The array output, y, in response to a unit-amplitude plane wave, has already been
presented as defining the directivity, or the beam pattern, of the array. A scalar that
quantifies the array directivity is the directivity index, which provides a measure for
the ratio between the peak and the average values of the squared beam pattern. The
directivity factor, with symbol D F, is defined as [7]
5.3 Directivity Index 111

|y(θl , φl )|2
DF =

π , (5.27)
1
4π 0 0 |y(θ, φ)|2 sin θ dθ dφ

from which the directivity index, with symbol D I , is computed by D I = 10 log10


(D F). The directivity index can be interpreted in several ways. First, it can be consid-
ered as the output of the directional array, relative to an omni-directional microphone
with the same root-mean-squared directional gain. It can also be interpreted as the
SNR for a plane-wave signal arriving from the look direction and a noise sound field
which is diffuse (or spherically isotropic). In both cases it quantifies the improvement
in SNR provided by the array due to its directional response.
Substituting Eqs. (5.12) and (5.16) in Eq. (5.27), and applying the orthogonality
property of the spherical harmonics [Eq. (1.23)], the directivity factor can be written
as a function of the beamforming weights, wnm , as
2
N n ∗
n=0 m=−n wnm (k)vnm
DF =

π N n  ∗ 2

0 n=0 m=−n wnm (k)bn (kr ) Yn (θ, φ) sin θ dθ dφ
1 m
4π 0
2
N n ∗
n=0 m=−n wnm (k)vnm
= N n , (5.28)
1 w∗ (k)bn (kr ) 2
4π n=0 m=−n nm

where vnm in the numerator is given by Eq. (5.16). It is typically assumed that the look
direction employed in the design of wnm equals the wave arrival direction, (θk , φk ).
The directivity factor can be rewritten in a matrix form as a generalized Rayleigh
quotient:
H
wnm Awnm
DF = H
wnm Bwnm
A = vnm vnm
H

1 
B= diag |b0 |2 , |b1 |2 , |b1 |2 , |b1 |2 , . . . , |b N |2

 T
vnm = v00 , v1(−1) , v10 , v11 , . . . , v N N , (5.29)

with vnm = bn [Ynm (θk , φk )]∗ , as in Eq. (5.16), and with matrices A and B of dimen-
sions (N + 1)2 × (N + 1)2 . The explicit dependence of bn (kr ) on kr has been
dropped for notation simplicity.
A similar derivation of the directivity factor can also be obtained for the case of
an axis-symmetric beam pattern by substituting Eq. (5.22) in Eq. (5.28):
2
N
n=0 m=−n dn (k)Ynm (θl , φl )[Ynm (θk , φk )]∗
n

DF =

π N n 2
1
d (k)Y m (θ , φ )[Y m (θ, φ)]∗ sin θ dθ dφ
4π 0 0 n=0 m=−n n n l l n
112 5 Spherical Array Beamforming
2
N
n=0 dn (k) 2n+1

P n (cos 0)
= 1 N
n=0 |dn (k)| 4π Pn (cos 0)
2 2n+1

2
N
n=0 dn (k) 2n+1

= 1 N , (5.30)
n=0 |dn (k)| 4π
2 2n+1

where it has been assumed that (θl , φl ) = (θk , φk ) in the derivation of the numerator,
i.e. the look direction equals the plane-wave arrival direction. Also, the orthogonality
property of the spherical harmonics, Eq. (1.23), and the spherical harmonics addi-
tion theorem, Eq. (1.26), have been employed in the derivation of the denominator.
Equation (5.30) can be written in a matrix form, in a similar manner to Eq. (5.29):

dnH Adn
DF =
dn H Bdn
A = vn vnH
1
B= diag(vn )

1  T
vn = 1, 3, 5, . . . , 2N + 1 , (5.31)

where, in this case, both A and B are (N + 1) × (N + 1) matrices of known constants


and dn = [d0 , d1 , . . . , d N ]T .
Figure 5.4 shows two examples of directivity plots, |y(Θ)|, as formulated in
Eq. (5.23), with dn = 1, n = 0, . . . , N . For N = 0, y(Θ) = 4π 1
; this shows a con-
stant directivity, or
N 2n+1 an omni-directional beam pattern, with D  = 1. For N =
F
2, y(Θ) = n=0 4π
P n (cos Θ) = 1 3
4π 2
5 cos 2
Θ + 2 cos Θ − 1 [3], showing a
directional response with a clear maximum at Θ = 0 and with D F = 9.

5.4 White Noise Gain

Arrays typically operate under non-ideal conditions, which include, for example,
sensor noise and uncertainties in the frequency response and in the position of the
microphones. It is important that the performance of the array, e.g. directivity index,
remains robust to the undesired effect of noise and uncertainties. A common param-
eter employed as a measure for array robustness is the WNG [7]. It is defined as the
improvement in SNR at the array output compared to the array input. The array input
is the signal at the individual microphones, or sensors, and the array output is the
combined signal, after array processing (such as beamforming) is applied.
With the aim of formulating simple expressions for the WNG, the following is
assumed.
(i) The sound field is composed of a single, unit-amplitude plane wave.
5.4 White Noise Gain 113

Omni-directional Directional
90 90
120 60 120 0.8 60
0.08
0.06 0.6
150 30 150 30
0.04 0.4
0.02 0.2
180 0 0 180 0 0

210 330 210 330

240 300 240 300


270 270

Fig. 5.4 Polar directivity plot, |y(Θ)|, for an axis-symmetric beamformer with dn = 1, n =
0, . . . , N , for an omni-directional directivity with D F = 1 and N = 0 and a directional response
with D F = 9 and N = 2

(ii) The array is composed of sound pressure microphones in a free field. Other
array configurations are considered later in this section.
(iii) The array beamforming weights are designed with a look direction equal to the
plane-wave arrival direction.
(iv) The noise at the sensors is assumed to be uncorrelated across sensors, or micro-
phones, and to have a variance of σ 2 with zero mean.
Under these conditions, the signal magnitude at the array input is unity and the
variance of the noise at the array input is σ 2 . The signal at the array output due to
the plane wave can be computed using Eq. (5.12) as |y|2 = |wnm H
vnm |2 , where vnm
in this case is the steering vector in the look direction. The variance of the noise at
the array output can be derived from the array equation:
     
E |y|2 = E yy H = wnm
H H
E pnm pnm wnm . (5.32)

Using the general form of the discrete spherical Fourier transform, pnm = Sp, as in
Eq. (3.40), and assuming that the signal at the individual microphones, p, includes
the sensor noise component, which is uncorrelated between sensors, the array output
reduces to
   
E |y|2 = wnm
H
SE pp H S H wnm
= wnm
H
Sσ 2 IS H wnm
 H
= σ 2 wnm
H
SS wnm . (5.33)

Now, the WNG, computed as the ratio of the SNR at the array output and the SNR
at the array input, is given by
114 5 Spherical Array Beamforming

|wnm
H
vnm |
2
H
σ 2 wnm [ ] nm
H SS H w w vnm 2
WNG = = H 
nm
 . (5.34)
1/σ 2 wnm SS H wnm

Reformulated, this equation takes a generalized Rayleigh quotient form:


H
wnm Awnm
WNG = H
wnm Bwnm
A = vnm vnmH

B = SS H . (5.35)

If the beamforming weights are normalized such that |wnm H


vnm |2 = 1, the numerator
in Eq. (5.35) reduces to unity.
In the particular case of uniform and nearly-uniform sampling, using the orthog-
onality property of matrix Y, as stated in Eq. (3.39), the expression for SS H becomes

4π H 4π 4π
SS H = Y Y = I, (5.36)
Q Q Q

and the WNG simplifies to the form of a Rayleigh quotient:


H
wnm Awnm
WNG = 4π H
w w
Q nm nm

A = vnm vnm
H
. (5.37)

An expression for the WNG in the case of axis-symmetric beamforming can be


derived by substituting Eq. (5.22) for wnm , rewritten here, omitting the dependence on

k and r , wnm = dbnn Ynm (θl , φl ). In addition, the expression for vnm given in Eq. (5.16),
vnm = bn [Yn (θl , φl )]∗ , in which the plane-wave arrival direction is assumed to equal
m

the look direction, is substituted in Eq. (5.37). Now, using the spherical harmonics
addition theorem, Eq. (1.26), the WNG is rewritten here using summations in the
spherical harmonics domain, as derived in [5]:
 ∗ 2
N n
n=0 m=−n dn (k)Ynm (θl , φl ) Ynm (θl , φl )
WNG = 2
4π N n
m=−n [dn (k)/bn (kr )]Yn (θl , φl )
m
Q n=0
2
N
n=0 dn (k) 2n+1

= 4π N . (5.38)
n=0 |dn (k)/bn (kr )|
2 2n+1
Q 4π

This can be written in a matrix form:


5.4 White Noise Gain 115

dnH Adn
WNG =
dnH Bdn
A = vn vnH
4π 
B= diag(vn ) × diag |b0 |−2 , |b1 |−2 , . . . , |b N |−2
Q
1  T
vn = 1, 3, 5, . . . , 2N + 1 . (5.39)

The derivations of expressions for the WNG presented above assumed sensors
in free field. This is convenient, because the SNR at the array input is the same for
all sensors, so that any sensor can be selected as representing the array input. This
is not the case for other array configurations. For example, the SNR at the array
input for an array configured around a rigid sphere may differ between sensors. Due
to the shadowing effect of the sphere, the SNR at the array input will degrade for
sensors located at angles on the sphere further away from the plane-wave arrival
direction. In this case, the definition of the WNG may need readjustment to take
into consideration the contributions from all sensors. It has been shown [4] that the
variation due to scattering of sound from the rigid sphere is smaller than 3 dB. In this
book this difference is ignored in favor of using the same WNG formulation across
all array configurations, even though this formulation strictly holds only for the free
field configuration.
The WNG for an axis-symmetric beamformer is presented next. Consider an
array with Q = 9 microphones arranged uniformly on the surface of an open sphere,
providing spherical harmonics analysis to order N = 2. Employing the same example
as in Sect. 5.3, beamforming coefficients are chosen with dn = 1, and Eq. (5.39) is
used to compute the WNG for kr = 0 to N . Figure 5.5 shows the WNG as a function
of kr , first for a single microphone and then for an array with Q = 9 microphones.
The WNG for the single microphone is unity, as expected, because in this case the
array input is the same as the array output. The WNG for the array of order N = 2 and
for large values of kr is larger than unity, meaning that the SNR at the array output
has improved, compared to the SNR at the array input. However, for low values of kr
the WNG is less than one, meaning that the SNR is degraded, which is an undesirable
property in array processing. For further discussion of WNG, including addressing
the factors affecting the WNG and ways to design arrays that maximize WNG, see
Chap. 6.

5.5 Simple Axis-Symmetric Beamformers

Examples of simple beamformers are presented in this section. The first beamformer
is the delay-and-sum beamformer, which is widely used due to its simple realization,
i.e. the beamforming weights are composed of delays. The delays are selected such
that the phase of a plane wave arriving from the array look direction is matched at all
116 5 Spherical Array Beamforming

5
Q=9,N=2
4.5 Q=1,N=0

3.5

2.5

1.5

0.5

0
0 0.5 1 1.5 2

Fig. 5.5 WNG for an axis-symmetric beamformer with dn = 1 , n = 0, . . . , N , for a single micro-
phone with Q = 1 and N = 0 and an array with Q = 9 and N = 2

sensors, providing maximum output at the look direction [7]. Furthermore, the delay-
and-sum beamformer also offers maximum WNG and therefore maximum robustness
to noise and uncertainties. This is discussed further in the next chapter. Note that the
delay-and-sum approach will work only if the plane waves propagate in free field,
so that the delay-and-sum beamformer is applicable to open array configurations.
However, its realization is also possible for other configurations, as detailed in this
section.
The beamforming integral equation, as presented in Eq. (5.1), is now employed
with the aim of developing an analytical formulation for the delay-and-sum beam-
former. The sound pressure on a sphere of radius r due to a single unit-amplitude
plane wave arriving from direction k̃ can be expressed as ei k̃·r . Phase alignment for
waves with arrival direction, (θk , φk ), equal to the array look direction, (θl , φl ), is
therefore achieved when selecting the beamforming weighting function to be

[w(k, θ, φ)]∗ = e−i k̃l ·r , (5.40)

with k̃l = (k, θl , φl ) and r = (r, θ, φ) representing the array spherical surface. Using
Eq. (2.37), the coefficients of the beamforming weights can be written in the spherical
harmonics domain as  ∗
wnm (k) = bn (kr ) Ynm (θl , φl ) , (5.41)

with bn representing an open-sphere array configuration, as in Eq. (4.4). Now, with


the formulation of axis-symmetric beamforming, as in Eq. (5.22), the axis-symmetric
beamforming weights, dn , for the delay-and-sum beamformer are given by [5]
5.5 Simple Axis-Symmetric Beamformers 117

dn (k) = |bn (kr )|2 . (5.42)

Although popular, the delay-and-sum beamformer is typically restricted to arrays


composed of sensors in free field, due to the assumption that the magnitude of
the incoming wave is the same at all sensors, so that only phase compensation is
required. In the case of a spherical array formulated in the spherical harmonics
domain, the delay-and-sum beamformer, with its highly desired robustness property,
can be extended to other array configurations. Substituting Eq. (5.41) for the array
weights and Eq. (4.3) for the measured sound pressure in the array equation, Eq. (5.8),
the array output, can be written for this case as


N 
n 
N 
n
pnm (k, r )
y= [wnm (k)]∗ pnm (k, r ) = dn (k)Ynm (θl , φl )
n=0 m=−n n=0 m=−n
bn (kr )

N 
n
= |bn (kr )|2 anm (k)Ynm (θl , φl ). (5.43)
n=0 m=−n

Now, anm (k) can be computed from the sound field measured by the various array
configurations presented in Chap. 4, using the appropriate function bn (kr ) for the
actual configuration; the terms bn (kr ) that replace the array weights are those repre-
senting an open sphere, regardless of the actual configuration. This is an illustration
of the flexibility of array design and processing in the spherical harmonics domain.
Another widely used beamformer is characterized by beamforming weights of
unit value, i.e. dn = 1. Equation (5.43) can be rewritten for this case, by substituting
dn = 1, as


N 
n 
N 
n
y= [wnm (k)]∗ pnm (k, r ) = anm (k)Ynm (θl , φl )
n=0 m=−n n=0 m=−n
≈ a(k, θl , φl ), (5.44)

with the approximation becoming equality as N → ∞. This result suggests that


the array output, y, as a function of look direction, approximates the plane-wave
amplitude density function. In other words, the sound field measured by the array can
now be represented using plane-wave components. For this reason, the beamformer
is termed the “plane-wave decomposition” beamformer [3]. Another name for this
beamformer is a “regular” beamformer (see [1]). In the next chapter, it is shown that
an array with a regular beam pattern achieves the maximum directivity index.

5.6 Beamforming Example

A beamforming example is presented in this section with the aim of illustrating the
way in which sound field composition, sampling, beamforming, and analysis are
118 5 Spherical Array Beamforming

formulated and realized using computer simulations. The example is broken down
into stages for clarity.
(i) Sound pressure in free field. Consider a sound field, composed of S harmonic
plane waves with wave number k, arrival directions denoted by (θs , φs ), s =
1, . . . , S, and amplitudes at the origin of the coordinate system, as (k), s =
1, . . . , S. Using Eqs. (2.40) and (2.41), the sound pressure at (r, θ, φ) can be
written as
∞ 
 n
p(k, r, θ, φ) = pnm (k, r )Ynm (θ, φ)
n=0 m=−n
∞ 
 n 
S
 ∗
= 4πi n jn (kr )as (k) Ynm (θs , φs ) Ynm (θ, φ). (5.45)
n=0 m=−n s=1

This equation is exact. However, when the aim is to generate this sound field
using a computer simulation, an approximation must be applied by constraining
the summation to be finite.
(ii) Finite-order sound field. The finite-order sound field is computed by replacing
the upper summation limit over n with Ñ . The approximation error can still be
small if kr Ñ (see Sect. 2.3), with r denoting the distance from the origin.
The sound field generated in practice is therefore given by


Ñ 
n 
S
 ∗
p(k, r, θ, φ) = 4πi n jn (kr )as (k) Ynm (θs , φs ) Ynm (θ, φ). (5.46)
n=0 m=−n s=1

(iii) Sampling by microphones. In the next stage of this simulation example, a spher-
ical microphone array is introduced into the sound field, centered at the origin.
It is assumed that the array is composed of a rigid sphere of radius ra with
Q microphones arranged on its surface, following a t-design configuration
(see Sect. 3.4), which allows for aliasing-free sampling up to order N . Equa-
tion (5.46) can now be used directly to represent the pressure at the micro-
phone positions, (ra , θq , φq ), q = 1, . . . , Q. Note that, in this case, the term
4πi n jn (kr ) is replaced by bn (kr ), with r = ra , to represent a rigid-sphere con-
figuration, as in Eq. (2.62):


Ñ 
n
p(k, ra , θq , φq ) = pnm (kra )Ynm (θq , φq )
n=0 m=−n


Ñ 
n 
S
 ∗
= bn (kra )as (k) Ynm (θs , φs ) Ynm (θq , φq ),
n=0 m=−n s=1
q = 1, . . . , Q. (5.47)
5.6 Beamforming Example 119

(iv) Spherical Fourier transform. In the next stage, the spherical Fourier transform
of the sound pressure at the sphere surface, pnm , is computed directly from the
pressure measurements at the microphones, p(k, ra , θq , φq ), using the spherical
Fourier transform for the nearly-uniform sampling scheme [see Eq. (3.24)], as

4π 
Q
 ∗
pnm (k, ra ) = p(k, ra , θq , φq ) Ynm (θq , φq ) , n ≤ N . (5.48)
Q q=1

(v) Alternative computation of the spherical Fourier transform. If the effect of


finite-order sound field approximation and spatial aliasing introduced in stages
(ii)-(iv) needs to be avoided, then the spherical harmonic coefficients can be
simply deduced from Eq. (5.47) to be


S
 ∗
pnm (k, ra ) = bn (kra ) as (k) Ynm (θs , φs ) , n ≤ N . (5.49)
s=1

(vi) Beamforming. Having calculated pnm , beamforming such as plane-wave decom-


position can be computed using Eq. (5.44), with [wnm (k)]∗ = Ynm (θl , φl )/
bn (kra ):


N 
n
y(θl , φl ) = [wnm (k)]∗ pnm (k, ra )
n=0 m=−n

N  n
pnm (k, ra ) m
= Y (θl , φl ). (5.50)
n=0 m=−n
bn (kra ) n

It is important to note that the angles (θl , φl ) can be selected at any desired
density over the sphere and are not related to the original sampling set (θq , φq ).
In particular, when plotting y(θl , φl ) over the sphere, a high sampling density
may be desired.
As a numerical example, consider a sound field composed of S = 3 harmonic
plane waves with amplitudes 1.0, 0.7eiπ/3 and 0.4eiπ/2 and arrival directions
(90◦ , 45◦ ), (117◦ , 90◦ ) and (45◦ , 270◦ ), respectively, at wave number k and radius r
satisfying kr = kra = 6. The pressure on the surface of the rigid-sphere array is mea-
sured by Q = 84 microphones, allowing aliasing-free sampling up to order N = 6.
The sound pressure at the microphones is computed as in stage (iii) with Ñ = 10.
Then, pnm is computed as in stage (iv) and beamforming is applied as in stage (vi)
to produce the plane-wave decomposition, y(θl , φl ).
Figure 5.6 shows the normalized magnitude of y(θl , φl ) in this case. An equal-
angle grid of 60 × 60 points was used to generate (θl , φl ). The figure shows three
peaks, corresponding to the actual arrival directions of the plane waves, marked as
“+” on the figure. Note that with plane-wave decomposition, due to the finite spher-
120 5 Spherical Array Beamforming

160 0.9

140 0.8

0.7
120
0.6
100
0.5
80
0.4
60
0.3

40 0.2

20 0.1

0
0 50 100 150 200 250 300 350

Fig. 5.6 Normalized magnitude of y(θl , φl ), the plane-wave decomposition, computed using
Eq. (5.50) with pnm computed from Eq. (5.48), for kr = kra = 6. The arrival directions of the
three plane waves are marked with white “+”

ical harmonics order of the beamforming, each plane wave contributes a sinc-like
function to y (see Fig. 1.12), so that y is composed of the weighted summation of
these functions. This may explain effects such as peaks at directions other than the
wave arrival directions, wide peaks around the arrival directions and peaks not corre-
sponding exactly to arrival directions. Methods to reduce these effects by controlling
the beam pattern of the array are presented in the next chapter.
Figure 5.7 shows the normalized magnitude of y(θl , φl ); this time with pnm com-
puted directly from Eq. (5.49), therefore avoiding errors due to finite-order and spatial
aliasing. As Figs. 5.6 and 5.7 are relatively similar, it is clear that, in this case, the lim-
ited order and spatial sampling do not produce significant errors. This is reasonable,
because with Ñ = 10, N = 6 and kr = 6, both errors are expected to be small.
In contrast, errors cannot be expected to be small in the next example, where
the computation of y(θl , φl ) is repeated, as in Fig. 5.6, but this time for kr = 10.
Figure 5.8 shows a larger number of peaks away from the plane-wave arrival direc-
tions. These peaks are mostly due to aliasing errors, with the higher orders aliased
to the lower orders, n = 0, . . . , 6.
The formulations in this simulation example are presented here in a matrix form,
because this is the form most likely to be employed in practice using computer
programming. First, the pressure at the microphones, Eq. (5.47), is rewritten as
5.6 Beamforming Example 121

160 0.9

140 0.8

0.7
120
0.6
100
0.5
80
0.4
60
0.3

40 0.2

20 0.1

0
0 50 100 150 200 250 300 350

Fig. 5.7 Normalized magnitude of y(θl , φl ), the plane-wave decomposition, computed using
Eq. (5.50) with pnm computed from Eq. (5.49), for kr = kra = 6. The arrival directions of the
three plane waves are marked with white “+”

160 0.9

140 0.8

0.7
120
0.6
100
0.5
80
0.4
60
0.3

40 0.2

20 0.1

0
0 50 100 150 200 250 300 350

Fig. 5.8 Normalized magnitude of y(θl , φl ), the plane-wave decomposition, computed using
Eq. (5.50) with pnm computed from Eq. (5.48), for kr = kra = 10. The arrival directions of the
three plane waves are marked with white “+”
122 5 Spherical Array Beamforming

p = Ỹq B̃ỸsH as
as = [a1 (k), a2 (k), . . . , a S (k)]T
⎡ ⎤
Y00 (θ1 , φ1 ) · · · Y ÑÑ (θ1 , φ1 )
⎢ .. .. .. ⎥
Ỹs = ⎢
⎣ . . .


Y00 (θ S , φ S ) · · · Y ÑÑ (θ S , φ S )

B̃ = diag b0 , b1 , b1 , b1 , · · · , b Ñ
⎡ ⎤
Y00 (θ1 , φ1 ) · · · Y ÑÑ (θ1 , φ1 )
⎢ .. .. .. ⎥
Ỹq = ⎢
⎣ . . .


Y0 (θ Q , φ Q ) · · · Y Ñ (θ Q , φ Q )
0 Ñ

 T
p = p(k, ra , θ1 , φ1 ), · · · , p(k, ra , θ Q , φ Q ) , (5.51)

where the S × 1 vector as holds the plane waves’ amplitudes, the Q × 1 vector p
holds the sound pressure amplitude at the microphones, the ( Ñ + 1)2 × ( Ñ + 1)2
diagonal matrix B̃ holds the values of bn (kr ) for a rigid sphere with r = ra , the
S × ( Ñ + 1)2 matrix Ỹs holds the spherical harmonics with the plane wave arrival
directions and, similarly, the Q × ( Ñ + 1)2 matrix Ỹq holds the spherical harmonics
with the microphone positions’ directions.
In the next stage, the spherical harmonic coefficients of the sound pressure on the
sphere, pnm , are computed, as in Eq. (5.48):

4π H
pnm = Y p
Q q
 T
pnm = p00 , p1(−1) , p10 , p11 , . . . , p N N , (5.52)

where the (N + 1)2 × 1 vector pnm holds coefficients pnm , and Yq is similar to Ỹq
but with Ñ replaced by N . In the final stage, plane-wave decomposition is computed,
as in Eq. (5.50):

y = Yl B−1 pnm
⎡ 0 ⎤
Y0 (θ1 , φ1 ) · · · Y NN (θ1 , φ1 )
⎢ .. .. .. ⎥
Yl = ⎣ . . . ⎦, (5.53)
Y00 (θ L , φ L ) · · · Y NN (θ L , φ L )

where the L × (N + 1)2 matrix Yl holds the spherical harmonics with the plane-
wave decomposition look directions, and B is similar to B̃ but with Ñ replaced by
N.
5.7 Steering Non Axis-Symmetric Beam Patterns 123

5.7 Steering Non Axis-Symmetric Beam Patterns

Although the axis-symmetric beamformer presented in Sect. 5.2 offers simplicity in


design, due to the one-dimensional formulation, in some cases we may be interested
in beam patterns that are not axis-symmetric about the look direction. A situation may
arise where sound sources of interest occupy a wide region in directional space, such
as a stage in an auditorium or a few speakers positioned in proximity. In this case,
the main lobe should be wide over the azimuth and narrow over the elevation, and
so beam patterns that are axis-symmetric about the look direction may not offer the
most suitable solutions. In this case, we may want to revert back to the general, two-
dimensional beamformer formulation, as in Eq. (5.8). It may be convenient to present
the beamforming coefficients wnm as a function of bn and modified beamforming
coefficients cnm , as follows:

cnm (k)
[wnm (k)]∗ = . (5.54)
bn (kr )

The array beam pattern, defined as the array output in response to a unit-amplitude
plane wave, can be formulated by substituting Eqs. (5.54) and (5.16) into Eq. (5.8):


N 
n
 ∗
y= cnm (k) Ynm (θk , φk ) . (5.55)
n=0 m=−n

Beam pattern y and coefficients cnm (k) are therefore related through the spherical
Fourier transform and complex-conjugate operations, i.e. [y(θk , φk )]∗ is the spherical

Fourier transform of cnm . This provides a simple framework for the calculation of
cnm once a desired beam pattern is available. However, the steering of such a beam
pattern may not be as simple as in the case of the axis-symmetric beam pattern.
Recall that in the case of the axis-symmetric beam pattern, steering was achieved by
substituting a desired look direction, (θl , φl ), in Eq. (5.22), without any modification
to the beamforming coefficients, dn . In the case of non axis-symmetric beamforming,
Eq. (5.55), steering will directly change the coefficients, cnm . However, steering the
beam pattern is equivalent to rotating function y(θk , φk ), and so the rotation operation
of functions on the sphere, as presented in Sect. 1.6, is employed [6].
Let us denote by y r (θk , φk ) ≡ Λ(α, β, γ )y(θk , φk ) the function on the sphere, y,
rotated by Euler angles (α, β, γ ) (see Sect. 1.6 for more details on rotation using
Euler angles). In the case of beamforming, the rotation will steer the beam pattern to
the desired orientation. It is important to note that in the case of a non axis-symmetric
beam pattern, in addition to conventional steering, which is the change in the look
direction, another degree-of-freedom is available; this can be interpreted as a rotation
of the beam pattern itself about the look direction. Such a rotation will only change
the beam pattern if it is non axis-symmetric about the look direction. This explains
the need for three angles, (α, β, γ ), when performing steering of non axis-symmetric
beam patterns.
124 5 Spherical Array Beamforming

Steering is now formulated based on Eq. (1.72) and Sect. 1.6, where a rotation of
a function on the sphere is decomposed into a set of rotations of spherical harmonics,
which, in turn, are formulated using multiplication with the Wigner-D functions [6]:

y r (θk , φk ) = Λ(α, β, γ )y(θk , φk )


N  n
 ∗
= cnm (k)Λ(α, β, γ ) Ynm (θk , φk )
n=0 m=−n


N 
n 
n
 ∗ 
∗
= cnm (k) Dmn m (α, β, γ ) Ynm (θk , φk )
n=0 m=−n m =−n
 

N 
n 
n
 ∗ 
∗
= cnm (k) Dmn m (α, β, γ ) Ynm (θk , φk )
n=0 m =−n m=−n


N 
n  ∗
= r
cnm (k) Yn (θk , φk )
m
. (5.56)
n=0 m =−n

The rotated beam pattern y r is generated by a new set of beamforming coefficients,


r
cnm , related to the original coefficients using


n
 ∗
(k) = cnm (k) Dmn m (α, β, γ ) .
r
cnm (5.57)
m=−n

r
Substituting Eq. (5.54) into Eq. (5.57), the rotated coefficients wnm can be written in
terms of the original coefficients wnm as


n
(k) = wnm (k)Dmn m (α, β, γ ).
r
wnm (5.58)
m=−n

Equation (5.58) can be written in a matrix form as


r
wnm = Dwnm , (5.59)

where wnm r
is the (N + 1)2 × 1 vector of coefficients of the rotated beam pattern,
wnm has been defined in Eq. (5.10) and the block-diagonal Wigner-D matrix D has
been defined in Sect. 1.6.
Rotations can be applied successively in an ongoing steering process, e.g. succes-
sive rotations D1 and D2 can be realized by multiplying the two rotation matrices, i.e.
D2 D1 , to produce an equivalent rotation. This can be useful to simplify the steering
process from the current look direction, (θl , φl ), to the desired new look direction,
(θl , φl ), where ψl and ψl represent the rotations about the current and desired look
directions, respectively. First, a rotation of Λ(−ψl , −θl , −φl ) is applied to align the
beam pattern look direction with the positive z-axis direction, without any further
5.7 Steering Non Axis-Symmetric Beam Patterns 125

rotation about this direction. Then, a rotation in the direction Λ(θl , φl , ψl ) is applied
to steer the beam pattern to the new direction. This entire process can be realized
using a single rotation matrix, by multiplying the two rotation matrices, as explained
above [6].

References

1. Li, Z., Duraiswami, R.: Flexible and optimal design of spherical microphone arrays for beam-
forming. IEEE Trans. Audio Speech Lang. Process. 15(2), 702–714 (2007)
2. Meyer, J., Elko, G.W.: A highly scalable spherical microphone array based on an orthonormal
decomposition of the soundfield. In: IEEE International Conference on Acoustics, Speech and
Signal Processing (ICASSP 2002), vol. II, pp. 1781–1784. Orlando (2002)
3. Rafaely, B.: Plane-wave decomposition of the pressure on a sphere by spherical convolution. J.
Acoust. Soc. Am. 116(4), 2149–2157 (2004)
4. Rafaely, B.: Analysis and design of spherical microphone arrays. IEEE Trans. Speech Audio
Process. 13(1), 135–143 (2005)
5. Rafaely, B.: Phase-mode versus delay-and-sum spherical microphone array processing. IEEE
Signal Process. Lett. 12(10), 713–716 (2005)
6. Rafaely, B.: Spherical microphone array beam steering using Wigner-D weighting. IEEE Signal
Process. Lett. 15, 417–420 (2008)
7. Van Trees, H.L.: Optimum Array Processing (Detection, Estimation, and Modulation Theory,
Part IV), 1st edn. Wiley, New York (2002)
Chapter 6
Optimal Beam Pattern Design

Abstract Beamforming with spherical microphone arrays was presented in Chap. 5


as an instrument to achieve directional filtering, characterized by the beam pattern
of the array. It may be desired to control the beam pattern in a more explicit manner
to achieve specific properties. For example, beamformers that achieve maximum
directivity index may be useful to enhance a desired plane wave relative to undesired
plane waves arriving from the entire range of directions. Beamformers that achieve
maximum white noise gain (WNG) may be desired if robustness to system uncertainty
is important. We may also be interested in enhancing a desired plane wave while
guaranteeing a specific reduction level for undesired plane waves in other directions.
This can be achieved by restricting the side-lobe level in the beam pattern using
the Dolph-Chebyshev design. Design objectives can also be combined into a single
objective, or integrated into a more complex constrained optimization formulation. In
summary, this chapter presents methods for beam pattern design formulated explicitly
for spherical arrays, with the aim of providing tools for matching the properties of
the array to specific performance aspects.

6.1 Maximum Directivity Beamformer

The directivity factor has been introduced in Sect. 5.3 to account for the ratio between
the array response in the look direction and the average response across all directions.
It is common in array processing to normalize the response in the look direction by
introducing a distortionless-response constraint [14], such that the average response is
minimized subject to the constraint of a unit response in the look direction. Following
the directivity factor derived in Eq. (5.29), the maximum directivity beamformer is
designed to satisfy
H
minimize wnm Bwnm
wnm
(6.1)
H
subject to wnm vnm = 1,

© Springer Nature Switzerland AG 2019 127


B. Rafaely, Fundamentals of Spherical Array Processing, Springer Topics
in Signal Processing 16, https://doi.org/10.1007/978-3-319-99561-8_6
128 6 Optimal Beam Pattern Design

with
1  
B= diag |b0 |2 , |b1 |2 , |b1 |2 , |b1 |2 , . . . , |bN |2 (6.2)

and with wnmH


vnm = 1 denoting the distortionless-response constraint. Vectors wnm
and vnm , of size (N + 1)2 × 1, are defined as in Chap. 5:
 T
wnm = w00 , w1(−1) , w10 , w11 , . . . , wNN
 T
vnm = v00 , v1(−1) , v10 , v11 , . . . , vNN , (6.3)

with the elements of the steering vector vnm defined in Eq. (5.16).
A solution to the optimization problem in Eq. (6.1) is obtained using the method
of Lagrange multipliers, widely employed in array processing [14]. Note that the
average directivity, denoted by the denominator in Eq. (5.27), is a real quantity, and
H
so the denominator in Eq. (5.29) derived thereafter, i.e. wnm Bwnm , is also real. The
function to be minimized in Eq. (6.1) is therefore real. Using the method of Lagrange
multipliers, the constrained optimization problem is reduced to an unconstrained one
as follows:
 H   H 
H
minimize wnm Bwnm + λ∗ wnm vnm − 1 + λ vnm wnm − 1 . (6.4)
wnm

Taking the derivative with respect to the complex vector wnm and setting the result
to zero, gives
H
wnm B + λvnm
H
= 0, (6.5)

which, when satisfied, implies that at the solution point both the quadratic objective
function and the linear constraint function have gradients in the same direction, only
normalized by λ. The solution therefore satisfies
H
wnm = −λvnm
H
B−1 . (6.6)

Multiplying both sides from the right by vnm and substituting the constraint in
Eq. (6.1), the value of λ is given by

1
λ=− H B−1 v
. (6.7)
vnm nm

The optimal value of wnm can now be written in the final form as
H
vnm B−1
H
wnm = H B−1 v
. (6.8)
vnm nm
6.1 Maximum Directivity Beamformer 129

Note that matrix B must be invertible, which amounts to requiring all values of bn (kr)
to be non-zero [see Eq. (6.2) and Chap. 4]. By substituting the elements of matrix B
and vnm in Eq. (6.8), the elements of wnm can be expressed as

[bn (kr)]∗ Ynm (θk , φk )/|bn (kr)|2


[wnm (k)]∗ = N n  ∗
m=−n Yn (θk , φk ) Yn (θk , φk )
m m
n=0
1
Y m (θk , φk )
b (kr) n
= Nn
n=0 4π Pn (cos 0)
2n+1

4π 1
= Y m (θk , φk ). (6.9)
(N + 1)2 bn (kr) n

Two conclusions can be drawn from this result. First, comparing Eq. (6.9) to
Eq. (5.22), it is clear that the maximum directivity beamformer is axis-symmetric,
with

dn (k) = . (6.10)
(N + 1)2

It immediately follows that the optimal beamformer in Eq. (6.9) is also a solution
to the axis-symmetric maximum directivity beamformer, with a directivity factor as
defined in Eq. (5.30). The second conclusion follows directly, by noting that Eq. (6.10)
is a normalized version of the plane-wave decomposition array described in Sect. 5.5.
The plane-wave decomposition array therefore achieves maximum directivity. This
is evidence of the following characteristic of the spherical harmonics domain formu-
lation, in particular with axis-symmetric beam patterns: the naive solution of setting
all coefficients to a constant value achieves the best directivity index! An alternative
approach to solve for the maximum directivity beamformer is outlined at the end of
this section.
The directivity factor of the maximum directivity beamformer is derived next, by
substituting the solution from Eq. (6.8) and the satisfied constraint into Eq. (5.29):
H
wnm Awnm
DFmax = H Bw
wnm nm
H H
w vnm vnm wnm
=  nm −1
H −1
vnm B vnm
= vnm
H
B−1 vnm

N 
n
4π  ∗
= [bn (kr)]∗ Ynm (θk , φk ) bn (kr) Ynm (θk , φk )
n=0 m=−n
|bn (kr)| 2

N  n
 m ∗ N
2n + 1
= 4π Yn (θk , φk ) Ynm (θk , φk ) = 4π Pn (cos 0)
n=0 m=−n n=0

= (N + 1)2 . (6.11)
130 6 Optimal Beam Pattern Design

The maximum achievable directivity factor therefore depends on the array order.
Arrays with a high directivity factor require a high-order N , which, in turn, requires
a large number of microphones. As the number of microphones for aliasing-free
sampling requires Q ≥ (N + 1)2 , it is clear that the maximum achievable directivity
is directly proportional to the number of microphones in the array.
Maximum directivity arrays exhibit a beam pattern that is referred to as hyper-
cardioid [6]. This beam pattern, well known for a directivity of 14 (1 + 3 cos Θ) for a
first order array, can also be extended specifically to spherical arrays by exploiting
the maximum directivity solution. The array beam pattern for an axis-symmetric
array, with dn = (N4π+1)2
, can be written using Eq. (5.24) as

4π 
N
2n + 1
y(Θ) = Pn (cos Θ)
(N + 1) n=0 4π
2

PN +1 (cos Θ) − PN (cos Θ)
= (6.12)
(N + 1)(cos Θ − 1)

[see Sect. 1.5, describing the spherical Fourier transform of Ynm (θ, φ)]. Substituting
the expressions for the Legendre polynomials (see Sect. 1.3), Table 6.1 shows the
hyper-cardioid directivity for several array orders and Fig. 6.1 illustrates the beam
patterns for orders N = 1, . . . , 4. The figure shows that improved hyper-cardioid
directivity at high orders comes with reduced side-lobe level, and a narrower main
lobe. In fact, Rafaely [11] showed that for arrays with an order higher than about
N = 4, the width of the main lobe, defined as the angle between the two zeros on
either side of the main lobe, can be approximated by the following simple expression:


2Θ0 ≈ , (6.13)
N
with Θ0 denoting the angle of the main lobe zero. The width of the main lobe is
also related to the ability of the array to spatially separate two plane waves arriving

Table 6.1 Hyper-cardioid directivity for orders N = 0, . . . , 5, normalized to unit amplitude at


Θ = 0, with corresponding directivity index
Order N y(Θ)/y(0) DI (dB)
0 1 0
4 (3 cos Θ + 1)
1
1 6.0
6 (5 cos Θ + 2 cos Θ − 1)
1 2
2 9.5
32 (35 cos Θ + 15 cos Θ − 15 cos Θ − 3)
1 3 2
3 12.0
40 (63 cos Θ + 28 cos Θ − 42 cos Θ − 12 cos Θ + 3)
1 4 3 2
4 14.0
96 (231 cos Θ + 105 cos Θ − 210 cos Θ − 70 cos Θ + 35 cos Θ + 5)
1 5 4 3 2
5 15.6
6.1 Maximum Directivity Beamformer 131

90 90
120 1 60 120 1 60
0.8 0.8
150 0.6 30 150 0.6 30
0.4 0.4
0.2 0.2
180 0 0 180 0 0

210 330 210 330

240 300 240 300


270 270

90 90
120 1 60 120 1 60
0.8 0.8
150 0.6 30 150 0.6 30
0.4 0.4
0.2 0.2
180 0 0 180 0 0

210 330 210 330

240 300 240 300


270 270

Fig. 6.1 Hyper-cardioid beam patterns for orders N = 1, . . . , 4

from different directions. The limit of this separation ability is known in optics as
the Rayleigh resolution [2], such that
π
ΘRayleigh ≈ . (6.14)
N
For arrays with low orders, the Rayleigh resolution is poor, but as the order
increases, resolution improves in a proportional manner.
An alternative approach to the derivation of the maximum directivity beamformer,
which does not require the Lagrange multiplier, is briefly outlined next. In this
approach the directivity factor is maximized directly, without imposing a distor-
tionless response constraint, after which the solution is normalized to satisfy the
constraint. Maximizing the directivity factor, as in Eq. (5.29), can be written as
H
wnm Awnm
maximize λ, λ = H
. (6.15)
wnm wnm Bwnm

This equation can be written as


H
wnm Awnm = λwnm
H
Bwnm . (6.16)

A solution to this scalar equation can be found by solving the following vector
equation:
132 6 Optimal Beam Pattern Design

Awnm = λBwnm , (6.17)

H
because left-multiplication of Eq. (6.17) with wnm preserves the equality. Equation
(6.17) is a generalized eigenvalue problem [5], with Eq. (6.15) representing a gen-
eralized Rayleigh quotient. We now use the special structures of matrices A and
B, as defined in Eq. (5.29), to simplify the generalized eigenvalue problem into a
(standard) eigenvalue problem. First, both sides of the equation are multiplied by the
inverse of matrix B. Second, matrix A is written as a dyadic or outer product of two
H
vectors, vnm vnm , such that B−1 A = ṽnm vnm
H
, with ṽnm = B−1 vnm . Equation (6.17)
can now be rewritten as
 
H
ṽnm vnm wnm = λwnm . (6.18)

Equation (6.18) is an eigenvalue problem, with the matrix under consideration having
unit rank, as it is composed of the outer product of two vectors. Due to the single
rank, there is only one non-zero eigenvalue, with a corresponding right eigenvector
ṽnm [9]. Substituting wnm = ṽnm , this becomes a solution, provided λ = vnm H
ṽnm .
These are therefore the eigenvector and eigenvalue in this case; the eigenvalue is
the largest, as it is real and positive, and all other eigenvalues are zero. The optimal
beamforming coefficients can therefore be written as
H
wnm = vnm
H
B−1 , (6.19)

which is a normalized version of the solution derived in Eq. (6.8). Further normal-
ization can now be applied, as in Eq. (6.8), to satisfy the distortionless-response
constraint.

6.2 Maximum WNG Beamformer

WNG was introduced in Sect. 5.4 as a general measure for array robustness. Arrays
that achieve maximum WNG will therefore be most robust to the effect of sensor
noise and other uncertainties in system parameters. This section presents the deriva-
tion of a spherical array with maximum WNG. Similar to the design of maximum
directivity beamformers, we constrain the beam pattern to have unit response at the
H
look direction, such that wnm vnm = 1, and so the numerator in Eq. (5.35) satisfies
wnm Awnm = 1. Maximum WNG beamformers can therefore be designed by solving
H

the following optimization problem:


H
minimize wnm Bwnm
wnm
(6.20)
H
subject to wnm vnm = 1,
6.2 Maximum WNG Beamformer 133

with
B = SSH . (6.21)

This problem is similar to the maximum directivity problem defined in Eq. (6.1), and
therefore a solution similar to Eq. (6.8) applies, leading to
H
vnm B−1
H
wnm = H B−1 v
. (6.22)
vnm nm

The maximum WNG in this case is derived by substituting the solution, Eq. (6.22),
H
in the expression for the WNG, Eq. (5.34), assuming wnm vnm = 1:
 H 
w vnm 2 1
nm
WNGmax = H = H
wnm Bwnm wnm Bwnm
 H −1 2
vnm B vnm
= H −1 −H
vnm B BB vnm
= vnm
H
B−1 vnm . (6.23)

The last line of the derivation requires that B is Hermitian, which is satisfied because
B = SSH .
In the special case of uniform or nearly-uniform sampling [see Eq. (5.36)] matrix
B simplifies to

B = SSH = I. (6.24)
Q

Substituting Eq. (6.24) in Eqs. (6.22) and (6.23), the expressions for the optimal
weights and the maximum WNG for the case of uniform and nearly-uniform sampling
can be written as
H
vnm
H
wnm = H
(6.25)
vnm vnm

and
Q H
WNGmax = v vnm . (6.26)
4π nm
The expression for the maximum WNG can be further simplified using the following
relation [see Eqs. (5.34) and (5.39)]:

Q H
vH v = vnm
H
YH Yvnm = v vnm . (6.27)
4π nm
134 6 Optimal Beam Pattern Design

Substituting into Eq. (6.26) leads to

WNGmax = vH v = Q. (6.28)

The equality to Q is achieved for the case of sensors in free field; in this case, the
steering vector is defined as in Eqs. (5.5) and (5.6), i.e. with elements vq = eik̃·r , r =
(r, θq , φq ), and so the maximum WNG is equal to Q, the number of sensors. This is
a well-known result for the maximum achievable WNG [14].
Substituting Eq. (5.16) in Eqs. (6.25) and (6.26), the solution for the optimal
weights and the maximum WNG can be expressed more explicitly in the spheri-
cal harmonics domain as

[bn (kr)]∗ Ynm (θk , φk )


[wnm (k)]∗ = N n  ∗
m=−n |bn (kr)|
2 Y m (θ , φ ) Ynm (θk , φk )
n=0 n k k
∗ m
[bn (kr)] Yn (θk , φk )
= N 2n+1 (6.29)
n=0 4π |bn (kr)|
2

and

Q  2n + 1
N
WNGmax = |bn (kr)|2 . (6.30)
4π n=0 4π

It is interesting to note that the beamformer achieving maximum WNG is axis-


symmetric [see Eq. (5.22)], such that

|bn (kr)|2
dn (k) = N . (6.31)
n=0
2n+1

|bn (kr)|2

Note also that this beamformer is similar to the beamformer presented in Eq. (5.42),
i.e. the delay-and-sum beamformer, when sensors are in a free field. It is therefore
clear that for free field arrays, the maximum WNG beamformer is equivalent to the
delay-and-sum beamformer. This further justifies the popular use of the delay-and-
sum beamformer in the literature, due to its robustness property [14]. Nevertheless,
Eq. (6.31) can be used to design maximum WNG beamformers for general array
configurations, not only for sensors in free fields, e.g. rigid-sphere arrays.
Figure 6.2 shows the WNG for an array of order N = 3, in the range kr ∈ [0, 3],
designed to achieve maximum WNG. The values of the WNG were calculated using
Eq. (6.30), substituting values of bn (kr) for rigid and open spheres. The open-sphere
array achieves WNG close to Q (about 15 dB), as expected. Only as kr approaches 3
does the value of the WNG slightly reduce, as in this range the contribution of orders
higher than 3 to the sound field becomes significant, and the approximation of the
complex exponential sound field function becomes less accurate. The rigid-sphere
array achieves a WNG slightly higher than Q in the higher frequency range. This
6.2 Maximum WNG Beamformer 135

19
Open
18.5 Rigid
Q
18

17.5

17

16.5

16

15.5

15

14.5

14
0 0.5 1 1.5 2 2.5 3

Fig. 6.2 WNG for an array of order N = 3 with Q = 32 microphones nearly-uniformly arranged
on the surface of rigid and open spheres

is due to the effect of scattering; however, as discussed in Sect. 5.4, the WNG was
defined for sensors in a free field and hence may not apply directly in the case of
sensors around a rigid sphere. This means that the increase in the WNG is somewhat
artificial.

6.3 Example: Directivity Versus WNG

The previous two sections presented two alternatives to the design of spherical arrays,
one that achieves maximum directivity index and the other that achieves maximum
WNG. These two designs are compared in this section by means of an example
[12]. Maximum directivity and maximum WNG beamformers are designed for a
spherical array composed of Q = 36 microphones arranged around an open sphere
and using a nearly-uniform sampling configuration, which achieves aliasing-free
sampling up to and including order N = 4. The directivity index and WNG for these
two beamformers are presented in Fig. 6.3. Several conclusions can be drawn from
this example:
• The directivity index plot clearly shows that the array designed for maximum
directivity does achieve a better directivity index than the array designed for max-
imum WNG. The value of the directivity index in this case (for the fourth-order
array) is given by 10 log10 (N + 1)2 ≈ 14 dB, as illustrated in the figure.
• The WNG plots show that the array designed for maximum WNG does achieve
a better WNG than the array designed for maximum directivity. The value of
136 6 Optimal Beam Pattern Design

20
Max DI
18 Max WNG
16

14

12

10

0
0 1 2 3 4 5

30
Max DI
Max WNG
20

10

-10

-20

-30
0 1 2 3 4 5

Fig. 6.3 Directivity index (top) and WNG (bottom) for two arrays of order N = 4, with Q = 36
microphones nearly-uniformly arranged on the surface of an open sphere; one is designed to achieve
maximum directivity index and the other is designed to achieve maximum WNG
6.3 Example: Directivity Versus WNG 137

the WNG for this delay-and-sum type array is given by 10 log10 Q ≈ 15.5 dB, as
illustrated in the figure.
• The directivity index of the array designed for maximum WNG decreases towards
the low frequencies, achieving DI = 0 dB at kr = 0. This is a result of the require-
ment introduced in this design to achieve maximum WNG. The required WNG
can only be achieved at low values of kr by allocating low-magnitude weights to
high-order coefficients, as is evident from the solution in this case, i.e. dn is pro-
portional to |bn (kr)|2 . For high n and low kr the magnitude of bn (kr) is small. The
low effective order of the array at low kr produces low directivity index values.
The high orders with their low magnitude present poor SNRs, so that allocating
weights with high gains to these orders would increase the noise and reduce the
WNG.
• For the same reason, the array designed for maximum directivity index achieves
poor WNG at low values of kr.
• At kr = N , both designs achieve a similar directivity index and WNG. This is
due to the behavior of bn (kr), n = 0, . . . , N , which have a similar magnitude at
kr = N . Arrays designed for narrow-band signals can therefore have both the best
directivity index and the best WNG if designed to operate at kr = N .
• The arrays designed for maximum directivity index have poor WNG at frequencies
around which bn (kr) = 0, i.e. the zeros of the spherical Bessel function. As dis-
cussed above, low values of bn (kr) impose poor WNG when attempting to achieve
a high directivity index. The disadvantage of the open-sphere array regarding
robustness is therefore clearly illustrated in this example.
The example presented above clearly shows the inherent trade-off between the
directivity index and WNG. This trade-off calls for a design which takes both the
directivity index and WNG into account. Such design approaches are presented in
the following sections.

6.4 Mixed-Objective Design

Spherical microphone array designs for maximum directivity and maximum WNG
were presented in Sects. 6.1 and 6.2. The design example presented in the following
section demonstrates the inherent trade-off between directivity and WNG, i.e. high
directivity index may come at the expense of robustness. Design of spherical arrays
in practice therefore involves a balance between these two measures of performance.
Spherical arrays with maximum directivity are particularly useful when it is nec-
essary to reduce the effect of diffuse or spherically isotropic noise fields. On the
other hand, spherical arrays with maximum WNG are particularly useful when it
is necessary to reduce the effect of sensor noise. Therefore, the balance between
directivity and WNG represents the balance between reducing acoustic noise and
reducing sensor noise. Now, by minimizing the overall noise at the array output,
which is composed of both acoustic noise and sensor noise, a natural balance can be
achieved between directivity and WNG [10]. A framework for designing a spherical
138 6 Optimal Beam Pattern Design

array that minimizes the overall noise at the array output is presented in this section.
First, an expression for the overall noise at the array output is formulated and next,
a closed-form expression for the array beamforming coefficients is derived by min-
imizing the overall noise at the array output, subject to a distortionless-response
constraint.
Assuming spatially uncorrelated sensor noise with variance σs2 and following the
derivation in Sect. 5.4, the variance of the sensor noise at the array output, σso2 , can
be expressed, as in Eq. (5.33):

σso2 = σs2 wnm


H
Awnm
A = SSH , (6.32)

with matrix S dependent on the sampling scheme [see Eqs. (3.41)–(3.43)]. For the
particular case of uniform and nearly-uniform sampling, σso2 reduces to


σso2 = σs2 wnm H wnm (6.33)
Q

[see Eq. (6.24)].


Further, assuming diffuse or spherically isotropic acoustic noise and following
the derivation in Sect. 5.4 with σa2 representing the variance of the spatial density
of the acoustic noise (or alternatively, σa representing the amplitude density of the
plane waves composing the noise field), the variance of the acoustic noise at the array
output, σao
2
, can be expressed as

2π π
σao
2
= |y(θ, φ)|2 sin θ d θ d φ
0 0
2π π N 
2


n
  ∗ 
=  [wnm (k)]∗ σa bn (kr) Ynm (θ, φ)  sin θ d θ d φ
 m=−n

0 0 n=0


N 
n
 
= σa2 [wnm (k)]∗ bn (kr)2
n=0 m=−n

= σa wnm Bwnm ,
2 H
(6.34)

with  
B = diag |b0 |2 , |b1 |2 , |b1 |2 , |b1 |2 , . . . , |bN |2 , (6.35)

similar to the expression found in Eq. (5.29). The orthogonality property of the spher-
ical harmonics, as formulated in Eq. (1.23), was used in the derivation to evaluate the
integral. The overall noise at the array output can now be written as a composition
of the acoustic noise and sensor noise:
6.4 Mixed-Objective Design 139

σo2 = σso2 + σao


2
= σs2 wnm
H
Awnm + σa2 wnm
H
Bwnm
= wnm
H
Rwnm , (6.36)

where

R = σs2 A + σa2 B. (6.37)

Adding a distortionless-response constraint, as in Eq. (6.1), an optimization problem


can be written as
H
minimize wnm Rwnm
wnm
(6.38)
H
subject to wnm vnm = 1.

The solution [see Eq. (6.8)] becomes


H
vnm R−1
H
wnm = H R −1 v
. (6.39)
vnm nm

A similar formulation for an axis-symmetric beamformer can be derived by substi-


tuting Eq. (5.22) and assuming nearly-uniform sampling, such that Eq. (5.36) holds.
The variance of the sensor noise at the array output can be derived for this case as

4π 
N 
n
σso2 = σs2 |wnm (k)|2
Q n=0 m=−n

4π  |dn (k)|2   m 2
N n
= σs2 Yn (θl , φl )
Q n=0 |bn (kr)| m=−n
2

1  |dn (k)|2
N
= σs2 (2n + 1)
Q n=0 |bn (kr)|2
= σs2 dnH Adn , (6.40)

with
1  
A= diag 1/|b0 |2 , 3/|b1 |2 , . . . , (2N + 1)/|bN |2 . (6.41)
Q

The spherical harmonics addition theorem, formulated in Eq. (1.26), was used to
simplify the summation over spherical harmonics.
The variance of the acoustic noise for the case of an axis-symmetric beamformer
with nearly-uniform sampling can be derived directly from Eq. (6.34) by substituting
Eq. (5.22):
140 6 Optimal Beam Pattern Design

2π π 
N 
n
 
σao
2
= |y(θ, φ)|2 sin θ d θ d φ = σa2 [wnm (k)]∗ bn (kr)2
0 0 n=0 m=−n


N 
n
 m 
= σa2 |dn (k)|2 Y (θl , φl )2
n
n=0 m=−n


N
(2n + 1)
= σa2 |dn (k)|2
n=0

= σa dn Bdn ,
2 H
(6.42)

1
B= diag (1, 3, 5, . . . , 2N + 1) . (6.43)

Matrix R in this case has the same form as in Eq. (6.37), i.e. R = σs2 A + σa2 B. An
optimization problem similar to the one in Eq. (6.38) can now be written as

minimize dnH Rdn


dn
(6.44)
subject to dnT vn = 1,

where, in this case, the elements of the steering vector, vn , are vn = 2n+1 4π
,n=
0, . . . , N [see Eq. (5.31)]. It is assumed in this case that the angle between the incom-
ing plane wave and the look direction is zero. The solution becomes

vnH R−1
dnT = . (6.45)
vnH R−1 vn

Table 6.2 presents examples of spherical microphone array designs using the
mixed-objective method. In all examples, an optimization problem, as formulated
in Eq. (6.44), was formulated and solved using Eq. (6.45). Then, the values for the
directivity factor and the WNG were computed using Eqs. (5.31) and (5.39), respec-
tively.
The first two rows of the table illustrate two simplified designs, based on a second-
order spherical array in an open configuration, at kr = 2, composed of 12 micro-
phones and using a uniform sampling scheme. The first design, with σs2 = 0, σa2 = 1,
reduces to a maximum directivity beamformer. Indeed, DF = 9 is achieved, follow-
ing the theoretical upper limit of (N + 1)2 for this case. The second design, with
σs2 = 1, σa2 = 0, reduces to the maximum WNG beamformer, achieving a WNG of
11.67, which is just below the upper limit of Q for an array in free field (or open
configuration), which is 12 in this case. This example illustrates that the maximum
directivity and the maximum WNG designs are special cases of the mixed objectives
design.
6.4 Mixed-Objective Design 141

Table 6.2 Directivity factor and WNG are shown for several designs, with parameters presented
on the left-hand side of the table
Sphere N Q kr σs2 σa2 DF WNG
Open 2 12 2 0.0 1.0 9.00 6.58
Open 2 12 2 1.0 0.0 5.97 11.67
Rigid 3 32 3 0.0 1.0 16.00 44.52
Rigid 3 32 3 1.0 0.0 15.31 46.72
Rigid 3 32 3 1.0 1.0 16.00 44.64
Rigid 4 36 2 0.0 1.0 25.00 1.60
Rigid 4 36 2 1.0 0.0 9.35 51.50
Rigid 4 36 2 1.0 0.4 17.78 15.48

The second set of examples is based on a spherical array of order N = 3 in


a rigid-sphere configuration, at kr = 3, composed of 32 microphones and using a
nearly-uniform sampling scheme. The first two rows in this set of designs are similar
to the first two rows in the previous set of designs and represent maximum directivity
and maximum WNG beamformers. The first design achieves DF = (N + 1)2 = 16,
as expected. The second design achieves WNG that is higher than Q (32 in this case).
This is explained by the effect of scattering from the rigid sphere, which can account
for an increase in the values of the WNG, see Sect 5.4. In the third design, σs2 and σa2
were assigned equal weights. It is interesting to note that the directivity factor and
WNG values of all three designs are very similar and are not significantly affected by
the choice of σs2 and σa2 . This can be explained by the fact that the value of bn (kr) is
very similar for n = 0, . . . , 3 at kr = 3 (see Fig. 2.9), and so the two extreme designs
of maximum directivity and maximum WNG are very similar in this case. See also
Sect. 6.3. In this case, the mixed-objectives design is not very useful, as it produces
a similar design regardless of the choice of σs2 and σa2 .
The final set of examples is based on a spherical array of order N = 4 in a rigid-
sphere configuration, at kr = 2, composed of 36 microphones and using a nearly-
uniform sampling scheme. The first design achieves a maximum directivity factor
of DF = 25, while the second design achieves a maximum WNG of 51.5, which is
higher than Q (36 in this case), as expected, due to the scattering from the rigid sphere.
The third design, with σs2 = 1 and σa2 = 0.4, is an intermediate design, trading-
off directivity for robustness. This illustrates the capability of the mixed-objectives
method to provide a range of useful optimal beamformers, all having a closed-form
expression for the beamforming coefficients. Furthermore, this useful design offers
an optimal trade-off between directivity and robustness when the variances of the
sensor noise and acoustic noise are known.
142 6 Optimal Beam Pattern Design

6.5 Maximum Front-Back Ratio

The design of microphone arrays with an optimized beam pattern has been presented
in Sect. 6.1, where the ratio between the magnitude of the beam pattern in a single
look direction and the magnitude of the beam pattern averaged over all directions was
maximized. The underlying assumption in this maximum directivity design is that
the desired signal arrives from a single direction. This may not always be the case.
Consider, for example, the recording of live music, with the microphone facing the
stage. In this case the directivity factor should be maximized over a wider directional
range, to capture the sound sources from the entire stage. In addition, low magnitude
of the beam pattern from other directions (e.g. the audience) may be desired. A simple
design objective suitable for this example is to maximize the ratio between the front
and back parts of the beam pattern. Directional microphones with maximum front-
back ratio have been discussed in [4], with optimal solutions derived for differential
microphones. In this section, the maximum front-back ratio solution is derived for
the spherical microphone array.
The measure for the front-back ratio can be written as [4]


π/2
|y(θ, φ)|2 sin θ d θ d φ
F =
02π
0π . (6.46)
0 π/2 |y(θ, φ)|2 sin θ d θ d φ

In this formulation, the “front” refers to the upper hemisphere and the “back” to the
lower hemisphere. As the problem is symmetric around the z-axis, the axis-symmetric
beam pattern is employed, as in Eq. (5.24), substituting y = Nn=0 dn 2n+1

Pn (cos θ ),
omitting the dependence on k. The resulting integral in the numerator of Eq. (6.46)
is evaluated next, denoting F = FFNUM
DEN
. We first solve for FNUM :

2π π/2
FNUM = |y(θ, φ)|2 sin θ d θ d φ
0 0

1  ∗
N N
= d (2n + 1)dn (2n + 1)
8π n=0 n =0 n
π/2
× Pn (cos θ )Pn (cos θ ) sin θ d θ. (6.47)
0

The last integral


 can be evaluated by explicitly writing the Legendre polynomials as
Pn (z) = nq=0 pqn z q , with z = cos θ , such that
6.5 Maximum Front-Back Ratio 143

1 
n 
n  1

Pn (z)Pn (z)dz = pqn pln z q+l dz
0 q=0 l=0 0


n 
n
1 
= pqn pln . (6.48)
q=0 l=0
q+l+1

Now, FNUM can be written in a matrix form as

FNUM = dnH Adn , (6.49)

where dn = [d0 , d1 , . . . , dN ]T , and the elements of matrix A for n = 0, . . . , N and


n = 0, . . . , N are given by

1 n n
1 
Ann = (2n + 1)(2n + 1) pqn pln . (6.50)
8π q=0
q + l + 1
l=0

An expression for the denominator of F, denoted by FDEN , can be derived in a similar


way with different limits over the integral, leading to

2π π
FDEN = |y(θ, φ)|2 sin θ d θ d φ
0 π/2

1  ∗
N N
= d (2n + 1)dn (2n + 1)
8π n=0 n =0 n
π
× Pn (cos θ )Pn (cos θ ) sin θ d θ
π/2

1  ∗ 
N N n n
 (−1)q+l n n
= dn (2n + 1)dn (2n + 1)
 p p . (6.51)
8π n=0 n =0 q=0
q+l+1 q l
l=0

F can now be written in a Rayleigh quotient matrix form as

dnH Adn
F= , (6.52)
dnH Bdn

where the elements of matrix B for n = 0, . . . , N and n = 0, . . . , N are given by



1 n n
(−1)q+l n n
Bnn = (2n + 1)(2n + 1) p p . (6.53)
8π q=0
q+l+1 q l
l=0
144 6 Optimal Beam Pattern Design

90 90
120 1 60 120 1 60
0.8 0.8
150 0.6 30 150 0.6 30
0.4 0.4
0.2 0.2
180 0 0 180 0 0

210 330 210 330

240 300 240 300


270 270

90 90
120 1 60 120 1 60
0.8 0.8
150 0.6 30 150 0.6 30
0.4 0.4
0.2 0.2
180 0 0 180 0 0

210 330 210 330

240 300 240 300


270 270

Fig. 6.4 Super-cardioid beam patterns for order N = 1, . . . , 4, with the corresponding F values in
decibels

Matrices A and B are real, symmetric and positive definite, and so the eigenvalues
are positive real and the eigenvectors are real (see also [4]). Writing the Rayleigh
quotient as a generalized eigenvalue problem,

Adn = λBdn , (6.54)

the largest eigenvalue is the value of the maximum front-back ratio, and the corre-
sponding vector is the solution dn .
The maximum front-back ratio beam pattern is also known as the super-cardioid
pattern [4]. Figure 6.4 shows examples of the super-cardioid beam pattern for spher-
ical arrays of orders N = 1, . . . , 4. Note that very high front-back ratios can be
achieved with these arrays, as detailed on the figures.

6.6 Dolph-Chebyshev Beam Pattern

Beam pattern design often involves some assumptions about the desired signal and
the unwanted noise. For example, in the maximum directivity beamformer design,
the desired signal is a plane wave arriving from the array look direction, while the
noise is composed of waves arriving from all directions, e.g. a diffuse sound field.
However, the noise may be composed of a smaller number of plane waves arriving
6.6 Dolph-Chebyshev Beam Pattern 145

Fig. 6.5 Chebyshev 10


polynomial, T8 (x), showing
equal-amplitude ripple in the
range x ∈ [0, 1] and 8
diverging amplitude at
x > 1. (x0 , R) = (1.06, 8.2) 6
is also denoted on the figure

-2
0 0.2 0.4 0.6 0.8 1

from unknown directions. In this case, constraining the level of the beam pattern side
lobes can guarantee a desired level of noise attenuation. A framework for the design
of such beam patterns, called the Dolph-Chebyshev design method, is presented in
this section.
In particular, beam patterns with minimal width of the main lobe can be designed
for a given limit on the level of the side lobes, or beam patterns with a minimal level
of side lobes can be designed given a limit on the width of the main lobe. A brief
overview of Dolph-Chebyshev beam patterns is first presented [14], followed by a
derivation of a closed-form Dolph-Chebyshev design method for spherical arrays
[7].
The Dolph-Chebyshev beam pattern is based on the Chebyshev polynomials, char-
acterized by equal-amplitude oscillations in the range [−1, 1] and rapid divergence
beyond this range. Figure 6.5 shows an example of a Chebyshev polynomial, T8 (x),
illustrating that |T8 (x)| ≤ 1 in the range x ∈ [0, 1] and rapidly increases thereafter.
With the design of a Dolph-Chebyshev beam pattern, the oscillatory part of the poly-
nomials is transformed into the equal-ripple side-lobe response of the beam pattern,
while the diverging part contributes to the main lobe with a monotonic response. To
set the width of the main lobe and the relative attenuation of the side lobes, a point
(x0 , R) is selected, as shown in Fig. 6.5. The point x0 is to be transformed into the
look direction, or the peak of the main lobe, such that a relative side-lobe attenua-
tion of 1/R is achieved. Finally, the polynomial undergoes parameter scaling, with
x = x0 cos(θ/2). The fundamental equation describing the Dolph-Chebyshev beam
pattern based on the Chebyshev polynomials is therefore given by

1  
y(θ ) = TM x0 cos(θ/2) , (6.55)
R
146 6 Optimal Beam Pattern Design

Fig. 6.6 Function 1.2


R T8 (x0 cos(θ/2)), with
1

x0 = 1.06, R = 8.2 and 1


θ0 = 45◦ , showing the main
lobe and the equal-level side
lobes 0.8

0.6

0.4

0.2

-0.2
-150 -100 -50 0 50 100 150

where TM (·) is the Chebyshev polynomial of order M , θ ∈ [−π, π ] is the signal


arrival direction and x0 controls the width of the main lobe. Due to the division by
R, the peak response at the look direction, θ = 0, is one. Figure 6.6 shows y(θ ) for
M = 8 and (x0 , R) = (1.06, 8.2), derived from the polynomial presented in Fig. 6.5.
In the formal design process, the desired side-lobe level is selected first, by setting
the value of 1/R, after which x0 is calculated by [14]

cosh−1 (R)
x0 = cosh , (6.56)
M

with the zero of the main lobe, θ0 , given by


 
π
cos
θ0 = 2 cos−1 2M
. (6.57)
x0

Alternatively, the desired zero of the main lobe is set to θ0 , from which x0 and then
R are derived:  
R = cosh M cosh−1 (x0 ) , (6.58)

with  π 
cos 2M
x0 = . (6.59)
cos(θ0 /2)

Spherical arrays can be efficiently designed to achieve a Dolph-Chebyshev beam


pattern, due to the similarity between the Legendre polynomials composing the
axis-symmetric spherical array beam pattern and the Chebyshev polynomials [7].
6.6 Dolph-Chebyshev Beam Pattern 147

Following the development in [7], the axis-symmetric beam pattern for the spherical
array, as in Eq. (5.23), 
is equated to that in Eq. (6.55), with further substitutions of
1 + cos θ
z = cos θ, cos(θ/2) = 2
and M = 2N , leading to

 

N
2n + 1 1 1+z
dn Pn (z) = T2N x0 . (6.60)
n=0
4π R 2

, for an
The Chebyshev polynomial, T2N  even order 2N , consists only of even powers
[1], and the polynomial T2N x0 1 +2 z is therefore of order N in z. The polynomial
on the left-hand side of Eq. (6.60) is also of order N in z (see Sect. 1.3) and so the
coefficients of the two polynomials can be equated, leading to a derivation of dn for
a Dolph-Chebyshev beam pattern. First, both sides of Eq. (6.60) are multiplied by
2π Pm (z), m = 0, . . . , N , and then they are integrated over the range z ∈ [−1, 1]. The
left-hand side reduces to dm , due to the orthogonality of the Legendre polynomials
[see Eq. (1.36)] leading to

1  
2π 1+z
dm = Pm (z)T2N x0 dz, m = 0, . . . , N . (6.61)
R 2
−1

To solve the integral, both polynomials are written explicitly in an expanded form
as


m
Pm (z) = psm z s
s=0

N
T2N (z) = t2l2N z 2l , (6.62)
l=0

where psm and t2l2N denote the coefficients of the Legendre and Chebyshev polynomials,
respectively. Although T2N (z) is of order 2N , it has only N + 1 coefficients, as the
coefficients of the odd powers are zero. Substituting Eq. (6.62) into Eq. (6.61) and
rearranging terms, yields,

1
2π   −l 2N m 2l
m N
dm = 2 t2l ps x0 z s (1 + z)l dz. (6.63)
R s=0
l=0 −1

Further
simplification is obtained by substituting the binomial expansion [1] (1 +
z)l = lq=0 q!(l−q)!
l!
z q , and then solving the integral, with odd powers of z integrating
to zero, leading to
148 6 Optimal Beam Pattern Design

2π    1 − (−1)q+s+1
m N l
l!
dm = 2−l t2l2N psq x02l . (6.64)
R s=0 q=0
q + s + 1 q!(l − q)!
l=0

Equation (6.64) can be written in a matrix form as


d= PACTx0 (6.65)
R
where

d = [d0 , d1 , . . . , dN ]T
 T
x0 = 1, x02 , x04 , . . . , x02N
⎡ 0 ⎤
p0 0 · · · 0
⎢ 1 1 ⎥
⎢ p0 p1 · · · 0 ⎥

P=⎢ . . . . ⎥ ⎥
⎣ .. .. . . .. ⎦
p0N p1N · · · pNN
⎡ 1−(−1)N +1

2 0 ··· N +1
⎢ ⎥
⎢ 0 2
··· 1−(−1)N +2 ⎥
⎢ 3 N +2 ⎥
A=⎢ ⎥
⎢ .. .. .. .. ⎥
⎣ . . . . ⎦
1−(−1)N +1 1−(−1)N +2 1−(−1)2N +1
N +1 N +2
··· 2N +1
⎡ 1 1 ⎤
1 2 · · · 2N
⎢0 1 ··· N ⎥
⎢ N ⎥
C = ⎢ . 2. . 2. ⎥
⎣ .. .. . . .. ⎦
0 0 · · · 21N
 
T = diag t02N , t22N , ..., t2N 2N
. (6.66)

All four matrices are of size (N + 1) × (N + 1), with matrix A consisting of ele-
q+s+1
ments (s, q) given by 1−(−1)
q+s+1
, and the upper triangular matrix C consisting of
elements (q, l) given by q!(l−q)!
l!
2−l for l ≥ q. The Dolph-Chebyshev beam pattern
for a spherical array can now be designed as follows:
(i) The array order N is defined.
(ii) Desired side-lobe level 1/R or desired main-lobe width 2θ0 is selected.
(iii) Either Eq. (6.56) or Eq. (6.58) is evaluated to make both x0 and R available.
(iv) Array coefficients are computed using Eq. (6.65).
Figure 6.7 illustrates two examples of Dolph-Chebyshev beam patterns for spher-
ical arrays of orders N = 4, 9. For both designs 20 log10 R = 25 dB and in both
designs a side-lobe level of −25 dB is maintained. The high-order array achieves a
narrower main lobe, as is clearly shown in the figure. A second set of design examples
6.6 Dolph-Chebyshev Beam Pattern 149

5
N=4
0 N=9

-5

-10

-15

-20

-25

-30

-35
-150 -100 -50 0 50 100 150

Fig. 6.7 Beam pattern for an axis-symmetric spherical array with a Dolph-Chebyshev design with
R set to achieve a side-lobe level reduction of 25 dB, for arrays of orders N = 4, 9

0 N=4
N=9
-10

-20

-30

-40

-50

-60

-70

-80
-150 -100 -50 0 50 100 150

Fig. 6.8 Beam pattern for an axis-symmetric spherical array with a Dolph-Chebyshev design with
x0 set to achieve a main-lobe with a zero at θ0 = 45◦ , for arrays of orders N = 4, 9
150 6 Optimal Beam Pattern Design

is illustrated in Fig. 6.8, where for both designs θ0 = 45◦ , achieving a zero-to-zero
main-lobe width of 90◦ . The higher-order array achieves a lower side-lobe level, as is
clearly shown in the figure. In summary, the figures illustrate the trade-off in design
between main-lobe width and side-lobe level and further show that a spherical array
with a higher order achieves better performance, either in terms of main-lobe width
or in terms of side-lobe level.

6.7 Multiple-Objective Design

In the previous sections of this chapter, various approaches to the design of spherical
microphone array beamformers were presented. Each of these design methods is
based on a different objective, which expresses a desired characteristic of the array.
These objectives include maximum directivity, maximum WNG, minimum side-lobe
level, and minimum main-lobe width, among other objectives. Design methods that
include a single objective, or two objectives as in the case presented in Sect. 6.4,
allowed standard formulations and closed-form solutions. However, in practice, a
design which considers all (or many) of these objectives may be desired, because
all of these objectives relate to important array characteristics. Although multiple-
objective formulations typically do not have a closed-form solution, they can be
integrated into an optimization problem that can be solved numerically, as presented
in recent studies [8, 13, 15].
Two example design methods based on numerical optimization are presented in
this section. Similar formulations that include other mixtures of objectives are also
possible. As a first example, consider the design of a spherical array that maximizes
directivity, but maintains a minimum desired level of robustness by imposing a
lower limit on the WNG. In addition, the beam pattern is designed to maintain a
distortionless-response constraint in the look direction. Using the results presented in
the design for maximum directivity, maximum WNG and the distortionless-response
constraint, as presented in Eqs. (6.1) and (6.20), the following optimization problem
is formulated:
H
minimize wnm Bwnm
wnm
H
subject to wnm vnm = 1 (6.67)
1
H
wnm Awnm ≤ ,
WNGmin

where

A = SSH
1  
B= diag |b0 |2 , |b1 |2 , |b1 |2 , |b1 |2 , . . . , |bN |2 (6.68)

6.7 Multiple-Objective Design 151

and WNGmin is the lower limit on the WNG. Matrix S is dependent on the sampling
scheme [see Eqs. (3.41)–(3.43)]. Due to the special structure of matrices A and B,
these matrices are positive definite, i.e. the matrices are Hermitian and the scalars
xH Ax and xH Bx are positive for all non-zero vectors x. The optimization problem
in Eq. (6.67) is therefore convex and is called a quadratically-constrained quadratic
program (QCQP), having readily available numerical solution methods [3].
QCQP is a special case of second-order cone programming (SOCP), so that this
optimization problem can also be presented as a SOCP problem:

minimize μ
wnm
H
subject to wnm vnm = 1
1 (6.69)
wnm
H
B2 ≤ μ
1
wnm
H
S ≤ √ ,
WNGmin

with
1 1  
B 2 = √ diag |b0 |, |b1 |, |b1 |, |b1 |, . . . , |bN | , (6.70)

and · denotes the 2-norm (see also [3, 13]).


A similar formulation for an axis-symmetric beamformer can be derived for the
multiple-objective design by substituting Eq. (5.22) and assuming uniform or nearly-
uniform sampling, such that Eq. (5.36) holds. In this case, Eq. (6.67) reduces to

minimize dnH Bdn


dn

subject to dnT vn = 1 (6.71)


1
dnH Adn ≤ ,
WNGmin

where
4π  
A= diag(vn ) × diag |b0 |−2 , |b1 |−2 , . . . , |bN |−2
Q
1
B= diag(vn )

1
vn = [1, 3, 5, . . . , 2N + 1]T . (6.72)

In the next example, a constraint on the maximum side-lobe level of the array
beam pattern is introduced. An array beam pattern, as in Eq. (5.12), is presented
here, explicitly denoting the plane-wave arrival direction by (θk , φk ):

y(θk , φk ) = wnm
H
vnm (θk , φk ), (6.73)
152 6 Optimal Beam Pattern Design

with

vnm (θk , φk ) = v00 (θk , φk ), v1(−1) (θk , φk ), v10 (θk , φk ), v11 (θk , φk ), . . . ,
T
vNN (θk , φk )
 ∗
vnm (θk , φk ) = bn (kr) Ynm (θk , φk ) , (6.74)

similar to the expressions in Eqs. (5.13) and (5.16). Now, as in [13], the entire direc-
tional region is divided into one region denoting the main-lobe directions and a
second region denoting the side-lobe directions. The side lobes directional region is
denoted by ΩSL , such that the arrival directions within this region satisfy

(θk , φk ) ∈ ΩSL . (6.75)

Now, the requirement that the magnitude of the side lobes is not larger than a limit
denoted by lSL can be formulated as a constraint on the maximum side-lobe level:

|y(θk , φk )| ≤ lSL
(θk , φk ) ∈ ΩSL . (6.76)

The incorporation of this constraint into a beamforming optimization problem is


facilitated by sampling the region ΩSL , as suggested in [13]. Assuming ΩSL is sampled
at I discrete directions, the constraint of maximum side-lobe level can be written in
a discrete form as

|y(θi , φi )| ≤ lSL , i = 1, . . . , I
(θi , φi ) ∈ ΩSL , i = 1, . . . , I . (6.77)

It is important to note that the discrete formulation is not equal to the continuous
formulation, because maintaining the constraint is not guaranteed at directions other
than the selected set. However, assuming the beam pattern is order-limited in the
spherical harmonics domain, it cannot facilitate rapid changes along (θ, φ), so that
dense sampling of ΩSL will tend to reduce the error (due to sampling) in maintaining
the constraint [13].
Equation (6.73) is substituted into Eq. (6.77), forming a discrete formulation of
the side-lobe level constraint that can be integrated into the QCQP optimization. One
possibility is to simply add a side-lobe level constraint such that Eq. (6.67) is written
as
H
minimize wnm Bwnm
wnm
H
subject to wnm vnm = 1
1 (6.78)
H
wnm Awnm ≤
WNGmin
H
wnm Bi wnm ≤ lSL
2
, i = 1, . . . , I ,
6.7 Multiple-Objective Design 153

where

A = SSH
1  
B= diag |b0 |2 , |b1 |2 , |b1 |2 , |b1 |2 , . . . , |bN |2

Bi = vnm (θi , φi )vnm
H
(θi , φi ), i = 1, . . . , I . (6.79)

In a similar manner, this formulation can be written as a SOCP optimization


problem:
minimize μ
wnm
H
subject to wnm vnm = 1
1
wnm
H
B2 ≤ μ (6.80)
1
wnm
H
S ≤√
WNGmin
 H 
w vnm (θi , φi ) ≤ lSL , i = 1, . . . , I .
nm

A simpler formulation is also available in this case for an axis-symmetric beam-


former with uniform or nearly-uniform sampling:

minimize dnH Bdn


dn

subject to dnT vn = 1
1 (6.81)
dnH Adn ≤
WNGmin
dnH Bi dn ≤ lSL
2
, i = 1, . . . , I ,

where
4π  
A= diag(vn ) × diag |b0 |−2 , |b1 |−2 , . . . , |bN |−2
Q
1
vn = [1, 3, 5, . . . , 2N + 1]T

1
B= diag(vn )

Bi = vn (Θi )vnH (Θi )
1  T
vn (Θi ) = P0 (cos Θi ), 3P1 (cos Θi ), . . . , (2N + 1)PN (cos Θi ) , (6.82)

with Θi denoting the angle between the array look direction and (θi , φi ).
Design examples using the multiple-objective method are presented next. Con-
sider a spherical microphone array with 36 microphones nearly-uniformly distributed
on the surface of a rigid sphere. The array order is N = 4, operating at kr = 2.
154 6 Optimal Beam Pattern Design

10

-10

-20

-30

-40

-50

-60

-70
-150 -100 -50 0 50 100 150

Fig. 6.9 The magnitude of the beam pattern, |y(Θ)|, for an axis-symmetric spherical array designed
to maximize the directivity factor, while maintaining a constraint of WNGmin = 10. The design
achieves DF = 19.5, while maintaining exactly the WNG constraint WNGmin = 10 and achieving
a maximum side-lobe level of −18.4 dB

Axis-symmetric beamformers are designed for this array. Table 6.2 shows that the
maximum directivity beamformer achieved DF = 25 with WNG = 1.6, while the
maximum WNG beamformer achieved WNG = 51.5 with DF = 9.35.
The optimization problems in Eqs. (6.71) and (6.81) are used in the design of
two beamformers. In both designs, a WNG constraint of WNGmin = 10 is desired,
and a distortionless-response constraint is also introduced. In the first design, using
Eq. (6.71), the directivity factor is maximized while maintaining the two constraints.
In the second design, using Eq. (6.81), an additional constraint of side-lobe level of
−30 dB, or lSL2
= 0.001, is introduced, within the side-lobe range of θ ∈ [60◦ , 180◦ ].
The side-lobe range was sampled by I = 50 uniformly distributed samples, each
defining an individual constraint.
Figure 6.9 shows the magnitude of the beam pattern for the first design. Both the
WNG and the distortionless-response constraints are maintained. Due to the WNG
constraint, the directivity factor achieved (DF = 19.5) is smaller than the maximum
achievable of DF = 25. A maximum side-lobe level of −18.4 dB is achieved for this
design.
The aim of the second design is to reduce the maximum side-lobe level, while
maintaining the same WNG constraint and maximizing the directivity factor. A max-
imum side-lobe level constraint of −30 dB is introduced using the formulation in
2
Eq. (6.81) with lSL = 0.001.
Figure 6.10 shows the magnitude of the beam pattern for the second design. The
WNG constraint is maintained with WNG = 10 and the maximum side-lobe level
6.7 Multiple-Objective Design 155

10

-10

-20

-30

-40

-50

-60

-70
-150 -100 -50 0 50 100 150

Fig. 6.10 The magnitude of the beam pattern, |y(Θ)|, for an axis-symmetric spherical array
designed to maximize the directivity factor, while maintaining a constraint of WNGmin = 10 and
maximum side-lobe level of −30 dB. The design achieves DF = 18.2, while maintaining exactly
the WNG constraint and the side-lobe level constraints

constraint is maintained at −30 dB. Due to the introduction of the side-lobe level
constraint, the directivity factor is further reduced to DF = 18.2.
These design examples demonstrated the flexibility of the multiple-objective
approach with a numerical optimization solution, providing beamformer design with
a high level of detail in the specification of performance.

References

1. Arfken, G., Weber, H.J.: Mathematical Methods for Physicists, 5th edn. Academic Press, San
Diego (2001)
2. Born, M., Wolf, E.: Principles of Optics: Electromagnetic Theory of Propagation, Interference
and Diffraction of Light, 7th edn. Cambridge University Press, Cambridge (1999)
3. Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge
(2004)
4. Elko, G.W.: Differential microphone arrays. In: Huang, Y., Benesty, J. (eds.) Audio Signal Pro-
cessing for Next-Generation Multimedia Communication Systems, pp. 11–89. Kluwer Aca-
demic Publishers, Boston (2004)
5. Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. The John Hopkins University
Press, Baltimore (1996)
6. Huang, Y., Benesty, J. (eds.): Audio Signal Processing for Multimedia Communication Sys-
tems. Kluwer Academic Publishers, Boston (2004)
7. Koretz, A., Rafaely, B.: Dolph-Chebyshev beampattern design for spherical arrays. IEEE Trans.
Signal Process. 57(6), 2417–2420 (2009)
156 6 Optimal Beam Pattern Design

8. Li, Z., Duraiswami, R.: Flexible and optimal design of spherical microphone arrays for beam-
forming. IEEE Trans. Audio Speech Lang. Process. 15(2), 702–714 (2007)
9. Osnaga, S.M.: On rank one matrices and invariant subspaces. Balk. J. Geom. Appl. 10(1),
145–148 (2005)
10. Peled, Y., Rafaely, B.: Objective performance analysis of spherical microphone arrays for
speech enhancement in rooms. J. Acoust. Soc. Am. 132(3), 1473–1481 (2012)
11. Rafaely, B.: Plane-wave decomposition of the pressure on a sphere by spherical convolution.
J. Acoust. Soc. Am. 116(4), 2149–2157 (2004)
12. Rafaely, B.: Phase-mode versus delay-and-sum spherical microphone array processing. IEEE
Signal Process. Lett. 12(10), 713–716 (2005)
13. Sun, H., Yan, S., Svensson, U.P.: Robust minimum sidelobe beamforming for spherical micro-
phone arrays. IEEE Trans. Speech Audio Process. 19(4), 1045–1051 (2011)
14. Van Trees, H.L.: Optimum Array Processing (Detection, Estimation, and Modulation Theory,
Part IV), 1st edn. Wiley, New York (2002)
15. Yan, S., Sun, H., Svensson, U.P., Xiaochuan, M., Hovem, J.M.: Optimal modal beamforming
for spherical microphone arrays. IEEE Trans. Speech Audio Process. 19(2), 361–371 (2011)
Chapter 7
Beamforming with Noise Minimization

Abstract Optimal beamformer design, as presented in Chap. 6, may be very useful,


but does not take into account the properties of the specific sound field producing the
signals at the microphones. In this chapter, beamforming in which the beam pattern
is tailored to the actual sound field is presented. This beamforming distinguishes
between the desired signal and the noise and, therefore, potentially achieves improved
performance in real, noisy sound fields. The measured sound field is characterized
by spatial cross-spectrum matrices, typically divided into matrices representing the
desired signals and matrices representing the unwanted noise. Therefore, the first
part of this chapter extends the array equations, in both the space and the spherical
harmonics domains (as presented in Chap. 5) to include noise. In particular, explicit
expressions are developed for designs that consider noise fields that are spatially
white and noise fields that are acoustically diffuse. The second part of the chapter
employs the new models in the development of popular beamformers, such as the
minimum variance distortionless response (MVDR) and the linearly constrained
minimum variance (LCMV). These beamformers are developed for spherical arrays
with explicit formulations in the spherical harmonics domain, emphasizing their
advantages when formulated in this domain. The chapter concludes with design
examples to illustrate the performance of the beamformers under various conditions.

7.1 Beamforming Equations Including Noise

Array equations in the space domain were developed in Sect. 5.1. Typical equations
for array processing also include the effect of noise and disturbing sources [6] and
so, in this section, the equations developed in Sect. 5.1 are extended to include noise.
The sound pressure at the microphones, denoted by p, is now replaced by x, which
includes noise:
x = p + n, (7.1)

where, similar to Eq. (5.2),


 T
p = p1 (k), p2 (k), . . . , p Q (k) (7.2)

© Springer Nature Switzerland AG 2019 157


B. Rafaely, Fundamentals of Spherical Array Processing, Springer Topics
in Signal Processing 16, https://doi.org/10.1007/978-3-319-99561-8_7
158 7 Beamforming with Noise Minimization

represents the sound pressure at the Q sensors due to the desired sources and, simi-
larly,
 T
n = n 1 (k), n 2 (k), . . . , n Q (k) (7.3)

represents the noise at the sensors. The array output can now be formulated by
applying array coefficients to the array input:

y = w H x. (7.4)

The variance of the array output (assuming zero mean, see discussion below) can
now be computed as follows:
   
E |y|2 = E w H xx H w = w H Sxx w, (7.5)

where  
Sxx = E xx H (7.6)

is the spatial spectral matrix of the array input, in which each element represents
the cross-spectral density at wave number k between the signals at two sensors.
Substituting Eq. (7.1) into Eq. (7.6), the spatial cross-spectral density matrix of the
array input can be written as

Sxx = Spp + Snn + Spn + Snp , (7.7)

with  
Spp = E pp H (7.8)

and  
Snn = E nn H , (7.9)

representing the spatial cross-spectrum matrices due to the desired pressure signal
and the noise signal, respectively, and Spn , Snp , representing the cross-spectrum
matrices between the signal and the noise. It is common to assume that the desired
pressure signal and the noise signal are independent, as they typically originate
from different, independent sources. Furthermore, in most applications of acoustics
no useful information is contained in the constant component of the time-domain
signals and so it can be removed in practice, if different from zero. The zero mean
in the time domain is transformed into a zero mean in the frequency domain, such
that E[p] = E[n] = 0, where 0 is the zero vector. Therefore, independence between
the desired pressure signal and the noise signal, i.e. E[pn H ] = E[p] · E[n] H , leads
to a zero cross-spectral density between the desired pressure and the noise signals,
Spn = Snp = 0, in this case and Eq. (7.7) is rewritten as

Sxx = Spp + Snn , (7.10)


7.1 Beamforming Equations Including Noise 159

which is a standard result in array processing [6].


A common cause for noise at the sensors is the so-called sensor noise, which
typically refers to electrical noise due to the amplifiers connected to the transducers,
e.g. the microphones. Assuming all sensors in the array are identical, the noise signals
can be assumed to be independent and identically distributed (i.i.d.). Together with
the zero-mean assumption argued above, the spatial cross-spectrum of the noise
becomes
Snn = σn2 I, (7.11)

where I is a Q × Q unit matrix and σn2 is the variance of the sensor noise.
Another common noise model is due to acoustic noise in the form of a diffuse
sound field. This may represent an environment with a large number of sources
distributed in all directions, e.g. a hall occupied by many speakers, or a highly rever-
berant environment in which the sound field due to late reflections tends to be diffuse
[4]. A diffuse sound field is composed of an infinite number of plane waves having
amplitudes with equal magnitude and random phases, arriving from all directions
with equal distribution. Denoting the sound pressure at the qth microphone due to
the diffuse sound field by n q , the array input can be written in a manner identical
to Eq. (7.1). The spatial cross-spectrum matrix of the noise term in this case, Snn , is
composed of the spatial cross-spectrum between microphone pairs. When the micro-
phones are positioned in an open-sphere configuration, the spatial correlation is given
by [3]  
E n q n q∗  = σn2 sinc(krqq  ), (7.12)

where rqq  is the distance between microphone q and microphone q  ; the expecta-
tion operation in this case represents averaging over different realizations of diffuse
fields. For example, consider equal-angle sampling with 4(N + 1)2 microphones; the
distance between adjacent microphones on the equator is approximated by dividing
the circumference 2πr by the number of microphones on the equator, 2(N + 1). At
the highest operating frequency of the array, satisfying kr ≈ N , the distance between
adjacent microphones satisfies

k2πr N
kr ≈ ≈π ≈ π. (7.13)
2(N + 1) N +1

Because sinc(π ) = 0 is the first zero of the sinc function, the correlation between
adjacent microphones is near zero in this case. For microphone pairs that are not
adjacent the distance is larger and the sinc function oscillates, while converging to
zero for kr  π . In this case, matrix Snn will have larger terms on the diagonal
[where sinc(0) = 1] and terms with a generally decreasing magnitude off the diag-
onal. At low frequencies, kr  π and sinc(kr ) → 1. In this case, matrix Snn will
have all its elements close to σn2 and will be of rank one. It is therefore clear that
matrix Snn for the case of acoustic noise in the form of a diffuse field may change
considerably its characteristics as a function of the operating frequency.
160 7 Beamforming with Noise Minimization

For an array configured around a rigid sphere, the effect of scattering from the rigid
sphere is to slightly reduce the correlation values, so that the correlation function is
sinc-like and slightly compressed along the axis of the argument kr [2].
Array equations in the spherical harmonics domain were presented in Sect. 5.1.
In this section these equations are extended to include the effect of noise. The array
input can therefore be rewritten to include both sound pressure and noise:

xnm = pnm + nnm , (7.14)

where  T
pnm = p00 (k), p1(−1) (k), p10 (k), p11 (k), . . . , p N N (k) (7.15)

represents the (N + 1)2 × 1 vector of spherical harmonic coefficients of the sound


pressure due to the desired sources and, similarly,
 T
nnm = n 00 (k), n 1(−1) (k), n 10 (k), n 11 (k), . . . , n N N (k) (7.16)

represents the (N + 1)2 × 1 vector of spherical harmonic coefficients of the noise.


The array output can be written, similarly to Eq. (5.9):

y = wnm
H
xnm , (7.17)

with wnm representing the (N + 1)2 × 1 vector of array coefficients in the spherical-
harmonics domain, defined as in Eq. (5.10):
 T
wnm = w00 (k), w1(−1) (k), w10 (k), w11 (k), . . . , w N N (k) . (7.18)

Similar to the space domain formulation, Eq. (7.5), the variance of the array output
can be formulated as
   H 
E |y|2 = E wnm H
xnm xnm wnm = wnm
H
Sxnm xnm wnm , (7.19)

where  
Sxnm xnm = E xnm xnm
H
(7.20)

is the spherical harmonics formulation of the cross-spectrum matrix of the array


input. In this formulation each element in the matrix represents the cross-spectral
density at wave number k between the signals at two spherical harmonic coefficients.
Following the arguments leading to Eq. (7.10), assuming the desired and noise signals
are independent and of zero mean, the cross-spectrum matrix can be represented as

Sxnm xnm = Spnm pnm + Snnm nnm , (7.21)


7.1 Beamforming Equations Including Noise 161

with  
Spnm pnm = E pnm pnm
H
(7.22)

and  
Snnm nnm = E nnm nnm
H
, (7.23)

representing the cross-spectrum matrices due to the desired pressure signal and the
noise signal, respectively, in the spherical harmonics domain.
When the noise at the array input is due to sensor noise and using the discrete
formulation of the spherical Fourier transform, as in Eq. (3.40), sensor noise in the
spherical harmonics domain can be written as

nnm = Sn, (7.24)

with matrix S dependent on the sampling scheme (see Sect. 3.6). This leads to

Snnm nnm = SSnn S H = σn2 SS H , (7.25)

where it has been assumed that the noise is independent and identically distributed
such that Eq. (7.11) holds. In this case the spatial cross-spectrum matrix of the noise
depends on the sampling scheme. In the special case of uniform or nearly-uniform
sampling, S = 4π Q
Y H [see Eqs. (3.43) and (3.39)] such that

4π H 4π 4π
Snnm nnm = σn2 Y Y = σn2 I. (7.26)
Q Q Q

In this case the cross-spectrum matrix is proportional to a unit matrix, similar to the
space domain formulation.
In the case where the noise originates from a diffuse sound field, n nm (k) can be
represented in a manner similar to Eq. (2.63) as

2π π
 ∗
n nm (k) = bn (kr )anm (k) = bn (kr ) a(k, θk , φk ) Ynm (θk , φk ) sin θk dθk dφk .
0 0
(7.27)

In this case, the integral represents a continuum of plane waves, or an infinite num-
ber of plane waves, in which case a(k, θk , φk ) is the plane-wave amplitude density
function. For a diffuse sound field it is assumed that a(k, θk , φk ) has unit or equal
magnitude in all directions, with random phases, which defines a white noise process
along (θk , φk ) satisfying
 
E a(k, θk , φk )a(k, θk  , φk  )∗ = σn2 δ(cos θk − cos θk  )δ(φk − φk  ). (7.28)
162 7 Beamforming with Noise Minimization

Now, E[n nm n ∗n  m  ] can be derived using Eqs. (7.27) and (7.28) and the orthogonality
property of the spherical harmonics:

2π π 2π π
  ∗
 
E n nm n ∗n  m  = bn (kr ) [bn  (kr )] E a(k, θk , φk )[a(k, θk  , φk  )]∗
0 0 0 0
 
∗
×Ynm (θk , φk ) Ynm (θk  , φk  ) sin θk dθk dφk sin θk  dθk  dφk 
2π π   ∗

= bn (kr ) [bn  (kr )] σn2 Ynm (θk , φk ) Ynm (θk , φk ) sin θk dθk dφk
0 0
= σn2 |bn (kr )|2 δnn  δmm  . (7.29)

This result shows that the noise due to a diffuse sound field is uncorrelated in the
spherical harmonics domain [7]. Written in a matrix form, this is
 
Snnm nnm = E nnm nnm
H
(7.30)
= σn2 B H B,

with B an (N + 1)2 × (N + 1)2 diagonal matrix defined by

B = diag (b0 , b1 , b1 , b1 , . . . , b N ) . (7.31)

The array equation, Eq. (7.14), can also be rewritten with multiplication by the
inverse of matrix B:

x̃nm = B−1 pnm + B−1 nnm = anm + B−1 nnm . (7.32)

In this form, the desired signal is anm , which is the plane-wave amplitude density
function in the spherical harmonics domain, satisfying anm = pnm /bn [see Eq. (2.63)].
The cross-spectrum matrix of the array input has a simple form in this case:

Sx̃nm x̃nm = Sanm anm + σn2 I. (7.33)

This form is particularly useful, because in the case of a diffuse sound field the noise
term is a scaled unit matrix, or spatially white.

7.2 Minimum Variance Distortionless Response

Optimal beamformers have been discussed in Chap. 6. In particular, Sect. 6.1 pre-
sented beam patterns that are optimal in attenuating noise due to a diffuse sound
field. However, when the noise field is not perfectly diffuse, this maximum directiv-
7.2 Minimum Variance Distortionless Response 163

ity beam pattern is no longer optimal. In this case an optimal beam pattern, tailored
to the actual measured noise, can be designed. One such a design is the minimum
variance distortionless response (MVDR), where the beam pattern is constrained to
be unity in the look direction, while minimizing the variance of the array output.
This beamformer is particularly useful when the desired signal is a plane wave arriv-
ing from the array look direction, with all other contributions to the array output
considered as noise and, therefore, to be minimized.
Consider a desired signal s(k), originating from a distant source at direction
(θk , φk ); the source generates a plane wave at the array position, with a steering
vector v denoting the transfer function from the source s(k) to the array input. The
array also measures noise, such that array input can be written in a manner similar
to Eq. (7.1):

x = p+n
= vs + n, (7.34)

where the dependence of s(k) on k has been removed for simplicity and with p and n
denoting the desired pressure signal and noise at the sensors, respectively. Applying
beamforming, as in Eq. (7.4), the variance of the signal at the array output is given
by
 
E |y|2 = w H Sxx w
= w H Spp w + w H Snn w
 2  
= w H v E |s|2 + w H Snn w. (7.35)

The following design objective is now considered:

minimize w H Sxx w
w
(7.36)
subject to w H v = 1.

It is clear that due to the distortionless-response constraint, w H v = 1, the desired


signal part in Sxx cannot be modified, so that minimization of w H Sxx w leads to a
minimization of w H Snn w, i.e. the variance of the noise at the array output. The result
of this optimization is therefore to deliver the desired signal unchanged to the array
output, while minimizing the noise contribution. The optimization in Eq. (7.36) is
similar to that in Eq. (6.1), and so the solution can be written in a similar manner:

v H S−1
wH = xx
. (7.37)
v H S−1
xx v

The optimal solution requires the inversion of Sxx , so that this matrix has to be of
full rank. With a desired signal composed of a single plane wave, Spp has unit rank,
and so the inversion of Sxx requires Snn to be of full rank or nearly full rank. It is
164 7 Beamforming with Noise Minimization

important to note that the beamformer described here is sometimes referred to as the
minimum power distortionless response (MPDR) beamformer, but in this case the
MVDR beamformer is the same as in Eq. (7.37), with S−1 −1
xx replaced by Snn :

minimize w H Snn w
w
(7.38)
subject to w H v = 1,

with a solution
v H S−1
wH = nn
. (7.39)
v H S−1
nn v

In the context of this section, with a single plane-wave sound field for the desired
signal and a distortionless response in the same direction, the two forms are equiv-
alent. However, when the desired signal has additional components, for example
due to a reflection from a wall in a room, minimization of Sxx may lead to sig-
nal cancellation, i.e. the reflection component cancels the desired signal from the
look direction, even when the distortionless-response constraint is maintained (see
Sect. 7.4 for examples and further discussion). This can be avoided by directly mini-
mizing Snn , although an estimate of Snn may not always be available separately from
the desired signal.
In the special case of sensor noise, substituting Eq. (7.11), Snn = σn I, leads to
v
w= . (7.40)
vH v
For sensors in free field, with the steering vector v composed of complex expo-
nentials [see Eq. (5.6)] the solution reduces to that of a delay-and-sum beamformer
(see Sect. 5.5) or a maximum WNG beamformer (see Sect. 6.2) formulated in the
space domain [6]. Indeed, the MVDR beamformer in this case maximizes the signal
to sensor-noise ratio.
The MVDR beamformer can also be formulated in the spherical harmonics
domain, following the array equations developed in the spherical harmonics domain
in Sect. 7.1. Starting from Eq. (7.14), and using a steering-vector notation as in
Eq. (7.34), the equation can be written as

xnm = pnm + nnm


= vnm s + nnm . (7.41)

Now, following the derivation in Eq. (7.35), the MVDR optimization problem can
be written in the spherical harmonics domain in a way similar to Eq. (7.36) as
H
minimize wnm Sxnm xnm wnm
wnm
(7.42)
H
subject to wnm vnm = 1.
7.2 Minimum Variance Distortionless Response 165

Similar to Eq. (7.37), a solution can be written for the spherical harmonics beam-
forming coefficients:
H −1
vnm Sxnm xnm
H
wnm = −1
. (7.43)
H
vnm Sxnm xnm vnm

Also, in a similar manner, MVDR can be distinguished from MPDR by replacing


Sxnm xnm with Snnm nnm :
H −1
vnm Snnm nnm
H
wnm = −1
. (7.44)
H S
vnm nnm nnm vnm

In the case of sensor noise and a spherical array with a nearly-uniform sampling
scheme configuration, the spatial cross-spectrum matrix of the noise is proportional
to a unit matrix [see Eq. (7.26)] and the solution in this case becomes
H
vnm
H
wnm = H v
. (7.45)
vnm nm

This result is the same as the maximum WNG beamformer [see Eq. (6.25)], showing
a similar behavior to the space domain formulation.
In the case of noise generated by a diffuse sound field, and using the formulation
as in Eq. (7.32), the spatial cross-spectrum matrix of the noise is proportional to a
unit matrix. A solution in the form of Eq. (7.45) can be written as
H
ṽnm
H
w̃nm = H ṽ
, (7.46)
ṽnm nm

with ṽnm = B−1 vnm , following the derivation in Eq.


 (7.32). Now, ∗ using the expression
for vnm , as in Eq. (5.16), ṽnm reduces to ṽnm = Ynm (θk , φk ) , where (θk , φk ) is the
arrival direction of the desired plane wave. Further, using the spherical harmonics
H
addition theorem [Eq. (1.26)] to compute ṽnm ṽnm , the beamforming coefficients in
Eq. (7.46) reduce to
 ∗ 4π
w̃nm (k) = Y m (θk , φk ). (7.47)
(N + 1)2 n
∗ ∗
With wnm = w̃nm /bn , this result leads to

4π 1
[wnm (k)]∗ = Y m (θk , φk ), (7.48)
(N + 1) bn (kr ) n
2

which is equivalent to the maximum directivity beamformer [see Eq. (6.9)], devel-
oped in Sect. 6.1. Indeed, the maximum directivity beamformer maximizes the SNR
in the case where the noise originates from a diffuse field, arriving equally from all
directions.
166 7 Beamforming with Noise Minimization

7.3 Example: MVDR with Sensor Noise and Disturbance

Examples of beam patterns designed using the MVDR method are presented in
this section. Consider a spherical microphone array designed around a rigid sphere,
operating at kr = N , with N = 4. The array is composed of Q = 36 microphones
arranged nearly-uniformly, with sensor noise assumed to be spatially uncorrelated
and with variance σn2 = 0.1. In this case, Snnm nnm due to the sensor noise can be
written as in Eq. (7.26):

Snnm nnm = σn2 I. (7.49)
Q

The desired signal is assumed to propagate with a plane wave arriving from direction
(θ0 , φ0 ) = (60◦ , 36◦ ), having a variance of σ02 = 1 at the operating frequency. As the
desired signal and noise are assumed uncorrelated, the solution for the beamforming
weights in the spherical harmonics domain can be calculated from Eq. (7.45), having
a maximum WNG beam pattern. The resulting beam pattern is then calculated using
wnm and Eq. (5.12) as

y(θ, φ) = wnm
H
vnm (θ, φ)

N 
n
 ∗
= [wnm (k)]∗ bn (kr ) Ynm (θ, φ) . (7.50)
n=0 m=−n

Figure 7.1 shows the magnitude of the beam pattern for this example. The contour
plot shows that the main lobe is directed at the desired signal, marked by the “+” sign,
while the balloon plot illustrates that the beam pattern is symmetric around the look
direction axis, as expected from the maximum WNG beamformer (see Sect. 6.2).
In the second part of this example, a disturbance is added to the noise signal
in the form of a plane wave arriving from direction (θ1 , φ1 ) = (60◦ , 320◦ ), with a
disturbance signal uncorrelated to the desired signal and to the sensor noise signal,
having a variance of σ12 = 0.5. The spatial spectrum matrix of the noise for this
example, formulated in the spherical harmonics domain, can be written as


Snnm nnm = σn2 I + σ12 vnm1 vnm1
H
, (7.51)
Q

where vnm1 is the steering vector in the direction of the disturbance. The optimal
beamforming weights for this example are given by Eq. (7.44) and the resulting
beam pattern by Eq. (7.50). Figure 7.2 illustrates the magnitude of the beam pattern
for this example. The main lobe is directed at the desired signal, as in the first
example. In the direction of the disturbance signal, marked by the dark “+” sign,
the beam pattern has low magnitude, as expected if the array output due to Snnm nnm
is to be minimized. It is interesting to note by comparing the balloon plots of Figs.
7.1 and 7.2 that the first side lobe has been modified and now includes a null in
the direction of the disturbance signal, therefore breaking the axis-symmetry of the
7.3 Example: MVDR with Sensor Noise and Disturbance 167

160 0.9

140 0.8

0.7
120
0.6
100
0.5
80
0.4
60
0.3

40 0.2

20 0.1

0
0 50 100 150 200 250 300 350

Fig. 7.1 Magnitude of the array beam pattern, |y(θ, φ)|, for the MVDR beamformer with sensor
noise. Upper: contour plot, the arrival direction of the plane wave holding the desired signal is
marked by the white “+”. Lower: balloon plot. In this plot cyan (green-blue) color shades represent
positive values of Re{y(θ, φ)}, while magenta (purple-red) color shades represent negative values
of Re{y(θ, φ)}

beam pattern around the look direction. This example illustrates the advantage of the
MVDR beamformer - the ability to shape the beam pattern to account for uncorrelated
disturbances in the sound field.
168 7 Beamforming with Noise Minimization

160 0.9

140 0.8

0.7
120
0.6
100
0.5
80
0.4
60
0.3

40 0.2

20 0.1

0
0 50 100 150 200 250 300 350

Fig. 7.2 Magnitude of the array beam pattern, |y(θ, φ)|, for the MVDR beamformer with sensor
noise and a single disturbance. Upper: contour plot, the arrival direction of the plane wave holding
the desired signal is marked by the white “+” and the arrival direction of the plane wave holding the
disturbance signal is marked by the dark “+”. Lower: balloon plot. For color scheme see Fig. 7.1

7.4 Example: MVDR with Correlated Disturbance

In this section, the example presented in Sect. 7.3 is further extended to include a
disturbance signal that is correlated to the desired signal. This may occur in practice,
for example, when the disturbance is the result of the desired signal being reflected
from a nearby surface like a wall in a room. At the operating frequency, the distur-
bance signal is therefore an attenuated and phase-shifted version of the desired signal.
7.4 Example: MVDR with Correlated Disturbance 169

Denoting by s0 the amplitude of the desired signal at the origin of the coordinate
system, the disturbance signal satisfies s1 = As0 , where A is a complex constant.
The same spherical array as in Sect. 7.3 is also used in this example, i.e. a
rigid-sphere array with nearly-uniform sampling, Q = 36, N = 4, and kr = N .
The desired signal propagates as a plane wave arriving from (θ0 , φ0 ) = (60◦ , 36◦ )
with σ02 = 1 and the disturbance is another plane wave with arrival direction
(θ1 , φ1 ) = (60◦ , 320◦ ), with σ12 = |A|2 σ02 and A = 0.8e−iπ/3 . Sensor noise with
σn2 = 0.1 is also assumed. The spatial spectrum matrix of the noise, including the
contribution from the disturbance, is given by


Snnm nnm = σn2 I + σ12 vnm1 vnm1
H
. (7.52)
Q

Now, recalling that the disturbance is correlated to the desired signal, the spatial
spectrum matrix of the overall input signal is derived:


Sxnm xnm = σn2 I + σ02 vnm0 vnm0
H
+ σ12 vnm1 vnm1
H
Q
+A∗ σ02 vnm0 vnm1
H
+ Aσ02 vnm1 vnm0
H
, (7.53)

where it is noted that E[s1 s0∗ ] = Aσ02 .


An MVDR beamformer is designed by minimizing Snnm nnm subject to a
distortionless-response constraint, with the solution given by Eq. (7.44), setting vnm
to vnm0 . Then, the beam pattern is calculated using Eq. (7.50), with its magnitude
presented in Fig. 7.3. Inspection of this beam pattern reveals that it is the same as in
Fig. 7.2. Indeed, the spatial spectrum matrix, Snnm nnm , is the same in both cases, so
that the optimal beamformer is the same. In this sense, the fact that the disturbance
is correlated to the desired signal did not affect the beam pattern.
However, the significant difference between the two examples lies in the ability
to estimate Snnm nnm in practice. In the uncorrelated-disturbance case, it is sufficient
to record the input signal at times when the desired signal is not active, but the
disturbance is active. Alternatively, one could simply record the entire input signal,
because minimizing Snnm nnm or Sxnm xnm leads to the same beamformer. However, in the
case of the correlated disturbance, for example a disturbance that is a reflected version
of the desired signal, both the desired signal and the disturbance appear and disappear
coherently, and so estimating Snnm nnm (that does not include the desired signal, but
does include the disturbance) is typically not possible in practice. Therefore, the
beam pattern shown in Fig. 7.3 cannot be achieved in practice.
To overcome this limitation, it may be possible to employ the MVDR by min-
imizing Sxnm xnm , as in Eqs. (7.42) and (7.43). The resulting beam pattern is pre-
sented in Fig. 7.4. The figure shows that the beam pattern is composed of two sig-
nificant lobes, with look directions at the desired signal and disturbance directions.
This is surprising, because the aim of the beamformer is to attenuate the distur-
bance. A more detailed investigation of the beamformer reveals that wnm H
vnm0 = 1,
verifying that the distortionless-response constraint is satisfied. The beamformer also
170 7 Beamforming with Noise Minimization

160 0.9

140 0.8

0.7
120
0.6
100
0.5
80
0.4
60
0.3

40 0.2

20 0.1

0
0 50 100 150 200 250 300 350

Fig. 7.3 Magnitude of the array beam pattern, |y(θ, φ)|, for the MVDR beamformer, as presented
in Eq. (7.44), with sensor noise and a single correlated disturbance. Upper: contour plot, arrival
direction of the plane wave holding the desired signal is marked by the white “+” and the arrival
direction of the plane wave holding the disturbance signal is marked by the dark “+”. Lower:
balloon plot. For color scheme see Fig. 7.1

satisfies |wnm
H
vnm1 | = 1.25, which shows that the disturbance is not attenuated but
even enhanced! Now, because the desired signal and the disturbance are correlated,
the combined contribution of the desired signal and the disturbance at the array output
is given by
7.4 Example: MVDR with Correlated Disturbance 171

1.2
160

140 1

120
0.8
100

80 0.6

60 0.4

40
0.2
20

0
0 50 100 150 200 250 300 350

Fig. 7.4 Magnitude of the array beam pattern, |y(θ, φ)|, for the MVDR beamformer, as presented
in Eq. (7.43), with sensor noise and a single correlated disturbance. Upper: contour plot, arrival
directions of the plane waves holding the desired signal and the disturbance are marked by the white
“+”. Lower: balloon plot. For color scheme see Fig. 7.1

 2
|y|2 = s0 wnmH
vnm0 + s1 wnm
H
vnm1 
 H 2
= |s0 |2 wnm vnm0 + AwnmH
vnm1 
= 7.5 × 10−6 |s0 |2 , (7.54)

showing that the overall signal at the array output has been attenuated by more than
50 dB. This phenomenon is referred to as signal cancellation [6], where instead of
keeping the desired signal unchanged and attenuating the disturbance, the beam-
172 7 Beamforming with Noise Minimization

former satisfies the distortionless-response constraint in the look direction, but then
employs the correlated disturbance to cancel the desired signal through the mini-
mization of Sxnm xnm , which includes contributions from both.
This example shows the limitation of the MVDR method for correlated distur-
bances. One way to overcome this limitation is by designing a null at the direction
of the disturbance through an additional constraint. This is made possible using an
extended method, the LCMV, as detailed in the next section.

7.5 Linearly Constrained Minimum Variance

Section 7.2 presented the MVDR design method that aims to minimize the noise
at the array output, while avoiding distortion of the signal by imposing a constraint
in the array look direction. The MVDR method can be extended by introducing
additional constraints to the desired beam pattern. For example, the distortionless-
response constraint, or a similar constraint, can be introduced at directions near the
look direction, thereby improving robustness against errors in the estimation of the
arrival direction of the desired signal. Also, if the noise field is composed of disturbing
sources, then the effect of these can be explicitly removed by constraining the beam
pattern to be zero at these directions. This is referred to as null constraints. In addition,
spatial derivatives of the beam pattern can be employed, for example, with the aim of
controlling the width of the main lobe in the look direction, or the width of the nulls
in the direction of disturbances. The general formulation of the linearly constrained
minimum variance (LCMV) beamformer, incorporating linear constraints within the
beamformer design, is derived in this section, while the following sections present
more specific designs.
The LCMV beamformer is first formulated in the space domain, designed as the
solution to the following optimization problem [6]:

minimize w H Sxx w
w
(7.55)
subject to V H w = c.

Matrix V is of dimensions Q × L, with L representing the number of constraints.


In the simple case, the columns of V represent steering vectors for a given set of
directions, with the L × 1 vector c holding the beamformer gain at these directions.
The values can be 1, representing a distortionless response, 0, representing a null
constraint, or other values, specifying the desired gain. The same formulation can be
extended to include other constraints, such as a derivative constraint.
The solution to the problem in Eq. (7.55) can be formulated using Lagrange
multipliers [6] in a manner similar to Eq. (6.4):

minimize w H Sxx w + λ H V H w − c + w H V − c H λ, (7.56)


w
7.5 Linearly Constrained Minimum Variance 173

where λ is an L × 1 vector of Lagrange multipliers. Taking the derivative with respect


to w and setting the result to zero lead to

w H Sxx + λ H V H = 0, (7.57)

with w satisfying

w H = −λ H V H S−1
xx . (7.58)

Multiplying from the right by V and substituting the constraint term in Eq. (7.55), λ
can be written as

−1
λ H = −c H V H S−1
xx V . (7.59)

Substituting in Eq. (7.58), the solution to Eq. (7.55) becomes



−1 H −1
w H = c H V H S−1
xx V V Sxx . (7.60)

A similar formulation, distinguishing between LCMV and LCMP (linearly con-


strained minimum power) can also be obtained by replacing Sxx with Snn [6]. In this
case, and assuming sensor noise, with Snn = σ 2 I, the solution becomes

−1 H
wH = cH VH V V . (7.61)

The LCMV beamformer can also be formulated in the spherical harmonics


domain. Adding constraints to the spherical harmonics formulation of the MVDR in
Eq. (7.42), the LCMV can be written as
H
minimize wnm Sxnm xnm wnm
wnm
(7.62)
H
subject to Vnm wnm = c.

The solution can be derived in a similar manner to the derivation of the space-
domain solution, Eqs. (7.56)–(7.60), and is given by
H −1
−1 H −1
H
wnm = c H Vnm Sxnm xnm Vnm Vnm Sxnm xnm . (7.63)

The LCMV in the spherical harmonics domain can also be formulated and solved
by replacing Sxnm xnm with Snnm nnm . Furthermore, in the case of sensor noise and a
spherical array with a nearly-uniform sampling scheme, Snnm nnm is proportional to a
unit matrix [see Eq. (7.26)] and the solution becomes
H
−1 H
H
wnm = c H Vnm Vnm Vnm . (7.64)
174 7 Beamforming with Noise Minimization

The spatial cross-spectrum matrix of the noise is also proportional to a unit matrix,
when using the array equations as shown in Eq. (7.32) and assuming that the noise
signal is generated by a diffuse sound field. The solution in this case becomes
−1
H
w̃nm = c H Ṽnm
H
Ṽnm H
Ṽnm , (7.65)

with Ṽnm = B−1 Vnm [see Eq. (7.32)] and with the columns of Ṽnm equal to Ynm (θ, φ)
for the case where Vnm represent steering vectors.

7.6 Example: LCMV with Beam Pattern Amplitude


Constraints

An example of an LCMV design for the spherical harmonics formulation is presented


in this section. The constraints are based on beam pattern amplitude values, such that
Vnm is directly the steering matrix. A distortionless-response constraint is applied at
(θ0 , φ0 ) = (60◦ , 36◦ ). Another null constraint is applied at (θ1 , φ1 ) = (60◦ , 320◦ ).
All other array parameters and the operating frequency are the same as in the previous
example in Sect. 7.3. The desired signal is assumed to have a variance of σ02 = 1,
while sensor noise with a variance of σn2 = 0.1 is also assumed. Matrix Sxnm xnm can
therefore be written as

Sxnm xnm = σ02 vnm0 vnm0
H
+ σn2 I. (7.66)
Q

The steering matrix includes both the direction of the desired signal and the direction
of the null, and is defined by

Vnm = [vnm0 , vnm1 ] , (7.67)

where vnm0 and vnm1 are the steering vectors corresponding to plane waves arriving
from directions (θ0 , φ0 ) and (θ1 , φ1 ), respectively. The constraint vector is given by

c = [1, 0]T . (7.68)

The solution to the LCMV optimization problem, as formulated in Eq. (7.62),


is given by Eq. (7.63). The resulting beam pattern is computed using Eq. (7.50).
Figure 7.5 shows the magnitude of the beam pattern for this example. Comparison
with Fig. 7.3 reveals that both beamformers are identical. While in the MVDR design
the null in the direction of the disturbance was achieved through minimization of
Snnm nnm , in the LCMV, the same null was achieved through the inclusion of a null
H
constraint at the direction of the disturbance, i.e. vnm1 wnm = 0. The advantage of
the MVDR design is that the null at the disturbance direction is achieved without
7.6 Example: LCMV with Beam Pattern Amplitude Constraints 175

160 0.9

140 0.8

0.7
120
0.6
100
0.5
80
0.4
60
0.3

40 0.2

20 0.1

0
0 50 100 150 200 250 300 350

Fig. 7.5 Magnitude of the array beam pattern, |y(θ, φ)|, for the LCMV beamformer with sensor
noise and a single null constraint. Upper: contour plot, arrival direction of the plane wave holding
the desired signal is marked by the white “+” and direction of the null constraint is marked by the
dark “+”. Lower: balloon plot. For color scheme see Fig. 7.1

the need to identify the direction-of-arrival of the disturbance, while in the LCMV
approach, the null is achieved by explicitly specifying the null direction in Vnm .
However, the advantage of the LCMV design is that the null is achieved regardless
of the type of disturbance signal, while the MVDR design is significantly degraded
if the disturbance is correlated with the desired signal, due to signal cancellation.
As discussed above, this LCMV design requires the knowledge of the direction-
of-arrival of the disturbance to set the null constraint. In the case where this direction
176 7 Beamforming with Noise Minimization

is estimated inaccurately, it may be advisable to extend the width of the null in the
beam pattern, so that the disturbance is significantly attenuated even if the arrival
direction of the disturbance is slightly different from the null direction. One way
to achieve this is to introduce additional null constraints at directions close to the
original null, as illustrated in the following example.
In addition to the distortionless-response and the null constraints introduced in
the previous example, there are two nulls at directions (θ2 , φ2 ) = (70◦ , 290◦ ) and
(θ3 , φ3 ) = (15◦ , 310◦ ). Matrix Sxnm xnm is defined as in Eq. (7.66), while the steering
matrix Vnm is reconstructed to include the new steering vectors:

Vnm = [vnm0 , vnm1 , vnm2 , vnm3 ], (7.69)

and, accordingly,
c = [1, 0, 0, 0]T . (7.70)

The solution is computed as in Eq. (7.63), and the resulting beam pattern is
computed using Eq. (7.50). Figure 7.6 shows the magnitude of the beam pattern,
also denoting the directions of the three nulls. It is clear that compared with Fig. 7.5,
a wider near-zero response of the beam pattern is achieved around the directions
of the nulls, thereby achieving a wider directional region with low magnitude, as
desired. The corresponding balloon plot is also presented in Fig. 7.6.
In the final example of this section, a single null constraint is used, as in the first
example, but now four distortionless-response constraints are added around the array
look direction. This could be useful to extend the width of the main lobe in a case
where the arrival direction of the desired signal is not known with high accuracy.
In this example, the look direction is (θ0 , φ0 ) = (60◦ , 36◦ ) and four distortionless-
response constraints are added at (60 ± 5◦ , 36 ± 5◦ ). A null constraint is applied, as
before, at (θ1 , φ1 ) = (60◦ , 320◦ ). In this case, the steering matrix Vnm is constructed
as follows:
Vnm = [vnm0 , vnm1 , vnm2 , vnm3 , vnm4 , vnm5 ] , (7.71)

with indices 2 - 5 denoting the additional distortionless-response constraints and,


accordingly,
c = [1, 0, 1, 1, 1, 1]T . (7.72)

Figure 7.7 shows the magnitude of the beam pattern and also denotes the directions
of all constraints. It is clear that the null constraint is maintained, while the width of
the main lobe is significantly increased compared to Fig. 7.5, showing the ability of
the LCMV to control the main-lobe width by introducing additional constraints.
7.7 LCMV with Derivative Constraints 177

160 0.9

140 0.8

0.7
120
0.6
100
0.5
80
0.4
60
0.3

40 0.2

20 0.1

0
0 50 100 150 200 250 300 350

Fig. 7.6 Magnitude of the array beam pattern, |y(θ, φ)|, for the LCMV beamformer with sensor
noise and three null constraints. Upper: contour plot, arrival direction of the plane wave holding the
desired signal is marked by the white “+” and directions of the null constraints are marked by the
dark “+”. Lower: balloon plot. For color scheme see Fig. 7.1

7.7 LCMV with Derivative Constraints

The LCMV with amplitude constraints, as presented in the previous section, illus-
trates the broadening of the main lobe and the null region by adding constraints at
directions close to the look direction and the null. A similar effect can be achieved
using a more analytical approach by constraining the first (and higher) order deriva-
tives of the beam pattern to be zero. This derivative constraint can be formulated as
linear constraints in the beam pattern optimization problem, hence directly integrat-
178 7 Beamforming with Noise Minimization

160 1

140
0.8
120

100 0.6

80
0.4
60

40
0.2
20

0
0 50 100 150 200 250 300 350

Fig. 7.7 Magnitude of the array beam pattern, |y(θ, φ)|, for the LCMV beamformer with sensor
noise and several distortionless-response and null constraints. Upper: contour plot, directions of the
distortionless-response constraints are marked by the white “+” and direction of the null constraint
is marked by the dark “+”. Lower: balloon plot. For color scheme see Fig. 7.1

ing into the LCMV framework [6]. The steering vectors in the spherical harmonics
domain are first written more explicitly as a function of the angles. It is important
to note that the following formulation of the derivative constraint has a closed-form
expression, due to the efficient separation of frequency, distance and angles in the
spherical harmonics domain. The steering vectors are written as in Eq. (5.16):
 ∗
vnm (θ, φ) = bn (kr ) Ynm (θ, φ) , (7.73)
7.7 LCMV with Derivative Constraints 179

with
 T
vnm (θ, φ) = v00 (θ, φ), v1(−1) (θ, φ), v10 (θ, φ), v11 (θ, φ), . . . , v N N (θ, φ) .
(7.74)

A partial derivative of the beam pattern with respect to the azimuth angle, φ, is
derived first. The derivative is written as

∂ ∂ ∂
y(θ, φ) = wnm vnm (θ, φ) = wnm
H H
vnm (θ, φ) , (7.75)
∂φ ∂φ ∂φ

where T
∂ ∂ ∂
vnm (θ, φ) = v00 (θ, φ), . . . , v N N (θ, φ) . (7.76)
∂φ ∂φ ∂φ

Recalling from Eq. (1.9) the definition of the spherical harmonics:



2n + 1 (n − m)! m
Ynm (θ, φ) ≡ P (cos θ )eimφ , (7.77)
4π (n + m)! n


the elements of v
∂φ nm
can be derived:

∂  ∗
vnm (θ, φ) = −imbn (kr ) Ynm (θ, φ) = −imvnm (θ, φ). (7.78)
∂φ

The expression for the second-order derivative follows directly:

∂2
vnm (θ, φ) = −m 2 vnm (θ, φ). (7.79)
∂φ 2

Higher-order derivatives can be derived in a similar manner.


Finally, setting the constraint of a zero derivative at a given direction, (θ0 , φ0 ) is
formulated using the newly-derived derivative vector:


H
wnm vnm (θ, φ) = 0. (7.80)
∂φ (θ0 ,φ0 )

The first-derivative constraint with respect to θ is derived next. In a similar manner


to the derivation of the derivative with respect to φ, we can write

∂ ∂ H ∂
y(θ, φ) = w vnm (θ, φ) = wnm
H
vnm (θ, φ) , (7.81)
∂θ ∂θ nm ∂θ

where
180 7 Beamforming with Noise Minimization
T
∂ ∂ ∂
vnm (θ, φ) = v00 (θ, φ), . . . , v N N (θ, φ) (7.82)
∂θ ∂θ ∂θ

and
∂ ∂  m ∗
vnm (θ, φ) = bn (kr ) Y (θ, φ) . (7.83)
∂θ ∂θ n
The derivative of the spherical harmonics can be derived from the following
relation [5]:

∂ m 
Yn (θ, φ) = m cot θ Ynm (θ, φ) + (n − m)(n + m + 1)e−iφ Ynm+1 (θ, φ).
∂θ
(7.84)
A few notes about this equation are presented next. First, Ynm+1 (θ, φ) = 0 for m +
1 > n in general, and for m = n in the context of this equation. Second, for θ = 0
and θ = π , the cotangent function diverges, but for these angles ∂θ∂ Ynm (θ, φ) = 0.
This is because Ynm (0, φ) = Ynm (π, φ) = 0 ∀m = 0, while for m = 0 the spherical
harmonics reduce to the Legendre polynomials, composed of cosine functions that
have a gradient of zero at θ = 0 and θ = π [1]. The same argument follows for the
Ynm+1 (θ, φ) term. Therefore, at these specific angles the first-order derivative with
respect to θ is zero, and a zero constraint would be satisfied anyway.
In summary, the derivative of the elements of the steering vector with respect to
θ can be written as

vnm (θ, φ) = g1 vnm (θ, φ) + g2 vn(m+1) (θ, φ)
∂θ
g1 = m cot θ

g2 = (n − m)(n + m + 1)eiφ
vn(m+1) (θ, φ) = 0 ∀ m = n. (7.85)

A zero-derivative constraint with respect to θ at (θ0 , φ0 ) is now formulated as




wnm H
vnm (θ, φ) = 0. (7.86)
∂θ (θ0 ,φ0 )

Finally, derivatives with respect to both θ and φ can be set to zero using linear
constraints within the LCMV framework, as follows:

∂ ∂
wnm H
vnm (θ, φ), vnm (θ, φ) = [0, 0]. (7.87)
∂θ ∂φ (θ0 ,φ0 )
7.8 Example: Robust LCMV with Derivative Constraints 181

7.8 Example: Robust LCMV with Derivative Constraints

An LCMV design example is presented in this section with the aim of illustrating
the use of derivative constraints. A spherical microphone array with the same con-
figuration as in the design example in Sect. 7.6 is used in this section. An LCMV
beamformer with a distortionless-response constraint at (θ0 , φ0 ) = (60◦ , 36◦ ) and
one null constraint at (θ0 , φ0 ) = (60◦ , 90◦ ) is designed, following the formulation
presented in Sect. 7.5. The input signal to the array was assumed to be composed
of a desired signal with a plane wave arriving from the look direction (θ0 , φ0 ) with
variance σ02 = 1 and sensor noise with variance σn2 = 0.1. Figure 7.8 shows a con-
tour plot and a balloon plot of the beam pattern, clearly illustrating the main lobe at
the look direction and the null at (60◦ , 90◦ ).
In the next step, a derivative constraint is added to the LCMV design, following
the formulation developed in Sect. 7.7, with a single derivative constraint with respect
to φ at (θ1 , φ1 ), i.e. the null direction. Figure 7.9 shows the results for this design.
Comparing with Fig. 7.8, two effects of the derivative constraint on the beam pattern
are observed. First, the width of the low-level magnitude of the beam pattern around
the null constraint in the φ direction has been increased. This is an expected effect
because now both the beam pattern function and its derivative along φ are zero. The
advantage of this null-width increase is improved robustness with respect to uncer-
tainty in the arrival direction of a potential disturbance. However, another change
to the beam pattern is a slight shift in the main lobe, such that its peak value seems
to be slightly to the left of the look direction shown in the contour plot of Fig. 7.9.
This can be regarded as a degradation, as we would like to have the maximum gain
exactly at the look direction. This issue will be discussed towards the end of this
design example.
In the following step, a derivative constraint with respect to θ at the null direction
has been added to the derivative constraint with respect to φ. Figure 7.10 shows
the resulting beam pattern. An increase in the width of the low-magnitude region
around the null constraint along θ is observed when compared with Fig. 7.9. This is
expected, as in this design the derivatives with respect to both θ and φ are set to zero
around the null direction.
In the final step of this design, two additional derivative constraints are included.
These are derivative constraints with respect to both θ and φ, but this time at the
look direction, (θ0 , φ0 ) = (60◦ , 36◦ ). Now, both the look direction and the null
are set to have zero derivatives. The effect on the main lobe is clear, as illustrated in
Fig. 7.11. The peak of the main lobe has shifted back to the look direction, because the
derivative constraints have forced the main lobe to have a local maximum at the look
direction. This has therefore corrected the undesired shift generated by the derivative
constraints at the null direction. However, this correction comes at a cost - with this
complex set of constraints, the LCMV introduces high side lobes at directions away
from the constraints. The overall behavior of this beam pattern may not be attractive,
although all imposed constraints are maintained. This example shows that constraints
182 7 Beamforming with Noise Minimization

160 0.9

140 0.8

0.7
120
0.6
100
0.5
80
0.4
60
0.3

40 0.2

20 0.1

0
0 50 100 150 200 250 300 350

Fig. 7.8 Magnitude of the array beam pattern, |y(θ, φ)|, for the LCMV beamformer with sensor
noise and a single null constraint. Upper: contour plot, arrival direction of the plane wave holding
the desired signal is marked by the white “+” and direction of the null constraint is marked by the
dark “+”. Lower: balloon plot. For color scheme see Fig. 7.1

have to be introduced with care, as they may come at the expense of reduction of
noise and disturbances arriving from other directions.
7.8 Example: Robust LCMV with Derivative Constraints 183

160 1

140
0.8
120

100 0.6

80
0.4
60

40
0.2
20

0
0 50 100 150 200 250 300 350

Fig. 7.9 Magnitude of the array beam pattern, |y(θ, φ)|, for the LCMV beamformer with sensor
noise, a single null constraint and a derivative constraint with respect to φ at the null direction.
Upper: contour plot, arrival direction of the plane wave holding the desired signal is marked by the
white “+” and direction of the null constraint is marked by the dark “+”. Lower: balloon plot. For
color scheme see Fig. 7.1
184 7 Beamforming with Noise Minimization

160 1

140
0.8
120

100 0.6

80
0.4
60

40
0.2
20

0
0 50 100 150 200 250 300 350

Fig. 7.10 Magnitude of the array beam pattern, |y(θ, φ)|, for the LCMV beamformer with sensor
noise, a single null constraint and derivative constraints with respect to both θ and φ at the null
direction. Upper: contour plot, arrival direction of the plane wave holding the desired signal is
marked by the white “+” and direction of the null constraint is marked by the dark “+”. Lower:
balloon plot. For color scheme see Fig. 7.1
7.8 Example: Robust LCMV with Derivative Constraints 185

160 0.9

140 0.8

0.7
120
0.6
100
0.5
80
0.4
60
0.3

40 0.2

20 0.1

0
0 50 100 150 200 250 300 350

Fig. 7.11 Magnitude of the array beam pattern, |y(θ, φ)|, for the LCMV beamformer with sensor
noise, a single null constraint and derivative constraints with respect to both θ and φ at the null
direction and at the look direction. Upper: contour plot, arrival direction of the plane wave holding
the desired signal is marked by the white “+” and direction of the null constraint is marked by the
dark “+”. Lower: balloon plot. For color scheme see Fig. 7.1

References

1. Arfken, G., Weber, H.J.: Mathematical Methods for Physicists, 5th edn. Academic, San Diego
(2001)
2. Avni, A., Rafaely, B.: Interaural cross-correlation and spatial correlation in a sound field repre-
sented by spherical harmonics. In: First International Symposium on Ambisonics and Spherical
Acoustics (Ambisonics 2009). Graz, Austria (2009)
186 7 Beamforming with Noise Minimization

3. Cook, R.K., Waterhouse, R.V., Berendt, R.D., Seymour, E., Thompson, M.C.: Measurement
of correlation coefficients in reverberant sound fields. J. Acoust. Soc. Am. 27(6), 1072–1077
(1955)
4. Kinsler, L.E., Frey, A.R., Coppens, A.B., Sanders, J.V.: Fundamentals of Acoustics, 4th edn.
Wiley, New York (1999)
5. Spherical harmonics, low order differentiation with respect to θ (2013). http://functions.wolfram.
com/05.10.20.0001.01
6. Van Trees, H.L.: Optimum Array Processing (Detection, Estimation, and Modulation Theory,
Part IV), 1st edn. Wiley, New York (2002)
7. Yan, S., Sun, H., Svensson, U.P., Xiaochuan, M., Hovem, J.M.: Optimal modal beamforming
for spherical microphone arrays. IEEE Trans. Speech Audio Process. 19(2), 361–371 (2011)
Glossary

Acronyms

LCMP Linearly constrained minimum power


LCMV Linearly constrained minimum variance
MPDR Minimum power distortionless response
MVDR Minimum variance distortionless response
QCQP Quadratically-constrained quadratic program
SNR Signal-to-noise ratio
SOCP Second-order cone programming
WNG White noise gain

Mathematical Operators

· 2-norm
(·)∗ Complex conjugate
(·)T Transpose
(·) H Hermitian or complex transpose
(·)† Pseudo matrix inverse
(·)! Factorial
∇ Gradient
∇x2 Laplacian in Cartesian coordinates
∇r2 Laplacian in spherical coordinates
E[ · ] Expectation
I m{·} Imaginary part
κ(·) Condition number of a matrix
Re{·} Real part
Λ(·) Rotation operator

Greek Symbols

αq , αqnm Sampling weights


α Vector of sampling weights
© Springer Nature Switzerland AG 2019 187
B. Rafaely, Fundamentals of Spherical Array Processing, Springer Topics
in Signal Processing 16, https://doi.org/10.1007/978-3-319-99561-8
188 Glossary

δnm , δn Kronecker delta function


δ(·) Dirac delta function
θ Elevation angle
φ Azimuth angle
Ω Solid angle

Symbols

a(·) Plane-wave decomposition in the space domain


anm Plane-wave decomposition in the spherical harmonics domain
bn (·) Function relating pressure to plane-wave decomposition
DF Directivity factor
DI Directivity index
dn Axis-symmetric beamforming weighting function
 (·)
n
dmm Wigner-d function
 (·)
n
Dmm Wigner-D function
dn Axis-symmetric beamforming weighting vector
F Front-back ratio
h n (·) Spherical Hankel function of the first kind
h (2)
n (·) Spherical Hankel function of the second kind
I Unit matrix
jn (·) Spherical Bessel function of the first kind
k Wave number
k Wave vector denoting propagation direction
k̃ Wave vector denoting arrival direction
L 2 (·) Space of square-integrable functions
N Order of spherical harmonics
N Set of all natural numbers
n Noise vector in the space domain
nnm Noise vector in the spherical harmonics domain
Pn (·) Legendre polynomial
Pnm (·) Associated Legendre function
p Sound pressure in the space domain
pnm Sound pressure in the spherical harmonics domain
p Sound pressure vector
pnm Sound pressure vector in the spherical harmonics domain
Q Number of samples or microphones
R One-dimensional space of real numbers
R3 Three-dimensional space of real numbers
r Vector of spherical coordinates
Ry Euler rotation matrix for rotations about the y axis
Rz Euler rotation matrix for rotations about the z axis
S2 Unit sphere
S Spherical Fourier transform matrix
Sxx Cross-spectrum matrix in the space domain
Glossary 189

Sxnm xnm Cross-spectrum matrix in the spherical harmonics domain


Snn Noise cross-spectrum matrix in the space domain
Snnm nnm Noise cross-spectrum matrix in the spherical harmonics domain
TM (·) Chebyshev polynomial
v Steering vector in the space domain
vnm Steering vector in the spherical harmonics domain
WNG White noise gain
w(·) Beamforming weighting function in the space domain
wnm Beamforming weighting function in the spherical harmonics domain
w Beamforming weighting vector in the space domain
wnm Beamforming weighting vector in the spherical harmonics domain
yn (·) Spherical Bessel function of the second kind
Ynm (·) Spherical harmonics
Y Matrix of spherical harmonics
Z Set of all integers
Index

A factor, 110, 111, 127, 129–131, 140–142,


Aliasing, 64, 75–80, 82–84, 91, 92, 96–99, 154, 155
101, 105, 118–120, 130, 135 index, 103, 110–112, 117, 127, 129, 130,
Associated Legendre differential equation, 135–137
36 maximum, 127, 129–133, 135, 137, 140–
Associated Legendre function, 4, 8, 9, 12– 142, 144, 150, 154, 163, 165
15, 36, 64 Distortionless-response constraint, 127, 128,
Axis symmetry, 9, 103, 108–111, 113–116, 131, 132, 138, 139, 150, 154, 163,
123, 129, 130, 134, 139, 142, 147, 164, 169, 172, 174, 176, 178, 181
149, 151, 153–155, 166 Dolph-Chebyshev design, 127, 145–149
Dual radius, 90–93, 98
Dual sphere, 92, 97–102, 107
B
Bessel function, spherical Bessel function,
36–41, 44, 48, 56, 82, 84, 85 E
zeros, 39, 47, 52, 81, 82, 84–87, 89–92, Equal-angle sampling, 59, 61, 62, 64–67, 70,
96–98, 102, 137 74–79, 96, 108, 119, 159
Euler angles, 23, 25, 123

C F
Cartesian coordinate, 1, 2, 4, 24, 26, 28, 29, Front-back ratio, 142, 144
33, 34, 41
Chebyshev polynomial, 145–147
Concentric spheres, 81, 90, 98, 101, 102 G
Condition number, 94–100 Gaussian sampling, 59, 66, 67, 70, 74–79,
Convolution, 1, 28, 30, 65 83, 92, 96, 108

D H
Delay and sum, 103, 115–117, 134, 137, 164 Hankel function, spherical Hankel function,
Derivative constraint, 172, 177–181, 183– 33, 36–41, 48, 49, 51, 56, 57
185 Helmholtz equation, 34–36
Diffuse sound, 111, 137, 138, 144, 157, 159, Hemispherical array, 81, 101
161, 162, 165, 174 Hermitian matrix, 133, 151
Directivity, 87, 88, 105, 110, 112, 113, 128, Hilbert space, 1, 16, 20
130, 137, 141, 150 Hyper-cardioid, 130, 131
© Springer Nature Switzerland AG 2019 191
B. Rafaely, Fundamentals of Spherical Array Processing, Springer Topics
in Signal Processing 16, https://doi.org/10.1007/978-3-319-99561-8
192 Index

I Q
Isotropic noise, 111, 137, 138 Quadratically-constrained quadratic pro-
gram, 151
Quadrature, 60, 65, 68, 73
L
Lagrange multiplier, 128, 131, 172, 173
Laplacian, 33, 34 R
Legendre polynomial, 9, 12, 14–16, 22, 130, Rank, 95, 97, 132, 159, 163
142, 146, 147, 180 Rayleigh formula, 36
Linearly constrained minimum power, 173 Rayleigh quotient, 114, 143, 144
Linearly constrained minimum variance, generalized, 111, 114, 132
157, 172–178, 180–185 Rayleigh resolution, 131
Regular beamformer, 117
Rigid sphere, 33, 49–55, 81, 83, 85–87, 89,
M 93–99, 101, 103, 105, 107, 109, 115,
Main lobe, 123, 130, 145, 146, 148–150, 118, 119, 122, 134, 135, 141, 153,
152, 166, 172, 176, 177, 181 160, 166, 169
Manifold vector, 105 Robustness, 75, 89, 92, 95, 96, 100, 101, 112,
Microphone 116, 117, 127, 132, 134, 137, 141,
150, 172, 181
cardioid, 87–89, 93–95, 97, 98, 102
Rotation, 23, 25–30, 66, 123, 124
mismatch, 83, 95
pressure, 81–83, 87, 89, 90, 92, 93, 95,
105, 111, 113
S
Minimum power distortionless response,
Sampling weights, 60, 61, 68–70, 73, 74, 76,
164
77, 79, 80, 82, 92
Minimum variance distortionless response,
Second-order cone programming, 151
157, 162–169, 171–175
Sensor noise, 81, 83, 86, 112, 113, 132, 137–
139, 141, 158, 159, 161, 163–171,
173–175, 177, 178, 181–185
N Side lobe, 127, 130, 145, 146, 148–152, 154,
Null constraint, 172, 174–178, 181–185 155, 166, 181
Spatial resolution, 83, 131
Spherical Bessel equation, 36
O Spherical cap, 22–25, 27, 28
Open sphere, 81, 83–87, 89–93, 95, 97–99, Spherical coordinate, 1–3, 29, 33, 34, 36, 40,
101, 107, 115–117, 134–137, 140, 50, 55
159 Spherical Fourier transform
complex conjugate, 18, 123
conditions, 17
P definition, 17, 43, 46, 47, 59–61, 70, 73,
Perturbation, 94, 95 75, 105
Plane wave, 33, 34, 40–43, 45–49, 51, 52, discrete, 74, 75, 113, 119, 161
55, 57, 81, 82, 84, 85, 88, 92, 105, Gibbs phenomenon, 20, 23
107, 110–123, 127, 130, 138, 140, inner product, 20
144, 151, 159, 161, 163, 165–170, linearity, 18
174, 175, 177, 181–185 Parseval’s relation, 17
amplitude density, 46, 47, 161, 162 symmetry, 19, 30
decomposition, 81, 92, 103, 117, 119– Spherical harmonics
122, 129 addition theorem, 12, 22, 41, 110, 112,
sound field, 36, 41, 43, 47–49, 52, 53, 85, 114, 139, 165
90, 91, 106, 107, 164 completeness, 12, 17
Platonic solids, 67–69 complex conjugate, 8, 18, 20, 123
Point source, 33, 36, 48, 49, 51, 55 definition, 4, 5, 62, 179
Index 193

derivative, 180 96–99, 108, 114, 115, 119, 133, 135,


illustration, 6–9 136, 138–141, 151, 153, 154, 161,
orthogonality, 11–13, 17, 21, 60, 65, 74, 165, 166, 169, 173
105, 111, 112, 114, 138, 162
symmetry, 10, 20
zeros, 8
V
Spherical shell, 95, 99, 101
Spindle torus, 101 Velocity, 50, 51
Steering, 103, 110, 123, 124
matrix, 174, 176
vector, 105, 107, 108, 110, 113, 128, 134, W
140, 163, 164, 166, 172, 174, 176, 178, Wave equation, 33–37, 40, 41, 48
180 Wave number, 34, 43, 82, 83, 90–93, 97, 104,
118, 119, 158, 160
Wave vector, 34, 40, 88
T
T-design, 68–70, 118 White noise gain, 103, 112–115, 132, 134,
Translation, 33, 55, 56 135, 137, 140, 141, 150, 151, 154,
155
maximum, 115, 116, 127, 132–137, 140,
U 141, 150, 154, 164–166
Uniform sampling, nearly-uniform sam- Wigner-D function, 26, 124
pling, 59, 67, 69, 70, 74, 75, 77, 80, Wigner 3-j symbol, 57

You might also like