Sound modeling: source-based approaches
Sound modeling: source-based approaches
Federico Avanzini
3.1 Introduction
It was 1971 when Hiller and Ruiz envisioned the possibility of using numerical simulations of the wave
equation for sound synthesis applications.
[. . . ] This is a completely new approach to electronic sound synthesis insofar as the starting point is
the physical description of the vibrating object [. . . ]
A decade later McIntyre, Schumacher, and Woodhouse published their classic study on the use of non-
linear maps for modeling the generation of self-sustained oscillations in musical instruments.
[. . . ] a fast minicomputer could produce results at a cycle rate in the audible range. The result would
perhaps have some novelty: an electronic musical instrument based on a mathematical model of an
acoustic instrument [. . . ]
Today the algorithms described by these authors can be easily implemented in real-time on general-
purpose hardware, and it is common practice to use the term physical modeling to refer to sound mod-
eling techniques in which the synthesis algorithms are designed based on a description of the physical
phenomena involved in sound generation.
Direct sound representations, that are merely based on a description of the sound waveform, do not
contain information about the way the sound has been generated and processed by the surrounding en-
vironment before arriving to the listener’s ear. Sampling in time the sound signal does not assume any
underlying structure, or process, or generative model, in sound representation. The symbolic descrip-
tion is extremely poor, and as a consequence very little interaction with the sound representations is
allowed. Although signal processing techniques can provide meaningful modifications (e.g. pitch shift,
time stretching), sampling is basically a static, low-level description of sound.
3-2 Algorithms for Sound and Music Computing [v.February 2, 2019]
High level representations of sound signals are necessarily associated with some abstract paradigms
that underlie sound production. As we have seen previously, when trying to develop a taxonomy of
sound synthesis methods a first distinction can be traced between signal models and source models. Any
algorithm which is based on a description of the sound pressure signal and makes no assumptions on the
generation mechanisms belongs to the class of signal models. Additive synthesis is a good example of
a signal model: as already mentioned, one major drawback of this technique is its enormous number of
control parameters: at least one amplitude and one pitch envelopes have to be specified for each partial.
Moreover, the sound representation has not a strong semantic interpretation, since these parameters do
not have a high-level meaning. Subtractive synthesis with its source-filter structure provides in a sense a
more semantic description of sound: in certain cases the two blocks can be given a physical interpretation
in terms of an exciting action and a resonating object, respectively. As an example, in the case of LPC
based speech synthesis the broadband input signal can be interpreted as a glottal source signal, and the
shaping filter represents the action of the vocal tract. However, in many other cases this interpretation
does not hold, and the control parameters in the model (e.g., the filter coefficients) do not have a high-
level meaning.
Source models aim at describing the physical objects and interactions that have generated an acoustic
event rather than the acoustic signal itself. This modeling approach often gives rise to rather complex de-
scriptions, that can lead to computationally expensive numerical algorithms. Several modeling paradigms
and techniques are available in the literature for deriving efficient implementations of such descriptions,
including lumped/distributed modeling, waveguide structures, finite difference methods, and so on. The
following sections describe in detail a few of these approaches. Here it is worth discussing another as-
pect, i.e. that of control. A direct consequence of assuming a source-based approach is that the resulting
control parameters have a straightforward physical interpretation: typical parameters in the models are
associated with masses, hardness/softness characteristics, blowing pressures, lengths: such a semantic
representation can in principle allow more intuitive interaction.
Source-based sound modeling paradigms are often grouped into two broad categories, namely lumped
and distributed models. Generally speaking, distributed models are more often used for describing vibrat-
ing bodies or air volumes where forces and matter depend on (and propagate along) both time and space.
One-, two- and three-dimensional resonators (such as strings, bars, acoustical bores, membranes, plates,
rooms, etc.) can be treated as continuous distributed systems, and mathematically described by means
of Partial Differential Equations (PDEs). One of the most popular distributed modeling approaches is
waveguide modeling, which will be discussed in detailed in Sec. 3.4.
Although waveguides are extremely successful in modeling nearly elastic mediums, where the D’Alembert
equation or some of its generalizations hold, they are not equally good in dealing with systems where
these hypotesis are not met. As an example, oscillations in a bar are governed by the so called Euler-
Bernoulli equation, for which no traveling-waves schematization can be assumed. One possible approach
for dealing with such systems is using finite difference or finite elements methods. These time-domain
techniques are based on direct discretization of the PDEs and consequently have high computational
costs. On the other hand, when properly applied they provide stable and very accurate numerical sys-
tems.
As opposed to distributed models, lumped models are used when a physical system can be conve-
niently described without explicitly considering its extension in space. As an example, a mechanical
resonating body may be described in terms of ideal masses or rigid elements, connected to each other
with spring and dampers, and possibly non-linear elements. Similar considerations may apply to elec-
trical circuits and even to certain acoustic systems. The resulting models are naturally described in the
time domain, in terms of Ordinary Differential Equations (ODEs). Sec. 3.5 discusses lumped modeling
approaches, and includes an introduction to modal synthesis. Defining modal synthesis as a lumped mod-
eling approach may be questionable, since the modal formalism incorporates a “spatial” representation
(e.g. it is possible to inject a force in a specific point of a modal resonator, or to measure its displacement
in a specific point). On the other hand, representing a resonator as a combination of a finite number of
modes correspons to approximating the resonator as a mesh of point masses connected with strings and
dampers, and in this sense modal synthesis may be regarded as a lumped modeling approach.
3.2.1.1 Oscillators
The simplest physical oscillating system is the damped second-order (or harmonic) oscillator. A generic
oscillator of this kind is described by the following linear differential equation:
where uext is an external driving signal. The general solution of the homogeneous equation (i.e., Eq. (3.1)
with uext = 0) is given by
x(t) = a0 e−αt cos(ωr t + ϕ0 ), (3.2)
√
where ωr = ω02 − α2 . The parameters a0 and ϕ0 are uniquely determined by the initial conditions
x(0), ẋ(0). In particular the impulse response of the system corresponds to initial conditions x(0) = 0
and ẋ(0) = 1, and is given by h(t) = e−αt sin(ωr t)/ωr .
An electrical system representing a damped harmonic oscillator is the RLC circuit (Fig. 3.1(a)).
d2 i di 1
L 2
(t) + R + i = 0, (3.3)
dt dt C
where i is the current in the circuit, L, R, C are the inductance, resistance, and capacitance of in the
circuit, respectively. This is an equation of the form (3.1), with α = R/2L, ω02 = 1/LC. Therefore it
has a solution of the form (3.2).
In the mechanical case, an instance of damped harmonic oscillator is the mass-spring-damper system
depicted in Fig. 3.1(b):
mẍ(t) + rẋ(t) + kx(t) = 0, (3.4)
where x is the displacement signal, m, r, k are the mass, mechanical resistance, and spring stiffness.
Again this is an equation of the form (3.1), with α = r/2m, ω02 = k/m. Therefore it has a solution of
the form (3.2).
In certain situations, acoustic systems can also be described in terms of lumped elements that are
equivalent to resistance, capacitance, and inductance. The variables involved in this case are air-flow
(or volume velocity) u(t), measured in m3 /s, and acoustic pressure p(t), measured in Pa. When the
dimensions of an acoustic element are much less than the sound wavelength, then the acoustic pressure,
p can be assumed constant over the element. In this case, the acoustic behavior of the element is, at least
at low frequencies, very simple. In particular, the Helmholtz resonator (depicted in Fig. 3.1(c)) behaves
to a good degree of approximation as a second order oscillator. We analyze this systems in terms of three
main elements: the opening, the neck, and the cavity.
m x(t)
S
R L
k r
V u(t)
C
O
i(t) L
(a) (b) (c)
Figure 3.1: Second order electrical, mechanical, and acoustic oscillators; (a) a RLC circuit; (b) a
mass-spring-damper system; (c) a Helmholtz resonator.
Resistive phenomena are observed during the passage of acoustic airflow through the opening, due
to a pressure difference ∆pop (t): the flow behavior is dominated by viscous and thermal losses and it is
reasonably assumed to be in phase with the acoustic pressure. Therefore the relation ∆pop (t) = Ru(t)
holds at the opening where the constant R is termed fluid-dynamic resistance. Simple inertial behaviors
are observed in the cylindrical neck. The air mass inside this tube is m = ρair SL (ρair being the air
density, S the cross-sectional area, and L the length). If a pressure difference ∆ptube (t) is applied at the
tube ends, the enclosed air behaves like a lumped mass driven by the force ∆ptube , and Newton’s law
implies
∆ptube (t) = ρair SL · v̇(t), (3.5)
where the relation u(t) = Sv(t) has been used, and v(t) indicates particle velocity. Finally, the cavity
has an elastic behavior. Consider the volume V (t) of air inside the cavity: the contraction dV (t) caused
by a pressure difference ∆pcav (t) is such that −ρair c2 · dV /V = ∆pcav . As a consequence, a new air
volume −dV can enter the cavity. By definition, this equals the integral of u(t) over time, therefore
∫ t
V
−dV (t) = u(t′ )dt′ = ∆pcav (t).
0 ρair c2
∫ (3.6)
ρair S 2 c2 t ′ ′
S∆pcav (t) = − v(t )dt ,
V 0
which represent a linear spring with stiffness ρair S 2 c2 /V . Both the air mass in the tube and the resistance
at the opening impede the same flow u, and are therefore in a “series” connection. This flow u enters
the cavity, so that the the volume is in series with the other two. The resulting equation for the particle
displacement x is
ρair S 2 c2
(ρair SL) · ẍ(t) + Rẋ(t) + x(t) = 0. (3.7)
V
Again equation of the form (3.1). Therefore solution of the form (3.2).
3.2.1.2 Impedance
The examples in the previous section show that in a large class of systems it is possible to construct pairs
of variables (often defined as Kirchoff variables) with the property that their product has the dimensions
of power (Kg m2 /s3 ). In electrical systems such a pair of variables is given by (v, i), voltage and current.
Integro-differential relations can be found that relate these two variables, in particular three elementary
relations define the fundamental quantities resistance R, inductance L and capacitance C. In the Laplace
domain, the integro-differential equations are turned into simple algebraic relations:
1
V (s) = R · I(s), V (s) = sL · I(s), V (s) = I(s). (3.8)
sC
These are particular examples of a more general relation in linear electric circuits:
where the quantity Z(s) is called impedance of the circuit and is defined as the ratio between the Laplace
transforms of voltage and current intensity. The inverse of Z(s) is called admittance, and it is usually
denoted as Γ(s) = Z(s)−1 .
Similar considerations apply to mechanical systems. Force f (Kg m/s2 ) and velocity v (m/s) are the
mechanical Kirchhoff variables, since their product is a power. Again, the ratio of these two variables
in the Laplace domain is defined as (mechanical) impedance, and its inverse is the (mechanical) admit-
tance. In the mechanical oscillator described above we have already introduced the three mechanical
equivalents of resistance, capacitance and inductance. The direct proportionality f (t) = rv(t) defines
ideal linear viscous forces, and by comparison with the first of Eqs. (3.8) r can be regarded as a mechan-
ical resistance. The inertial mass m of a non-relativistic body is defined as the ratio between the total
force acting on it and its acceleration, i.e. f (t) = ma(t) = mv̇(t), or F (s) = msV (s) in the Laplace
domain, and by comparison with the second equation in (3.8) m can be regarded as a mechanical induc-
tance. Finally, in an∫ ideal linear spring the elastic force is proportional to the elongation of the spring:
f (t) = kx(t) = k 0 v(t′ )dt′ . or F (s) = k/s V (s) in the Laplace domain, and by comparison with
t
the third equation in (3.8) the stiffness k can be regarded as a mechanical capacitance. Therefore the
aggregate impedance Z(s) of a second-order mechanical oscillator is Z(s) = ms + k/s + r.
As far as acoustic systems are concerned, acoustic pressure p (Kg/ms2 ) and volume velocity u (m3 /s)
are the acoustic Kirchhoff variables, since their product is a power. Again, the ratio of these two variables
in the Laplace domain is defined as (acoustic) impedance, and its inverse is the (acoustic) admittance.
In the Helmoltz resonator described above we have already introduced the three acoustic equivalents
of resistance, capacitance and inductance. More precisely, fluid-dynamic resistance is associated to
viscous and thermal losses at narrow openings: p(t) = Ru(t). Fluid-dynamic inductance is associated
to short, open tubes: p(t) = ρair L/S · u̇(t), or P (s) = ρair Ls/S · U (s) in the Laplace ∫ domain.
Fluid-dynamic capacitance is associated with enclosed air volumes: p(t) = ρair c /V · u(t′ )dt′ , or
2
If we introduce a suitable set of variables q1,2 in place of x1,2 , the above equations can be decoupled, or
diagonalized:
with q1 = x1 + x2 , q2 = x1 − x2 , ω02 = k/m. The normal modes qi (i = 1, 2) are uncoupled and the xi
are linear combinations of the qi .
This simple example can be extended to more complicated systems, composed N masses coupled
through springs and dampers. One can in general reformulate the system in terms of normal modes
of oscillation, and the oscillation of each point mass can be seen as a linear combination of N normal
modes, each of which obeys the equation of a second-order (damped) harmonic oscillator. We will return
on these concept in Sec. 3.5.2, when discussing modal synthesis.
r
r φ θ
O
O
z φ
(a) (b)
two main assumptions are that (i) the infinitesimal string segment dx moves only in the vertical direction,
so that its acceleration can be computed using only the transverse component of the tension as the acting
force; and (ii) the amplitude of the vibrations is very small.
There are interesting cases where acoustic disturbances can be assumed to be one-dimensional up
to a reasonable approximation. Propagation of acoustic pressure in a cylindrical or in a conical tube
is an example. Using cylindrical coordinates (see Fig. 3.2(a)), one can show that for cylindrical bores
one-dimensional longitudinal pressure waves in the z direction are described using Eq. (3.12), with z
in place of x and with y representing acoustic pressure. Using spherical coordinates (see Fig. 3.2(b)),
one can show that for conical bores one-dimensional spherical pressure waves are described through the
equation ( )
1 ∂ 2 ∂R 1 ∂2R
r (r, t) = (r, t), (3.13)
r2 ∂r ∂r c2 ∂t2
in which R(r) represents acoustic pressure, and the Laplacian operator is expressed in spherical coordi-
( 2∂) ( ) ∂2
nates as ∇2 = r12 ∂r
∂ 1
r ∂r + r2 sin ∂
θ ∂θ
∂
sin θ ∂θ 1
+ r2 sin 2 θ ∂ϕ2 . Using the substitution R = R̃/r, it is
easily seen that Eq. (3.13) reduces to the one dimensional D’Alembert equation (3.12) for the variable
R̃.
This is the solution to Eq. (3.12) originally proposed by D’Alembert himself. The two functions y ±
describe waveforms that translate rigidly with velocity c, in the right-going and left-going directions,
respectively. Their shape is determined by the boundary conditions (in space) and the initial conditions
(in time). As an example, if y represents the displacement of a vibrating string the initial conditions are
represented by an initial displacement and an initial velocity:
∂y
y0 (x) = y(x, 0), v0 (x) = (x, 0). (3.16)
∂t
t t
(a) (b)
Figure 3.3: Boundary conditions and wave reflections; (a) fixed string end and negative wave reflection;
(b) free string end and positive wave reflection.
Boundary conditions impose constraints on the solution at the boundary of its domain. As an example,
if y represents the displacement of a vibrating string boundary conditions impose values for y and its
derivatives at the boundary points x = 0 and x = L. The two most common boundary conditions for
a string are the fixed end condition and the free end condition, which read as follows for the boundary
point x = 0 (similar equations can be written for the boundary point x = L)
These equations show that boundary conditions imply “reflection” conditions on the traveling waves y ±
(see Fig. 3.3).
s′′ 1 q̈
(x) = 2 (t) ⇒ s′′ (x) = αs(x), q̈(t) = c2 αq(t), (3.18)
s c q
for some α ∈ R. This last equation follows from the fact that s′′ /s is a function of space only, while q̈/q
is a function of time only. Therefore these ratios must necessarily equal to a constant α.
Now look at the spatial equation. In order for the boundary conditions to be satisfied s has necessarily
to be a non-monotonic function and consequently the condition α < 0 must hold, so that s obeys the
equation of a second-order oscillator (otherwise s(x) would be an exponential function). Moreover, since
it has to be s(0) = s(L) = 0, only a numerable set of spatial frequencies are allowed for s:
√
2 nπ
s(x) = sin(kn x), with kn = , (3.19)
L L
√
where 2/L is just a normalization factor.
Once the spatial equation has been solved, the temporal equation gives
nπc
q(t) = A sin(ωn t + ϕ), with ωn = ckn = , (3.20)
L
where A and ϕ depend on initial conditions. Again, only a numerable set of temporal frequencies ωn =
ckn are allowed. Spatial and temporal frequencies are proportional to each other through the constant c.
In conclusion we have obtained the following stationary waves, or normal modes:
√
2
yn (x, t) = sin(ωn t + ϕn ) sin(kn x). (3.21)
L
The general solution to Eq. (3.12) can be expressed as a linear combination of these modes:
∑
+∞
y(x, t) = An yn (x, t), (3.22)
n=1
where An , ϕn are determined by the initial conditions. This latter equation re-states what we already
know: a periodic signal, such as the one generated in an ideal string with ideal boundary conditions, can
be expressed as a series of harmonically-related sinusoidal signals.
Note that the Fourier solution, expressed in term of normal modes, and the D’Alembert solution,
expressed in terms of traveling waves, are equivalent. In fact a standing wave yn (x, t) can be viewed
as a superposition of sinusoidal traveling waves. More precisely, using the Werner formulas1 a standing
wave can be written as
√
1
yn (x, t) = {cos[kn (ct − x) + ϕn ] − cos[kn (ct + x) + ϕn ]} . (3.23)
2L
Therefore a standing wave is the sum of two sinusoidal waves y ± that translate rigidly with velocity c, in
the right-going and left-going directions, respectively. This proves the equivalence of the D’Alemebert
and Fourier solutions.
Note however that normal-mode solutions are more general than traveling-wave solutions: already
a simple system like a one-dimensional bar, described by a 4th order PDE, does not admit a solution in
terms of traveling waves while its normal modes can be written analytically.
25
20
10
x[n] y[n]
0
g −5
−10
−15
0 0.5 1 1.5 2 2.5 3
z −m ωd (rad)
(a) (b)
Figure 3.4: A comb filter; (a) block scheme and (b) magnitude response.
all about, and yet it is structurally simple enough to be discussed in a limited amount of pages. Finally,
from a historical perspective it can be regarded as the first prototype of a waveguide approach to string
modeling: it is true that the original formulation of the algorithm did not contain any physical interpre-
tation. What is unquestionable, however, is that the KS algorithm is structurally identical to the simplest
waveguide models that we are going to examine in the next sections.
will show that choosing a sign or another corresponds to describing two different boundary conditions
(e.g., an open termination versus a closed termination in an acoustical bore).
M-3.1
Write a function that computes the output of the comb filter of Fig. 3.4, given a desired fundamental frequency f0
and a factor g < 1.
M-3.1 Solution
function y = ks_simplecomb(f0,g);
global Fs;
m = round(Fs/f0); %length of the delay line
d= dline_init(m); % create a delay-line object
The input signal x is defined in accordance to the KS algorithm specifications (see next section). Zero
padding of x is chosen in such a way that the ouput signal has time to decay by 60 dB (see Chapter
Sound in space). Matlab/Octave are very inefficient at computing long cycles, but we use this approach
for coherence with next examples; in particular we have used two auxiliary functions that initialize a
delay line structure
f.x = 0; f.y = 0;
if(floor(d) == d) f.d = d; % ok, d is a valid integer delay
else error(’Not a valid delay’);
end
f.in = zeros(1, d); % create buffer for past input values
function f = dline_compute(f);
4000
3500
3000
2500
f (Hz)
2000
1500
1000
500
0
0 1 2 3 4 5 6 7 8
t (s)
Figure 3.5: . Spectrogram of a plucked A2 guitar string. Note the harmonic structure and the decay
rates, which increases with increasing frequency.
In the real world a nylon guitar string is one of the closest relative of an ideal string and exhibits an
almost perfectly harmonic spectrum. Figure 3.5 shows the spectrogram of a a plucked guitar string: as
expected, a harmonic spectrum can be observed. However another relevant feature is that each harmonic
partial decays at a different rate, with lower partials surviving longer than higher partials.
On the other hand we have just seen that the IIR comb filter produces a spectrum in which all har-
monic peaks have the same magnitude, which means that the associated partials all decay in time at the
same rate. In order to simulate a frequency-dependent decay, one can insert a low-pass filter Hlp into the
feedback loop, as shown in Fig. 3.6(a): we call this structure a low-pass comb filter. Intuitively, at each
passage the high-frequency components are attenuated more strongly than low-frequencies components.
The simplest low-pass filter that can be employed is the first-order FIR already examined in Chapter
Fundamentals of digital audio processing:
1 1[ ]
y[n] = [x[n] + x[n − 1]] ⇒ Hlp (z) = 1 + z −1 . (3.26)
2 2
Figure 3.6(b) shows the frequency response of the low-pass comb structure after the insertion of Hlp :
as expected higher resonances are less peaked and have larger bandwidths, because now the filter poles
have frequency-dependent magnitudes.
However the insertion on a low-pass filter in the structure has also a second effect: it introduces an
additional half-sample delay, which can be observed if one looks at the phase response of Hlp (z) and is
qualitatively explained by the fact that this is filter averages the current sample with the previous one. A
consequence of this additional delay is that the fundamental frequency generated by the low-pass comb
structure is now f0 = Fs /(m + 1/2) Hz. Moreover, a closer analysis would also show that the upper
partials are not anymore integer multiples of f0 = Fs /(m + 1/2), due to the insertion of Hlp in the loop.
These deviations from the harmonic series can also be noticed from the plot in Fig. 3.6(b).
In many cases the deviations introduced by the low-pass filter are very small, especially for the lower
partials and for values of g that are close to 1. However they can still be perceivable. As an example, if
Fs = 44.1 kHz and m = 100, then a half sample delay corresponds to a delay in the order of 10−5 s: in
this case the IIR comb produces a fundamental at Fs /m = 441 Hz, while the low-pass comb produces a
fundamental at Fs /(m + 1/2) ∼ 439 Hz.
M-3.2
Find the response of the complete system given in Fig. 3.6 and plot magnitude and phase responses for various
values of g and m.
25
20
10
Hlp 5
x[n] y[n]
0
−5
g
−10
−15
z −m
0 0.5 1 1.5 2 2.5 3
ω (rad)
d
(a) (b)
Figure 3.6: Low-pass comb filter obtained through insertion of a low-pass element into the comb struc-
ture; (a) block scheme and (b) frequency response (the triangles mark the harmonic series lπ/L, l ∈ N).
The low-pass comb structure discussed so far is the core of the KS algorithm. However we have
not yet discussed what is the input signal to fed to the filter in order to obtain an output sound. Since
the impulse response of the filter is the signal that is resemblant of a plucked string sound, an obvious
choice is to inject the filter with an impulse. A second possible choice, originally suggested by Karplus
and Strong, is to impose a random initial state (m past values of y) to the filter: although this choice
has hardly any physical interpretation,2 it has the benefit of providing significant initial excitation in the
high-frequency region, with a consequent perceptual effect of an initial noisy transient followed by a
harmonic steady-state signal.
M-3.3
Write a function that implements the KS algorithm using the low-pass comb of Fig. 3.6, given a desired funda-
mental frequency f0 and a factor g < 1.
M-3.1 Solution
function y = ks_lpcomb(f0,g);
global Fs;
m = round(Fs/f0); %length of the delay line
d= dline_init(m); % create a delay-line object
2
It would be like imposing initial random displacements to points of a string, as we shall see in the next sections.
We want to design a filter with the same characteristics, i.e. flat magnitude response and linear phase
response (equivalently, with constant and coincident phase and group delays). However we want the
slope of the phase response to be an arbitrary phase delay τph , and not limited to integer values m.
Moreover, since any real delay τph can be written as the sum of an integer delay ⌊τph ⌋ and and a fractional
delay 0 ≤ (τph −⌊τph ⌋) < 1, without loss of generality we restrict our attention to the design of fractional-
delay filters Hτph with 0 ≤ τph < 1).
Note that the impulse response of an ideal delay filter is
∫ +π
1
hτph [n] = e−jωd τph ejωd n dωd = sinc(n − τph ). (3.28)
2π −π
If τph = m ∈ N this reduces to h[n] = δ[n − m]. However, if τph is not integer then this is a non-
causal filter with infinite impulse response, i.e. a non-realizable filter. This remark makes it clear that
we will not be able to find exact realizations of fractional-delay filters, and we will have to look for
approximations.
∑
N
Hτph (z) = bk z −k . (3.29)
k=0
Starting from this general form, we have to design of an N th order FIR filter approximating a constant
magnitude and linear phase frequency response. Several criteria can be adopted to drive this approxima-
tion problem. One approach amounts to minimizing some error distance between the FIR filter (3.29) and
the ideal fractional-delay filter defined previously. Possibly the most intuitive realization of this approach
is the minimization of the least squared (LS) error function, defined as the L2 norm the error frequency
response E(ejωd ) = Hτph (ejωd ) − e−jτph (i.e. E is the difference between the frequency responses of the
FIR filter and the ideal fractional-delay filter).
A different approach, that we describe in some more details, amounts to setting the error function
E(ejωd ) and its N derivatives to zero at ωd = 0:
dl E jωd
(e ) , l = 0, . . . N. (3.30)
dωdl ωd =0
3
In chapter Auditory based processing we will see that pitch perception is a complex phenomenon, and that the perceived pitch
does not necessarly coincide with the fundamental frequency.
1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3
ωd (rad) ωd (rad)
(a) (b)
Figure 3.7: Linear interpolation filters (N = 1) for τph = 0, 0.1, . . . , 1; (a) amplitude response and (b)
phase delay.
This is called the maximally flat design at ωd = 0, since it tries to make the error function as flat as
possible around the value 0, in the vicinity of zero frequency. Substituting Eq. (3.29) in these latter
N + 1 equations yields
∑
N
k l bk = τph
l
, l = 0, . . . N ⇔ V b = τ, (3.31)
k=0
form as
∏N
τph − l
bk = , k = 0, . . . N. (3.32)
k−l
l̸=k;l=0
It is interesting to notice that the FIR filter coefficients obtained by this method are equal to those of the
Lagrange interpolation formula for equally spaced abscissas. In other words, the FIR filter determined
by these coefficients estimates the value x[n − τph ] by interpolating a polynomial of order N over the
N + 1 values x[n], x[n − 1], . . . x[n − N ]. This leads to Lagrange interpolation.4 For N = 1 one obtains
simple linear interpolation, b0 = 1 − τph , b1 = τph . For the case τph = 1/2 we reobtain the first-order
FIR low-pass filter.
Plots for N = 1 and different values of τph are shown in Fig. 3.7. The phase delay remains reasonably
constant up to high frequency values (and is exactly constant in the cases τph = 0, 1/2, 1). Note however
that the magnitude response has always a low-pass character. This is a drawback of these FIR filters:
high frequencies are attenuated due to non flat magnitude response. Using higher orders N allows to
keep the magnitude response close to unity and a phase response close to linear in a wider frequency
band. Of course, this is paid in terms of computational complexity.
M-3.4
Implement a fractional delay line using Lagrange interpolation.
M-3.4 Solution
4
We are not interested here in deriving the Lagrange interpolation method, which is reviewed in many textbooks of numerical
analysis.
f.x = 0; f.y = 0;
f.d = d; % set delay (not necessarily integer)
f.in = zeros(1, floor(d)+2); % create buffer for past input values
f.b=ones(1,N+1); %coefficients of the Lagrange interpolator
tau=d-floor(d); %fractional delay to be simulated
for k=1:length(f.b)
for l=1:length(f.b)
if (l˜=k); f.b(k)=f.b(k)*(tau-(l-1))/( (k-1)-(l-1)); end
end
end
function f = lagrangedline_compute(f);
These functions can be tested in the KS algorithm (examples M-3.1 and M-3.3) in place of the integer
delay lines.
This is not the transfer function of a generic IIR filter. It represents the transfer function of a N th order
all-pass filter. According to the definition already given in Chapter Fundamentals of digital audio processing, an
all-pass filter is a filter with a perfectly flat magnitude response. The filter in the above equation satisfies
the property Hτph (ejωd ) ≡ 1 by construction: this property can be proved by noting that, since the
numerator polynomial is a mirrored version of the denominator polynomial A, the poles of a stable all-
pass filter are located inside the unit circle and its zeros are located outside the unit circle with the same
angle and with the inverse radius of the corresponding poles.
Since the above IIR filter satisfies by construction one of the two properties of an ideal delay filter
(flat magnitude response), we can now focus on the second one (linear phase response). The phase
response of an all-pass filter is found to be
[ ] [∑ ]
[ ( jω )] 1 N
k=0 ak sin(kω d )
arg Hτph e d = N ωd + 2 arg = N ωd + 2 arctan ∑N . (3.34)
A (e−jωd ) k=0 ak cos(kωd )
Therefore the phase response, the phase delay, and the group delay are all highly non-linear functions
of the filter coefficients. This means that one cannot expect as simple design formulas for the all-pass
filter coefficients as for FIR filters. Instead, one can almost exclusively find only iterative optimization
techniques for minimization of traditional error criteria.
Possibly the only design technique that has a closed-form solution is the maximally flat group delay
design. Let us start considering an all-pole low-pass filter with transfer function 1/A(z −1 ). It has been
0
1
0.8
−1
0.6
−1.5
−2 0.4
−2.5 0.2
−3
0
0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3
ωd (rad) ωd (rad)
(a) (b)
Figure 3.8: First-order Thiran allpass filters for τph = 0, 0.1, . . . , 1; (a) phase response and (b) phase
delay.
shown that the condition of maximally flat group delay at ωd = 0 for this filter yields the following
analytic solution:
( )∏N
k N 2τph + l
ak = (−1) , (3.35)
k 2τph + k + l
l=0
(N )
where k is the binomial coefficient. When τph > 0 then the filter is stable. This result can be applied to
our problem using Eq. (3.34): since the fractional phase delay of Hτph is twice those of 1/A, a maximally
flat all-pass filter with coefficients
( )∏N
k N τph + l
ak = (−1) , (3.36)
k τph + k + l
l=0
approximates the ideal delay filter with total delay N + τph . This is known as Thiran all-pass filter
approximation.
As an example let us look at the first-order all-pass filter
a1 + z −1
Hτph (z) = , (3.37)
1 + a1 z −1
with a1 < 1 for stability. The plots of its phase response and phase delay are shown in Fig. 3.8. In the
low-frequency region, the phase response can be approximated as follows:
[ ( )] sin ωd a1 sin ωd 1 − a1
arg Hτph ejωd ∼ − + ∼ −ωd , (3.38)
a1 + cos(ωd ) 1 + a1 cos ωd 1 + a1
i.e. the phase response is approximately linear with phase and group delay approximately equal to
(1 − a1 )/(1 + a1 ). Therefore given a desired phase delay τph one chooses
1 − τph
a1 = . (3.39)
1 + τph
This corresponds to the Thiran approximation with N = 1.
Thiran filters have complementary drawbacks with respect to Lagrange filters: although they provide
flat magnitude response, detuning of higher frequencies occurs due to phase non-linearity. In order to
have phase response approximately linear in a wider frequency range one has to use higher orders, at the
expense of higher complexities.
M-3.5
Implement a fractional delay line using Thiran filters.
M-3.5 Solution
Same approach as before. One function to initialize the line
f.x = 0; f.y = 0;
f.d = d; % set delay (not necessarily integer)
f.in = zeros(1, floor(d)-N); % create buffer for past input values
% the Thiran filter account for the remaining N+(d-floor(d))
f.state=zeros(1,N); %state of the Thiran filter
f.a = zeros(1, N+1); % coefficients of the Thiran filter
tau = d-floor(d); %fractional delay to be simulated
for k = 0:N
ak = 1; for l=0:N; ak = ak * (tau+l)/(tau+k+l); end
f.a(k+1) = (-1)ˆk * nchoosek(N,k) * ak;
end
function f = thirandline_compute(f);
These functions can be tested in the KS algorithm (examples M-3.1 and M-3.3) in place of the integer
delay lines.
So far, only displacement y (for a string) and acoustic pressure p (for a cylindrical bore) have been con-
sidered in the wave equation. However, alternative wave variables can be used in strings and acoustical
Therefore, using this equation force waves f ± can be defined as f ± := ∓ Tc ẏ ± . On the other hand, the
transversal velocity in the same string is given by
∂y
v(x, t) = (x, t) = ẏ + (ct − x) + ẏ − (ct + x). (3.41)
∂t
From this, velocity waves v ± are defined as v ± := ẏ ± . As we have seen in Sec. 3.2.1, the force-
velocity variable pair represent the mechanical Kirchhoff variables, in analogy with voltage and current
in electrical systems. From the previous equations it immediately follows that
√
f ± (ct ∓ x) = ±Z0 v ± (ct ∓ x), with Z0 = T /c = T µ. (3.42)
The quantity Z0 takes the name of wave (or characteristic) impedance of the string, and its reciprocal
Γ0 = Z0−1 is termed wave admittance. Note that using Z0 both the force f and the velocity v can be
related to the force waves f ± . Namely, the following relations hold:
1 [ + ]
f = f + + f −, v= f − f− ,
Z0
(3.43)
+ f + Z0 v − f − Z0 v
f = , f = ,
2 2
that transform the pair (f, v) into the pair (f + , f − ), and vice versa.
Wave impedance can be defined also in a cylindrical bore. In this case the Kirchhoff variables are
taken to be pressure p and flow u (volume velocity). These can be related through the wave impedance
Z0 : p± (ct ± x) = ±Z0 u± (ct ± x), where Z0 = ρair c/S and S is the constant cross-sectional area of
the bore. For conical geometries, the cross-section S is not constant and the definition of Z0 has to be
generalized. The wave impedance is then defined as a function Z0 (s) such that the relations P ± (r, s) =
±Z0 (s)U ± (r, s) hold in the Laplace domain. It can be seen that Z0 (s) = ρair c/S · [rs/(rs + c)].
In summary, Kirchhoff and wave variables in elastic media obeying the D’Alembert equation are
related through wave impedance and Eqs. (3.43). This results provide the basis for developing 1-D
waveguide structures.
p(mXs , nTs ) = p+ (ncTs − mXs ) + p− (ncTs + mXs ) = p+ ((n − m)cTs ) + p− ((n + m)cTs ).
p+[n] p+[n−m]
z− m
p [0,n] p [m,n]
z− m
L
p− [n] p− [n+m]
(a)
p+[n]
z− m
p [0,n] p [m,n]
ro r o−1 (L+ro) −1
L z− m
p− [n]
(b)
Figure 3.9: Lossless waveguide sections with observation points at position x = 0 and x = mXs = L;
(a) cylindrical section; (b) conical section.
The term p+ [n − m] in Eq. (3.44) can be thought of as the output from a digital delay line of length m,
whose input is p+ [n]. Analogously, the term p− [n + m] can be thought of as the input of a digital delay
line with the same length, whose output is p− [n]. This remark leads to the definition of a waveguide
section as a bidirectional delay line, as depicted in Fig. 3.9(a). The horizontal direction of this struc-
ture has a straightforward physical interpretation: it corresponds to the position x along the axis of the
cylindrical bore. In the example depicted in Fig. 3.9(a), two “observation points” have been chosen at
x = 0 and x = mXs = L. At these points, the pressure signal at time n is reconstructed by summing
the corresponding pressure waves p± .
A very similar structure can be outlined for numerically simulating a pressure distribution in an ideal
lossless conical bore. In this case, propagation is described by the one-dimensional equation (3.13),
whose general solution is given by
1
R(r, t) = [R̃+ (ct − r) + R̃− (ct + r)]. (3.45)
r
The conical waveguide is therefore defined as in Fig. 3.9(b). Observation points can be chosen analo-
gously to the cylindrical case.
At the beginning of this discussion we have assumed for simplicity that L = mXs . However this
quantization of the allowed lengths is too coarse for our purposes: with a sampling rate Fs = 44.1 kHz
and with a wave velocity c = 347 m/s (sound velocity in air at 20 C◦ ), the resulting spatial step is
Xs = 7.8 · 10−3 m. Length differences of this magnitude produce perceivable pitch variations in a wind
instrument. One way to overcome this limitation is to include in the structure a fractional-delay filter
(see Sec. 3.3.2) that provide fine tuning of the length of a waveguide section.
y+ y+
1
−1
y− y−
(a) (b)
Figure 3.10: Ideal waveguide terminations: (a) positive reflection; (b) negative reflection.
Looking at Fig. 3.9 we immediately realize that we still one element in order to come out with a compu-
tational structure that describes e.g. a string with fixed ends or a cylindrical tube section with open ends:
boundary conditions.
In Sec. 3.2.2 we have briefly discussed fixed-end and free-end boundary conditions for the displace-
ment y(x, t)|x=0,L of a vibrating string. These can be immediately turned into reflection conditions for
both velocity waves and force waves. As an example, a fixed-end condition implies that the velocity is
0 at the boundaries, therefore the reflection conditions v + = −v − applies at both points. By looking at
Eq. (3.43), one also see that the 0 velocity condition translates into the reflection condition f + = f − at
both points. Therefore wave variables at the boundaries are multiplied by either 1 or −1 (see Fig. 3.10).
More in general, reflection conditions can be derived by formulating boundary conditions for Kirch-
hoff variables and then using Eq. (3.43) to relate Kirchhoff variables to wave variables. A second relevant
example is that of a cylindrical bore of length L, with a closed end at x = 0 and an open end at x = L.
The first condition implies u = u+ + u− = [p+ − p− ]/Z0 = 0 at x = 0 (no flow through a closed end),
which in turn implies the reflection conditions u+ = −u− and p+ = p− . The second condition implies
p = p+ + p− = 0 at x = L (p matches the atmospheric pressure at the open boundary), which in turn
implies the reflection conditions p− = −p+ and u+ = u− .
With these concepts in mind we can now go back to Sec. 3.3.1 and reinterpret the IIR comb structure
used to construct the KS algorithm. The IIR comb can be viewed as a pair of waveguide sections of
length m/2 samples in which traveling waves circulate and reflect at the boundaries according to some
reflection condition. If the coefficient g has a positive sign, as in Eq. (3.24), the corresponding condition
is that of a string fixed at both ends. The signal traveling into the filter can be interpreted either as
a velocity wave (two sign inversions at the boundaries) or as a force wave (no sign inversions at the
boundaries). As a result a harmonic spectrum is generated that contains all the partials. On the other
hand, if the coefficient g has a negative sign, as in Eq. (3.25), the corresponding condition is e.g. that
of a cylindrical bore with one open end and one closed end. The signal traveling into the filter can be
interpreted either as a flow wave or as a pressure wave (both with one sign inversion at the boundaries).
As a result a harmonic spectrum is generated that contains only the odd partials.
p+[n]
z− m
1 p [0,n] p [m,n]
1
Hdisp Hloss z− m
p− [n]
Figure 3.11: Waveguide simulation dissipation and dispersion phenomena through insertion of loss and
dispersion filters.
pation and dispersion. Both can be accounted for by adding proper time, space or time-space derivatives
of different orders to the ideal wave equation. Correspondingly the basic waveguide structure is modified
by inserting appropriate loss and dispersion filters in the loop, as in Fig. 3.11
3.4.2.1 Dissipation
Energy dissipation occurs in any real vibrating medium. In an acoustical bore this is due to air viscosity,
thermal conduction and wall losses. Dissipation in a string comes from internal losses related to elastic
properties of the material, energy transfer through terminations, and friction with air. For clarity, con-
sider the pressure distribution in a cylindrical bore. In the simplest approximation, all of the dissipation
phenomena can be incorporated in the D’Alembert equation by including an additional term proportional
to the first time derivative. As an example, a first-order approximation of a string with linear density µ,
tension T , and dissipation is given by the modified D’Alembert equation
∂2p ∂2p ∂p
µ (x, t) = T (x, t) − d1 (x, t). (3.46)
∂t2 ∂x2 ∂t
In the limit of small d1 , Eq. (3.46) still admits a traveling wave solution, which can be digitized with the
same procedure described in the ideal case:
d1 x d1 x
p(x, t) = e− 2c p+ (ct − x) + e 2c p− (ct + x), then
d1 Ts
(3.47)
p[m, n] = g m p+ [n − m] + g −m p− [n + m], with g = e− 2 < 1.
Thus the traveling waves are exponentially damped along the propagation direction, and this phenomenon
can be incorporated in the waveguide structure. In many real-world phenomena, however, losses increase
with frequency. As an example, the dissipative force exerted by the air on a moving string section is,
to a first approximation, directly proportional to the frequency of oscillation. Similar remarks apply to
the effects of internal material losses. A better approximation of dissipation phenomena in a string is
provided by the equation
With these concepts in mind we can go back again to Sec. 3.3.1 and reinterpret the the comb struc-
tures. In the simple IIR comb filter, the coefficient g < 1 plays the role of the loss factor g m , and
accordingly introduces equal decay times to all partials. In the low-pass comb filter, the low-pass trans-
fer function Hlp plays the role of the loss filter Hloss (z), and accordingly introduces frequency-dependent
decay times to the partials.
There are many techniques for designing a loss filter Hloss to fit a real object. In this section we outline
a relatively simple approach to fit a lossy waveguide model to a real string sound.
First the sound of the target string has to be recorded and analyzed. This can be done using e.g. the
sinusoidal peak detection/continuation algorithms discussed in Chapter Sound modeling: signal based approaches.
As a result from the analysis stage, the frequencies fk and the decay times τk (k = 1, . . . , N ) of the
first N partials can be estimated. In particular τk is defined as the time required by the amplitude of the
kth partial to decay by 1/e with respect to its initial amplitude. A robust way of calculating the τk ’s
is fitting a line by linear regression on the logarithm of the amplitude envelopes derived from the peak
continuation algorithm.
The estimated parameters fk , τk specify the magnitude of Hloss over a set of N points:
( )
2πf
j F k − k
Hloss e s = e fk τk , k = 1, . . . , N. (3.49)
1+α
Hloss (z) = g , (3.50)
1 + αz −1
with −1 < α < 0 and g < 1. One can show that in this case the approximate analytical formulas for the
decay times are
( )2
1 2πfk α 2
≃a+b , with a = f0 (1 − g), b = −f0 , (3.51)
τk Fs 2(α + 1)
and where f0 is the fundamental frequency. Therefore the decay rate 1/τk is a second-order polynomial
of fk with even order terms. Consequently a and b can be straightforwardly determined by polynomial
regression from the prescribed decay times, and finally g and α are computed from a and b via the
inverse of Eqs. (3.51). In most cases, the one-pole loss filter yields good results. Nevertheless, when
precise rendering of the partial envelopes is required, higher-order filters have to be used.
M-3.6
Realize a complete loss filter design procedure, to be applied to a guitar sound. Use the spectral analysis tools
to estimate the decay times of the guitar string partials. Use Eq. (3.51) to design the filter (3.50).
3.4.2.3 Dispersion
A second important phenomenon in natural wave propagation is dispersion. In a string, dispersion is
introduced by string stiffness, i.e. the phenomenon by which a string opposes resistance to bending.
Such a shearing force can be modeled as a fourth spatial derivative, which is introduced as an additional
term in the D’Alembert equation:
M-3.7
Implement the waveguide structure of Fig. 3.11, including the loss filter (3.50) and an all-pass filter to simulate
dispersion.
In this section we outline one possible approach to the dispersion filter design. The total phase delay
over a waveguide loop of length 2m, with loss and dispertion filters is
kFs
τph (fk ) = = 2m + τloss (fk ) + τdisp (fk ). (3.55)
fk
With everything else known, this equation provide a phase delay specification for the dispersion filter:
kFs
τdisp (fk ) = − 2m − τloss (fk ). (3.56)
fk
Given L estimated partial frequencies {fk }k=1,...,L , one can then design an all-pass filter of order N < L
as follows. First, for each partial compute the quantities
1[ ]
βk = − τdisp (fk ) − 2N πfk , k = 1, . . . , L. (3.57)
2
Then, filter coefficients are computed by solving the system
∑
N
aj sin(βk + 2jπfk ) = sin βk , k = 1, . . . , L. (3.58)
j=1
M-3.8
Realize a complete dispersion filter design procedure, to be applied to a piano sound. Use the spectral analysis
tools to estimate the frequencies of the piano partials. Use Eq. (3.58) to design the all-pass filter.
p1+ p2−
1−ρ
−ρ
ρ
Figure 3.12: Kelly-Lochbaum junction for two cylindrical bores with different areas.
the junction and the air coming out from the other side must be the same). These two requirements lead
to the following conditions at the junction:
u1 + u2 = 0, p1 = p2 = pJ . (3.59)
Using the Kirchhoff analogy p ↔ v (voltage) and u ↔ i (current), Eqs. (3.59) can be regarded as
describing a parallel junction. If pressure wave variables are introduced as in Eq. (3.43) (with p+ and
p− denoting incoming and outgoing waves, respectively), and the junction pressure pJ is used, then the
relation p−
l = pJ − pl (for l = 1, 2) holds. Substitution in the first of Eqs. (3.59) yields
+
− − − −
1 + u1 ) + (u2 + u2 ) = Γ1 (p1 − p1 ) + Γ2 (p2 − p2 ) =
0 = (u+ + + +
(3.60)
1 − pJ ) + Γ2 (2p2 − pJ ).
= Γ1 (2p+ +
From this, the junction pressure pJ can be expressed in terms of the incoming pressure waves p+
1,2 as
Γ1 p+ +
1 + Γ 2 p2
pJ = 2 . (3.61)
Γ1 + Γ 2
Γ2 − Γ1 + 2Γ2
p−
1 = pJ − p1 = −
+
p + p+ ,
Γ2 + Γ1 1 Γ2 + Γ 1 2
(3.62)
2Γ1 Γ2 − Γ1 +
p−
2 = pJ − p2 = +
+
p+ + p .
Γ2 + Γ1 1 Γ2 + Γ 1 2
And finally
p−
1 = −ρ p1 + (1 + ρ)p2 ,
+ +
Γ2 − Γ1
with ρ≜ , (3.63)
p−
2 = (1 − ρ)p+
1 +ρ p+
2, Γ2 + Γ 1
These equations describe the Kelly-Lochbaum junction. The quantity ρ is called the reflection coefficient
of the junction. A scattering diagram is depicted in Fig. 3.12.
This junction has been extensively used in so-called “multitube lossless models” of the vocal tract.
These are articulatory models where the vocal tract shape is approximated as a series of concatenated
cylindrical sections. Pressure wave propagation in each section is then described using digital waveg-
uides, and interconnections are treated as Kelly-Lochbaum junctions. However this very same junction
can be used to describe not only acoustic, but also mechanical structures. As an example, consider two
strings with different densities, connected at one point: this can be thought of as a series junction, since
the physical constraints impose that velocity (i.e., “current”) has to be the same on the left and right
p+ p-
1 2
z - m1 z - m2
closed
open
1
-1
ρ
z - m1 z - m2
p+
p- 2
1
L1 L2
Figure 3.13: Example of use of the Kelly-Lochbaum junction: (a) a parallel junction of two cylindrical
bores; (b) realization with two waveguide sections and a Kelly-Lochbaum junction.
sides, and the sum of forces (i.e., “voltages”) from the two sides must be zero. Analogously to the above
analysis, a series Kelly-Lochbaum junction can be derived in this case.
Terminations of a waveguide model are an interesting particular case of junctions. Consider an ideal
cylindrical bore, closed at one end: this boundary condition corresponds to an infinite impedance Z2 =
∞ (i.e., S2 = 0), and thus to a reflection coefficient ρ = −1. In other words, complete reflection occurs
and the relation p− +
1 (0, t) = p1 (0, t) holds. Similarly, an ideally open end can be seen to correspond
to Z2 = 0 (i.e., S2 = ∞), and thus to ρ = 1: this is a second case where complete reflection occurs,
namely the relation p−1 (0, t) = −p1 (0, t) holds. These reflection conditions are identical to those derived
+
M-3.9
Implement the waveguide structure of Fig. 3.13. Add a loss filter (3.50) to each WG section.
The result expressed in Eq. (3.63) can be readily extended to higher dimensions. Consider a parallel
junction of N acoustic bores. In this case a scattering matrix can be found, and Eq. (3.63) is generalized
to
p− = A · p+ , (3.64)
where p± are n-dimensional vectors whose elements are the incoming and outgoing pressure waves in
the n bores. The physical constraints expressed in Eq. (3.59) are also generalized as
p1 = p2 = . . . = pN = pJ ,
(3.65)
u1 + u2 + . . . + uN = 0.
As an example, a 3-dimensional junction can be used to model an acoustic hole in a wind instrument:
in this case, two waveguide sections represents the two sides of the acoustic bore with respect to the
hole, and the third one represents the hole itself. Note also that when N = 2 Eq. (3.64) reduces to the
Kelly-Lochbaum equations.
A second relevant extension of the Kelly-Lochbaum junction is the loaded junction, in which an
external signal is injected into the system. A simple example is that of a string that is excited (e.g.
hammered) at a given point. For continuity, the velocity of the string in this contact point will be the
same at both sides. Moreover, during the contact this velocity will be equal to the velocity of the hammer.
Finally, the sum of the forces at the contact point equals the hammer force. The following equations of
continuity are then derived:
v1 = v2 = vJ , f1 + f2 + fJ = 0. (3.67)
With the Kirchhoff analogies this is a series junction with an external load (the “currents” at the junction
are the same, and the potentials at the junction sum to the driving potential). Then
M-3.10
Implement the waveguide structure of Fig. 3.14. Add a loss filter (3.50) to each WG section.
v+
1 [n] v−
2 [n]
Waveguide Waveguide
string 1 string 2
v−
1 [n] v+
2 [n]
1/2Z 0
fJ [n]
Figure 3.14: Example of a loaded junction: a waveguide structure for a string excited by an external
force signal fJ [n] (e.g. a hammer).
Transition
volume
S2 S1
S1 S2
(a) (b)
Figure 3.15: Boundary regions for (a) non-convex and (b) convex conical junctions.
by assuming that the transition volume is small and thus pressure is constant inside the volume. Under
this assumption, continuity conditions analogous to (3.59) are imposed and the reflection coefficient ρ is
generalized to a first order filter R(s).
However, a second and more serious problem arises when one looks at the nature of R(s). This
filter turns out to be unstable (non-causal growing exponential) in the case of the convex configuration
depicted in Fig. 3.15(b). While this circumstance is physically consistent (in the continuous-time domain
the scattered waves can grow exponentially only for a limited time because they are cancelled out by
subsequent multiple reflections), in a numerical simulation the system can turn out unstable, due to the
approximations introduced by the discretization process and to round-off errors introduced by finite-
precision.
or lip reeds, can be conveniently described using the lumped modeling paradigm. Although these sys-
tems are quite complicated, due to their limited spatial extensions they can be modeled using lumped
elements, and it is widely accepted that such a simplified description captures the basic behavior of
pressure controlled valves. Similar remarks hold for hammers and mallets: during collision, they are de-
formed and subject to internal losses and non-linear restoring forces. However, interactions with strings
and bars have been modeled and efficiently implemented in sound synthesis algorithms by assuming the
hammer/mallet to be a lumped mass.
∑M ∑
N
B(s) bk sM −k Kk
Γ(s) = = ∑k=0 , ⇒ Γ(s) = , (3.70)
A(s) N
k=0 ak s
N −k s − pk
k=1
where the pk ’s are the poles of the system. By taking the inverse Laplace transform of this latter equation,
one can see that the impulse response γ(t) is a combination of complex exponentials. This impulse
response is then sampled to obtain its digital counterpart:
∑
N ∑
N
( )n
γ(t) = Kk e pk t
, ⇒ γd [n] ≜ γ(nTs ) = Kk epk Ts . (3.71)
k=1 k=1
∑
N
Kk Bd (z)
Γd (z) = −1
= , with pd,k = epk Ts . (3.72)
1 − pd,k z Ad (z)
k=1
This equation tells that the transfer function Γd (z) of the discretized system is still rational, with N poles
pd,k uniquely determined by the continuous-time poles pk .
One quality of the method is that stability is guaranteed at any sampling rate: if the continuous-time
system is stable, i.e. Re(pk ) < 0 for all k, then Eq. (3.72) tells that | pd,k | < 1 for all k, i.e. the discrete
time system is also stable. On the other hand, a drawback of the method is aliasing. Since γd [n] has been
obtained by sampling γ(t), then the discrete-time response Γd is a periodization of Γ:
∑
+∞ ( )
jω jω 2kπ
Γd (e ) = Γ +j . (3.73)
Ts Ts
k=−∞
As a consequence, aliasing can occur in Γd if the bandwidth of Γ exceeds the Nyquist frequency.
The mapping g1 (z) is known in numerical analysis as the backward Euler method. The adjective “back-
ward” is used because the first derivative of x at time n is estimated through the values of x at time n
and n − 1. Higher-order derivatives can be estimated through iterate application of Eq. (3.74). As an
example, the second derivative is computed as
[ ]
d2 x 1 x[n] − x[n − 1] x[n − 1] − x[n − 2] x[n] − 2x[n − 1] + x[n − 2]
(nTs ) ≈ − = . (3.75)
dt2 Ts Ts Ts Ts2
Alternatively, a centered estimate is also often used in combination with the backward Euler method. In
this case the second derivative is computed as:
A second, widely used s-to-z mapping is provided by the bilinear transform. Like the backward Eu-
ler method, it can be seen as a finite approximation of the time derivative, but in this case the incremental
ratio is assumed to approximate the value of ẋ(t) averaged on time instants nTs and (n − 1)Ts :
bilinear transform
1 Euler method
0.5
ω=0
Im(z)
−0.5
−1
−1 −0.5 0 0.5 1
Re(z)
Figure 3.16: Mapping of the vertical axis s = jω (solid circle lines) and of the left-half s-plane (shaded
regions) using the backward Euler method g1 and the bilinear transform g2 .
equation of the form x[n] = fd (x[n], x[n − 1], n), in which x[n] depends implicitly on itself through
the function fd . This is a source of problems for the resulting discrete-time system, since the difference
equation is not computable explicitely due to the instantaneous dependence of a variable on itself. Below
we discuss briefly this computability problem in the case of linear systems. Note that one advantage of
the centered estimate (3.76) is that when it is applied in conjunction with the Euler method to a second-
order ODE it leads to an explicit difference equation.
A comparison between the first estimate in Eq. (3.77) and the first in Eq. (3.74), gives the intuition that the
bilinear transform provides a more accurate approximation than the Euler method. A rigorous analysis
would show that the order of accuracy of the bilinear transform is two, while that of the backward Euler
method is one.
Another way of comparing the two techniques consists in studying how the frequency axis s = jω
and the left-half plane Im(s) < 0 are mapped by g1,2 into the discrete domain. This provides information
on both stability and accuracy properties of g1,2 . As shown in Fig. 3.16, both the methods define one-
to-one mappings from s = jω, onto two circles. Therefore no frequency aliasing is introduced. Second,
both the methods are stable, since the left-half s-plane is mapped inside the unit circle by both g1 and
g2 . However we also see that both mappings introduce frequency warping, i.e. the frequency axis is
distorted. The bilinear transform g2 maps the axis s = jω exactly onto the unit circle z = ejωd , and the
mapping between the continuous frequency ω and the digital frequency ωd can be written analytically:
(ω ) ( )
2 1 − e−jωd 2j d ωTs
jω = = tan , ⇒ ωd = 2 arctan . (3.78)
Ts 1 + e−jωd Ts 2 2
At low frequencies ωd increases almost linearly with ω, while higher frequencies are progressively com-
pressed (warped) and ωd → ±π as ω → ±∞. This warping phenomenon is the main drawback of the
bilinear transform.
u2 w u2 w
~
a y x u1
a b/(1-ab)
y b x u1
(a) (b)
Figure 3.17: A linear discrete-time system; (a) delay-free path, (b) equivalent realization with no delay-
free paths.
For the Euler method no analytic mapping can found from ω to ωd . The function g1 c “doubly” warps
the frequency axis: there is a progressive warping in the direction of increasing frequency (similarly to
the bilinear transform), and there is also warping normal to the frequency axis. Figure 3.16 also shows
that the poles of the discrete-time system obtained with g1 are more “squeezed” inside the unit circle than
those obtained with g2 . Furthermore, it can happen that continuous-time poles with positive real-part are
turned by g1 into discrete-time poles with modulus less than unity: in other words g1 can turn unstable
continuous systems into stable discrete systems. This numerical damping is a second major drawback of
the Euler method.
One more relevant aspect to discuss is the computability of the discrete-time systems obtained when
discretizing a system of ODEs with either g1,2 (or other mappings). As already stated, being these
implicit methods the resulting difference equations are implicit. In order to clarify this point, let us
consider the simple example depicted in Fig. 3.17(a). This system can be written as
w[n] = w̃[n] + y[n], with w̃ = u2 ,
x[n] = x̃[n] + ay[n], with x̃ = u1 + au2 , (3.79)
y[n] = bx[n], ⇒ y[n] = b[u1 [n] + au2 [n] + ay[n]],
where we have defined tilded variables w̃ and x̃ than only depend on the external inputs u1,2 , and are
therefore known at each time n.
The signals y and x are connected through a delay-free loop and the resulting set of difference
equations is implicit: in particular the last of Eqs. (3.79) shows that y[n] depends implicitly on itself. It
is easy, however, to rearrange the computation in order to solve this problem: the last of Eqs. (3.79) can
be inverted, yielding
b
y[n] = [u1 [n] + au2 [n]]. (3.80)
1 − ab
This new equation relates y to the computable vector x̃. Therefore, an equivalent realization of the system
is obtained as shown in Fig. 3.17(b). The key point in this example is that the discrete-time system is
linear, which allows explicit inversion of the last equation in (3.79).
This simple example is an instance of the so-called delay-free loop problem. In the linear case the lit-
erature of digital signal processing provides techniques for the restoring computability by rearrangement
of the structure.
Z(s) − Z0
F (s) = Z(s)V (s), ⇒ F − (s) = R(s)F + (s), with R(s) ≜ . (3.82)
Z(s) + Z0
The circuit element can then be visualized as a black box with a port consisting of two terminals, with
a port voltage applied across them, and an associated flowing current, as in Fig. 3.18(a). A linear sys-
tem can then be modeled through series and parallel connections of one-port elements: as an example,
Fig. 3.18(b) visualizes a series connection of two ports representing the mechanichal system
The second step in WDF design is the discretization of R(s). The equivalent wave digital filter is
obtained using the bilinear transform as R(g2 (z)). Note that since the reference impedance Z0 can be
given any value, this provides an additional degree of freedom in the design. In particular, Z0 can be
chosen such that the WDF has no delay-free paths from input to output, therefore guaranteeing com-
putability when connecting more than one element. As an example, consider the three elementary me-
chanical impedances Zmass (s) = ms, Zspring (s) = k/s, Zloss (s) = r. For the mass, the reflectance is
Rmass (s) = (ms − Z0 )/(ms + Z0 ), therefore the equivalent WDF is
Therefore choosing Z0 = 2Fs m leads to the interesting result that no delay-free path is present in the
corresponding WDF. Similarly, one can prove that Rspring (z) = z −1 with Z0 = k/2Fs , and Rloss = 0
with Z0 = r.
This brief section has shown that WDFs can be used to digitize lumped element networks using
wave variables and adapted impedances in such a way that delay-free computational loops are avoided in
the resulting numerical structure. We have shown a single example of a series connection between two
elements. The concept of connection is generalized in WDF theory with the concept of adaptors, which
are N -port elements that model interconnection between arbitrary numbers of elements.
+ ms −Z 0,1
f(t) R1 (s) R1 (s)=
− ms+Z 0,1
f +(t)
+
+ k/s − Z 0,1
f(t) v(t) R(s) R2 (s) R2 (s)=
− k/s +Z 0,1
−
f −(t)
(a) (b)
Figure 3.18:
In Sec. 3.2.1 we have studied the simple example of two coupled mechanical oscillators, and we have
seen that the resulting system can be viewed as the combination of two uncoupled oscillators, whose
frequencies depend on those of the original ones. This approach can be extended to a generic network of
N linear undamped oscillators:
M ÿ(t) + Ky(t) = f ext (t). (3.85)
In this equation y is a vector containing the displacements of the N points of the network, while M is
the mass matrix: typically (but not necessarily) it is diagonal and contains the masses ml (l = 1 . . . N )of
each point of the network. K is the stiffness matrix and is in general not diagonal because the points are
coupled through springs.
Now we consider the homogeneous equation (f ext ≡ 0) and look for a factorized solution of the
form y(t) = s · sin(ωt + ϕ). By substituting this into Eq. (3.85), one finds
Ks = ω 2 M s. (3.86)
This is a generalized eigenvalue problem for the matrix K: more precisely, ω 2 is an eigenvalue of
M −1 K and s is the associated eigenvector. In general one will find N distinct eigenvalues and eigen-
vectors ωi2 and si (for simplicity we consider normalized si ’s). The key property of these eigenvectors
is that they are orthogonal with respect to the mass and the stiffness matrix:
where mi and ki = ωi2 mi are real positive scalars. The orthogonality condition also implies that the
modal shapes si are linearly independent. The si ’s can be used to define a modal transformation, i.e. a
change of spatial coordinates that transforms system (3.85) into a set of N uncoupled oscillators:
By virtue of the orthogonality property, the matrices Mq and Kq are diagonal and contain the elements
mi and ki on their diagonals, respectively. Therefore this is a system of uncoupled oscillators with
frequencies ωi , the quantities mi and ki represent the masses and the stiffnesses of these modes.
The matrix S T of the modal shapes defines how a driving force f ext acts on the modes: as a
particular case, consider a scalar force acting only on the lth point of the network, i.e. f ext (t) =
[0, . . . , fext (t), . . . , 0]T (where the only non-null element is in the lth index). This force is applied to
the generic ith mode, scaled by the factor si,l , i.e. the shape of the ith mode at the lth point of the net-
work. If this factor is 0, i.e. if the ith mode has a node at the lth point of the network, then no force is
transmitted to the mode.
The oscillation yl (t) of the system at the lth spatial point will ∑be the sum of the modal oscillation
N
weighted by the modal shapes, according to Eq. (3.88): yl (t) = i=1 si,l qi (t). Again, if the ith mode
has a node at the lth point of the network, that mode will not be “heard” in this point. In conclusion the
motion of the network is determined by the motion of N second-order mechanical oscillators and by the
transformation matrix S.
This formalism can be extended to systems that include damping, i.e. where we add a term Rẏ in
Eq. (3.85).
where the term s′n (x)sn (x)|L 0 is identically zero for fixed (or even free) boundary conditions. Therefore
∫L
the equation for the nth mode is that of a second-order oscillator with mass mn = µ 0 s2n (x)dx and
∫L
stiffness kn = T 0 [s′n (x)]2 dx. For the ideal string the modal shapes are simply sn (x) = sin(nπx/L),
therefore mn = µL/2 and kn = T L/2.
The shape also defines how a driving force acts on the mode. As a particular case, consider a force
density that is ideally concentrated in a single point xin of the string, i.e. fext (x, t) = δD (x − xin )u(t)
(where the function δD (·) is the Dirac delta): then the force acting on the nth mode is sn (xin )u(t), and
if xin is a node of the mode then no force is transmitted to it. We already know that the oscillation
y(xout , t) of the system at the∑spatial point xout will be the sum of the modal oscillations weighted by the
modal shapes: y(xout , t) = +∞ n=1 sn (xout )qn (t). Again, if the nth mode has a node at the point xout ,
that mode will not be “heard” in this point.
This analysis can be extended to include dispersion and dissipation. As an example, we know that
for a dissipative string we have to add the terms d1 ∂y/∂t − d2 ∂/∂t(∂ 2 y/∂x2 ) on the left-hand side of
Eq. (3.90). Again, by substituting the mode yn (x, t) in the[equation, and then multiplying by s]n (x) and
∫L ∫L
integrating over the string length, one finds that the term d1 0 s2n (x)dx + d2 0 [s′n (x)]2 dx q̇(t) has
to be added to Eq. (3.92), which represents a viscous damping term for the second order oscillator.
M-3.11
Compute modal parameters for a string with linear dissipation.
M-3.11 Solution
function [omega,alpha,m,s]=modal_string(L,T,mu,d1,d2,N,M);
xstep=L/(M-1); xpoints=0:xstep:L;
Parameters are functions of the string tension T , linear density µ, loss factors d1,2 . One can choose
the number N of modes to compute, and the number M of spatial points for the shape computation.
Two remarks. First, we are using the ideal spatial shapes: this is not correct for the dissipative string,
but is acceptable for small d1,2 values. Second, we are computing the integrals numerically, although
for the ideal string shapes these have analytical solutions: in Sec. 3.5.3 we will examine less trivial
shapes.
In conclusion the modal representation of continuous systems described by PDEs is in strict analogy
with that of discrete systems described as networks of masses and springs. Here we have obtained similar
equations, where the discrete spatial index l = 1 . . . N indicating the points of the network (3.85) has
become a continuous spatial variable x, sums over l have become integrals over x, and a numerable
Mode: 1 2 3 4 5 ... 10
N= 1
2
3
4
5 ..
.
... ...
10 ... ...
(a) (b)
Figure 3.19: Analogies between continuous and discrete systems: (a) approximation of an ideal string
with a mass-spring network; (b) modes of the discrete system for different numbers N of masses.
infinity of modes has been found instead of a finite set of N modes. These strict analogies reflect the fact
that continuous systems can be seen as the limit of discrete systems when the number of masses becomes
infinite. As an example, a string can be approximated with the discrete network of Fig. 3.19(a) made of
N masses and N + 1 springs. Figure 3.19(b) shows that for a given N the system has N modes, whose
shapes resemble closely those of the first N modes of the continuous string. Moreover the approximation
grows closer and closer as N increases. One could also show that the modal frequencies of the continuous
system are underestimated by those of the discrete system, due to the spatial discretization.
The frequency ω0 = k/m and the loss factor α = r/m depend on the geometry and the material of the
object. The force fmode that is “felt” by a single mode depends on the modal shape and on the spatial
force distribution, and is scaled by the modal mass m. The displacement y(x, t) at a certain point x of
the structure is a linear combination of the modes q(t), where the coefficients of the linear combination
are the modal shapes s(x) at the point x. This is true whether we have a discrete set of points xi or a
continuous domain, although in practice the spatial domain will be always discretized.
In order to construct a modal synthesizer, the first step to perform is to construct a discrete-time
equivalent of the second order oscillator (3.93). We can discretize the differential equation with the
numerical methods examined previously in Sec. 3.5.1. The impulse invariant method yields:
[ ( −αT ) ]
Ts emωr sin(ωr Ts ) z −1
s
H(z) = . (3.94)
1 − [2e−αTs cos(ωr Ts )] z −1 + e−2αTs z −2
The presence of a z −1 factor at the numerator indicates that this is an explicit numerical scheme (there is
no instantaneous dependence on the input). The backward Euler method yields
1
m(Fs2 +2αFs +ω02 )
H(z) = 2Fs (α+Fs ) Fs2
. (3.95)
1− z −1 + z −2
2
Fs +2αFs +ω0 2 2 2
Fs +2αFs +ω0
0 0 0 0
amplitude resp. (dB)
Figure 3.20: Amplitude responses of a second order oscillator with constant mass and quality factor,
and ω0 = 2, 4, 8 kHz: continuous-time responses (solid lines) and discrete-time responses (dashed lines)
with (a) impulse invariant method, (b) backward Euler method, (c) backward Euler method with centered
scheme, (d) bilinear transform.
This instead is an implicit numerical scheme. The backward Euler method with centered scheme yields
Ts2 −1
[ m z]
H(z) = . (3.96)
1 + ω02 Ts2 + 2αTs − 2 z −1 + [1 − 2αTs ] z −2
Like in the impulse invariant case, this is an explicit numerical scheme. By looking at the poles of this
discrete-time system one can see that it can become unstable depending on the mechanical parameters
and on the sampling period: the scheme is not unconditionally stable. Finally, the bilinear transform
yields [ ]
1
m(4Fs2 +4αFs +ω02 )
(1 + 2z −1 + z −2 )
H(z) = 2(ω02 −4Fs2 )
. (3.97)
−1 + 4Fs −4αFs +ω0 z −2
2 2
1 + 4F 2 +4αF +ω 2
z 4F 2 +4αF +ω 2 s s 0 s s 0
Like in the case of the backward Euler method, this is an implicit numerical scheme.
The resulting amplitude responses are shown in Fig. 3.20. As expected, the impulse invariant method
exhibits aliasing, the Euler method exhibits warping and numerical damping, the Euler method with
centered scheme tends to become unstable for high omega0 values, and the bilinear transform exhibits
warping (but not numerical damping).
M-3.12
Write a function that computes the filter coefficients of the mechanical oscillator discretized with (a) the impulse
invariant method, (b) the Euler method g1 (z), (c) Euler method with the centered estimate (3.76), and (d) the
bilinear transform. Compare the frequency responses of the resulting discrete-time systems.
M-3.12 Solution
function [B,A]=modal_oscillator(m,alpha,omega,method)
switch method
case ’impinv’
omegar=sqrt(omegaˆ2-alphaˆ2); eaTs=exp(-alpha*Ts);
B= [0 Ts*eaTs/(m*omegar)*sin(omegar*Ts) 0];
A= [1 -2*eaTs*cos(omegar*Ts) eaTsˆ2];
case ’euler’
delta= Fsˆ2+2*alpha*Fs+omegaˆ2;
B= [1/(m*delta) 0 0];
A= [1 -2*Fs*(alpha+Fs)/delta Fsˆ2/delta];
case ’eulercenter’
B= [0 Tsˆ2/m 0];
A= [1 (omegaˆ2*Tsˆ2 +2*alpha*Ts -2) (1-2*alpha*Ts)];
case ’bilin’
delta=4*Fsˆ2 +4*alpha*Fs +omegaˆ2;
B= [1/(m*delta) 2/(m*delta) 1/(m*delta)];
A= [1 2*(omegaˆ2 -4*Fsˆ2)/delta (4*Fsˆ2 -4*alpha*Fs +omegaˆ2)/delta];
otherwise error(’unknown numerical method’);
end
M-3.13
Write a function that computes the output of a modal resonator given an input force signal.
M-3.13 Solution
function y = modal_synth(x,omega,alpha,m,s,in,out,method);
global Fs;
N=length(omega);% it must be size(omega)=size(alpha)=size(m)=N
% it must be size(s,1)=N; 0<in<size(s,2); 0<out<size(s,2);
y=zeros(1,length(x));
for i= 1:N
[B,A]=modal_oscillator(m(i),alpha(i),omega(i),method);
y_i=filter(B,A,s(i,in)*x);
y=y +s(i,out)*y_i;
end
We have assumed that the force distribution is concentrated in a single point, represented by the index
in. We “pick-up” the resonator signal at another point, represented by the index out (like we were
using a contact mike attached to the object at the point out).
The input modal parameters can be chosen to match those of an arbitrary object. Moreover, morphing
between different shapes and material can be obtained by designing appropriate trajectories for these
parameters.
M-3.14
Synthesize the sound of a dissipative string using the modal approach.
M-3.14 Solution
∂2y EK 2 ∂ 4 y
(x, t) = − (x, t), (3.98)
∂t2 ρ ∂x4
where E is the Young modulus of the material, K is the radius of gyration,7 and ρ is the volume density.
Note that the fourth-order term is the one that we used to describe a stiff (and dispersive) string. The
modal solutions y(x, t) = s(x)q(t) are in this case
y(x, t) = [A cosh kx + B sinh kx + C cos kx + D sin kx] cos(ωt + ϕ), with k = cω, (3.99)
∫ ∫
7
This would need some explanation. In short: K 2 = S1 z 2 dS, where S = dS is the total cross-section of the bar and
z is the distance from the neutral axis, i.e. the axis along the bar which does not change its length when the bar is bent (at one
side of the neutral axis there is elongation, at the other side there is compression). Everything clear??
free−free
clamped−free
hinged−hinged
(a) (b)
Figure 3.21: Modal description of the ideal bar: (a) ideal bar with various boundary conditions and (b)
corresponding modes.
√
and where c2 = ωK E/ρ. This modal solution cannot be interpreted in terms of traveling waves,
therefore waveguide methods fall short here, while modal synthesis can be successfully employed.
The constants A, B, C, D as well as the allowed frequencies are determined depending on four
boundary conditions (two at each end). The conditions for a free end are ∂ 2 y/∂x2 = ∂ 3 y/∂x3 = 0
(no torque and no shearing force); those for a supported (hinged) end are y = ∂ 2 y/∂x2 = 0 (no dis-
placement and no torque); and those for a clamped end are y = ∂y/∂x = 0 (no displacement and zero
slope). Three notable examples are shown in Fig. 3.21(a). For these cases, numerical solution of the
equations resulting from boundary conditions yields
√
π2K E [ ]
(free-free) {ωn } = 2
3.0112 , 52 , 72 , . . . , (2n + 1)2 , . . . ,
4L ρ
√
π2K E [ ]
(clamped-free) {ωn } = 2
1.1942 , 2.9882 , 52 , . . . , (2n − 1)2 , . . . , (3.100)
4L ρ
√
2π 2 K E 2
(hinged-hinged) {ωn } = n .
L2 ρ
Note that in the first two cases the frequencies are strongly inharmonic, while in the third case they are
harmonically related: the corresponding lowest modes are shown in Fig. 3.21(b). Mallet percussions
most typically use bars with (approximately) free-free conditions. However in many cases their bars do
not have constant cross-sections, instead their are cut with an arch on the underside in such a way that
the theoretical partials of the free-free series in Eq. (3.100) are shifted and aligned to an almost harmonic
series.
M-3.15
Compute modal parameters for a bar with the three boundary conditions examined here, and with linear dissipa-
tion.
M-3.15 Solution
Like Example M-3.11, but using the modal shapes of the ideal bar.
z
Lx (1,1) (2,1) (1,2) (3,1) (3,2) (3,3)
Ly x
y ... ...
(a) (b)
z
(4,1) (2,2) (0,3) (5,1) (3,2) (6,1)
x
a
y
(c) (d)
Figure 3.22: Modal description of ideal membranes: (a) ideal rectangular membrane with fixed ends
and (b) corresponding modes; (c) ideal circular membrane with fixed ends and (d) corresponding modes.
where z is the membrane vertical displacement and the constants T and σ are the membrane surface
tension (in N/m) and surface density (in Kg/m2 ). The symbol ∇2 = ∂ 2 /∂x2 + ∂ 2 /∂y 2 stands here for
the 2-dimensional Laplacian operator. Modal solutions z(x, y, t) = s(x) (x)s(y) (y)q(t) are found with
the same procedure used for the ideal string:
√ √ ( ) ( )
2 2
zn,m (x, y, t) = sin kn(x) x sin km
(y)
y cos (ωn,m t + ϕn,m ) ,
Lx Ly
√[ ] [ ]
(3.102)
nπ mπ (x) 2 (x) 2
with kn(x) = , km (y)
= , ωn,m = c kn + kn .
Lx Ly
Note that the modal frequencies ωn,m are not harmonically related in this case. The modal shapes
(x) (y)
sn,m (x, y) = sn (x)sm (y) have straight nodal lines: the lowest modes are shown in Fig. 3.22(b).
A second example, even more relevant for musical applications, is the circular membrane with
fixed ends, like the one in Fig. 3.22(c). In this case the 2-D D’Alembert equation is more conve-
niently written in circular coordinates x = r sin θ and y = r cos θ and the laplacian becomes ∇2 =
∂ 2 /∂r2 + 1/r(∂/∂r) + 1/r2 (∂/∂θ). Accordingly, one looks for modal solutions of the form z(r, θ, t) =
s(r) (r)s(θ) (θ)q(t).
Substituting this into the 2-D D’Alembert equation results in two differential equations for s(r) and
(θ) (r)
s(θ)( ) the angular shapes sm (θ) = cos(mθ). Then for each m, the radial shapes are sn (r) =
. One finds
(r) (r)
Jm km,n r , i.e. they are the first-kind Bessel functions of order m, with radial frequencies km,n . The
(r) (r)
allowed values for km,n are found as usual by imposing that sn = 0 at the fixed boundary, therefore are
determined by the nth zero of Jm . In conclusion the m, n mode has m nodal diameters (determined by
(θ) (r)
the function sm ) and n nodal circles (determined by the function sn ). The lowest modal frequencies
ωn,m are
2.405c
{ωn,m } = [1, 1.594, 2.136, 2.296, 2.653, 2.918, 3.156, 3.501, 3.6, 3.652, 4.06, 4.154] , (3.103)
a
and are highly inharmonic. The corresponding modes are shown in Fig. 3.22(d).
M-3.16
Compute modal parameters for a rectangular and a circular bar with fixed boundary conditions, and with linear
dissipation.
M-3.16 Solution
Like Example M-3.11, but using the modal shapes of the ideal rectangular and circular membrane.
Modulating
actions
where the oscillations actually take place (an acoustic bore, a string, a bar, etc) and is therefore related
to such sound attributes as pitch, spectral envelope, and so on. The exciter controls the way energy is
injected into the system, thus initiating and possibly sustaining the oscillations, and relates in partic-
ular to properties of attack transients. A simple yet striking demonstration of the effectiveness of the
exciter/resonator schematization is provided by mounting a clarinet mouthpiece on a flute.8 The bore
boundary conditions are changed from open-open to closed-open, so that it plays one octave lower, and
the resulting instrument is perceived as a bad sounding clarinet. In other words, the excitation mecha-
nism defines sound identity (“it’s a clarinet”), while the resonator is mostly associated to sound quality
(“it’s a bad clarinet”).
The interaction between the two blocks is a two-way interaction, where the state of each block influ-
ences the other. As an example, the impact force between a hammer and a string depends on the displace-
ments and velocities of both hammer and string, and affects both. Clearly there are also examples where
non-linearities in the excitation are negligible: plucked string instruments can be conveniently treated
as linear systems (strings and instrument body), where the “pluck” is described as a non-equilibrium
initial condition (i.e., the pluck gives a string a non-zero displacement distribution and a null velocity
distribution).
Finally, note that non-linearities are not necessarily related to excitation mechanisms only: even res-
onators, that are assumed to be linear in a first approximation, can exhibit non-linear behaviors. As an
example, when a string vibrates outside the limit of small oscillations its length cannot be anymore as-
sumed to be constant, but varies (together with string tension) during an oscillation cycle: this length- and
tension-modulation mechanism can produce perceivable pitch glides in the sound. Similar considerations
apply to other systems (e.g. non-linear circuit elements).
Consider the well known Chua-Felderhoff electrical circuit: this is a RLC circuit, made of a series con-
nection of a resistor R, an inductor L and a capacitor C. The elements R and L are constant, while this
is not the case for C. More precisely, the characteristic of the capacitance is a function of the voltage v,
8
The author has enjoyed a live demonstration with such a “flarinet”, performed by Joe Wolfe while giving a seminar in
Venice, 2000.
−10 −11
x 10 x 10
1.5 25 non−linear
linearized
20
1 15
q [C]
C [F]
10
5
0.5
0
−5
0
0 2 4 6 8 10 12 0 2 4 6 8 10 12
v [V] v [V]
(a) (b)
Figure 3.24: Non-linear behavior of (a) capacitance C(v) and (b) charge q(v) in the Chua-Felderhoff
circuit.
The variable q(t) stands for the charge on the capacitor, and ve (t) is an applied voltage. Note that
C(v) ∼ C0 when v → 0, i.e. the system is a linear RLC circuit in the limit of small oscillations.
However, for larger voltage v this approximation does not hold, and C(v), q(v) behave as depicted in
Fig. 3.24(a) and (b), respectively. There is no easy way to translate the non-linear relation (3.104) into
the Laplace domain, because the definition of impedance given in Sec. 3.2.1 assumes linearity of the
circuit elements.
The Chua-Felderhoff circuit has been extensively studied and is one of the classical systems used
for exemplifying transition to chaotic behavior: when the peak of the voltage generator is increased, the
behavior of the charge q(t) on the capacitor undergoes successive bifurcations.
4 1400
compression velocity (m/s)
1200
2
−2 600
400
−4
200
−6 0
−2 −1 0 1 2 0 0.2 0.4 0.6 0.8 1 1.2
compression (m) −4
x 10 compression (m) x 10
−4
(a) (b)
Figure 3.25: The non-linear impact model (3.106): (a) phase portrait of a point mass hitting a hard
surface; (b) the corresponding non-linear force during impact.
In a less ideal impact model one would assume that the impact force is non-null over a finite duration
or time (the contact time between the colliding objects) and takes finite values. The force magnitude is
related to the impact energy (e.g. the impact velocity of the hammer hitting the resonator), while the
contact time is related to the hardness of the impact. A simple signal model of the impact force is the
following:
{
fmax [ ]
2 1 − cos( 2πt
τ ) , 0 ≤ t ≤ τ,
f (t) = (3.105)
0, otherwise,
where τ is the contact time and fmax is the maximum force value.
More complex models must take into account other effects. There is dissipation of energy during
contact. The contact force itself is a function of the relative compression x(t) between the two contacting
objects (which may be thought as the difference between the displacements of the two objects during the
contact), and also of the compression velocity v(t) = ẋ(t). Accordingly, a more physically-based model
of the impact force is the following:
{
kx(t)α + λx(t)α v(t), x > 0,
f (x(t), v(t)) = (3.106)
0, otherwise,
where k is the force stiffness, λ is the force damping weight, and the exponent α depends on the local
geometry around the contact area. As an example, according to Herz theory of contact an ideal impact
between two spherical objects obeys this equation with α = 3/2 and λ = 0.
Figure 3.25(a) depicts the simulation of a point mass hitting a rigid surface with the impact model (3.106:
the phase portrait shows that due to dissipation the mass velocity after the impact is always lower in
magnitude than the initial impact velocity, and converges to a limit value. Figure 3.25(a) shows the
corresponding impact force: it has a non-linear characteristics that depends on the exponent α, and it
exhibits a hysteresis effect that is associated to to the dissipative component λxα v. This plot is qualita-
tively resemblant of what one would observe by measuring the contact force during a real impact of a
small mass againts a rigid surface.
M-3.17
Simulate a modal oscillator excited by the non-linear impact force (3.106).
0.3 fs
0.2
friction force (N)
0.1 fc string
0
−0.1
−0.2
−0.3
bow
−0.2 −0.1 0 0.1 0.2 0.3
relative velocity (m/s)
(a) (b)
Figure 3.26: Stick-slip friction: (a) example of parametrization of a kinetic (static) friction curve; (b)
Helmoltz motion resulting from stick-slip ideal string-bow interaction.
f (v(t)) = (3.107)
0, otherwise,
where fc , fs are the Coulomb force and the stiction (short for static friction) force respectively, while
vs is named Stribeck velocity. The Coulomb force and the stiction force are related to the normal force
through the equations fs = µs fN and fc = µd fN , where µs and µd are the static and dynamic friction
coefficients. If fN ≤ 0 this means that there is no contact. The dependence of the friction force on
velocity, as given in Eq. (3.107), is shown in Fig. 3.26(a).
When two objects in relative motion interact through a friction force of this kind, a stick-slip phe-
nomenon is generated in which the two objects remain in static contact for a certain amount of time (the
“stick” phase) and suddenly detach (the “slip” phase). Sound generation occurs when this alternation
of stick and slip phases occurs in an almost periodic fashion and with an audio rate, typically locked to
some of the proper resonance frequencies of the interacting objects.
An example of stick-slip interaction is the Helmoltz motion occurring in an ideal, rigidly terminated,
bowed string (see Fig. 3.26(b)). Assuming the bow to be perfectly rigid and to be in contact with the
string in a single point, the string motion at the contact point is a sawtooth signal in which the string
remains stuck to the bow hair for a considerable fraction of each vibratory cycle, and slips back abruptly
when its displacement becomes large enough, to begin the next cycle. In normal playing condition the
resulting frequency of oscillation is almost coincident with the first-mode frequency of the string. Further
y(x,t)+y’(x,t)dx
dL y’(x,t)dx
y(x,t)
dx
analysis of this Helmoltz motion would reveal that at every instant the shape of the string consists of two
line segments joined by a corner, and the corner travels on an envelope composed of two parabolas.
M-3.18
Simulate a modal oscillator excited by the non-linear friction force (3.107).
More refined, dynamic friction models include some “memory”. The dependence of friction on the
relative sliding velocity is modeled using a differential equation. These models are able to take into
account presliding behavior, where the friction force increases gradually for small displacement val-
ues. Static and dynamic friction models have the same behavior at high or stationary relative velocities,
but dynamic models provide more accurate simulation of transients, which is particularly relevant for
realistic sound synthesis.
mouth
y
L
p to
m p bore
u
y max
Reed
lip
h
L
by finding the modes of Eq. (3.108). As an example, in the simple case where d2 = E = 0 in Eq. (3.108),
one can find the modes
[ ]
∑
+∞
q̈i (t) + d1 q̇(t) + i2 ω02 + ω12 l2 ql2 (t) qi (t) = 0, i = 1, . . . , +∞, (3.111)
l=1
that can be interpreted as mechanical oscillators in which the frequency of oscillation depends on the
modal displacement. In the general case of Eq. (3.108) including dispersion and frequency-dependent
dissipation, a similar modal description can still be found.
where pm is the pressure inside the performer’s mouth, p is the (oscillating) acoustic pressure inside
the instrument bore, k is the effective reed stiffness, Sd is an effective driving surface on which the
pressure ∆p acts, and ka = k/Sd is the stiffness per unit area. Equation (3.112) is called a quasi-static
approximation since it can be determined experimentally in static conditions where a constant pressure
−4
x 10
1.2
1000
1
p+ [Pa]
0.8 500
u [m3/s]
0.6
0
0.4
0.2 −500
0
0 500 1000 1500 2000 2500 3000 3500 −600 −400 −200 0 200 400 600 800
∆ p [Pa] p− [Pa]
(a) (b)
Figure 3.29: Quasi-static approximation of a single reed; (a) u versus ∆p and (b) rotated mapping p+
versus p− .
difference δp is injected into the system and the corresponding constant displacement yL is measured
after an initial transient.
As far as aerodynamics is concerned, the relation between the reed opening h(t), the airflow u(t)
through the slit, and the pressure drop ∆p(t) can be approximated through the equation
ρair | u(t) | 2
∆p(t) = f (u(t), h(t)) = sgn[u(t)] , (3.113)
2 wh(t)
where w is the reed width. This equation is derived from the Bernoulli law.9 Using Eq. (3.112), the
reed opening h is computed as h = ymax − yL = ymax − ∆p/ka , and by substituting this relation into
Eq. (3.113) one finds
( )√
w sgn[∆p(t)] y − ∆p(t) 2| ∆p(t) |
, ∆p < ka ymax ,
max ka ρair
u(t) = (3.114)
0, otherwise.
Figure 3.29(a) shows the plot of the resulting relation between u and ∆p. For low ∆p values, u increases
until a maximum is reached at ∆p = ka ymax /3. For higher ∆p values, the flow starts to drop due to reed
closure, and reaches the value u = 0 at ∆p = ka ymax . Beyond this value the reed is completely closed.
This non-linear map can be used to construct a quasi-static reed model. If wave variables p± are
introduced in the cylindrical bore, i.e. p = p+ + p− and u = p+ − p− , then these relations can be
substituted into Eq. (3.114). As a consequence this non-linearity can be turned in a new one in which
p+ depends on p− through a non-linear reflection function Rnl , i.e. p+ = Rnl (p− ). This is depicted in
Fig. 3.29(b).
Despite its simplicity, the quasi-static model is able to capture the basic non-linear mechanisms
of self-sustained oscillations in a single reed instrument. Due to its compactness and low number of
parameters, this model has been also used for sound synthesis purposes.
M-3.19
9
The Bernoulli law holds for incompressible non-viscous fluids in stationary conditions, and states the relation u = A · x ·
∆p1/2 sgn(∆p) between the flow u and the pressure difference ∆p through an aperture of width x. Some authors adopt for the
single reed the generalized equation u = [A · x∆p1/2 sgn(∆p)]1/α , with an experimentally determined value α = 3/2.
Write a function that computes the pressure wave p+ [n] reflected into the bore from the wave p− [n] arriving from
the bore, according to the quasi-static model (3.114). Implement a quasi-static clarinet model in which the quasi-
static reed is coupled to a waveguide cylindrical bore and driven by a mouth pressure signal pm . The bell can
be modeled as a low-pass filter, that radiates frequencies above its cut-off (typically around 1500 Hz) and reflects
low frequencies back inside the bore.
M-3.19 Solution
Further refinements to this model should include propagation losses, fractional-delay filters in order
to allow for fine tuning of the bore length, and acoustic holes modeled as scattering filters connected
through 3-port junctions to the main waveguide structure.
M-3.20
Implement a dynamic clarinet model in which the dynamic reed is coupled to a waveguide cylindrical bore and
driven by a mouth pressure signal pm .
Finally, additional degrees of freedom must be taken into account when simulating other types of
reeds. Double reeds (such as those found in oboe and bassoon) are composed of two reeds that oscillate
independently, and even if one assumes perfect symmetry of oscillation the flow model differs from the
one examined previously, due to the smallness of the aperture. In so-called lip reeds the role of the reed
is taken by the performer’s lips, that are constrained into the mouthpiece and vibrate at the fundamental
frequency: at least two degrees of freedom are needed to simulate lip vibration.
original (f)
k= 1.5 1.2 0.9 0.6 0.3 sheared (h)
1
0.8
y 0.6
0.4
0.2
0
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2
x
where f is a non-linear function, and x̃[n] is a vector of variables that are known at time n. The variables
y[n] depend instantaneously onto themselves in the above equation. If one could turn this implicit de-
pendence into a new explicit dependence y[n] = h(x̃[n]), this would solve the delay-free loop problem.
This is achieved using the implicit mapping theorem. Define the function g as
and assume that there is a point (x̃0 , y 0 ) such that g(x̃0 , y 0 ) = 0. Moreover, assume that the following
condition holds [ ]
gi
det[Jy (g)(x̃0 , y 0 )] = det (x̃0 , y 0 ) ̸= 0, (3.119)
yj i,j
where Jy (·) denotes the Jacobian matrix with respect to the y variables. From the definition of g, it is
seen that Jy (g) = Jx (f )K − I. Therefore, condition (3.119) implies that the matrix [Jx (f )K − I]
must be non-singular at the point (x̃0 , y 0 ). If these conditions are fulfilled, then the implicit mapping
theorem states that a function h(x̃) exists locally (i.e. for points x̃ in a neighborhood of x̃0 ), with the
properties
h(x̃0 ) = y 0 and g(x̃, h(x̃)) = 0. (3.120)
If the above conditions are fulfilled globally rather than in a neighborhood of (x̃0 , y 0 ), then h is defined
globally.
A few geometrical considerations can help understanding the shape of the new function h. Consider
the coordinate transformation [ ] [ ] [ ]
x̃ I −K x
= · . (3.121)
y 0 I y
This defines a shear that leaves the y axes unchanged and distorts the x axis into the x̃ axis. The plot of
the function y = f (x) “lives” in the (x, y) space. Then the plot of y = h(x̃) is obtained by applying
the coordinate transformation (3.121), and is therefore a sheared version of the former.
In order to understand this shear effect, consider the following example with a scalar function f :
R → R:
y[n] = f (x[n]) = e−(x[n] ) ,
2
with x[n] = x̃[n] + ky[n]. (3.122)
Condition (3.119) translates in this case in the condition f ′ (x) ̸= 1/k, which has a straightforward geo-
metrical interpretation: the shear transformation defined in Eq. (3.121) is such that the vector [x, y]T =
[k, 1]T (i.e. a point with tangent 1/k) is transformed into the vector [x̃, y]T = [0, 1]T (i.e. a point with
vertical tangent). This explains why the derivative of f cannot equal 1/k.
Figure 3.30 shows the original function f (x), together with the sheared one h(x̃), for various k val-
ues. It can be seen that the horizontal coordinate is distorted when applying the shearing transformation.
Moreover, note that for k = 1.5 the new function h(x̃) cannot be defined globally, because the condition
f ′ (x) ̸= 1/k is not fulfilled globally in this case.
M-3.21
Simulate a modal oscillator excited by the non-linear impact force f (x(t) = kx(t)α (i.e. the impact model (3.106)
with λ = 0) as follows: use an implicit numerical scheme (e.g. the bilinear transform), find the implicit dependence
in the form (3.117), and construct the corresponding sheared non-linear function.
the topic of higher dimensional (2- and 3-D) waveguide structures: seminal ideas were first presented by
van Duyne and Smith III [1993].
About lumped modeling approaches. Numerical and computational aspects: most of the techniques
described in Sec. 3.5.1 are found in DSP textbooks: see e.g. [Mitra, 2005]. In the field of numerical
analysis, a comprehensive discussion on numerical methods for ordinary differential equations is given
by Lambert [1993]. The example illustrated in Fig. 3.17 about delay-free computational paths in linear
systems is adapted from [Mitra, 2005, Sec. 6.1.3, Fig. 6.5]. A classic reference to the theory of Wave
Digital Filters (WDF) theory is [Fettweis, 1986].
Finite difference schemes have been applied to also to the explicit numerical simulation of partial dif-
ferential equations, e.g. for modeling idiophones [Chaigne and Doutaut, 1997] and single reed systems
[Stewart and Strong, 1980]. A recent book about the applications of finite difference methods to numer-
ical sound synthesis is [Bilbao, 2009], which discusses the fundamentals of finite differences and shows
how they can be employed to simulate strings, bars, plates, membranes, acoustic tubes. Among other
lumped modeling approaches, in the early nineties Cadoz and coworkers have introduced the CORDIS-
ANIMA model [Florens and Cadoz, 1991], which describes vibrating bodies as a set of interconnected
mass-spring-damper cells.
Modal synthesis. A classic presentation of modal synthesis techniques is [Adrien, 1991]. Cook
[1997] developed a series of “physically-informed” approaches to the modeling of percussion sounds,
which are based on a modal description. The use of modal sound synthesis to virtual reality applica-
tions is discussed in [van den Doel and Pai, 2004]. A corpus of relevant contributions in this field has
been provided by Rabenstein and coworkers [Trautmann and Rabenstein, 2003], who have proposed
the so-called functional transformation method (FTM): in essence, the method exploits the existence of
an analytical form of the modal parameters for a set of relevant multidimensional differential systems,
including strings and membranes with various boundary conditions. Our examples of modal analysis
for simple 1-D and 2-D shapes is based on [Fletcher and Rossing, 1991, Ch.2-3]. The same book also
shows experimental results of modal analysis on several musical instruments, including modal shapes
and Chladni patterns. In addition to linear prediction techniques and partial tracking methods, already
discussed in Chapter Sound modeling: signal based approaches, a method for high-resolution estimate of modal
parameters from sound analysis has been proposed in [Esquef et al., 2003].
About non-linear physical models. The non-linear impact model of Eq. (3.106) was first proposed
by Hunt and Crossley [1975]. . Concerning stick-slip friction models, an overview of traditional models
in the context of sound synthesis applications (bowed strings) is provided by Serafin [2004]. More com-
plex dynamic stick-slip models, typically used in the literature of automatic control, have been recently
applied to sound synthesis by Avanzini et al. [2005]. We have seen that the reed mechanism is that of
pressure-controlled valves: a classic paper on the topic is [Fletcher, 1993]. The quasi-static single reed
examined in Sec. 3.6.3 was first studied by Schumacher [1981] and has been used extensively in the
literature. Other types of reeds: for the double reed see [Guillemain, 2004], for the lip reed see [Adachi
and aki Sato, 1996]. Lip reeds have some similarities with vocal fold functioning: a classic example of a
vocal fold model applied to voice synthesis is [Ishizaka and Flanagan, 1972].
We have seen that new problems are encountered when non-linear elements are present in the delay-
free computational path: Borin et al. [2000] provides a discussion of these issues, together with a pro-
posed non-iterative solution (in brief, a set of hypotheses and techniques to pre-compute a “sheared”
non-linear function that makes the numerical scheme computable), and applications to the simulation of
acoustic systems.
References
Seiji Adachi and Masa aki Sato. Trumpet Sound Simulation Using a Two-dimensional Lip Vibration Model. J. Acoust. Soc.
Am., 99(2):1200–1209, Feb. 1996.
Jean-Marie Adrien. The missing link: Modal synthesis. In Giovanni De Poli, Aldo Piccialli, and Curtis Roads, editors,
Representations of Musical Signals, pages 269–297. MIT Press, Cambridge, MA, 1991.
Federico Avanzini, Stefania Serafin, and Davide Rocchesso. Interactive simulation of rigid body interaction with friction-
induced sound generation. IEEE Trans. Speech Audio Process., 13(6):1073–1081, Nov. 2005.
Balasz Bank. Physics-based Sound Synthesis of String Instruments Including Geometric Nonlinearities. PhD thesis, Budapest
University of Technology and Economics, Dep. of Measurement and Information Systems, Budapest, 2006.
Julien Bensa, Stefan Bilbao, Richard Kronland-Martinet, and Julius O. Smith III. The simulation of piano string vibration:
From physical models to finite difference schemes and digital waveguides. J. Acoust. Soc. Am., 114(2):1095–1107, Aug.
2003.
Stefan Bilbao. Numerical sound synthesis - Finite difference schemes and simulation in musical acoustics. John Wiley & Sons,
Chichester, 2009.
Gianpaolo Borin, Giovanni De Poli, and Davide Rocchesso. Elimination of delay-free loops in discrete-time models of nonlin-
ear acoustic systems. IEEE Trans. Speech Audio Process., 8(5):597–606, Sep. 2000.
Antoine Chaigne and Vincent Doutaut. Numerical Simulations of Xylophones. I. Time-domain Modeling of the Vibrating Bar.
J. Acoust. Soc. Am., 101(1):539–557, Jan. 1997.
Perry R. Cook. Physically informed sonic modeling (PhISM): Synthesis of percussive sounds. Computer Music J., 21(3):
38–49, 1997.
Giovanni De Poli. A Tutorial on Digital Sound Synthesis Techniques. In Curtis Roads, editor, The Music Machine, pages
429–447. MIT Press, 1991.
Giovanni De Poli and Davide Rocchesso. Physically Based Sound Modelling. Organized Sound, 3(1):61–76, Apr. 1998.
John R Deller, John G. Proakis, and John. H.L. Hansen. Discrete-Time Processing of Speech Signals. Macmillan, New York,
1993.
Paulo A. A. Esquef, Matti Karjalainen, and Vesa Välimäki. Frequency-zooming arma modeling for analysis of noisy string
instrument tones. EURASIP Journal on Applied Signal Processing, 2003(10):953–967, 2003.
Alfred Fettweis. Wave Digital Filters: Theory and Practice. Proceedings of the IEEE, 74(2):270–327, Feb. 1986.
Neville H. Fletcher. Autonomous Vibration of Simple Pressure-Controlled Valves in Gas Flows. J. Acoust. Soc. Am., 93(4):
2172–2180, Apr. 1993.
Neville H. Fletcher and Thomas D. Rossing. The physics of musical instruments. Springer-Verlag, New York, 1991.
Jean Luc Florens and Claude Cadoz. The physical model: modeling and simulating the instrumental universe. In Giovanni
De Poli, Aldo Piccialli, and Curtis Roads, editors, Representations of Musical Signals, pages 227–268. MIT Press, Cam-
bridge, MA, 1991.
Federico Fontana and Federico Avanzini. Computation of delay-free nonlinear digital filter networks. Application to chaotic
circuits and intracellular signal transduction. IEEE Trans. Sig. Process., 56(10):4703–4715, Oct. 2008.
Philippe Guillemain. A digital synthesis model of double-reed wind instruments. EURASIP Journal on Applied Signal Pro-
cessing, 2004(1):990–1000, Jan. 2004.
Lejaren Hiller and Paul Ruiz. Synthesizing Musical Sounds by Solving the Wave Equation for Vibrating Objects: Part I. J.
Audio Eng. Soc., 19(6):462–470, June 1971a.
Lejaren Hiller and Paul Ruiz. Synthesizing Musical Sounds by Solving the Wave Equation for Vibrating Objects: Part II. J.
Audio Eng. Soc., 19(7):542–551, July 1971b.
Kenneth H. Hunt and F. R. Erskine Crossley. Coefficient of restitution interpreted as damping in vibroimpact. ASME J. Applied
Mech., 42:440–445, June 1975.
Kenzo Ishizaka and James L. Flanagan. Synthesis of voiced sounds from a two-mass model of the vocal cords. Bell Syst. Tech.
J., 51:1233–1268, 1972.
Kevin Karplus and Alexander Strong. Digital Synthesis of Plucked String and Drum Timbres. Computer Music J., 7(2):43–55,
1983.
John L. Kelly and Carol C. Lochbaum. Speech synthesis. In Proc. 4th Int. Congr. Acoustics, pages 1–4, Copenhagen, Sep.
1962.
Timo I. Laakso, Vesa Välimäki, Matti Karjalainen, and Unto K. Laine. Splitting the Unit Delay Tools for Fractional Delay
Filter Design. IEEE Signal Processing Magazine, 13(1):30–60, Jan. 1996.
John D. Lambert. Numerical Methods for Ordinary Differential Systems. John Wiley & Sons, 1993.
Michael E. McIntyre, Robert T. Schumacher, and James Woodhouse. On the Oscillations of Musical Instruments. J. Acoust.
Soc. Am., 74(5):1325–1345, Nov. 1983.
Robert T. Schumacher. Ab Initio Calculations of the Oscillations of a Clarinet. Acustica, 48(2):71–85, 1981.
Stefania Serafin. The sound of friction: real-time models, playability and musical applications. PhD thesis, Stanford University,
Center for Computer Research in Music and Acoustics, Stanford, 2004.
Julius O. Smith III. A new approach to digital reverberation using closed waveguide networks. In Proc. Int. Computer Music
Conf. (ICMC’85), pages 47–53, Vancouver, 1985.
Julius O. Smith III. Viewpoints on the History of Digital Synthesis. In Proc. Int. Computer Music Conf. (ICMC’91), pages
1–10, Montreal, Oct. 1991.
Julius O. Smith III. Principles of digital waveguide models of musical instruments. In Mark Kahrs and Karl-Heinz Brandenburg,
editors, Applications of Digital Signal Processing to Audio and Acoustics, pages 417–466. Kluwer Academic Publishers,
New York, Mar. 1998.
Julius O. Smith III. Virtual acoustic musical instruments: Review and update. Journal of New Music Research, 33(3):283–304,
Autumn 2004.
Julius O. Smith III. Physical Audio Signal Processing: for Virtual Musical Instruments and Digital Audio Effects, December
2008 Edition. http://ccrma.stanford.edu/˜jos/pasp/, 2008. Accessed 15/12/2008.
Stephen E. Stewart and William J. Strong. Functional Model of a Simplified Clarinet. J. Acoust. Soc. Am., 68(1):109–120, July
1980.
Lutz Trautmann and Rudolf Rabenstein. Digital Sound Synthesis by Physical Modeling Using the Functional Transformation
Method. Kluwer Academic, New York, 2003.
Vesa Välimäki, Jyri Pakarinen, Cumhur Erkut, and Matti Karjalainen. Discrete-time modelling of musical instruments. Rep.
Prog. Phys., 69(1):1–78, 2006.
Kees van den Doel and Dinesh K. Pai. Modal Synthesis for Vibrating Objects. In Ken Greenebaum, editor, Audio Anecdotes.
AK Peters, Natick, MA, 2004.
S. A. van Duyne and J. O. Smith III. The 2-D Digital Waveguide Mesh. In Proc. IEEE Workshop on Applications of Sig.
Process. to Audio and Acoustics (WASPAA’93), pages 177–180, New Paltz (NY), Oct. 1993.
3-59
3-60 Algorithms for Sound and Music Computing [v.February 2, 2019]