The Technology of Computer Music 1969 PDF
The Technology of Computer Music 1969 PDF
Technology
of
Computer
Music
MAX V. MATHEWS
with the
collaboration
of
Joan E. Miller
F. R. Moore
J. R. Pierce
and J. C. Risset
The
Technology
of
Computer
Music
1. Fundamentals
Introduction 1
Numerical Representation of Functions of Time 2
Sampling and Quantizing 4
Foldover Errors 7
*Mathematical Analysis of Sampling 11
*Alternative Analysis of Sampling 16
Bounding Sampling Errors 18
*Sample and Hold Analysis 21
*Analysis of Quantizing Errors 22
Digital-to-Analog and Analog-to-Digital Converters 25
Smoothing-Filter Design 29
Digital Data Storage and Retrieval for Sound 31
Fundamental Programming Problems for Sound Synthesis 33
Overview of Sound-Synthesis Program-Music V 37
Annotated References by Subject 39
Problems for Chapter 1 40
v
vi CONTENTS
3. Music V Manual
1. Introduction 115
2. Description of Pass I 120
3. Operation Codes and Corresponding Data Statements 122
4. Definition of Instruments 124
5. Unit Generators 126
6. Special Discussion of 0SC Unit Generator 134
7. Input-Output Routines for Pass I and Pass II 139
8. PLF Subroutines 144
9. General Error Subroutine 144
10. Description of Pass II 145
11. WRITE2 148
12. C0N-Function Evaluator for Line-Segment Functions 150
13. S0RT and S0RTFL 150
14. PLS Routines 151
15. C0NVT-Convert Subroutine 152
16. Description of Pass III 153
17. I and IP Data Arrays in Pass III 158
18. Note Parameters 160
19. Instrument Definition 161
20. F0RSAM 162
21. SAMGEN 165
22. SAM0UT 167
23. SAM0UT for Debugging 167
24. Acoustic-Sample Output Program: FR0UT 167
25. GEN-Pass III Function-Generating Subroutines 169
CONTENTS vii
Introduction
This book is intended for people who plan to use computers for
sound processing. Present users range from engineers and physicists
concerned with speech and acoustics to musicians and phoneticians
concerned with sound synthesis and speech production and perception.
The widely varied technical and mathematical background of this
audience makes it hard to select a technical level for this presentation.
Some experience with a computer language such as F0RTRAN will be
assumed, though it could be obtained at the time this material is studied.
Occasionally a satisfactory explanation of some point requires
mathematics at the level of a graduate curriculum in electrical engineer-
ing. These mathematical sections have been quarantined and marked
with an asterisk. Although the mathematical material adds essential
understanding of sound processing, the rest of the book is intended to
be comprehensible without it. The implications of the mathematics are
usually given in elementary terms in other sections. Also, Appendix B
lists the main relationships required for mathematical background.
Chapter I covers some fundamentals that are basic to all computer
sound processing-the representation of sounds as numbers, the under-
lying processes of sampling and quantizing a sound wave, the approxi-
mations and errors that are inherent in sampling and quantizing, the
operation of digital-to-analog and analog-to-digital converters, the
2 CHAPTER ONE
Jt----,-,-,-,-,-,-"., .
.001 .002 .003 .004 .005 .OOG .007
Time in seconds
(0)
Time in seconds
(b)
In the past most sounds have originated from the vibrations and
movements of natural objects-human vocal cords, violin strings,
colliding automobiles. The nature of these sounds is determined by and
limited by the particular objects. However, in the last 50 years the
loudspeaker has been developed as a general sound source. It produces
a pressure function by means of the vibrations of a paper cone actuated
by a coil of wire in a magnetic field. The movement of the cone as a
function of time, and hence the resulting pressure function, are deter-
mined by the electric voltage (as a function of time) applied to the coil.
Loudspeakers are not perfect: they distort all sounds slightly, and some
sounds are hard to produce. However, the almost universal range of
sounds they generate in a satisfactory way is demonstrated by the range
of sounds that can be played on phonograph records and on radios.
Loudspeakers are sound sources of almost unlimited richness and
potential.
To drive a loudspeaker and produce a desired pressure function, an
electric voltage function of time must be applied to its coil. Exchanging
the problem of generating a pressure function for generating a voltage
function might seem to offer little gain. However, very versatile methods
exist for producing electric functions.
4 CHAPTER ONE
Loudspeaker
Computer
I---Me~ory ---~ Digital- Smoothing
to-
: 6,13,16,12,11,15,: filter
analog
L13~~:~''':''':'' __ J converter
o to 5KHz
.
:::J
20
+
613.161211 15 .. ·
II>
......~
II>
.8
II>
10
~2
~g
00
~t -10
o
a.
~-200~------------~1------------~2
Time (milisecondsl
Pressure
Time
Fig. 3. Example of
various sampling rates:
(0 )
(a) high sampling
rate ; (b) low sampling
Pressure rate.
Time
(b)
We can now begin to appreciate the huge task facing the computer.
For each second of high-fidelity sound, it must supply 30,000 numbers
to the digital-to-analog converter. Indeed, it must put out numbers
steadily at a rate of 30,000 per second. Modern computers are capable
of this performance, but only if they are expertly used. We can also
begin to appreciate the inherent complexity of pressure functions
producing sound. We said such a pressure could not be described by
one number; now it is clear that a few minutes of sound require
millions of numbers.
The second approximation is called quantizing. The numbers in
computers contain only a certain number of digits. The numbers in the
Fig. 2 computer have only two digits. Thus, for example, all the pulse
amplitudes between 12.5 and 13.5 must be represented by the number
13. Of course we could build a larger computer that could handle
three-digit numbers. This machine could represent 12.5 exactly. How-
ever, it would have to approximate all the amplitudes between 12.45
and 12.55 by 12.5. Furthermore, the more digits, the more expensive
will be the computer.
FUNDAMENTALS 7
The quantizing errors are closely equivalent to the noise and distor-
tion that are produced by phonographs, tape recorders, amplifiers, and
indeed all sound-generating equipment. Their magnitude can be
estimated in terms of signal-to~noise ratios or percentage distortions.
The approximate signal-to-noise ratio inherent in a given number of
digits equals
Maximum number expressible with the digits
Maximum error in representing any number
For example, with two-decimal digits, the maximum number is 99 and
the maximum error is .5. The signal-to-noise ratio is
~~ ~ 200 or 46 dB
Foldover Errors
The generation of voltage functions from quantized samples is a
practical, powerful, and useful method when coupled to modern com-
puters. Most of this book is concerned with applications of this method.
In order to use the method, its errors and limitations must be under-
stood and avoided. A mathematical analysis of the errors is given later
8 CHAPTER ONE
Voltage function
I I I
Sample pulses
~--------------------~.Time
(a)
I I I
Somple pulses
Time
(b)
FUNDAMENTALS 9
waveform poorly. This happens when the voltage function has fre-
quencies higher than Rj2 Hz, where R is the sampling rate. This is the
case for the voltage functions and sampling rates shown in Fig. 4.
When the voltage function contains frequencies higher than Rj2 Hz,
these higher frequencies are reduced, and the resulting sound is heard
somewhere in the range 0 to Rj2 Hz. For example, if the sampling rate
is 30,000 Hz and we generate samples of a sine wave at a frequency of
25,000 Hz
sin (27T·25,000·t)
the resulting voltage function out of the low-pass filter (smoothing
filter, Fig. 2) will be a sine wave at 5000 Hz
sin (27T' 5000· t)
More generally, if we generate samples of a sine wave at F Hz, where F
is greater than Rj2, the resulting frequency will be
FroId = R - F
The frequency F is reflected or folded by the sampling frequency; hence
the term foldover.
Why does fold over occur? Some physical feeling is suggested by
Fig. 5. Here we have diagrammed the example discussed above, of a
25,000-Hz sine wave sampled at 30,000 Hz. The samples of the 25,000-
Hz wave are shown as points, and the actual numbers are
1, .5, - .5, -1, - .5, .5, 1, .5, ...
samPling time-33~fLsec
::;1
GI
sin (217"'25.000 t)
rT (Sampling rate = 30.000Hz)
~0~~-+-+'~~~~~+-~-4~~~~~~
GI Time in
0. microseconds
-I
sin (217"·5000t )
A 5000-Hz sine wave is also shown, and it also passes through the same
sample points. In other words, the 5000-Hz wave will have the identical
samples and therefore the identical numbers as the 25,000-Hz wave.
When the pulses produced by these numbers are put into the low-pass
filter, a 5000-Hz wave will come out, because the low-pass filter passes
low frequencies and attenuates high frequencies.
The essential point in the example is the identity of the samples of the
25,000-Hz and 5000-Hz waves. Hence from the samples there is no way
to distinguish between these frequencies. No computer program or
electric filter or other device can separate identical objects. For practical
purposes, the digital-to-analog converter and smoothing filter will
always be designed to interpret the samples as a 5000-Hz wave, that is, a
wave between 0 and Rj2 Hz. Thus one must be willing to accept this
frequency in the sound, or one must avoid generating samples of a
25,000-Hz wave (in general, a wave with frequencies greater than
Rj2 Hz).
The example chosen was simple in order that the graph could be
easily seen and the numbers easily computed. But the relation
F fOId = R - F holds for sine waves generally. More complex periodic
waves can be decomposed into individual harmonics and the foldover
frequency calculated separately for each harmonic.
Foldover also occurs from mUltiples of the sampling rate. Com-
ponents of ± R ± F, ± 2R ± F, ± 3R ± F, etc., are produced by the
digital-to-analog converter. However, in most cases only R - F is
troublesome.
We will next illustrate the sound of fold over with two examples.
Suppose a sine wave with continuously increasing frequency (glissando)
is sampled. What will be heard? As the frequency increases from 0 to
15,000 Hz, an increasing frequency going from 0 to 15,000 Hz will be
heard. But as the frequency increases from 15,000 to 30,000 Hz, a
decreasing frequency (30,000 - F) will be heard, going from 15,000 to
o Hz. This is usually a shock! If we persist in raising the frequency and
proceed from 30,000 to 45,000 Hz, the resulting sound will go upward
from 0 to 15,000 Hz (- 30,000 + F).
If we generate a complex tone with a high pitch and many harmonics,
the higher harmonics will fold over and appear at unwanted frequencies.
For example, the fifth harmonic of a 3000-Hz tone will occur at 15,000
Hz. That is the highest frequency that is not folded at a 30,000-Hz
sampling rate. The sixth harmonic (18,000 Hz) will be generated at
12,000 Hz and thus add to the fourth harmonic. The ninth harmonic
(27,000 Hz) will appear at the fundamental frequency, 3000 Hz.
FUNDAMENTALS 11
p(t), -00 < t < 00, is sampled. The analog-to-digital converter produces
a sequence of numbers p(iT), i = ... , -1,0, 1,2, ... , equal to p(t) at
the sampling times iT. The sampling interval is T, and the sampling
rate R = liT.
12 CHAPTER ONE
z(t) = L
i= - 00
8(t - iT) p(iT) (1)
P(w)
r:h o
w rod Isec
..
Fig. 7. Typical frequency-
limited spectrum .
FUNDAMENTALS 13
met} =
1=
L -00
8(t - iT)
m(tl
... t t
-2T -T
it t t
0 T
t sec
2T 3T
..
Fig. 8 .
pulses .
Sampling im-
M(w) =.;
2
n=
L
+00
-00
8(w - nwo) (3)
as shown in Fig. 9.
To
w rod/sec
t
W·o
1 This spectrum may be formally derived from the Fourier series analysis of
met), which yields
1 2 ex>
met) = T + T n~l cos nWot
The spectrum of cos nWot is
1T[a(W - nwo) + a(w + nwo)]
Hence the spectrum of met) may be computed as the sum of spectrums of cos nWot
terms
14 CHAPTER ONE
The spectrum P*(w) of the output p*(t) is Z(w) times the product of the
amplification T and the transfer function F(w) of the smoothing filter
+00
P*(w) = F(w)
n= -
L 00
pew - nwo) (6)
Equation 6 is the basic result and holds for both frequency-limited and
frequency-nonlimited P(w),s. It says that P*(w) contains the sum of
pew) spectra which have been shifted by nwo. Let us examine P*(w)
for the frequency-limited case.
Figure 10 shows a sketch of T· Z(w) for the P(w) shown in Fig. 7.
°
Since P(w) = for Iw I ~ wo/2, the sum of shifted P(w) spectra gives
w rad/sec
copies of the pew) spectra centered at ... , -Wo, 0, we' 2wo, ... rad/sec.
If the smoothing transfer function F(w) is such that F(w) = 1 for
°
Iwl < wo/2, and F(w) = for Iwl ~ wo/2 as shown in Fig. 10, then
P*(w) is simply the center hump ofT·Z{w). Geometrically it is easy to
see that P*(w) = pew) and therefore that pet) = p*(t).
FUNDAMENTALS 15
Thus we have established our main claim and shown how a faithful
replication of any frequency-limited function can be generated from
samples.
What errors are produced if pew) is not frequency-limited? Figure 11
shows such a case. P(w) is nonzero until w equals .9wo. The summation
w rod/sec
Fig. 11. Spectrum of T·Z(w) with function having a too wide frequency
spectrum.
specified by Eq. 5 causes the tail (P(w), wo/2 < w < wo) to add energy
to Z(w) in the frequency region 0 < w < wo/2. The tail is said to be
folded around wo/2, and hence the distortion is called foldover. Energy
in P(w) at frequencies w appears in P*(w) at frequencies Wo - w. This
distortion is produced by the terms P(w - wo) and P(w + wo) in
Eq. 5. If pew) contains even higher frequencies, distortions with
frequency shifts of 2wo - w will be introduced by the P(w - 2wo) and
pew + 2wo) terms, and so forth.
In addition to fold over, errors are also introduced by the smoothing
filter. The transfer function F(w) is one term in Eq. 6. Realizable filters
cannot achieve the ideal transfer function of unity for Iwl < wo/2 and
zero for Iwl 2:: wo/2. A typical function is sketched in Fig. 11. Two
types of errors are caused. Departures of the amplitude from unity for
Iw I < wo/2 distort P*(w) within the band of interest and produce
in-band distortion. These distortions are typical of errors in other
electronic equipment and are often measured in decibels of departure
from unity or "flatness." Flatness within ± I dB is typical and easy to
produce.
16 CHAPTER ONE
Departures of the amplitude from zero for Iwi;::: wo/2 add high-
frequency energy to P*(w). For example, if F(wo) # 0, a tone with a
pitch equal to the sampling frequency will be heard. Gains as small as
1/100 or 1/1000 are not hard to achieve for Iwl ;::: wo/2. In many cases
the ear is not sensitive to the high frequencies and hence they are not
objectionable. At a sampling rate of 30,000 Hz, all high-frequency
distortions are at frequencies greater than 15,000 Hz and hence are
almost inaudible.
One other limitation of realizable filters must be taken into account.
They require a certain frequency band to change gain from unity to
zero. In Fig. 11, the transition occurs between We and wo/2. Large
distortions occur in this band; therefore it cannot contain useful
components in P*(w). We is effectively an upper limit for the usable
frequency of P*(w), which is less than the theoretical maximum wo/2.
Typically We = .8wo/2.
The spectrum P*(w) and hence p*(t) can be computed from Eq. 6 for
any smoothing filter F(w) and any pew). Thus the error pet) - p*(t) can
be computed. The calculation is complicated and is usually not worth
carrying out. Instead, either a physical feeling for the error is obtained
from a sketch such as Fig. 11 or bounds are computed for the error.
Voltage
funct ion
p (t)
Impulses
at rate R
per second
2 Suggested by J. R. Pierce. This analysis is briefer than the preceding one and
may be easier to understand.
FUNDAMENTALS 17
Spectrum
Frequency (Hz)
pet) contains frequencies higher than R/2, that is, if A(f) is not zero for
f larger than R/2, the sideband lying below the sampling rate R will
fall partly within the frequency range from 0 to R/2. The higher
frequencies of pet) will have been folded over into the frequency range
from 0 to R/2.
18 CHAPTER ONE
Let us return to Fig. 12, which illustrates the sampling process. Here
we show the sampler (multiplier) followed by an amplifier of gain l/R
and a smoothing filter whose purpose is to remove frequencies above
R/2 Hz.
Suppose first that pet) contains no frequencies above R/2, and second
that the smoothing filter has zero loss for all frequencies below R/2
and infinite loss for all frequencies above R/2. Then from the preceding
analysis the output of the system should be exactly pet).
That ideal performance can fail in two ways.
The voltage function pet) may contain frequencies higher than R/2.
In that case, folded-over frequencies will appear in the frequency range
o to R/2, even though the smoothing filter is ideal.
The voltage function pet) may contain no frequencies higher than
R/2, but the smoothing filter may pass frequencies higher than R/2.
In that case, some folded-over frequencies above R/2 Hz will pass
through the smoothing filter.
In practice, we cannot make ideal smoothing filters. Rather, we count
on using frequencies only up to some cutofffrequency fc, which is some-
what less than R/2, and try to make the smoothing filter loss increase
rapidly enough with frequency above fc so that it passes little energy of
frequency above R/2.
\P(W)\
1.0
m
W
0 Wo
(a)
\F{w)1
1.0
-~I-C
0 We woll wQ
(b)
Fig. 14. Constants for bounding the error of the sampling process: (a)
spectrum of signal; (b) transfer function of :filter.
....
Q. j-.JL-~'L....--.II----II--'-"L""'Time
(0)
( b)
Fig. 15. Examples of p(t)
functions with differing
foldover.
(c)
:;tf\
~V
f\ f\ !\
V V\TV
I. Time
(d)
°
The usable frequency range is from to We; hence We should approach
wo/2.
Filter design and construction is a highly developed art. Typical
values that are easy to obtain in specially designed filters are c = .1
{l dB in-band deviation), b = 1/1000 (60 dB out-of-band attenuation)
and We = .8wo/2. General purpose filters or adjustable filters are not as
good but are more convenient to buy and use. It is always desirable to
have a flat in-band filter (c small). The importance of the out-of-band
attenuation depends on the sampling rate. At low rates (10,000 Hz),
FUNDAMENTALS 21
t sec
w rad Isec
(0 ) (b)
Fig. 16. Sample and hold circuit: (a) impulse response; (b) frequency
function for D = T.
modulator is held for D seconds, thus producing a finite pulse. The
transfer function H(w) of this filter can be written
H(w) = CD ! e- jwt dt
Jo D
The amplitude of the impulse response is taken as liD to normalize the
low frequency gain of H(w) to unity. Carrying out the integration,
H(w) is evaluated as
0
-I 0 2 3 4 5 6
(0 )
Time in
T units
o I I I I r
"l
-1/2
-I
I
0 2
1
3 4
I
5
(b)
Fig. 17. Quantizing process: (a) function being quantized; (b) quantizing
error.
quantizing levels 0, 1, and 2 are clear. The exact values p(iT) of pet)
at the sampling times are indicated by open circles. The analog-to-
digital converter approximates these by the nearest quantizing level
shown by the black dots pq(iT). The difference ei where
e1 = p(iT) - pq(iT) (8)
is the quantizing error.
FUNDAMENTALS 23
p*(t) = T L
i= - 00
p(iT)f(t - iT)
where f(t) is the impulse response of the smoothing filter and is related
to the filter frequency function by
1 f+oo
f(t) = -21T _ 00 F(w)e jwt dw
If the quantized samples pq(iT) are used as input to the impulse modu-
lator, then the output p:(t) is
eit) = T L
i= - 00
{p(iT) - pq(iT)}f(t - iT) (9)
24 CHAPTER ONE
eq(t) = T 2:
i= - 00
eif(t - iT)
and is
(10)
- It
e2 =
-t
x2 dx = l-2
For the ideal smoothing filter, F(w) = 1 for Iwl < wa/2 and F(w) = 0
for Iwl ~ wa/2, the energy in <I>q(w) is uniformly distributed over the
frequency band - wa/2 to wa/2. The mean-square quantizing error
W O/ 2 T-
=
f- wo/2
- e2 dw
27T
(12)
FUNDAMENTALS 25
At the input to the converter, the five digits that make up the number
are represented by the voltages on five lines going to the switch controls
S4' .. So. A "I" is represented by a positive voltage and "0" by a
26 CHAPTER ONE
1/2n
I/lsn
Analog
output
'~----~~~~----~
Digital input
negative voltage. The switch controls close their attached switch if they
have a positive input and open it with a negative input.
The resistor network embodies the sum given above. The resistors
are chosen to be inversely proportional to powers of 2. If Fi is a
switching function that is 0 if Si is open, and 1 if Si is closed, then
I = E R {F4·l6 + F 3 ·8 + F 2 ·4 + F 1 ·2 + Fo·l}
Thus I is the analog equivalent of the digital input. The constant of
proportionality is determined by the reference voltage ER • The current-
to-voltage amplifier generates an output voltage Eo which is proportional
to I.
In an actual converter, the switches would be transistors, the switch
controls would be flip-flop registers, the current-to-voltage amplifier
would be an operational amplifier, and the resistors would have values
measured in thousands of ohms. Higher accuracy and more digits are
obtained simply by adding more switches and resistors. Thus an
actual converter is not much more complicated than the simple device
we have described.
An analog-to-digital converter is more complicated. Most involve a
digital-to-analog converter plus a feedback mechanism, The exact
operation differs for different converters, but one widely used pro-
cedure is sketched in Fig. 19. The digital-to-analog converter that it
contains can be made in the way that has been described. The compli-
cated part is the programmer, which is effectively a small computer. A
conversion is made in a sequence of steps. The analog voltage to be
converted is applied to the analog input terminal. The programmer
initially sets all the digits S4' .. So equal to zero. Digit S4 is set to "1 "
FUNDAMENTALS 27
Analog
,.. input
~
Digital- to-
analog
converter E2 "\
Comporer
h
~
Di9itol
~
,.. output
}
~
S4 S3 S2 S, So
Programmer
,
Fig. 19. Analog-to-digital converter.
change state. If the most significant digit is slightly faster than the other
digits, the actual sequence will be 0111 1111 1000. The analog output
resulting from the correct and erroneous sequence is shown in Fig. 20.
It is clear that a large error is made momentarily. The error is difficult
to observe because it depends on the signal, that is, it depends on
transitions between particular levels, and it occurs very briefly.
Analog output
16 1111
8
~
0111
JL0111
O~-------------------------------------'Time
The error can be avoided in two ways. The switches can be carefully
adjusted to have the same operating speed. A good commercial con-
verter is usually satisfactory in this respect, whereas converters
assembled from computer cards may need adjustment. Secondly, a
Digitol-to- o ~--t----t--t----...
analog
converter
On
Off
Time
Smoothing-Filter Design
Filter design and construction is a highly developed science and art.
Satisfactory smoothing filters can be either built or purchased. They can
be of special design or of a standard type, or they can be variable with
knob-controlled cutoff frequency. Consulting a filter expert is the best
way to get just the right filter for a particular application. However, we
will give instructions for building one smoothing filter that has been
used for several years and is not too complicated.
The filter transfer function and circuit are shown on Fig. 22. 3 The
ver~ion shown is intended for a 20-KHz sampling rate. It has less than
1 dB loss over the band 0 to 8 KHz. It has 60 dB or greater loss for all
frequencies above 10KHz. The filter is not corrected for phase and will
distort the waveform of some signals. The phase change is less than that
introduced by any tape recorder and is almost always inaudible.
In constructing the filter, the components should be adjusted to be
within 1 percent of the values shown. An impedance bridge is used for
the adjustment. Capacitors can be adjusted by obtaining one that is
just under the desired value and adding a small capacitor in parallel.
Inductors can be adjusted by obtaining an inductor just larger than the
desired value and unwinding a few turns of wire. High-Q inductors of
good quality should be used, for example, those with torodial or ferrite
cores. The resistors are part of the source and load impedances and are
usually not built into the filter.
t
5 eout
-10
- 20
m -30
"'0
c:
G)
..... -40
:::J
0
Q)
-50
-60
-70
-80
2
Frequency (KHz)
Fig. 22. Smoothing-filter circuit and transfer function. The filter has a
dc gain of 1-, which is not shown on the curve. Element values in KO, H,
and (1.f.
Filters for other sampling rates can be built from this design by
changing the values of the inductors and capacitors according to the
equations
C' = C· 20,000/fs
L' = L·20,OOO/fs
where C and L stand for the element values in the original design, C'
and L' stand for the element values in the frequency-scaled design, and
fs is the new sampling rate. For example, a 10-KHz sampling rate is
accommodated by doubling all inductors and capacitors.
As is shown on the circuit, the filter is designed to be driven by a
5-KG source impedance and to drive a 5-KG load. These impedances
are not critical. The source impedance may vary from 2 KG to 5 KG,
FUNDAMENTALS 31
L~~~
~ LRecord gap
Fig. 23. Sample of
digital magnetic
tape showing
Record of data record gaps.
heads are initially positioned at the first record gap. The tape is started,
one record of data is transmitted, and the tape is stopped with the
4 Magnetic disk recording is also possible but has little advantage over tape
since the sound samples are in such an orderly sequence.
32 CHAPTER ONE
Digital-
to- Analog
analog output
converter
Put out a
Start-stop sample
Sampling-rate
oscillator
The design must take consideration of these facts: larger buffers cost
more, longer records yield higher maximum rates because of fewer
record gaps, the tape must be started soon enough to avoid emptying the
buffer at the highest sampling rate, the buffer must be large enough
never to overfill at the lowest sampling rate. A design is a proper com-
promise between these factors. Although we have only discussed a
digital-to-analog conversion system, the analog-to-digital process
requires the same buffer and works in a basically similar manner.
The digital tape controller that we have described is rather expensive
and complicated to build. Often the computer itself makes a more
attractive tape controller. A schematic diagram is shown in Fig. 25.
External
data
connection
Computer Digital-
plus to- analog
program ....+-----4-..... converter
Sampling- rate
oscillator
programs for analysis have been developed. Rather, many different pro-
grams have been written for particular tasks. For synthesis, one
program, which developed through five stages, Music I-Music V, has
proved generally useful. Hence we will present here the fundamental
considerations that led to Music V, and in the next chapters details
intended to teach a user of Music V. However, the material should be of
value not only to users of Music V, but to anyone writing a sound-
synthesis program.
The two fundamental problems in sound synthesis are (1) the vast
amount of data needed to specify a pressure function-hence the
necessity of a very fast and efficient computer program-and (2) the
need for a simple, powerful language in which to describe a complex
sequence of sounds. Our solution to these problems involves three
principles: (1) stored functions to speed computation, (2) unit-generator
building blocks for sound-synthesizing instruments to provide great
flexibility, and (3) the note concept for describing sound sequences. Let
us next consider sound synthesis from the computer's and the
composer's standpoints to see the importance of these principles.
To specify a pressure function at a sampling rate of 30 KHz, one
number is needed every 33 fLsec. That speed strains even the fastest
computers. A useful measure of computation is the time scale, which is
defined as
TO I _ time to compute samples of a sound
lme sca e = duration of the sound
Various possibilities exist at various time scales. If the time scale is
equal to 1 or less, a digital-to-analog converter can be attached directly
to the computer and sound can be synthesized in real time. This allows
improvising on the computer, hearing the sound as one pushes the
computer keys in the same way that one hears sound from a piano.
Fast current computers add two numbers in about 3 fLsec and multiply
two numbers in about 30 fLsec. Hence the computations for each
sample for real-time synthesis must be few indeed. However, real-time
synthesis is a powerful way of adjusting sound parameters to achieve
a particular timbre or effect. In addition, it allows the computer to be
used as a performing instrument. Hence it is an important objective.
Time scales greater than 1 necessitate recording the samples on a
digital magnetic tape, rewinding the tape, and playing the tape through
the converter. A delay equal to or greater than the sound duration is
inherent in the process. Time scales from 1 to 50 are eminently usable.
At 50, a delay of an hour is needed to compute one minute of sound.
FUNDAMENTALS 35
An hour seems long if you personally are waiting for the computer; it is
nothing if you are at home sleeping while the night shift runs the
problem. At a scale of 50, 1600 fJ-sec are available to compute each
sample. Fifty multiplications or several hundred additions can be
carried out in that time. Although much can be done, that number of
computations does not represent a copious supply, and it must be used
effectively.
Time scales from 50 to 1000 become so time consuming and expensive
that even the most reckless experimenter pauses to consider whether
the value of his sounds justifies the time and money. At a scale of
1000,20 minutes of computer time are needed for each second of sound.
It must be a remarkable second to make this effort seem worth while.
One way of speeding the effective computation is to store samples in
the computer memory, when possible, and to read these samples from
memory rather than recompute them. Reading from memory is rapid.
The process works only for samples or factors of samples that are
repetitive. Fortunately, many sounds have highly repetitive com-
ponents. For example, an oscillator repeats the same waveform each
cycle. The shape of a cycle~s wave may be very complicated, but once it
is computed and stored, it can be read out as rapidly as any simple
function. Many other factors can be reduced to repetitive stored
functions.
The cost of stored functions is memory space. In Music V a typical
function is stored as 512 samples, and the largest part of the memory
is used for storing functions. The cost is more than justified by the time
saved.
We have considered sound synthesis from the position of the
computer and it has led us to stored functions. Now let us look from
the composer's standpoint. He would like to have a very powerful and
flexible language in which he can specify any sequence of sounds. At
the same time he would like a very simple language in which much can
be said in a few words, that is, one in which much sound can be
described with little work. The most powerful and universal possibility
would be to write each of the millions of samples of the pressure wave
directly. This is unthinkable. At the other extreme, the computer
could operate like a piano, producing one and only one sound each time
one of 88 numbers was inserted. This would be an expensive way to
build a piano. The unit-generator building blocks make it possible for
the composer to have the best of both of these extremes.
With unit generators the composer can construct, with a simple
procedure, his own sound-synthesizing program. In Music V it is called
36 CHAPTER ONE
card are inserted into the appropriate instrument and the instrument is
turned on for the duration of the note.
To summarize, the complete program with three passes, stored
functions, unit generators, and instruments was evolved over several
years. It is not a unique way of synthesizing sound samples; other
equivalent programs could be written. However, it does provide great
speed and great flexibility by the careful use of a general compiling
language (F0RTRAN) plus certain machine language subroutines.
Time (msec)
Sampling
5. A waveform p(t) where
p(t) = 100 sine (27T·2100·t) + 50 sine (27T·4200·t)
+ 33 sine (27T' 6300· t) + 25 sine (27T' 8400· t)
+ 20 sine (27T·I0,500·t) + 17 sine (27T·12,600·t)
+ 14 sine (27T·14,700·t) + 12 sine (27T·16,800·t)
+ 11 sine (27T·18,900·t) + 10 sine (27T·21,000·t)
is subjected to a sampling and desampling process as shown in Fig. 6.
The sampling rate is 19 kHz. The desampled output is p*(t).
(a) What is the highest frequency component that can be faithfully
reproduced in p*(t) at this sampling rate? Call this component and all
lower components the desired components. Give the amplitudes and
frequencies of components of p*(t) with
(b) no smoothing filter
(c) a filter with the frequency function
+3
I
I
I
I
0 I
I
I
I
I
ID I
"CI I
I
I
-20 ----------------~----
I
I
I
I
-40
0 8 9 20
KHz
(d) Also give the amplitudes and frequencies of the components of p(t).
(e) What is the lowest frequency component in p(t)? What is p(t)'s
period? What is its pitch?
(f) What is the lowest frequency component in p*(t) with no smoothing
filter? What is p*(t)'s period?
(g) How much are the desired components in p*(t) changed by the
filter? Which desired component is most changed?
(h) Which distortion components in the "range of perception" (0-15
kHz) are reduced by the filter? Which are relatively unaffected?
42 CHAPTER ONE
(i) What distortion components that are folded about 38 kHz fall in the
range of perception?
(j) With no filter, what is the maximum frequency that can be reproduced
without causing a distortion component in the range of perception?
(k) With the filter, what is the maximum frequency that can be repro-
duced without causing a distortion component in the range of perception
(assume filter has infinite attenuation for frequencies greater than 9 kHz)?
6. Samples at a 20-kHz rate are computed for the waveform
. [21T • (60,000.
f( t) = sme 30 t) . t ]
. [2 (30,000(20
+ sme 1T. 30 - t») . t]
The sound is desampled with an impulse desampler and no filter. Describe
the amplitudes and frequencies of the components that fall within the
range of perception (0 to 15 kHz); t goes from 0 to 30 sec.
Analog-Digital Conversion
7. Calculate the tolerance on the resistors in the digital-to-analog
converter shown in Fig. 18 so that the maximum error due to anyone
resistor is to of one quantizing level. Give tolerance in terms of both
absolute accuracy and percent accuracy. Which resistor must have the best
percent accuracy?
Introduction
This chapter is intended to provide a training course in the use of
Music V by discussing a series of examples ranging from simple to
complex sound synthesis. It is written from the point of view of the
user of Music V. Details of operation of the programs will be suppressed
as much as possible. These can be found in Chapter 3. Because the
programs will not be described here, many of the conventions of the
computer score will seem arbitrary and must be temporarily accepted
on faith.
For concreteness we will also arbitrarily assume values for certain
parameters of the program, for example, a sampling rate of R =
20,000 Hz. Other parameters will be introduced as required. For the
student's benefit, the parameters of the training orchestra are listed at
the beginning of the problems for Chapter 2.
The material assumes that the student has a working knowledge of
F0RTRAN programming. The programming examples will be written
in F0RTRAN IV. It is also assumed that the student understands the
general functioning of a computer-arithmetic, memory, input-output,
and program. If necessary, these skills can be learned from books cited
in the references at the end of Chapter 2.
This chapter is intended as training material and not as a reference
43
44 CHAPTER TWO
P5 P6
~
c
IJoe! pj l:~ J J 1;.1 J tr F2
~ ~-~-======= ===-
82
~UT
(a)
(b)
F2
511
(c)
1 INS 0 1 ;
2 0SC P5 P6 B2 F2 P30 ;
3 0UT B2 Bl ;
4 END;
5 GEN 0 1 2 0 0 .999 50 .999 205 - .999 306 - .999 461 0 511 ;
6 N0T 0 1 .50 125 8.45 ;
7 N0T .75 1 .17 250 8.45 ;
8 N0T 1.00 1 .50 500 8.45 ;
9 N0T 1.75 1 .17 1000 8.93 ;
10 N0T 2.00 1 .95 2000 10.04 ;
11 N0T 3.00 1 .95 1000 8.45 ;
12 N0T 4.00 1 .50 500 8.93 ;
13 N0T 4.75 1 .17 500 8.93 ;
14 N0T 5.00 1 .50 700 8.93 ;
15 N0T 5.75 1 .17 1000 13.39 ;
16 N0T 6.00 1 1.95 2000 12.65 ;
17 TER 8.00 ;
(d)
Fig. 27. Elementary orchestra and score: (a) conventional score; (b)
instrument block diagram; (c) waveform; (d) computer score.
46 CHAPTER TWO
I-@ Blocks
1-0 blocks are short for unit generator input-output blocks. They
can be used as storage locations for either inputs or outputs for unit
generators, hence the designation input-output blocks. Blocks are
designated B1 through B lOin the training orchestra. Block B I has the
special function of storing the numbers that will be sent to the digital-
to-analog converter. All other blocks are equivalent in mono. (In
stereo, blocks BI and B2 are both reserved for output.)
The size of the block is a parameter of the orchestra. In the training
orchestra, it has been set at 512. The maximum size of numbers in the
I -0 blocks is another program parameter. In the training orchestra it
has been set at ± 2047 which is appropriate for a I2-bit digital-to-analog
converter.
AD2 Generator
The simplest generator is the two-input adder, AD2. Its function is to
combine two numbers by addition. It has two inputs and one output
as shown in Fig. 28a. The equation of operation is
where 11 and 12 are the two inputs, 0 is the output, and i is the index of
samples that starts at 0 at time t = O. We must quickly add that this
equation is computed only for those samples during which the
instrument with AD2 is playing a note.
In the score, AD2 is put in an instrument by a statement such as
AD2 B2 B4 B3 ;
This example says: take the numbers stored in block B2, add them to
those stored in block B4, and put the sum in block B3. The relation
48 CHAPTER TWO
II 12 II
¥ ~
(0 )
G (b)
II 12 II I2
~ ~
(e)
~ Fn
~
(d)
Fig. 28. Four simple unit generators: (a) AD2; (b) 0UT; (c) MLT;
(d) 0SC.
between sample index i and the numbers in a given block at a given time
need not worry the user; it is treated automatically by the program.
AD3 and AD4 also exist and form a sum of three and four inputs,
respectively. The score statement evoking AD4 would be
AD4 B2 B3 B4 B5 B6 ;
where B2 through B5 are inputs and B6 the output.
@UT Generator
The 0UT generator takes the numbers from an instrument and
places them in the special 1-0 block BI for subsequent outputting
through the digital-to:-analog converter. 0UT also combines the
numbers with any other instrument simultaneously being played. 0UT
is diagrammed in Fig. 28b. It is shown with one input. The output to
B I is not shown; it always goes to this block. The equation of operation
is in F0RTRAN-like nptation
Acoustic output! = acoustic output! + II!
This equation says: I I is added to anything previously in the acoustic
output block; by this simple means any number of instruments may be
combined. The operation of addition is perfectly equivalent to the
A SEQUENCE OF TUTORIAL EXAMPLES 49
MLT Generator
The MLT generator multiplies two numbers together in a manner
exactly analogous to the addition done by AD2. It is diagrammed in
Fig. 28c. The equation of operation is
0 1 = 11 1 .121
where 11 and 12 are the two inputs and 0 is the output. In the score
MLT B2 B3 B4;
associates II with B2, 12 with B3, and 0 with B4. In general, the order
of listing generator descriptions on the score is: inputs, outputs, special·
parameters.
@sc Generator 1
By far the most important generator is the oscillator 0SC. It is the
most frequently used and the most difficult to understand of the simple
generators. Its importance is based on the prominence of oscillations in
musical sounds and on its nature as a source of numbers. The generators
previously described modify or output numbers that have been created
elsewhere; 0SC is one of the few units that actually produce numbers.
The diagram of 0SC is presented in Fig. 28d. As will be shown,
three quantities determine the output 0: 11 controls the amplitude of the
oscillation; 12 controls the frequency; and Fm a stored function, is the
waveform. Fn is exactly one cycle of the 0SC output; the purpose of
the 0SC can be looked upon as repeating Fn at the desired frequency
and amplitude.
Fn may be thought of as a continuous function of time, but in the
computer it must be represented by a block of samples. In the training
orchestra each function is represented by 512 samples. Figure 29 shows
an example of a stored function F3. The waveform is, a square wave
with slightly slanted sides. The 512 points, F3(k) k = O... 511, are
The simplest 0SC program would simply repeat the 511 numbers in
F3, one after the other: F3(0), F3(l), ... , F3(511), F3(1), .... This
would produce an oscillation whose peak amplitude would be 1 and
whose frequency would be 20,000/511 = 39.14 Hz. That frequency is
too low for most purposes. By repeating every other sample, F3(1),
F3(3), ... , F3(511), F3(2), ... , one could produce a higher frequency,
78.28 Hz. In general, by repeating every nth sample of F3, one obtains
a frequency of
20,000 H
----sfl.n z
This is the fundamental relation between the frequency of 0SC and 12.
I t can be written
Freq = 39.4·12
12 = .02555· freq
M ore generally,
NF' f req
12 = -
R
511
P6 = 20,000· freq
54 CHAPTER TWO
FI
P5 P6 P7
I~ 1 I
,
o 20 491 511
(b)
J=60
F2
1 J
(d)
0
(0 )
-I
(c)
1 INS 0 1 ;
2 0SC P5 P6 B2 Fl P30 ;
3 0SC B2 P7 B2 F2 P29 ;
4 0UT B2 Bl ;
5 END;
6 GEN 0 1 1 0 0 .99 20 .99 491 0 511
7 GEN 0 1 2 0 0 .99 50 .99 205 -.99 306 -.99 461 0 511
8 N0T 0 1 2 1000 .0128 6.70 ;
9 N0T 2 1 1 1000 .0256 8.44 ;
10 TER 3 ;
(e)
FI FI
(f) (g)
Fig. 31. Instrument with attack and decay: (a) block diagram; (b) envelope
function; (c) waveform function; (d) conventional score; (e) computer
score; (f) pianolike envelope; (g) brasslike envelope.
A SEQUENCE OF TUTORIAL EXAMPLES 55
(0)
F3 F3
, ,0
0 511
J = 60 J = 60
j
1 J .,. .,. Glis
(b) (c)
1 INS 0 2 ;
2 0SC P5 P6 B2 Fl P30 ;
3 0SC P8 P9 B3 F3 P29 ;
4 AD2 P7 B3 B3 ;
5 0SC B2 B3 B2 F2 P28 ;
6 0UT B2 Bl ;
7 END;
8 GEN 0 1 1 0 0 .99 20 .99 491 0 511 ;
9 GEN 0 1 200 .99 50 .99 205 -.99 306 -.99 461 0 511 ;
10 GEN 0 2 3 1 1 ;
11 N0T 0 2 2 1000 .0128 6.70 .067 .205 ;
12 N0T 2 2 1 1000 .0256 8.44 .084 .205 ;
13 TER 3 ;
10' GEN 0 1 3 0 0 .999 511 ;
11' N0T 0 2 2 1000 .0128 6.70 4.55 .0128 ;
12' N0T 2 2 1 1000 .0256 11.25 0 .0256 ;
(d)
Fig. 32. Instruments with vibrato or glissando: (a) block diagram; (b) F3
and score for vibrato; (c) F3 and score for glissando; (d) computer score.
56
A SEQUENCE OF TUTORIAL EXAMPLES 57
With a change in F3, and the meaning of P7, P8, and P9, the same
instrument can also be used for glissando. An F3 consisting of a
straight" interpolating" function appropriate for glissando is shown in
Fig. 32c. P9 now becomes
P9 = .0255
duration of note
and causes 0SC #2 to produce one cycle per note (the same as 0SC #1).
P7 is set at
P7 = .0255 x initial note frequency
and P8 at
P8 = .0255 x (final note frequency - initial note frequency)
The action of AD2 and 0SC #2 with F3 is such that at the beginning of
the note B3 will contain .0255 x initial note frequency, and at the end
of the note it will contain .0255 x final note frequency.
Substitution of cards 10', 11', and 12' into the score in place of
cards 10, 11, and 12 will produce the glissando sample shown. Note
that for the second note (A 44o), which has a constant frequency, P8
is equal to 0 since the initial and final frequencies are the same. P6 and
P9 have the same values, and hence P9 could be eliminated if the
instrument were redefined.
The glissando obtained in this way has a linear change of frequency in
hertz. This means that the musical intervals will change faster at the
beginning of the slide than at the end. Although a linear change of
musical intervals might be preferable, this glissando has been much
used and seems perfectly satisfactory. During most slides, listeners are
insensitive to the precise time course of the pitch.
P5 P7 P6 P7 P8
#1 '~
o 511
'k:::J
o 511
1 INS 0 3 ;
2 0SC P5 P7 B2 F3 P30 ;
3 0SC P6 P7 B3 F4 P29 ;
4 AD2 B2 B3 B2 ;
5 0SC B2 P8 B2 F2 VI ;
6 0UT B2 Bl ;
7 END;
8 GEN 0 1 3 .999 0 0 511 ;
9 GEN 0 1 4 0 0 .999 511 ;
10 GEN 0 1 200 .99 50 .99 205 -.99 306 -.99 461 0 511 ;
11 N0T 0 3 2 0 2000 .0128 6.70 ;
12 N0T 2 3 1 2000 0 .0256 6.70 ;
13 TER 3 ;
written in P5 and the final amplitude in P6. 0SC #1 and 0SC #2 both
generate one cycle per note of waveforms F3 and F4, respectively. F3
goes linearly from 1 to 0 over the course of a note and is multiplied by
the initial amplitude in 0SC #1. Similarly, F4 goes from 0 to 1 and is
multiplied by the final amplitude. Thus the output of AD2 will proceed
linearly from the initial amplitude to the final amplitude.
Records 11 and 12 in the score play what amounts to a single note
made up of two notes tied together. The first note swells from 0 to
maximum amplitude, the second decays back to zero. Amplitude
controls in P5 and P6 are obvious. P7 is set to produce one cycle per
note in both 0SC #1 and 0SC #2.
60 CHAPTER TWO
P5 P7 P6 P7 P8 V3(01) V4{,2)
#1
82
#4
FI
Fb-
F2
1 INS 0 4 ;
2 0SC P5 P7 B2 F3 P30 ;
3 0SC P6 P7 B3 F4 P29 ;
4 AD2 B2 B3 B2 ;
5 MLT B2 VI B3 ;
6 AD2 B3 V2 B3 ;
7 MLT P8 V3 B4 ;
8 0SC B4 V4 B4 F5 P28 ;
9 AD2 P8 B4 B4 ;
10 AD2 B4 V5 B5 ;
11 0SC B3 B5 B5 F2 V7 ;
12 0SC B2 B4 B4,Fl V8 ;
13 MLT B2 B4 B4 ;
14 MLT B4 V6 B4 ;
15 0UT B4 Bl ;
16 END;
SUBR0UTINE C0NVT
C0MM0N IP, P, G
DIMENSI0N IP(10), P(100), G(1000)
IF (P(1) - 1.0) 102, 100, 102 2
100 IF (P(3) - 1.0) 102, 101, 102 3
101 P(5) = 10.0 ** (P(5)/20.0) 4
P(7) = 511.0 * P(6)/G(4)
P(6) = 511.0/(P(4) * G(4»
IP(1) = 7 5
102 RETURN
END
Notes
1. The data-record parameters PI-PI00 have been placed by Pass II
in P(1)-P(I00). The IP array contains some pertinent fixed-point
2 Equations relating to programs will usually be written in a F0RTRAN-like
notation.
64 CHAPTER TWO
With this C0NVT function the score lines to play the two notes on
Fig. 3ld (equivalent to lines 8 and 9 on Fig. 3lc) are
N0T 0 1 2 60 262 ;
N0T 2 I 1 60 330 ;
Note P6 P7
-1 o
-1 1
-1 2
A SEQUENCE OF TUTORIAL EXAMPLES 65
-1 11
o o
o 1
o
The vibrato controls will be eliminated from the N0T record. In-
stead, we will assign two Pass II variables, G(50) to control the percent
frequency variation and G(51) the rate of vibrato.
The equations which must be programmed into C0NVT are
Frequency = 262.0 * (2 ** (P6 + P7/12.O))
P5 = 10.0 ** (P5/20.0)
P6 = 511.0/(P4 * sampling rate)
P7 = 511.0 * frequency/sampling rate
P8 = 511.0 * frequency * G(50)/(sampling rate * 100)
P9 = 511.0 * G(51)/sampling rate
Most of the equations are self-explanatory. The note frequency is com-
puted in hertz from the logarithmic scales embodied in P6 and P7 by
the first relation. The factor 100 is put in the denominator of P8
because G(50) is a percentage.
Vibrato control is a good example of the use of Pass II memory in a
composition. Except for the first few variables, numbers in the G array
may be used for any purpose desired by the composer. Numbers are
placed in the array by an SV2 record, which is analogous to the SV3
record that was previously used to set a Pass III variable. Thus
SV2 0 50 .5 6 ;
would set G(50) = .5 and G(51) = 6 at t = O.
The program to carry out the computations follows.
Text Notes
SUBR0UTINE C0NVT
C0MM0N IP, P, G
DIMENSI0N IP(10), P(100), G(1000)
IF (P(1) - 1.0) 102, 100, 102
100 IF (P(3) - 2.0) 102, 101, 102
101 P(5) = 10.0 ** (P(5)j20.0)
P(7) = 511.0 * 262.0 * (2.0 ** (P(6) + P(7)j12.0))jG(4)
P(6) = 511.0j(P(4) * G(4))
P(8) = P(7) * G(50)j100.0 2
P(9) = G(51) * 511.0jG(4) 3
IP(2) = 9
102 RETURN
END
66 CHAPTER TWO
Notes
1. This statement calculates frequency, multiples it by the appropriate
constant of proportionality, and stores it in P(7).
2. This statement computes the maximum vibrato deviation. The
properly scaled frequency is already available in P(7) and hence must
only be multiplied by G(50)/100.0.
3. This statement sets the rate of vibrato. The constant of propor-
tionality, 51 1.0/G(4) , is the same as any other 0SC frequency
control. G( 51) will be the vibrato frequency in hertz.
A score for this instrument to replace lines 11 and 12 on Fig. 32 is
SV2 0 50 1 6;
N0T 0 2 2 60 0 0 ;
N0T 2 2 1 60 0 4 ;
Once G(50) and G(5I) are set, any number of notes may be written
with the same vibrato. On the other hand, the vibrato constants may be
changed at any time by a subsequent SV2 record. For example, the
deviation could be reduced and the rate increased at t = 15 sec by the
record
SV2 15 50 .5 8 ;
One may ask, why use Pass II variables in C0NVT rather than Pass
III variables or Pass I variables which also exist? The answer is,
C0NVT is a Pass II subroutine and can only make use of information
available in Pass II.
A final example of a C0NVT subroutine will provide a convenient
score language for the glissando instrument in Fig. 32. As shown, the
initial frequency of a note must be written in P7 and the (final - initial)
frequency in P8. We shall eliminate the arithmetic to calculate
(final - initial). Instead the score card will have
P5 = amplitude in decibels
and
P6 = final frequency in hertz
The initial frequency of each note will be defined as the final frequency
of the preceding note. The initial frequency of the first note will be read
into the program with an SV2 card into G(50).
In order to use this simple form the program must remember the
final frequency of each note. G(50) will also be used for this purpose.
The program to achieve these objectives follows.
A SEQUENCE OF TUTORIAL EXAMPLES 67
Text Notes
SUBR0UTINE C0NVT
C0MM0N IP, P, G
DIMENSION IP (10), P(100), G(1000)
IF (P(1) - 1.0) 102, 100, 102
100 IF (P(3) - 2.0) 102, 101, 102
101 P(5) = 10.0 ** (P(5)j20.0)
P(7) = G(50) * 511.0jG(4) 1
P(8) = (P(6) - G(50» * 511.0jG(4) 2
G(50) = P(6) 3
P(6) = 511.0/(P(4) * G(4» 4
P(9) = P(6)
IP(2) = 9
102 RETURN
END
Notes
1. This statement sets the initial frequency, which was stored in
G(50).
2. This statement computes the (final - initial) frequency.
3. This statement stores the final frequency in G(50) to become the
initial frequency of the next note.
4. This and the following statement set the frequency inputs of 0SC #1
and #2 to 1 cycle per note.
Figure 35 shows a brief score for the instrument. Only the N0T
cards and the SV2 cards are shown. Record 1 sets the initial frequency
~
/ /
/
l, ....
JI ,
"
/
Glis Glis
1 SV2 0 50 262 ;
2 N0T 0 2 2 60 440 ;
3 N0T 2 2 1 60 330 ;
4 N0T 3 2 1 60 330 ;
5 SV2 4.5 50 440 ;
6 N0T 5 2 1 60 400 ;
7 N0T 6 2 2 60 262 ;
(O)g~~
9
o
0 t R°r2
1024 Hz
F,,,,,,,,,
(b)
o
(el
This will produce an independent number for each sample and will
achieve a cutoff of Rj2, which is the highest frequency representable by
R samples per second.
A low-frequency function with I2i ~ 64 and a cutoff of 1250 Hz is
shown in Fig. 36e. It is clearly the smoothest of the three functions.
A score record to evoke RAN is
RAN P5 P6 B2 P30 P29 P28 ;
where P5 is the 11 input, P6 the 12 input, B2 the output; P30 is an unused
note parameter for the storage of the sum; and P29 and P28 are two
other temporary storage locations.
82 82
FI
P
F~
(a) (b) ( c)
_......::::::IL...-I.~----rz::::::o..-..... Frequency
1~24' P8 Hz
5~2' P8 Hz
Frequency
and
72 CHAPTER TWO
P8 = 512. bandwidth
R
P8 = P9 = .0075 * P7
to give i-percent periodic and i-percent random variation
VI = 8 * 511/R
A SEQUENCE OF TUTORIAL EXAMPLES 73
Envelope Generator-ENV
The use of 0SC as an envelope generator is satisfactory in some
applications, but it makes the attack and decay times proportional to
the total note duration. Important aspects of timbre depend on the
absolute attack time. With 0SC, these will change from long notes to
short notes. The difference may be enough to give the impression of
two different kinds of instruments.
A special generator ENV has been programmed to sweep away this
limitation. It allows separate control of attack time, steady-state dura-
tion, and decay time. In order for ENV to be effective, a special
C0NVT function must be written for ENV. The computations in
C0NVT are at least as complex as those in ENV.
An instrument using ENV is illustrated in Fig. 39. ENV has four
inputs II-14 and requires one function. II determines the amplitude of
the output, and 12, 13, and 14 the attack time, the steady-state time, and
the decay time, respectively. The function Fl is divided into four equal
sections, the first determining the shape of the attack, the second the
shape of the steady state, and the third the shape of the decay. The last
section is not used and should be zero to allow for any round-off error
involved in scanning the first three parts.
The output 0 1 may be written
0 1 = IIi * function (scanned according to 11, 12, and 13)
The first quarter of the function is scanned at a rate of 12 locations per
sample, the second quarter at a rate of 13 locations per sample, and the
third quarter at a rate of 14 locations per sample. Consequently,
, C0NVT should compute
12 = 128
attack time * sampling rate
13 = 128
steady-state time * sampling rate
and
14 = 128
decay time * sampling rate
74 CHAPTER TWO
P5 P9 PIO PII P7 P8 VI
F3
II
fv-
Per , j
-e-
J =300
J.
. ~
T J
FI
5/1
1 INS 0 4
2 ENV P5 Fl B2 P9 PlO P11 P30 ;
3 0SC P8 VI B3 F3 P29 ;
4 AD2 P7 B3 B3 ;
5 0SC B2 B3 B2 F2 P28 ;
6 0UT B2 Bl ;
7 END;
8 GEN 0 1 1 0 0 96 1 128 .7 150 1 175 .6 200 1 225 .7 256 1
320 .3 384 0 511 0 ;
9 SV2 0 50 .050 .100 ;
10 SV3 0 1 .15 ;
11 N0T 0 4 .1 54 349 ;
12 N0T.2 4 .1 54 392 ;
13 N0T.5 4 .13 54 440 ;
14 N0T.6 4 .2 54 349 ;
15 N0T 0 4 .8 54 262 ;
Fig. 39. Envelope generator ENV for attack and decay. Instrument #4.
The attack time AT and decay time DT will in general be constants.
C0NVT calculates the steady-state time SS as the duration PC4)
minus the attack and decay times
SS = PC4) - AT - DT
A SEQUENCE OF TUTORIAL EXAMPLES 75
Thus the steady-state time varies with duration. For short notes there
may be no steady state, and the attack and decay times may have to be
shortened so that their sum does not exceed the duration. All this must
be done by C0NVT.
The data record to evoke ENV is
ENV II, F, 0, 12, 13, 14, S ;
S is a sum that must be assigned temporary storage in some unused note
parameter.
In the example shown in Fig. 39, Pass II variables V50 and V5I
contain the attack and decay times, respectively. These are set with the
SV2 record. The vibrato rate is kept in Pass III VI and is set with SV3
record. The attack and decay function FI is computed with GEN1. The
attack portion has a slight overshoot for added sharpness. The steady-
state portion has two cycles of quaver. The decay portion has two line
segments to approximate an exponential.
The N0T records contain amplitude in decibels in P5 and the
frequency in hertz in P6.
The instrument requires inputs P5 and P7 through PI1, as shown on
the diagram. The C0NVT program to compute these inputs from the
N0T record is listed and annotated below.
Text Notes
SUBR0UTINE C0NVT
C0MM0N IP, P, G
DIMENSI0N IP(10), P(100), G(1000)
IF (P(l) - 1.0) 105, 100, 105
100 IF (P(3) - 4.0) 105, 101, 105
101 COR = 1.0 1
SS = P(2) - G(50) - G(5l)
IF (SS) 102, 103, 103 2
102 C0R = P(4)/(G(50) + G(5l) 3
P(10) = 128.
G0 T0 104
103 P(10) = 128./(G(4) * SS) 4
104 P(9) = 128./(G(4) * G(50) * COR) 5
P(ll) = 128./(G(4) * G(5l) * COR)
P(5) = 10.0 ** (P(5)/20.0) 6
P(7) = 511.0 * P(6)/G(4)
P(8) = .0075 * P(7)
IP(l) = 11
105 RETURN
END
76 CHAPTER TWO
Notes
1. C0R will correct attack and decay times for short notes where the
steady state does not exist.
2. Checks to see if steady state time SS is positive.
3. Steady state is negative. C0R is set to reduce attack and decay times
proportionally so that
AT + DT = duration
P(10) is set at 128 so that steady-state time will equal one sample,
which is the minimum possible steady state.
4. Computation of P(10) for positive steady-state times.
5. Computation of P(9) and P(ll) for either positive or zero steady-
state times. C0R will be less than 1 for zero steady-state times.
6. The usual computation of amplitude and frequency control. Vibrato
amplitude is set at i percent of center frequency.
The N0T records (11-15) play the five notes sketched on the staff.
Two additional capabilities of the program are inherent in these records.
Instrument #4 is used to play up to three voices simultaneously. The
second voice is a sustained C 262 • The third voice occurs because the
slurred notes overlap slightly, with the note from record 6 extending into
the beginning of the note from record 7. Pass III can play multiple
simultaneous voices on any instrument. As many as 30 voices can be
played in the training orchestra.
The score records are not written in ascending sequence of action
times, in that the C 262 is written last and starts at t = 0. The order of
these records is immaterial, since they will be sorted into the proper
ascending sequence of action times in Pass II.
Filter-FLT
One of the more difficult sound-processing operations is filtering. A
unit generator that operates as a band-pass filter is shown in Fig. 40.
The filter may be used to introduce formants or energy peaks at specified
frequencies into sound waves. Such formants are characteristic of many
instruments.
The filter is calculated by means of a difference equation. In terms of
the diagram shown in Fig. 40, the equation is
0 1 = III + 121 ·0i - 1 - 131 ·0i - 2
11 is the input to the filter, 0 the output, and 12 and 13 determine the
frequency and bandwidth of the passband.
More specifically, the difference equation approximates a 2-pole
filter with a pole pair at - D ± j F Hz on the complex frequency plane
A SEQUENCE OF TUTORIAL EXAMPLES 77
1'1
F Hz
-..j 0
IZ x
FLT
13 F
....!L-Hz
2'1J"
( a)
(b)
Gam
Go~----
O~----------:!F~-------"Freq Hz
(e)
Fig. 40. Band-pass filter-FLT. (a) Diagram: 12 = 2e- 2nD / R cos 21TF/R;
I3 = e- 4nD /R ; (b) poles in complex plane; (c) curve of gain vs. frequency.
as shown in Fig. 40. The approximate gain of the filter is also shown as a
function of frequency in Fig. 40, where the peak occurs at F Hz and the
bandwidth at the half-power points is 2D Hz. The approximation holds
for F» D. 12 and I3 are determined as functions of F and D by the
relations
12 = 2 e- 2nD/R cos 27TF /R
and
I3 = e- 4 :n:D/R
where R is the sampling rate. C0NVT may be conveniently used to
compute 12 and 13 from F and D.
78 CHAPTER TWO
Composing Subroutines-PLF
Our tutorial discussion of what might be called the basics of sound
generation is now complete. We are ready to take up compositional
subroutines that will permit the generation of note parameters by the
computer. These are some of the most interesting but difficult directions
in which computer sound generation can be developed. Advanced
applications point toward complete pieces composed by computer.
However, long before these goals are achieved, PLF subroutines will be
useful in saving the human composer from much routine work.
So far, for each note to be played, the composer has had to write a
line of score starting with N0T.... PLF subroutines will now be
developed which write these N0T records. Furthermore, one score
record that evokes a PLF subroutine can generate many N0T's.
A SEQUENCE OF TUTORIAL EXAMPLES 79
, J J JJ
~>
(a)
I I LJ
5 6 8 9
(b)
1 SVI 0 10 0 1 52 0 ;
2 SVI 0 20 1 1 56 .167 ;
3 SVI 0 30 2 1 60 .333 ;
4 SVI 0 40 3 1 56 0 ;
5 PLF 0 1 10 40 0 1 .583 4 ;
6 PLF 0 1 10 40 4 1 .750 4 ;
7 PLF 0 1 10 40 4 1 1.167 5 ;
8 PLF 0 1 10 40 8 .5 .417 4 ;
9 PLF 0 1 10 10 10 2 .583 4 ;
(c)
Fig. 41. PLF note-generating example: (a) pattern; (b) conventional score;
(c) computer score.
80 CHAPTER TWO
PI through P3 have the same significance for all PLF routines. The
rest of the P's depend entirely on the particular subroutine to be written.
The conventional score for the notes produced by records 5 through 9
is shown in Fig. 41 with the notes coming from a given record identified.
Record 5 produces the first four notes in which the pattern is shifted up
by a fifth. Records 6 and 7 produce two copies of the pattern playing in
fourths. The upper voice is played on instrument 5 which is assumed to
yield a staccato timbre. RecoId 8 plays the pattern at double speed.
Record 9 plays the first note of the pattern at half speed.
In order to write a PLF program we will have to know something of
the operation of Pass I. It reads the score records in the order in which
they appear in the score. The SVI records cause data to be stored in the
D(2000) memory. A N0T record would simply cause the record to be
sent on to Pass II. This is accomplished by placing the N0T data in the
P(lOO) array and calling a communication routine WRITE 1, which
writes out the P array on a file that will later be read by Pass II. For
bookkeeping purposes, the number of parameters in the record is kept
in another Pass I location IP(l) and is automatically written out by
WRITEl. The function of the PLF routine is to generate N0T records
and to write them out exactly as Pass I would have done with a record
in the score.
82 CHAPTER TWO
How is the PLF routine brought into action? When Pass I reads a
PLF score record it calls a subroutine PLFn, in which n is in P3.
The rest of the data on the score record is in the P array where it can
be used by the subroutine.
The annotated PLFI routine to perform the computations we have
described follows.
Text Notes
SUBR0UTINE PLFI
C0MM0N IP, P, D
DIMENSI0N IP(10), P(100), D(2000)
NS = P(4) 2
NE = P(5)
TS = P(6)
DS = P(7)
FS = P(8)
IP(1) = 6 3
P(1) = 1.0 4
P(3) = P(9) 5
D0 100 I = NS, NE, 10 6
P(2) = TS + DS * D(I) 7
P(4) = DS * D(I + 1) 8
P(5) = D(I + 2) 9
P(6) = (2.0 ** (D(I + 3) + FS» * 262.0 10
CALL WRITE1(10) 11
100 C0NTINUE
RETURN
END
Notes
1. This C0MM0N and DIMENSI0N statement locates the three
essential arrays, IP, P, and D for PLFI. The Pass I definition of
these arrays must agree with the definition in the subroutine.
2. These statements take parameters P4-P8 from the PLF data record
and store them in the PLF subroutine. Since the P(100) array will
be used to output N0T records, the PLF parameters must be
removed from it.
3. The word count of the N0T records is set at 6. We will assume that
we are generating notes for an instrument of a type shown in Fig.
39. The six fields are
PI N0T
P2 Action time in seconds
A SEQUENCE OF TUTORIAL EXAMPLES 83
P3 Instrument number
P4 Duration in seconds
PS Amplitude in decibels
P6 Note frequency in hertz
An example is shown in Fig. 42, where the first theme, the second
theme, and the product are written in musical notation. Typically, theme
2 is short and compact in frequency range. However, this is not a
requirement. We could also form the product
Theme 2 x theme I
Our multiplication algorithm is not commutative, and
Theme 1 x theme 2 =1= theme 2 x theme 1
, ,1.1 I J i ,j
(0 ) ( b)
1 SVI 0 10 0 2 50 .417 ;
2 SVI 0 20 2 2 53 .750 ;
3 SVI 0 30 4 1.5 56 .583 ;
4 SVI 0 40 5.5 .5 59 .167 ;
5 SVI 0 50 6 2 62 .417 ;
6 SVI 0 60 0 1.08 4 0 ;
7 SVI 0 70 1.5 .375 2 .167 ;
8 SVI 0 80 2 .75 0 - .083 ;
9 SVI 0 90 3 .75 - 2 0;
10 PLF 0 2 10 50 60 90 0 4 ;
(d)
Fig. 42. Multiplication of two themes: (a) theme 1 times (b) theme 2 gives
(c) product via PLF2; (d) computer score.
The score for the example is also shown in Fig. 42. Lines 1-5 define
theme 1, lines 6-9 define theme 2, and line 10 calls PLF2 to generate the
product. The calling sequence is
P4 D location of first note of first theme
P5 D location of last note of first theme
P6 D location of first note of second theme
P7 D location of last note of second theme
P8 Starting time of product theme
P9 Instrument number
A SEQUENCE OF TUTORIAL EXAMPLES 85
We will assume that the D array arrangement and the instrument are
the same as were used for PLFI.
The annotated F0RTRAN program follows.
Text Notes
SUBR0UTINE PLF2
C0MM0N IP, P, D
DIMENSI0N IP(10), P(100), D(2000)
NBI = P(4)
NEI = P(5)
NB2 = P(6)
NE2 = P(7)
TS = P(8)
IP(1) = 6
P(1) = 1.0
P(3) = P(9)
D0 101 I = NB1, NE1, 10 2
START = TS + D(I) 3
DS = D(I + 1)/(D(NE2) + D(NE2 + 1»
D0 100 J = NB2, NE2, 10 2
P(2) = START + DS * D(J) 4
P(4) = DS * D(J + 1)
P(5) = D(J + 2) + D(I + 2)
P(6) = (2.0 ** (D(J + 3) + D(I + 3)) * 262.0
CALL WRITE1(10)
100 C0NTINUE
101 C0NTINUE
RETURN
END
Notes
1. This group of statements moves the PLF parameters from the P(100)
array into the subroutine and sets the unchanging parts of the N0T
parameters in P(lOO) in preparation for writing N0T records.
2. The program contains two nested D0 loops. The outer loop is
executed once for each note in theme 1, the inner D0 cycles for each
note in theme 2.
3. These two statements compute the starting time shift and the
duration scaling for a repetition of theme 2. START is the beginning
time of a note in theme 1. DS is computed so that the last note in
theme 2 will end at the ending time of the note in theme 1.
4. This and the following three statements compute the starting times,
durations, amplitudes, and frequencies for the notes in theme 2
86 CHAPTER TWO
Compositional Functions
The note-generating subroutines that have just been demonstrated
can be greatly strengthened by defining information in certain ways
which we call compositional functions. Compositional functions can be
used to provide a new language to describe sounds, called a graphic
score. Although graphic scores can be used to represent the notes in a
conventional score, the notation is completely different. It is more
powerful in the sense that many sounds that are impossible to notate
conventionally can be readily described by a graphic score. Moreover,
in the synthesis of sounds, the graphic scores can be "read" easily by
note-generating subroutines. This section can only lay the foundation
for graphic scores, but further information is given in the references.
A compositional function is a function defined over an entire section
of a composition. It is used to control some always present parameter
such as loudness or tempo. Compositional functions should not be
confused with the stored function used to describe waveshape or
envelope. The stored functions are generated, stored, and used in
Pass III. Compositional functions are described and used in the first
two passes. Both their mode of description and their use differ from
those of stored functions.
A SEQUENCE OF TUTORIAL EXAMPLES 87
Metronome Function
Let us start by considering the metronome function which is built
into Pass II and can be evoked if desired. So far, the starting times and
durations of notes have been written in numbers which were interpreted
as seconds. Thus a note
N0T 2 4 I 54 .167 ;
starts at 2 sec from the beginning of the section and lasts for 1 sec.
With a metronome function, P2 and P4 are interpreted in beats; the
note starts at the beginning of the second beat of the section and lasts
for one beat. The relation between beats and seconds is given by the
metronome function, which is in standard metronome marking of beats
per minute. Thus, for example, if the metronome function is 180, the
note would start i sec from the beginning of the section and would last
i sec.
The metronome function need not be constant, but can change
abruptly or gradually during a section to introduce accelerandos or
retards. The operation can be illustrated by an example shown in
Fig. 43.
The conventional score for 14 quarter notes lasting 14 beats is shown
at the top, together with tempo marking. The N0T cards to encode this
are as follows.
N0T o 4 .8 60 o
N0T 1 4 .8 60 .167
N0T 2 4 .8 60 .333
N0T 3 4 .8 60 o
N0T 4 4 .8 60 o
N0T 5 4 .8 60 .167
N0T 6 4 .8 60 .333
N0T 7 4 .8 60 .417
N0T 8 4 .8 60 .583
N0T 9 4 .8 60 .750
N0T 10 4 .8 60 .917
N0T 11 4 .8 60 .583
N0T 12 4 .8 60 .583
N0T 13 4 .8 60 .750
r:[~::~:':'h o 2 4 6
Beats
8 10 12 14
(b)
<II
'c
:::J
CI
.E
~ 0
.5
- - -- ti-4-
.......
---- - --
a:
Time in seconds
(c)
Fig. 43. Metronome function: (a) music score; (b) metronome marking
function; (c) graphic score.
The first record describes the function, and P4 gives the initial abscissa
(0), P5 the ordinate at that abscissa (60), P6 the next abscissa (4), P7
the next ordinate (60), etc. The abscissa is in beats and the ordinate in
metronome marking-beats per minute. Successive points on the
function are connected by straight lines, as shown in Fig. 43. As many
segments as desired may be used by putting more points into the SV2
function. The abscissa points need not be uniformly arranged.
The second record tells the Pass II program that a metronome function
is being used and that it starts in variable 50, that is to say, in G(50).
The graphic score in Fig. 43 shows the notes resulting from the
metronome function being applied to the score. The pitch of each note,
plotted against the time it occurs, is shown by the horizontal bars.
Pitch is given on a logarithmic scale, 0 = middle C, + 1 = C 512 • Time
A SEQUENCE OF TUTORIAL EXAMPLES 89
~ 0 . . . . ..
(hI! :;
o
.<::
·:f---J/
0
.5
u
(c) ~ 0
0..
-.5
-I
60
~ 50
(d) ~0.
E
ct 40
o
Time in seconds
Fig. 44. Graphic score and resulting notes: (a) duration; (b) duty factor;
(c) pitch; and (d) amplitude.
Duration: SVI 0 50 0 .5 3 2 4 .3 8 1 12 .5 13 .5
Duty factor: SVI 0 65 0 .5 3 .5 4 1 6.7 1 7 - .25 8 - .25
8.3 1 13 1 ;
Pitch: SVI 0 85 0 0 3 1 7.5 -1 13 0
Amplitude: SVI 0 95 0 40 7.5 60 13 40 ;
SUBR0UTINE PLF3
C0MM0N IP, P, D
DIMENSI0N IP(10), P(100), D(2000)
TS = P(4)
END = P(5)
NA = P(6)
NP = P(7)
NDR = P(8)
NDF = P(9)
P(1) = 1.0 2
P(3) = P(10)
IP(1) = 6
T = 0.0 3
A SEQUENCE OF TUTORIAL EXAMPLES 93
Notes
1. These statements extract the essential information for the PLF3
from the P array.
2. These statements set the constant parts of the P array in preparation
for writing out N0T records, and they set the word count.
3. T is the starting time of the next note to be generated (not including
the time shift TS). It is set initially at zero and computed as a
running variable and is increased by the interval between successive
notes after each note is generated. T is also the variable used to
specify abscissa values in C0N.
4. DR is the interval between successive notes as obtained by C0N
from the duration function.
5. This statement checks to see whether the starting time of the next
note is greater than the ending time, END. If so, the current note is
not generated and PLF3 is terminated.
6. The time shift TS is added to T to obtain the starting time of the
note.
7. The duration of the note is computed as the interval times the duty
factor.
8. This statement checks for a rest. If the duration is zero or negative,
owing to a zero or negative duty factor, no N0T is written out and
the program proceeds to the next note.
9. These statements compute the rest of the N0T parameters and
write out the N0T record.
10. This statement adds T to the starting time of the next note.
11. This statement transfers control in order to generate the next note.
notes. The next six notes are legato, with no silent intervals between
notes. Two possible notes have been omitted to form a rest.
In terms of the number of notes generated, PLF3 is very efficient.
One PLF3 call produced 16 notes. It could just as well have produced
1600. In contrast to conventional scores, the notation for duration has
the advantage that a second's worth of fast notes requires no more
effort to describe than a second's worth of slow notes. Also ritardandos
and accelerandos are easy to describe by lines with increasing or
decreasing slopes, as illustrated from 4 to 8 and from 8 to l3 sec. Such
tempo changes can have striking acoustical effect.
1.0
Scale tones
.5
-.5
-Generated pitch
--- Quantized pitch
-1.0
I I I I I I I I I I I I I
o 2 3 4 5 6 7 8 9 10 II 12 13
Time in seconds
PLF are shown as solid horizontal bars, and the adjusted pitches are
shown as dashed bars. The adjustment may be either up or down,
depending on which scale step is closest. The process of adjustment is
called pitch quantizing.
Note that the first and sixth notes happened to fall exactly on a
scale step and require no quantizing. Also the last two pairs of notes
become pairs of repeated notes as a result of quantizing. Quantizing
tends to produce repeated notes if the scale steps are large and the change
in pitch between successive notes is small.
In order to write a PLS routine, it is necessary to understand a few
details of the operation of Pass II. All the data records in a section are
read into a large array D(10,000), 10,000 locations long in the training
orchestra. An array 1(1000) is computed by sorting so that
1(1) = the address in D of the beginning of "first" data record
where "first" means smallest action time
1(2) = the address in D of the beginning of "second" data
record
etc.
96 CHAPTER TWO
N0T 19 2 4 60 .167 ;
1(13) = 109
and
D(109) = 6 (Word count)
D(IlO) = 1 (N0T == 1)
D(IlI) = 19
D(112) =2
D(1l3) = 4
D(II4) = 60
D(1l5) = .167
After 1(1000) is computed, the program goes through the data
records in order of increasing action times, executing any PLS routines,
storing any SV2 data in the G(1000) Pass II data array, and writing out
N0T records with the aid of the C0NVT subroutine.
A PLS function can modify any N0T records with action times
greater than the action time of the PLS function. It cannot affect N0T
records with action times less than the PLS function, since these will
already have been written before PLS is executed. The Pass II memory
G(1000) will contain the numbers from any SV2 cards with action
times less than the action time on the PLS function; it will not contain
any data from SV2 cards whose action times are greater than on the
PLS function.
The scale will be stored in Pass II memory by a SV2 statement giving
the number of steps in the scale, followed by the pitches of these steps.
Thus the scale used in Fig. 45 is inserted in the memory by the record
SV20 100 11 -1 -.8 -.6 -.4 -.2 0 .2 .4 .6 .8 1 ;
These data will go into G memory at time O.
The PLS 1 function will be called at action time 0 by the statement
PLS 0 1 100;
where P3 = 1 indicates PLS 1 and the 100 gives the starting point of the
scale in the G array.
An annotated program to carry out the pitch quantizing follows.
A SEQUENCE OF TUTORIAL EXAMPLES 97
Text Notes
SUBR0UTINE PLS1
C0MM0N IP, P, G, I, T, D 1
DIMENSI0N IP(10), P(100), G(1000), 1(1000),
T(1000), D(10,OOO)
Il = IP(2) 2
IN = IP(3) 3
NQ = D(1l + 4) 4
NB = NQ + 1
NL = NQ + IFIX(G(NQ))
D0 103 J = 1, IN 5
ID = I(J)
IF(D(lD + 1) - 1.0) 103, 100, 103 6
100 FREQ = D(ID + 6) 7
MIN = 1,000,000.0 8
D0 102 K = NB, NL
IF(ABS(FREQ - G(K)) - MIN) 101, 102, 102
101 MIN = ABS(FREQ - G(K)
QFREQ = G(K)
102 C0NTINUE
D(lD + 6) = QFREQ 9
103 C0NTINUE
RETURN
END
Notes
1. This common statement and the subsequent dimension statement
describe the main data arrays in Pass II and must agree with the
corresponding statements in the Pass II main program. IP gives
certain miscellaneous constants, P is the communication array
from which data records are read and written, G is general variable
storage, I indexes the D array in action-time order, T contains
action times and is primarily used in the sorting process, D contains
the data records.
2. When PLS is called, IP(2) contains the address in the D array at
which the PLS data record is located. In this case if
IP(2) = 27
then
D(27) = 4 (Word count)
D(2S) = 10 (PLS == 10)
D(29) = 0
D(30) = 1
D(3l) = 100
98 CHAPTER TWO
The PLS routines tend to be both longer and logically more compli-
cated than the PLF routines. The steps in the example just discussed are
typical. Actually, they were not all necessary for the problem at hand.
The pitches could have been quantized by the PLF routine as they were
generated. Even if the quantizing were done in Pass II, it would not
have been necessary to go through the D array in order of action times.
However, for slightly more complicated operations, such as quantizing
the intervals between voices, all the Pass II steps are essential.
Another simplification in the program consists in writing out the
scale for all the octaves in which it is to be used. In many cases, only
one octave is written out; the actual pitches are translated to this
octave before being quantized; and the quantized pitches are translated
back to their original octave. The possibilities open to the composer
are almost endless.
I J J J Jl
=
(0 )
V3 V4 V5 V6 P5 82 P9 PIO PII 83 P7 VI V2
'~2LSG
82
r6
Instrument
#1
¥ 83
Instrument
#2
(b)
-£
~ .5~-------'
Instrument
Time in seconds #3
(c)
3 As of February 21, 1968, this feature was not yet programmed in Music V.
However, it seems both desirable and easy to insert.
A SEQUENCE OF TUTORIAL EXAMPLES 101
developed in Fig. 39, and it uses the C0NVT function for that instru-
ment. The additional amplitude function B2 is multiplied by the normal
amplitude input P5. The continuous amplitude-control function is
written in decibels (as shown in Fig. 46c) , and B2 is the exponential
transformation
B2 = 10 ** (Continuous ami~itude functiOn)
Thus, the decibels of the normal amplitude function and the continuous
function are additive. If the continuous function is 10 dB and the normal
function is 50 dB, the resulting sound will be at 60 dB.
The normal frequency input P7 is mUltiplied by the additional
frequency-control function B3. The continuous pitch function (also
shown in Fig. 46c) will be written in our standard logarithmic scale, and
B3 will be the exponential transformation
B3 = 2 ** continuous pitch function
Thus a continuous pitch function of 0 produces no change in pitch, a
continuous pitch function of 1 produces a one-octave upward shift, and
so forth. The computation of V3-V6 to achieve both the exponential
conversions and the proper increments will be done by a PLF4
subroutine.
Input VI specifies the proportion of frequency shift in the vibrato,
proportionality being controlled by a multiplier. Such Pass III multi-
plication is essential rather than multiplication by the C0NVT function,
because frequency can vary over a note.
The annotated PLF4 program is given below. The pitch and ampli-
tude functions will be stored as Pass I variables in the usual notation.
The functions shown in Fig. 46c are stored by the statements
SV1 0 50 0 0 4 20 8 0 12 20 19.99 0 20 20 24 0 ;
SVI 0 70 0 .583 12 .583 14 1.167 16 .333 19.99 .583 20 0
240;
The calling record for PLF4 is
PLF 0 4 TS END FA FP ;
where TS is the starting time of the control functions, END is the
duration of the control functions, FA is the starting variable of the
amplitude function, and FP is the starting variable of the pitch function.
For the example
PLF 0 4 0 24 50 70 ;
is the specific calling record.
102 CHAPTER TWO
Text Notes
SUBR0UTINE PLF4
C0MM0N IP, P, D
DIMENSION IP(10), P(100), D(2000)
TS = P(4)
END = P(5)
NA = P(6)
NP = P(7)
1= NA 2
IP(1) = 5
P(l) = 4.0
P(3) = 3.0
100 P(4) = 10.0 ** (D(I + 1)/20.0) 3
P(5) = (10.0 ** (D(I + 3)/20.0) - P(4»/«D(I + 2)
- D(I» * D(4»
P(2) = TS + D(I) 4
CALL WRITE 1(1 0)
IF (D(I + 2) - END) 101, 102, 102 5
101 1=1+2 6
G0 T0 100
102 1= NP
P(3) = 5.0
103 P(4) = 2.0 ** D(I + 1)
P(5) = «2.0 ** D(I + 3) - P(4»/«D(1 + 2)
- D(I» * D(4»
P(2) = TS + D(I)
CALL WRITEl(10)
IF(D(1 + 2) - END) 104, 105, 105
104 1=1+2
G0 T0 103
105 IP(1) = 4 8
P(1) = 1.0
P(2) = TS
P(3) = 1.0
P(4) = END
CALL WRITE1(10)
P(3) = 2.0
CALL WRITE1(10)
RETURN
END
A SEQUENCE OF TUTORIAL EXAMPLES 103
Notes
1. These statements extract the essential information for PLF4 from
the P array.
2. These statements prepare to write SV3 records for V3 and V4.
P(I) is 4 for SV3. P(3) = 3.0 designates V3 as the first variable.
One pair of V3 and V4 values will be written for each segment of the
amplitude function. I = NA will set the initial value of the equations
starting at 100 for the first segment.
3. This and the subsequent line calculate the initial value and slope for
the first segment. The slope is in units per sample. D(4) is the
sampling rate.
4. The time of the SV3 card is the beginning time of the first segment
plus TS.
5. This statement terminates the amplitude function at the end of the
current segment if D(I + 2) ~ END.
6. I is incremented by 2 and control is transferred to 100 to continue
with the next segment.
7. These statements write out SV3 records for variables 5 and 6 to
produce the pitch control. The process is exactly analogous to
amplitude control.
8. The rest of the program writes out two N0T records
N0T TS 1 END
N0T TS 2 END
The score records to produce the Fig. 46 output are given below. The
definition of the instruments and the Pass III stored functions are
omitted since they are completely standard.
SVI 0 50 0 0 4 20 8 0 12 20 19.99 0 20 20 24 0 ;
SVI 0 70 0 .583 12 .583 14 1.167 16 .333 19.99 .583 20 0
240;
PLF 0 4 0 24 50 70 ;
N0T 0 3 11.8 40 262 ;
N0T 12 3 7.8 40 262 ;
N0T 20 3 .8 40 392 ;
N0T 21 3 .8 40 349 ;
N0T 22 3 .8 40 330 ;
N0T 23 3 .8 40 294 ;
104 CHAPTER TWO
.g 60
i. 50
E
<X
40
1.5
.£:
o
a::
H
o
o 2 4 6 8 10 12 14 16 18 20 22 24
Time in seconds
Fig. 47. Graphic score with continuous changes in amplitude and pitch.
The graphic score of the resulting sound is shown in Fig. 47. Beginnings
and ends of individual notes are indicated by short vertical bars on the
amplitude and pitch curves. An attack or decay will be produced by the
envelope generator at these times. Amplitude and pitch changes occur
continuously and independently of note boundaries.
Even-Tempered Scale
Frequency Logarithmic
Note in hertz pitch
C 262 0
C# 277 .083
D 294 .167
D# 311 .250
E 330 .333
F 349 .417
F# 370 .500
G 392 .583
G# 415 .667
A 440 .750
A# 466 .833
B 494 .917
120 ri t (J 100)
p mf f
more than one note at a time. The limit in the training orchestra is 30
simultaneous voices.)
+1
-I
o
(c) F6, a sine wave of peak amplitude 1 (write only the first 20 samples)
1
o 5
\
506 511
o 500 511
This frequency is much greater than half the sampling rate. What is the
apparent period of B2(1) ... B2(20)? This period (about 6 samples) results
from foldover.
5. Instrument 1 shown in Fig. 27 uses the symmetrical square wave of
problem 2a for its stored function. Write the output samples So, Sb ... ,
S60,OOO resulting from the following score. Abbreviate your answer by
designating blocks of zero samples by ...
N0T 0 1 .0005 1000 60 ;
N0T .5 1 .0006 500 200 ;
N0T 1 1 .0002 100 10 ;
N0T 2 1 .001 500 70 ;
N0T 2.0002 1 .0004 500 100
6. Instrument 2 shown below uses F4 function of problem 2a. It plays
the note
N0T 0 2 .001 500 80 3 100 ;
Plot the samples
B2(1) ... B2(20)
B3(1) ... B3(20)
B4(l) ... B4(20)
A SEQUENCE OF TUTORIAL EXAMPLES 109
P5 P6 P7 P8
Instrument 2
Simple Instruments
7. Score the instrument diagrammed here.
(e) Write score records for Fl, an attack and decay function; F2, a
vibrato function; and F3, a modified square-wave waveform.
(f) Write the score for the following passage
J'J .r~----
J I J j J I J.
8. Diagram, score, and write functions and a note for. an instrument
that has attack and decay in amplitude and a frequency attack. The
frequency of each note should start 10% low and rise linearly to the final
frequency of the note within the first 1070 of the note's duration.
9. Diagram and score an instrument with attack and decay in amplitude,
with vibrato, and with attack and decay on the vibrato.
10. Diagram an instrument that uses four 0SC's to change the wave-
form of a note as a function of both amplitude and frequency. The com-
position of the output waveform should be
A·[{1000 - f}{(1 - A)0SC 1 + A·0SC 2 }
+ {f}{(1 - A)0SC 3 + A·0SC4 }]
where A is an amplitude control going from 0 to 1 and f is frequency in
hertz.
CfJNVT Functions
11. Write a C0NVT function for the instrument shown in Fig. 32 which
will process a note record of the form
N0T T 2 D A F ;
where A is amplitude in decibels and F is frequency in hertz. V50 is the
proportion of vibrato. For each N0T record, C0NVT should write out
three records to produce a three-note chord, the highest voice having a
frequency of A Hz, the middle voice A/2 Hz, and the lowest voice A/4 Hz.
12. Write a C0NVT function for the instrument shown in Fig. 33 which
reads a N0T record of the form
N0T T 3 D Al A 2 •• • An Freq
where A1 ... An is a sequence of amplitudes in decibels and Freq is frequency
in hertz. The C0NVT function outputs n + 1 successive notes of equal
duration, whose total duration is D. The first note starts at amplitude 0
(linear scale) and ends at A1(dB), the second goes from Al to A2 , ••• , the
last goes from An(dB) to 0 (linear scale).
Composing Subroutines
18. Write a set of PLF routines that will process note data in Pass 1
memory. Assume that the note data are stored in the Pass I D array in the
manner used for the Fig. 41 example, and that notes will be written for the
instrument shown in Fig. 39. Write the following subroutines:
(a) PLFI rewrites n notes in the D array, multiplying all logarithmic pitch
intervals by S, adding a constant K to the logarithmic pitch intervals, and
changing the tempo by a factor T.
(b) PLF2 substitutes a new note for note n in the array.
(c) PLF3 makes a copy of n notes starting at D(m) and stores the copied
notes at D(p), overwriting anything that was previously at D(p).
(d) PLF4 divides each of n notes starting at D(m) into k notes of equal
length whose total duration equals that of the note they replace. The new
notes are written starting at D(p).
(e) PLF5 writes N0T records for n notes starting at D(m). The starting
times of all notes are shifted by T sec.
Use these subroutines to compute a composition.
Graphic Scores
19. Write a subroutine PLFI that will generate pitch and amplitude
functions as the computed functions
Pitch(t) = f 1 (t) * f 2 (t) + f 3 (t) * f 4 (t)
Amplitude(t) = f5 (t) * f6 (t) + f7(t) * fs(t)
where f 1 (t) through fs(t) are functions stored in the D array. Compute the
starting and stopping times of notes as the positive-going zero crossings and
the negative-going zero crossings, respectively, of a function
Notes(t) = fg(t) * f10(t) + (1 - fg(t)) * f12 (t)
112 CHAPTER TWO
where fg(t) , flO(t) , f12(t) are stored in the D array. Let f 1o(t) and f12(t)
correspond to the rhythmic sequence of two well-known melodies. What
notes will be generated when fg(t) = 1 ?; when fg(t) = O?; when fg(t) has
some intermediate value? Follow the general procedures used in the Fig.
44 example.
Pitch Quantizing
20. Write PLS1, a pitch-quantizing routine which will quantize a voice
for instrument 1 into the closest note in the C major scale. Assume that
voices for instruments 2 and 3 produce notes in synchrony with instrument
1. Adjust these voices to harmonize instrument 1 according to the following
rules.
(a) Harmonize C and E with the chord CEG.
(b) Harmonize F and A with the chord FAC.
(c) Harmonize Band D with GBD.
(d) Harmonize G with CEG if it starts on a multiple of four beats and
with GED if it starts on any other beat.
Use a minimum adjustment of the other voices to achieve these chords.
Interconnected Instruments
21. Define an orchestra and an appropriate C0NVT function so that
the output of an instrument is the sum of two 0SC's, the proportion of
each being determined by two separate instruments 11 and I2. The propor-
tion will change continuously and frequently during the course of the notes
to add interest to the sound quality. Use LSG unit generators and follow
the general procedures of the Fig. 46 example.
3 Music V Manual
M. V. Mathews, Joan E. Miller,
F. R. Moore, and J. C. Risset
3 Music V Manual
1. Introduction
This chapter contains a detailed description of the operation and
structure of the Music V program. It provides reference material for
users of Music V and source material for those who desire intimate
knowledge of a sound-generating program in order to write their
own.
Music V is the direct descendant of Music IV, a program that was
widely used for five years and has been described in the literature. 1
Music V had to be rewritten to change from a second to a third genera-
tion computer (the IBM 7094 to the GE 645). However, in the process
certain improvements were made, especially changes that made the
program more easily adapted to other computers. It may be helpful to
'list these changes for the benefit of users of Music IV.
115
116 CHAPTER THREE
Overview of Music V
A block diagram of the over-all operation of the programs is shown
in Fig. 48. The main programs, the principal subroutines, the flow of
control, and the flow of data are indicated. The few basic machine-
language programs are especially marked.
Pass I causes the score to be read by the READ I subroutine. The
score may be thought of as a sequence of data cards prepared by the
user, although the actual medium could also be a computer-connected
typewriter, a graphic computer, or a data file.
MUSIC V MANUAL 117
Key:
------~.. ~
Flow of control Flow of Optional F(IlRTRAN IV Machine-language
in program da ta branch subroutine subroutine
Cards are processed by Pass I in the order in which they occur in the
score. Data are grouped into data statements which are terminated by a
semicolon; a data statement need not correspond to a single card. The
first field of the data statement specifies an operation code, and the
second field specifies an action time when the operation is to be done.
This time is measured from the beginning of each section of the music.
The other fields may vary depending on the particular operation code.
118 CHAPTER THREE
The total number of fields may vary; no more than necessary need be
used.
The principal operations are to
its subroutines and then proceeds to the other passes. The actual
F0RTRAN programs are, of course, the ultimate and best description
of Music V; they should be read along with the manual.
2. Description of Pass I
The purpose of Pass I is to read the input data (score) and translate
it into a form acceptable to the subsequent passes. The operation is
diagrammed in Fig. 49.
READ I are necessary for computers of different word length and for
different modes of input (see Section 7 for details).
The input data comprise a series of data statements punched in free
format in columns I through 72 of cards. A data statement need not
correspond to a single card.
A data statement begins with an operation code and is terminated by
a semicolon. Other fields of information in the statement are separated
by blanks (any number) or commas. Null fields, i.e., those denoted by
successive commas, are assumed to have the value O. With the exception
of statements used in instrument definitions (see Section 4), the fields
of a data statement are referred to as P fields since they load sequentially
into the P array located in C0MM0N storage in Pass I. 2 The operation
code, written as a three-letter mnemonic (see Section 3) is converted to a
numerical equivalent and goes into pel); the second field, containing
an action time that specifies when the operation corresponding to the
code is to be performed, goes into P(2). The other fields are interpreted
according to the specifications of the various operation codes. If a
field other than the OP code is written as an asterisk (*), the value
stored in the corresponding position of the P array will be the value
previously stored there. This feature can be employed to advantage
when parameters remain constant over a sequence of data statements.
The input data are terminated with the data statement having the
operation code of TER. Failure to provide this statement will result in
an error comment.
The input program makes certain checks on the data statements and
when errors are detected the value of IP(2), located in C0MM0N
storage, is set to 1. Since this location is initially 0, Pass I can verify at
its conclusion whether or not errors have been detected and, if so, the
run is terminated without proceeding to Pass II. Termination is accom-
plished by calling a nonexistent subroutine named HARVEY.
As the data cards are read, they are printed, and any error comments
are printed out after the offending statement. Data statements begin-
ning with operation code C0M result only in printing and are not
processed further. Such statements may be used to annotate the input
data with comments.
In addition to establishing the appropriate values in the P fields,
READ 1 counts the number of P fields in the data statement and sets
IP(1) (in C0MM0N storage) to this count. Pass I is then a1)le to process
the data statement as is required by the operation code and to write
2 C0MM0N storage in Pass I is arranged according to the statement,
C0MM0N IP(10), P(100), D(2000)
122 CHAPTER THREE
Numerical
Value Mnemonic Purpose
a This code number is used only by READ 1. A data statement beginning with
C0M is printed but is not processed further.
Remarks
1. Only the first three characters of the operation code mnemonic
are scanned; thus a user may write N0TE, INSTRUMENT, GEN-
ERATE, SECTI0N, TERMINATE, or C0MMENT in place of the
three-letter codes if he prefers.
2. Integer-valued P fields may be written with or without decimal
points.
MUSIC V MANUAL 123
4. Definition of Instruments
An instrument definition begins with the data statement" INS t n ; "
where t specifies the time at which instrument n is to be defined.
Subsequent data statements indicate the unit generators used in the
instrument and their associated parameters. The data statement
" END;" terminates the definition.
The unit generators that are recognized by name (i.e., three-letter
mnemonic) by READI follow.
Type
Name Parameters Number Purpose
Data statements that specify unit generators may begin with the
three-letter mnemonic name or with the type number. READ1 recog-
nizes the 12 types listed in the table above by name, 3 and makes a
check on the proper number of parameters. If, for example, four or
six parameters are listed for 0SC, which requires five parameters, an
error condition will result, causing the job to terminate at the conclusion
of Pass I after all input cards have been scanned. Since unit generators
may be labeled by type number as well as name, it is possible to add
units to the subroutines F0RSAM (coded in F0RTRAN IV) or
SAMGEN (coded in basic machine language) used in Pass III without
the need for modifying READ1. Data statements referring to these new
units by type number will be accepted by READ 1, but no check will be
made for proper number of parameters.
The notation for these parameters used on the data statement is as
follows:
Pn refers to nth P field on note card
Vn refers to nth location in variable storage of Pass III
Fn refers to nth stored function
Bn refers to nth 1-0 block used by units
For example, instrument No.3 would be defined at t = 10 by the
following data statements:
INS 10 3 ;
0SC P5 P6 B2 F1 P30 ;
AD2 P7 VI B3 ;
0SC B2 B3 B2 F2 P29 ;
0UT B2 B1;
END;
READ 1 translates each mnemonic data statement into an all-numerical
data statement as follows:
(1) In all data statements, PI contains 2, the numerical equivalent
of INS, and P2 contains the action time (10 in the example).
(2) P3 contains the instrument number (3) in the first data statement.
(3) In the second through the last data statements, P3-Pn contains
the numerical equivalent of the mnemonic data statement fields
Pl ... P1ast , respectively. The name equivalents for the unit
generators are their type numbers listed above. The equivalents
of the P's, V's, etc., are as follows:
3 The "named" generators change frequently. The table describes the state of
affairs in April 1968 at Bell Laboratories.
126 CHAPTER THREE
5. Unit Generators
fJUT: Output Unit (Numerical equivalent = 1)
Diagram:
I
BI is often used as the output block. The location of the output block
must be compiled into IP(lO) (see Section 17).
RAN
MUSIC V MANUAL 129
+1
o ~~-~-----::l~t-------+---'--- Samples
-I
etc.
The first quarter of F j gives the attack shape, the second quarter of F j
gives the steady state, the third quarter of F j gives the decay shape, the
fourth quarter is unused and should be zero.
Specifically, the sections of F j and the scanning rates are shown in
Fig. 51.
Scanning rates:
~.
I2lacations/somple
13 loc.lsam. 14 loc.lsam.
~I~I~I
I I I
I I I
I I :
I I I
i : :
At tack ~ Steody state-':_ Decoy I
In a typical use
P6, P7, and P8 determine attack, steady state, and decay times, respec-
tively. P5 determines the maximum amplitude. P9 determines the
frequency. FI determines the envelope and F2 the oscillator waveshape.
Typically P6, P7, and P8 are computed by an elaborate C0NVT
function (see Chapter 2, section on Additional Unit Generators, ENV).
+1
..... .....
RI
R2
• • ~3• •
... ..
-1
5ii samples
and
Si+l = S1 + Ii
where
01 = the ith output sample
Ai = the ith amplitude input
Ii = the ith increment input (controls frequency)
F = a stored function (controls waveshape)
Si = the ith sum of increments
FL = the length of the stored function (in samples)
Assume for a moment that the stored function is a representation of
a sine wave occupying 101 computer locations, F(O), F(2), ... , F(100) .
+1
•••••• •
• ••
••• ••
•
O~~----------~·'--------------.~I~--
• • I
• • I
• • I
-I
etc
•
•
.... . •
• I
:
I
FUOO)
136 CHAPTER THREE
The value of F(O) is sin (0/100 * 27T), F(1) is sin (1/100 * 27T), F(2) is
sin (2/100 * 27T), etc. Since °: :;
Isin xl :::; 1.0, we may multiply the
values of the function by any amplitude A to produce output samples
in the desired range, 0 :::; 1011:::; A.
How does the oscillator reproduce this sine wave at any frequency?
Assume that we have fixed the sampling rate at 10,000 samples per
second. This means that the digital-to-analog converter will convert
10,000 samples into sound every second, and each sample number we
output represents 1/10,000 second of sound. If we multiply the stored
function shown above by an appropriate amplitude and output it
directly, then each period of the wave will contain 100 samples, and
it will be heard 10,000flOO or 100 times per second. This corresponds
to a frequency of 100 Hz. Since the sampling rate is fixed, to double the
frequency of the sound we must halve the number of samples per period
of the wave. We do this simply by referring to every other value of the
stored function.
,
• '
.: ,l .. ••
I •
• I I I •
, I I I •
·1 I I I ••
• 1 ~I I I. I
1(=2)
• •
•
•
•• ••
•
••• •
Thus the output samples will be given by the relations
0(1) = F(O) * A (s = 0)
0(2) = F(2) * A (s = 2)
0(3) = F(4) * A (s = 4)
etc.
The output wave then has 50 samples per period and is heard at
10,000/50 or 200 Hz. To obtain the output 0 1 in this case, the inde-
pendent variable in the function F(s) is incremented by 2 each time the
function is referred to. If the increment used was 4, we would output
100/4 = 25 samples per period, or a sine wave of 10,000/25 or 400 Hz.
In general then
. sampling rate
Frequency III hertz = I . d
samp es per peno
and
MUSIC V MANUAL 137
S 1 . d function length
amp es per peno = increment
therefore,
and
The table shows the results of computing 500 values of sine x, using
various methods and stored function lengths. The table entries are the
percentage rms error.
yes
Convert BCD
field from
L pointer
to next
bla nk to
floating
point
number
the operation code and action time appear only on the first data record.
READI takes the operation code (= 2) and the action time from the
first data record and stores these in P(1) and P(2) in all subsequent
records connected with instrument definition. The value of P(3) is the
type number of the unit, and the remaining fields are interpreted and
converted to floating point and stored starting in P(4). Word (i.e., field)
count for the statement is established in IP(I).
The conversion from BCD to floating point is done by a subroutine
(at 70) which finds the position of the decimal point in the field of
characters (or supplies it at the end if missing) and then multiplies the
characters, which are expressed as integers, by the appropriate power of
10 and sums over all characters in the field.
Any errors that are detected cause an error comment to be printed
below the printout of the data statement in which the error occurred.
In cases other than that of an incorrect operation code, the entire
statement is scanned so that all errors will be detected. Incorrect
operation codes, however, prevent proper interpretation of the remain-
ing fields in the data statement. When errors occur, a flag is set in
C0MM0N storage (namely IP(2) is set to 1) so that Pass I may ter-
minate the job at its conclusion. Furthermore, when errors are de-
tected, the data statement is not returned to Pass I but control returns
to the entry point of READ 1 to obtain the next data statement.
It will be noted that the input array for the card data is named CARD
which is (F0RTRAN) equivalent to ICAR. Also, IBCD is equivalent
to BCD. This equivalence is necessary because the characters when read
in with format Al require a floating-point designation. However, for
purposes of comparison, the data must be regarded as in integer form.
Hence, the characters must be right adjusted (moved to the right and
of the computer word). Similarly, when the organized data statement is
to be printed out, it must be put back into left-adjusted form so that it
may be printed out in Al format. Consequently, this routine uses two
subroutines, which must be written in machine language and, therefore,
supplied by the user. READI (and REA DO) makes the following calls:
The action time TA is the same as the action time for the instrument
definition. Inspection is done by the first statement in READ 1. END
equals 1 at the end of an instrument definition and equals 0 otherwise.
SNA8 equals 1 if the mono-stereo mode is changed and equals 0
otherwise. STER = 0 if the last out box is 0UT, and STER = 1 if
the box is STR. Music V is assumed initially to be in the monaural
mode.
If the program is to be run on a machine of different word length,
the F0RTRAN DATA statements for arrays IBC, IVT, and L0P
must be changed. These contain right-adjusted BCD characters for the
break characters used in delimiting the input, the parameter types P,
V, F, and B used in specifying unit generators, and the characters used
in the three-letter mnemonic names for operation codes. In a 36-bit
machine such data are entered as 6HOOOOOX, in a 24-bit machine as
4HOOOX. If the input data are to be read in from any medium other
than cards, the two "READ" statements at 15 and under READO
need to be changed as required. The number of characters obtained by
executing a "READ" command is a variable NC, established in this
version of the program as 72. The arrays that hold these data have been
dimensioned to accept a maximum value of NC equal to 128.
The break characters delimiting the fields of input data are the
blank, comma, and semicolon; NBC, the number of break characters,
is equal to 3. (If typewriter input is to be substituted, an additional
break character may be needed equal to the carriage return.)
The most frequent change in READ 1 is the addition of other OP
codes or unit-generator names. The following steps will accomplish
this change:
(1) Add the three-character mnemonic to the end of the L0P array.
(The size of the L0P array mayor may not need to be increased,
the word count on the L0P DATA statement must be increased.)
(2) Increase N0PS by 1.
(3) Put another branch at the end of the G0 T0 at 29.
MUSIC V MANUAL 143
(4) Write appropriate code for the branch. The code for branches
201-210 or 300-1200 will usually serve as a model.
A few of the variables in the program are
NUMU Normally equals O. It is set to 1 to disable checking the
number of fields in a unit generator.
NPW The number of fields expected in a unit generator, not
counting the name. For example, AD3 PI P2 P3 B1 ;
would have NPW = 4.
L Scanning index for the IBCD array. It normally points
to the character just ahead of the next field to be pro-
cessed. L is changed by many parts of the program,
including the BCD to floating-point converter.
I Scanning index for the ICAR array.
J Scanning index to store characters in the IBCD array.
The calling sequences are the same as those of READO and READ I,
namely
CALL REA DO
and
CALL READI
Debug READO does nothing.
Debug READ 1 reads one statement into the P array according to the
F0RTRAN statement
READI, K, (P(J), J = I, K)
where the format statement 1 is
1 F0RMAT(l6, IIF6.0/(12F6.0))
Thus twelve numbers are read from the first 72 columns of each card.
8. PLF Subroutines
A data statement of the type
PLF 0 n D4 D5 ... Dm ;
will cause the following call to take place during Pass I at the time the
data statement is read
CALL PLFn
where n is some integer between 1 and 5. PLFn is a subroutine which
must be supplied by the user. These subroutines can perform any
function desired by the user. Usually they will generate data statements
for Pass II or manipulate Pass I memory (the D(2000) array).
The information of the data statement PLF, 0, n, D4, ... , Dm will be
placed in the P(100) array in P(1) - P(m) at the time PLFn is called.
The P, D, and IP arrays are kept in common storage and hence are
available to the PLFn routine. The dimension and common statements
in the PLF routine and in Pass I must, of course, agree. For examples
and a further discussion of PLF subroutines, see Chapter 2, section on
Composing Subroutines-PLF.
Pass I errors
Nonexistent OP code on data statement 10
Nonexistent PLF subroutine called 11
(i.e., in call PLFn , n < 1, or n > 5)
Pass II errors
Too many notes in section, D or I array full 20
Incorrect OP code in Pass II 21
Incorrect OP code in Pass II 22
Nonexistent PLS subroutine called 23
(i.e., in call PLS n , n < 1, or n > 5)
Pass III errors
Incorrect OP code in Pass III
Code is < 1 or > 12
Too many voices simultaneously playing a 2
Too many voices simultaneously playing 3
a The maximum number of voices must be equal to or less than the number of
note parameter blocks (see Section 16).
Initialize
section
Read section
of data CALL READ2
CALL S0RT
no
Fig. 53. Pass II block diagram
-Music v.
After a section has been processed, the next section is read. The
section-reading sequence is terminated by a TER card via a flag lEND
which is set to 1 when TER is encountered. lEND is checked after each
section is processed.
The error comments produced by Pass II are printed by ERR0R and
are discussed in Section 9.
Pass II contains a general-purpose memory, G(lOOO), which is
primarily used by the PLS subroutines and by the C0NVT function.
Blocks of locations starting at G(n) may be set with a SV2 AT n x ... ;
record. The setting occurs at the action time, AT, relative to the other
data records.
Certain locations in the G array have special functions:
148 CHAPTER THREE
11. WRITE2
Pass II calls WRITE2(1l) in order to:
(1) Invoke the optional metronome operations described below
(2) Produce the optional Pass II report on the printer
(3) Call C0NVT to modify data record parameters
(4) Write (N, P(I), J = 1, N) on data file 11 for subsequent use by
Pass III
In order to utilize the metronome operations available in Pass II, a
nonzero value must be stored in the array location G(2). This value is
the beginning subscript in the G array of a tempo function such as the
one shown in Fig. 54. This is a function constructed of any number of
240
-5 180 M (=160)
c
.~ ~
.= 120
4
M (=126)
5
~~ 6 Mo(=60)
I-~
O~ __ ~~ ____________ ~ ____ ~~~
Bo BI B2
(=0) (=10) (=15)
Time (In beats) -+
Ti = T i - 1 + (Bi - Bi - 1 )· (F~~i))
where
Ti is current time in seconds, which replaces the value in P(2)
Bi is current beat number, the value found in P(2)
F(Bi) is the value of the tempo function at beat Bi and
Ti - 1 and Bi - 1 are the time and beat of the previous data record
and
Di = Li · (F~~i))
where
Di is the duration in seconds
Ll is the duration of note in beats and
F(Bi ) is as above
The tempo function itself may be placed into the G array via an
SV2 instruction. The function shown in Fig. 54, for example, could be
placed in the G array beginning at G(30) by the data records:
SV2, 0, 2, 30 ;
SV2, 0, 30, 0, 60, 10, 60, 15, 120, 45, 40, 56, 160, 63, 126
These metronome operations can be turned off at any time by setting
G(2) at 0. If the metronome operations are so turned off, P(2) and P(4)
are not affected by WRITE2 and are assumed to be in seconds.
The Pass II report is printed automatically by WRITE2 if G(l) = 0.
The Pass II report may be suppressed by setting G(l) # with an SV2 °
instruction (e.g., SV2, 0, 1, 1 ;). It consists of each data statement
printed in order of ascending action times. Each data statement is
shown exactly as it is presented to Pass III (if the data statements do
6 Durations are given on N0T cards only. P(4) is affected if and only if P(1) = 1
(playa note).
150 CHAPTER THREE
not exceed 10 fields, they are printed one per line; longer data state-
ments are continued on next line). In addition, if the metronome func-
tion is in use, P(2) and P(4) will have been converted into seconds, and
the original values of these parameters (in beats) are printed to the
right of each print line.
WRITE2 calls C0NVT immediately before it returns. G(5) and G(6)
contain the original values of P(2) and P(4) if metronomic scaling was
used.
12
10
o 10 13 20
where n is some integer between I and 5. The call is carried out at action
time AT relative to the processing of other data statements in Pass II.
PLSn is a subroutine that must be supplied by the user. It can perform
any desired function, but a typical use would be to change a note
parameter, such as pitch, according to some composing rule. For more
information on the use of PLS routines, see the tutorial examples in
Chapter 2.
The data statement PLS AT n D4 . . . is stored in numerical form in
the Pass II D(10,000) array at the time the call to PLSn takes place.
The arrangement is
D(M) = word count
D(M + I) = 10 (the numerical equivalent of PLS)
D(M + 2) = AT
etc.
M = IP(2)
Thus, for example, in order to find D5 from the data statement, PLS
must look up M at IP(2) and then look in G(M + 6). Such a roundabout
procedure is necessary because of the sorting.
The dimension and common statements in Pass II and PLSn must,
of course, be identical.
starting time and duration in beats, if the metronomic scaling has been
used.
C0NVT may perform complicated logical functions. It may increase
or decrease the number of parameters, changing IP(1) accordingly.
For more information, see the tutorial examples described in Chapter 2.
Over-all Operation
The over-all operation of Pass III is diagrammed in Fig. 56. The
program is started by reading a few constants from IP, including the
sampling rate IP(3) and the scale factor for variables IP(12).
A section is started by resetting the "played to" time T(l) to zero,
since time is measured from the beginning of each section.
The main loop of Pass III consists simply of reading a data statement
into the P array. As in previous passes, the P array is used exclusively
for reading and processing data statements. The operation code always
appears in pel) and the action time in P(2). Samples of the acoustic
output are generated until the" played to" time equals the action time.
Then the operation code is interpreted and executed. The next data
statement is then read and processed.
Instrument Definitions
If the operation code defines an instrument, the definition is entered
in the I array starting with the first empty location in the table for
instrument definitions. The location of the beginning of this instrument
definition is recorded in the location table for instrument definitions.
Different instruments are designated by being numbered.
154 CHAPTER THREE
ISAM =[T(3)-TtI)]*
[sampling rate]
T(I)=T(3)
271
Go through Go through
note unit
parameter generator
blocks in list for
Remove TMIN I array the given
from TI. Instrument
remove note
parameters
in I
(0 )
( b)
Note-parameter blocks in the I array are kept intact for the duration of
a note. Hence certain quantities that must be continuous throughout
the note, particularly SUM in the oscillator, should be kept in the
note-parameter block.
Input-output blocks for unit generators must not be incorrectly
overwritten inside an instrument. The same block may be used as
input and output to a given unit generator, since the input is read before
the output is written. However, a block cannot be used simultaneous1y
for two different purposes, for example, as two inputs to a unit generator.
That is, it should be kept in mind that an input-output block may
contain only one set of values at a time (see Fig. 58).
(a)
( b)
Fig. 58. Examples of (a) an incorrect and (b) a correct input-output block.
IP(7) - - .
} Note parameters
IP(5) --.
Location table for instrument definitions
The definition of instrument n begins at
}
I(I(lP(5) + n))
IP(4) - - - .
} Instrument definition table
I(n) = P(m)
The following constants are compiled into the IP(n) array. The array
is constructed by a BL0CK DATA subprogram and is stored in labeled
C0MM0N memory, labeled PARM.
IP(I) Number of operation codes in Pass III
IP(2) Beginning subscript of functions
IP(3) Standard (default) sampling rate
IP(4) Beginning subscript of instrument definitions
IP(5) Beginning subscript of location table for instrument
definitions
IP(6) Length of a function
IP(7) Beginning subscript of blocks of note-parameter storage
IP(8) Length of a block of note-parameter storage
IP(9) Number of blocks of note parameters (equals the maxi-
mum number of voices that can play simultaneously)
IP(lO) Subscript of unit-generator input-output block which is
reserved for storage of samples of the acoustic output
waveform. SAM0UT puts out samples from this block
IP(lI) Sound zero. This is integer with decimal point at right end
of the word
IP(l2) Scale factor for unit-generator variables (input-outputs,
etc.)
IP(l3) Subscript of beginning of unit-generator input-output
blocks
IP(I4) Length of a unit-generator input-output block
IP(l5) Scale factor for functions
S1 )
~~.
Subscripts and parameters pertaining to first unit
generator
Sn
I(m) Type of second unit generator
Pointer to third unit generator
/.
I(r) 0 Terminates description of instrument
The S/s that specify inputs, outputs, and functions for the unit
generators have the following meaning:
If SI < 0, then /SI/ is the subscript in I which specifies the beginning
of a function or of a unit generator input-output block.
If 1 ::::; SI ::::; 262,144, then SI is the number of a note-card parameter.
If 262, 144 < Sh then SI - 262,144 is the subscript in I of a variable.
The number of the variable is SI - 262,144 - 100.
20. F0RSAM
F0RSAM is a subroutine that contains unit generators written in
F0RTRAN. These may be used either sepa,rately or together with
SAMGEN which contains unit generators written in basic machine
language.
F0RSAM is called in Pass III by the statement
CALL F0RSAM
The call causes F0RSAM to compute NSAM (= 1(5» samples of the
MUSIC V MANUAL 163
Common
initialization
af addresses
and parameters
SUBROUTINLFOHSAM
DIMENSIONI(15000),P(100),IP(20),L(8),M(8)
COMMONI,P/PARM/IP
LQUIVALENCE(Ml,M(I»,(M2,M(2»,(M3,M(3»,(M4,M(4»,(M5,M(5»,(M6,M
1(0», 0i07,~H7», (Ma,tJI(B», (L1,L(!», (L2,L(2», (L3,L(3», (L4,L(4», (
2L5,L(5) ), (L6,Ub) ) , (L 7, U7) ) , (LB,·L (8) )
C COMMON INITIALIZATION OF GENERATOHS
N1=I(6) +2
N2=I (N1-1>-1
U0204J1=IJI , N2
J£=J1-~J1 + 1
IF(I(Jl»200,201,201
200 L(J2)=-!(Jl)
M(J2)=1
GOT02U4
201 M(J2)=0
IF(I(J1)-262144)202,~02,203
202 L(J2)=I(Jl)+1(3)-1
GOT0204
203 L(J2)=I(Jl)-262144
204 CONTINUE
NSA",1=I (5)
N3=I (lH-2)
WGEN= N3 -100
20~ 60TO(101,30G,300),N0EN
C UNIT GENERATOR 101- INTERPOLATING OSCILLATOR
101 SFU=IP(12)
SFF=IP(15)
SFUI=l./SFU
SFFI=l./SFF
SFUFI=SFU/SFF
SUM=FLOAT(I(L5»*SFUI
IF(Ml)21U,210,211
210 AMP=FLOAT(I(Ll»*SFUI
211 IF(M2)212,212,213
212 FREQ=FLOAT(I(L2»*SFUI
213 XNFUN=IP(6)-1
D0223J3=1,NSAM
J4=INT<SUM) +L4
FRAC=SUM-AINT(SUM)
216 Fl=FLOAT(I(J4»
F2=FLOAT(I(J4+1»
217 F3=Fl+(F2-Fl)*FRAC
IF(M2)21b,218,219
218 SUM=SUM+FREQ
60T0220
219 J4=L2+J3-1
SUM=SUM+FLOAT(I(J4»*SFUI
220 IF(SUM-XNFUN)215,214,214
214 SUM=SUM-XNFUN
215 J5=L3+J3-1
IF(M1)221,221,222
221 I(J5)=IFIX(AMP*F3*SFUFI)
60T0223
222 J6=L1+J3-1
I(J5)=IFIX(FLOAT(I(J6»*F3*SFFI)
223 CONTINUE
I(L~)=IFIX(SUM*SFU)
300 RETURN
E.ND
J5 = J3 + L3 - 1
The particular unit generator is an oscillator that interpolates between
adjacent values of the function (see Section 6 for discussion of why
interpolation is useful). Computations are carried out in floating-point
arithmetic. Since the input data are fixed-point numbers, they must be
floated and scaled by appropriate constants. Scale factors for 1-0
blocks and for functions are given in IP(12) and IP(15), respectively.
The necessary scaling constants are computed at 101.
21. SAMGEN
SAMGEN is one of the few basic machine language programs in
Music V. Consequently it must be written specifically for the particular
machine on which it is to be used. The Bell Laboratories program is
written in GMAP for a General Electric 635 computer. A few com-
ments about the program may be of use in designing programs for
other machines.
SAMGEN includes the unit generators of type numbers less than
100. The computation of the actual acoustic samples, which is the
preponderance of the computation in Music V, is done by SAMGEN.
The general form of SAMGEN is shown in Fig. 60.
SAMGEN is written in such a way that one procedure can be used to
set the parameters in all of its unit generators. This procedure accesses
the I array in common storage during Pass III in order to find out
1(3) the subscript in the I array of the note parameters for the
note being played
] (5) the number of samples to generate and
](6) the subscript in the I array for the instrument definition table
of the unit generator being played.
The procedure then reads through the instrument description for the
unit generator being played. (See instrument description, Section 19.)
For each unit generator, the procedure expects a certain number of
inputs CSt's) in a certain order, e.g., if unit type = 2 (oscillator), then
166 CHAPTER THREE
I ni tialize unit
gene rotor
being played
22. SAM0UT
SAM0UT is another GMAP subroutine called by Pass III which
(1) scales samples which are ready to be output, and (2) calls FR0UT
to output these samples onto magnetic tape. Samples (S1) are scaled
according to
SI = SI/218 + 2048
The calling sequence is
CALL SAM0UT (IARRAY, N)
where IARRAY = address of first sample to be output, and N is the
number of samples to be output.
Other routines used by SAM0UT are
FR0UT4
No common storage is used.
At the end of the composition, Pass ITT calls FR0UT with the call
CALL FR0UT3
FR0UT3 completes the output buffer, if it was only partially filled,
with zero-voltage samples, empties this last buffer onto tape, and
writes an end-of-file mark.
Packing of samples can be accomplished by machine-language
shifting instructions and buffering. Acoustic sample tapes typically are
unlabeled and unblocked, and use fixed-length records.
FR0UT3 prints a statement giving the number of samples out of
range in the file which has just been terminated.
'. TM
~ I I
I I
I I
I I
3 Action Function T I N1
time No (j)
GEN2
GEN2 is a F0RTRAN subroutine to generate a function composed
of sums of sinusoids. The calling sequence is
CALL GEN2
Data are supplied by the Pen), I(n), and IP(n) arrays.
The jth function Fli) is generated according to the relation
3 Action 2 Function Al ±N
time No (j)
GEN3
General description:
GEN3 is a F0RTRAN subroutine which generates a stored function
according to a list of integers of arbitrary length. These integers specify
the relative amplitude at equally spaced points along a continuous
periodic function. The first and last points are considered to be the
same when the function is used periodically (e.g., by an oscillator).
Calling sequence:
CALL GEN3
Other routines used by GEN3 :
none
Data statement:
GEN, action time, 3, stored function number, PI, P 2 , • . . , Pnj
Examples:
The following P/s will generate the functions shown below.
(1) 0, 1,-1
o~----------------------~-------
will generate
(2) 0, 8, 10, 8, 0,
O~--------------~r-------------4
-8, -10,0:
Loudness
The perceived loudness of a sound depends on many factors in
addition to its intensity. For example, in order for a pure tone or
sinusoid at 100 Hz to be heard, its sound intensity must be 1000 times
greater than that of a pure tone at 3000 Hz. For most of the musical
173
174 APPENDIX A
range the perceived loudness increases as the 0.6 power of the sound
pressure (Stevens, 1961). The perceived loudness increases more slowly
with sound pressure for 3000-Hz tones than it does for very low fre-
quencies, say, 100 Hz; and in the 'uncomfortably loud range, tones of
equal power are about equally loud. This means that as we turn the
volume control up or down, the balance of loudness among frequency
components changes slightly.
Pitch
Transient Phenomena
Textbooks give harmonic analyses of the sounds of various musical
instruments, but if we synthesize a steady tone according to such a
formula it sounds little like the actual instrument. Steady synthesized
vowels do not sound like speech if their duration is long.
Temporal changes such as attack, decay, vibrato, and tremolo,
whether regular or irregular, have a strong effect on sound quality. A
rapid attack followed by a gradual decay gives a plucked quality to any
waveform. Also, the rate at which various partials rise with time and
the difference in the relative intensity of partials with loudness are
essential to the quality of the sound (Risset, 1965). Indeed it is at least
in part the difference in relative intensity of partials that enables us to
tell a loud passage from a soft passage regardless of the setting of the
176 APPENDIX A
volume control. This clue is lost in electronic music if the tones employed
have a constant relative strength of partials, independent of volume.
The "warmth" of the piano tone has been shown to be due to the
fact that the upper partials are not quite harmonically related to the
fundamental (Fletcher et at., 1962).
Consonance
Observers with normal hearing but without musical training find
pairs of pure tones consonant if the frequencies are separated by more
than the critical bandwidth (Plomp, 1966), or if the frequencies coincide
or are within a few hertz of one another (in this case beats are heard).
Pairs of tones are most dissonant when they are about a quarter of a
critical bandwidth apart. For frequencies above 600 Hz, this is about a
twentieth of an octave.
Excluding bells, gongs, and drums, the partials of musical instruments
are nearly harmonic. When this is so, for certain ratios of the frequencies
of fundamentals, the partials of two tones either coincide or are well
separated. These ratios of fundamentals are 2:1 (the octave), 3:2 (the
fifth), 4:3 (the fourth), 5:4 (the major third), and 6:5 (the minor third).
Normal observers find pairs of tones with these ratios of fundamentals
to be more pleasant, and intervening ratios less pleasant (Plomp, 1966).
Musical consonance and dissonance depend on many factors in
addition to frequencies of partials. For example, unlike nonmusicians,
classically trained musicians describe pairs of pure tones with these
simple numerical ratios of frequency as consonant and intervening
ratios as dissonant. The only reasonable explanation is that trained
musicians are able to recognize familiar intervals and have learned to
think of these intervals only as consonant.
Plomp (1966) has pointed out that, in order for complex tones to
attain a given degree of consonance, low tones must be separated by a
larger fraction of an octave than high tones, and he has observed that
composers follow this principle.
If the partials of a tone are regularly arranged but not harmonic, the
ratios of frequencies of the fundamental (or first partial) that lead to
, consonance are not the conventional ones (Pierce, 1966).
Combination Tones
When we listen to a pure tone of frequency f1 and another tone of
somewhat higher frequency f2' we hear a combination tone of lower
PSYCHOACOUSTICS AND MUSIC 177
Reverberation
Reverberation is important to musical quality; music recorded in
an organ loft sounds like a bad electronic organ. The reverberation for
speech should be as short as possible; for music about 2 sec is effective.
Music sounds dry in a hall designed for speech. Reverberation is not the
only effect in architectural acoustics. Our understanding of architectural
acoustics is far from satisfactory (Schroeder, 1966).
retrograde (in the words of Tovey) for the eye only? Transpositions
certainly are psychologically close, but what about augmentations and
inversions? What about changes in rhythm? What about manipulations
of the tone row?
References
Flanagan, J. L., and N. Guttman, "On the Pitch of Periodic Pulses," J. Acoust.
Soc. Amer. 32, 1308 (October 1960).
Fletcher, H., E. D. Blackham, and R. Stratton, "Quality of Piano Tones,"
J. Acoust. Soc. Amer. 34, 749 (June 1962).
Gardner, M., "Comparison of Lateral Localization and Distance for Single- and
Multiple-Source Speech Signals," J. Acoust. Soc. Amer. 41, 1592 (June 1967),
Abstract.
Goldstein, J. L., "Auditory Nonlinearity," J. Acoust. Soc. Amer. 41, 676-689
(March 1967).
von Helmholtz, H. L. F., Die Lehre von der Tonempfindungen als physiologische
Grundlage fur die Theorie der Musik, 1863. On the Sensations of Tone as a
Physiological Basis for the Theory of Music (Dover, New York, 1954).
Levelt, W. J. M., J. P. van de Geer, and R. Plomp, "Triadic Comparisons of
Musical Intervals," Brit. J. Math. Statist. Psychol. 19 (Part 2), 163-179
(November 1966).
Licklider, J. C. R., "Basic Correlates of the Auditory Stimulus," in Handbook of
Experimental Psychology, S. S. Stevens, Ed. (John Wiley & Sons, New York,
N.Y., 1951).
Mathews, M. V., "The Digital Computer as a Musical Instrument," Science 142,
553 (November 1963).
Miller, G. A., "The Magical Number Seven, Plus or Minus Two," Psycho!. Rev.,
63, 81 (1956).
Pierce, J. R., "Attaining Consonance in Arbitrary Scales," J. Acoust. Soc. Amer.
40, 249 (July 1966).
Pierce, J. R., and E. E. David, Man's World of Sound (Doubleday, Garden City,
N.Y., 1958).
Plomp, R., Experiments on Tone Perception (Institute for Perception RVO-TNO,
Soesterberg, The Netherlands, 1966).
Plomp, R., "Pitch of Complex Tones," J. Acoust. Soc. Amer. 41, 1526-1533
(June 1967).
Risset, J. C., "Computer Study of Trumpet Tones," J. Acoust. Soc. Amer. 38,
912 (November 1965), Abstract.
Schroeder, M. R., "Architectural Acoustics," Science 151, 1355 (March 1966).
Shepard, R. N., "Circularity in Judgments of Relative Pitch," J. Acoust. Soc.
Amer. 36, 2346 (December 1964).
Stevens, S. S., "Procedure for Calculating Loudness: Mark VI," J. Acoust. Soc.
Amer.33, 1577-1585 (1961).
Wallach, H., E. B. Newman, and M. R. Rosenzweig, "The Precedence Effect in
Sound Localization," Amer. J. Psychol. 52, 315-336 (1949).
Appendix B Mathematics
In the body of this text an effort has been made to minimize the number
and difficulty of mathematical expressions. In certain places some
computations characteristic of signal theory must be done. This
appendix lists the relations that are required by the text. No proofs are
given, and the conditions under which the relations are true are not
spelled out. They hold in a useful (and widely used way) for almost all
real signals. We apologize for the strong MIT and EE accent in the
mathematical language. If one has something to say, it is better to
speak with an accent than to remain silent.
Fourier Series
A "not too discontinuous" function f(x) with period T can be
represented almost everywhere by the series
a 277" 4rr
f(x) = "2o + a l cos y x + a2 cos y x + ...
where
2 fT 217-i
at = T J0 f(x) cos T dx
180
MATHEMATICS 181
r .
and
bi = if2 0
27Ti
f(x) sm T dx
Fourier Transform
A "not too discontinuous" function f(x) for which the integral of
f2(X) exists may be transformed and inverse transformed according to
the relations
+OO
pew) =
f -00 p(t)e- Joot dt
1 f+oo P(w)eJoot dw
pet) = -27T _ 00
pew) is called the Fourier transform of pet); pew) is also called the
amplitude spectrum of pet).
where hex) is called the impulse response of the system. For realizable
systems, hex) = 0 for x < O. The transform of hex) is called the
transfer function H(w) of the linear system and is written
+OO
H(w) =
f _ 00 h(t)e- Jrot dt
The Fourier transform of the output O(w) and the Fourier transform
of the input I(w) are related by the simple equation
O(w) = H(w)I(w)
Convolution Theorem
The three time functions, x(t), yet), and z(t), have as their respective
Fourier transforms X(w), Yew), and Z(w). If z is the product of x and y
z(t) = x(t)· yet}
182 APPENDIX B
then
J+
1 _ 0000 X(a)Y(w - a) da
Z(w) = 27T
J_ 00 S(t) d t = 1
= cos wot
More generally
cp(1") = E<p(t)p(t + 1")
where E< ) is defined in some way that makes sense for the random
function pet). The power spectrum <I>(w) is the Fourier transform of
cp(1"). Thus
Note that the 2~ factor is in the transform rather than the inverse
transform.
Mean-Square Function
If pet) is a random function with autocorrelation function cp(1") and
power spectrum <1>(w), then
+OO
E<p(t)2) = cp(O) =
f _ <I>(w) dw
00
Index
185
186 INDEX