0% found this document useful (0 votes)
66 views

Aist2010 03 Analysis

The document discusses Fourier analysis and its applications in audio signal processing. It introduces the Fourier transform which decomposes a signal into its constituent sinusoids. The discrete Fourier transform (DFT) is used to analyze digital audio signals by summing a finite number of sinusoids. The short-time Fourier transform (STFT) further breaks the analysis into frames to show how frequencies change over time in a spectrogram. Window functions are used to avoid spectral leakage when applying the Fourier transform to short segments.

Uploaded by

wingkitcwk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views

Aist2010 03 Analysis

The document discusses Fourier analysis and its applications in audio signal processing. It introduces the Fourier transform which decomposes a signal into its constituent sinusoids. The discrete Fourier transform (DFT) is used to analyze digital audio signals by summing a finite number of sinusoids. The short-time Fourier transform (STFT) further breaks the analysis into frames to show how frequencies change over time in a spectrogram. Window functions are used to avoid spectral leakage when applying the Fourier transform to short segments.

Uploaded by

wingkitcwk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

AUDIO ANALYSIS AND

VISUALIZATION
AIST2010 Lecture 3
Fourier Analysis Spectral Visualization MATLAB Programming

OUTLINE
AIST2010 L3 — AUDIO ANALYSIS AND VISUALIZATION 2
SUMMATION OF WAVES
Any continuous function, e.g. audio signal, can be expressed as a sum
of (infinite many) sinusoidal waves
­Proved by French scientist and mathematician Jean Baptiste Fourier (1768–
1830)
­Each sinusoidal wave has their
own amplitude and frequency

Image from: Fund. of Music Processing, p.70

AIST2010 L3 — AUDIO ANALYSIS AND VISUALIZATION 3


FUNDAMENTAL FREQUENCY AND HARMONICS Image from: https://en.wikipedia.org/wiki/Harmonic

In particular, some waveforms sound ”better”


­Only/mainly frequency components in relationship
of integer multiples
­The GCD is often called the fundamental frequency
𝑓" , and the others are harmonics 𝑓#
𝑓# = 𝑘𝑓"
­Harmonics are sometimes called partials and
overtones too, but may be numbered differently!
Read: https://en.wikipedia.org/wiki/Harmonic#Partials,_overtones,_and_harmonics Harmonics of multiple relationship

AIST2010 L3 — AUDIO ANALYSIS AND VISUALIZATION 4


Image from: Comp. Music Instruments, p.63

OTHER WAVEFORMS
Sawtooth wave
­ Sum of all harmonics, with each
decreasing in amplitude
Square wave
­ Sum of odd harmonics
Triangle wave Decomposing a square wave
­ Sum of odd harmonics, with a negative sign for alternating odd harmonics, and
each decreasing in amplitude
Some more animations of the square wave decomposition here: http://bilimneguzellan.net/fuyye-serisi/

AIST2010 L3 — AUDIO ANALYSIS AND VISUALIZATION 5


MATHEMATICAL REPRESENTATION
Sometimes you may see this as well:
𝑔 𝑡 ≔ 𝐴 cos(2𝜋𝑓𝑡 + 𝜑)
A sinusoidal wave can be represented as…
𝑔 𝑡 ≔ 𝐴 sin(2𝜋𝑓𝑡 + 𝜑)
Sometimes you may
where see this: 𝜔 = 2𝜋𝑓
­A = amplitude, i.e. loudness of the sound
­f = frequency (in Hz), i.e. pitch of the sound
­ Note: period T = 1/ f, in seconds
­ϕ = phase (in radians, where 2π rad=360°), i.e. relative position of an
oscillation within its cycle
­ Note: A phase shift by ϕ+2π has the same effect as a phase shift by ϕ

AIST2010 L3 — AUDIO ANALYSIS AND VISUALIZATION 6


FOURIER ANALYSIS
Transformation from time domain (amplitude vs. time) into frequency
domain (magnitude vs. frequency) 𝑒 @9 = cos 𝑡 + 𝑖 sin 𝑡

ℱ 𝑔 𝑡 = 𝑔7 𝑓 = ∫9∈ℝ 𝑔 𝑡 𝑒 =>?@A9 𝑑𝑡
­ You may view it as counting the occurrence of frequencies in the waveform
Yet, this function for continuous 𝑓 and 𝑡 cannot be applied to digital
signals!

Read: https://betterexplained.com/articles/an-interactive-guide-to-the-fourier-transform/

AIST2010 L3 — AUDIO ANALYSIS AND VISUALIZATION 7


Video from: https://youtu.be/spUNpyF58BY

AIST2010 L3 — AUDIO ANALYSIS AND VISUALIZATION 8


DISCRETE FOURIER TRANSFORM (DFT)
Since the input values (samples) are equally spaced, the Fourier
Transform for sound samples is discrete @9
𝑒 = cos 𝑡 + 𝑖 sin 𝑡
­Sum of finite series of sinusoidal waves
From a sequence of 𝑁 (complex) samples
𝑥F ≔ 𝑥" , 𝑥H , … , 𝑥J=H
into a sequence of 𝑁 complex numbers
NOPQR
=
𝑋# ≔ ∑J=H
FM" 𝑥F 𝑒 S

for 0 ≤ 𝑘 ≤ 𝑁 − 1
Image from: https://blog.revolutionanalytics.com/2014/01/the-fourier-transform-explained-in-one-sentence.html
AIST2010 L3 — AUDIO ANALYSIS AND VISUALIZATION 9
DISCRETE FOURIER TRANSFORM (DFT)
NOPQR
=
𝑋# ≔ ∑J=H 𝑥
FM" F 𝑒 S

­The Xk series is called the DFT coefficients (of N frequency bins)


­ Magnitude 𝑋# = 𝑅𝑒(𝑋# )> + 𝐼𝑚(𝑋# )> 𝑒 @9 = cos 𝑡 + 𝑖 sin 𝑡
­ Phase 𝑎𝑟𝑔 𝑋# = arctan(𝐼𝑚 𝑋# /𝑅𝑒 𝑋# )
#
­ Bin frequency 𝑓# = 𝑓b c
J

DFT is a very popular tool for digital signal processing


­Usually implemented as Fast Fourier Transform (FFT)
­ Ordinary DFT is 𝑂(𝑁 > ) while FFT is 𝑂(𝑁 log 𝑁)
­Luckily, you can often use FFT simply as a black box in programming
libraries, without understanding the math behind!
AIST2010 L3 — AUDIO ANALYSIS AND VISUALIZATION 10
SHORT-TIME FOURIER TRANSFORM (STFT)
DFT can only show the general “histogram” of frequencies
­The appearance of frequencies in the whole analyzed sound
STFT breaks the process into multiple DFT/FFT in time segments
­Analysis frames

amp.
Waveform
­The result is a spectrogram DFT DFT DFT DFT DFT DFT
(time domain)
­ Magnitude vs. frequency vs. time time
frame
­ Magnitude often represented in
DFT DFT DFT DFT DFT DFT
freq.
the colour dimension Spectrogram
coeff coeff coeff coeff coeff coeff (freq. domain)
time

AIST2010 L3 — AUDIO ANALYSIS AND VISUALIZATION 11


WINDOW FUNCTION
For a step-by-step Fourier analysis, a window function is needed
­The value is 1 only for a short time, and 0 otherwise
Image from:
https://commons.wikimedia.org/wiki/File:
Mplwp_window-functions-symmetric.svg

The shape of the window function will


affect frequency responses
­E.g. The sharp edges of a rectangular window
will result in high frequency components
­Usual choices to avoid spectral leakage:
Hamming Window, Hann Window, …
Comparison of windows
AIST2010 L3 — AUDIO ANALYSIS AND VISUALIZATION 12
FREQUENCY BINS
Usual window size: powers of two to facilitate FFT, e.g. 1024, …
­Often with an overlap of 50% to compensate loss of data by windowing
­1024-point FFT = 1024 time samples = 1024 frequency bins
­The more samples in the window, Overlap Windows
the higher the frequency resolution

amp.
DFT DFT DFT DFT DFT DFT DFT
­ The results fall into frequency bins of smaller range
time
­The higher the frequency resolution, DFT DFT DFT DFT DFT DFT DFT

freq.
the lower the time resolution coeff coeff coeff coeff coeff coeff coeff
­ Basically it is a trade-off between time and frequency time
Hop size
AIST2010 L3 — AUDIO ANALYSIS AND VISUALIZATION 13
HOW TO READ THE PLOTS?

AIST2010 L3 — AUDIO ANALYSIS AND VISUALIZATION 14


HOW TO READ THE PLOTS?

15
INVERSE OF THE FOURIER TRANSFORM
Rebuilding audio signal from the Fourier analysis data
Inverse DFT
­From frequency domain back to time domain
­Can easily be expressed in terms of the DFT
J=H J=H
=
>?@#F 1 >?@#F
𝑋# = g 𝑥F 𝑒 J 𝑥F = g 𝑋# 𝑒 J
𝑁
FM" #M"
Inverse STFT
­Overlap-add (OLA) method
AIST2010 L3 — AUDIO ANALYSIS AND VISUALIZATION 16
CEPSTRUM
What would happen for a
Fourier transform in the
frequency domain?
­Cepstrum: the patterns found in the
spectrum
­Quefrency: a measure of time
related to the sampling rate in
time domain
­Lifter: a filter in the cepstrum
(quefrency) domain
AIST2010 L3 — AUDIO ANALYSIS AND VISUALIZATION 17
Image from: https://sethares.engr.wisc.edu/vocoders/phasevocoder.html

PHASE VOCODER METHOD


A special kind of FFT analysis is the Phase
Vocoder method
­Phase information is used to compensate the
inadequate frequency resolution
­Mimicking the analog method using “filter banks”
­Possible for spectral edits and resynthesis
Especially good for analysis of harmonic sounds
­Using an appropriate frequency bin size to fit
harmonics

AIST2010 L3 — AUDIO ANALYSIS AND VISUALIZATION 18


ANALYSIS ON THE HARMONIC SERIES
For harmonic sounds (e.g. musical
instruments), a series of peaks of
integer multiples can be found in the
spectrum 𝑓# = 𝑘𝑓"
Timbre: tone colour
­The difference between musical
instruments, or human voice
Pn Gt Hp Xy
Piano
# sig. har. = 6 Guitar
# sig. har. = 6 Harp
# sig. har. = 1 Xylophone
# sig. har. = 3
Bandwidth = 8
Density of Sig. Har. = 0.75
Bandwidth = 9
Density of Sig. Har. = 0.67
Bandwidth = 1
Density of Sig. Har. = 1
Bandwidth = 7
Density of Sig. Har. = 0.43 Integer multiples of f0

1 3 5 7 9 11 13 15 17 19 1 3 5 7 9 11 13 15 17 19 1 3 5 7 9 11 13 15 17 19 1 3 5 7 9 11 13 15 17 19 AIST2010 L3 — AUDIO ANALYSIS AND VISUALIZATION 19


ALTERNATIVES TO STFT
Image from: http://ataspinar.com/2018/12/21/a-guide-for-using-
the-wavelet-transform-in-machine-learning/

STFT has drawbacks such as the resolution


constraints of time vs. frequency
There are alternatives, such as
­Wavelet Transform (WT)
­Constant-Q Transform (CQT)
The main aim is to reduce frequency
resolution at higher frequencies
­Frequency bins gets larger in the high end Time series and various transforms

AIST2010 L3 — AUDIO ANALYSIS AND VISUALIZATION 20


LECTURE REVIEW
The lecture is half-way done… with these discussed:
­How can sounds be represented mathematically
­The transform between time and frequency domain
­ Continuous vs. Discrete transforms
­Different settings of FFT
­Further possible analysis based on FFT

In the next half of this lecture, we will learn basic MATLAB


programming!
AIST2010 L3 — AUDIO ANALYSIS AND VISUALIZATION 21
READ FURTHER
Chapter 7, “Frequency-Domain Techniques”, Computer Music
Instruments
Chapter 2, “Fourier Analysis of Signals”, Fundamentals of Music
Processing

AIST2010 L3 — AUDIO ANALYSIS AND VISUALIZATION 22

You might also like