0% found this document useful (0 votes)

30 views

D2 Report 2022JTM2399

Uploaded by

BIDISHA MISRA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views

D2 Report 2022JTM2399

Uploaded by

BIDISHA MISRA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

ELL720:- Advanced Digital Signal Processing

Design Problem 2
Pitch Detection

Pitch or fundamental frequency is the lowest frequency component of a signal. The

pitch period is inverse of fundamental frequency , is the smallest repeating unit of a
signal. One such period describes the periodic signal (voiced part of the speech)
completely.

Extracting Pitch in Time domain:-

Most of the time-domain pitch period estimation techniques use auto-correlation
function (ACF).

Auto-correlation function :-
The basic idea of correlation-based pitch tracking is that the correlation signal will
have a peak of large magnitude at a lag corresponding to the pitch period.

ACF for a signal s[n] is computed as :--

Where ,
N - total number of samples in a window
K -Lag index
And N- should be as small as possible to show time variation
& N should be large enough to cover at least 2 periods so that periodicity can be captured by R[k].

Properties of R[k] :-
 Same periodicity as of s[m]
 Maximum value at k=0 and R[0] is equal to energy of deterministic signal.
 If s[m] is periodic with period of P samples , R[k] has maximum at k= 0, +P ,+2P , ………
Auto-correlation method

Original signal s4.wav is taken here to plot it in time domain and with sample
number.

 As seen from the above plot , the speech sample have spoken words and silence
in between different ranges.
 So , I am taking the spoken ranges of 17000 to 20000 , 7000 to 10000 , 3000 to
6000 and 12000 to 15000.

 I have also used Low -pass -filter before processing the speech signal .
 The filtered signal is shown above.

Speech segment from 17000 to 20000 sample number

pitch_Hz_17k_20k = 666.6667 hz (without filtering)

pitch_Hz_17k_20k = 571.4286 hz (with filtering)

Speech segment from 7000 to 10000 sample number

pitch_Hz_7k_10k = 533.3333 Hz (without filtering)

pitch_Hz_7k_10k = 533.3333 Hz (with filtering)
Speech segment from 3000 to 6000 sample number

pitch_Hz_3k_6k = 470.882 Hz (without filtering)

pitch_Hz_7k_10k = 470.882 Hz (with filtering)

Speech segment from 12000 to 15000 sample number

pitch_Hz_12k_15k = 533.3333 Hz (without filtering)

pitch_Hz_12k_15k = 533.3333 Hz (with filtering)

Now its the turn of s3.wav file sample
pitch_Hz_3.5k_5.5k =615.3846 Hz
pitch_Hz_8.2k_10.2k = 444.4444 Hz

pitch_Hz_13k_15k = 615.3846 Hz
pitch_Hz_22k_24k = 400 Hz

For s1.wav file :-

But the above method is doing pitch detection just for 3-4 ranges of the spoken
speech signal.
Now, another way to find the pitch for every 10ms of slots in whole of 3 second
speech signal.

Average pitch of s1.wav signal = 349.3121 Hz

For s2.wav file :-

Average pitch of s2.wav signal = 305.9340 Hz

For s3.wav file :-

Average pitch of s3.wav signal = 356.8377 Hz

For s4.wav file :-

average_pitch = 287.8709 Hz
Cepstral Method

Pitch Detection using Cepstral Method

Pitch detection is often done in the Cepstral domain because the Cepstral
domain represents the frequency in the logarithmic magnitude spectrum of a signal.
The Cepstrum is formed by taking the FFT (or IFFT) of log magnitude spectrum of a
signal. The reason for using the FFT or IFFT interchangeably is because one will just
give you a reversed version of the other, so each is equally valid for the processing
we wish to do.
Once in the cepstral domain, the pitch can be estimated by picking the peak
of the resulting signal within a certain range. The Cepstrum is given in term of
“quefrency” which, besides being a terrible name, represents pitch lag. Therefore,
the lag at which there is the most energy represents the dominant frequency in the
log magnitude spectrum thereby giving you the pitch.
There are of course some caveats to this approach. First of all, pitch and
fundamental frequency are not actually the same thing, so depending on which peak
your algorithm picks, you may be getting F0 (the fundamental) of FI (one of the
formants). Secondly, the Cepstrum is time shift variant. Therefore, you cannot just
apply this method blindly. Instead, you need to precisely line up your time domain
windows such that they start and stop exactly over a voiced speech segment. This is
not a trivial task as most VADs (Voice activity detection) often have errors and thus
your cepstrum will suffer from phase ambiguity.

Steps used :

1. Load the wav file

2. Define frame length , no. of samples, sampling frequency .
3. Evaluate FFT of every frame
4. Evaluate magnitude spectrum of every frame
5. Fid log magnitude of above spectrum
6. Find IFFT of above log -magnitude ==== this is the cepstrum we want.
7. As cepstrum is symmetric , used the half of this array

8. Apply "HIGH TIME LIFTERING" to get the pitch frequency

9. High Time liftered cepstrum

10. perform matrix multiplication of the half_cepstrum and liftering window

11. Simple Voice Activity detection

if mean(power_spectrum) >= 1 % an experimental value, most likely to fail

on other inputs
voiced_pitch_freq(length(voiced_pitch_freq)+1) = pitch_frequency; % record
frames identified as voiced
end

S1 file :-

Pitch frequency = 382.0938 (without VAD condition)

final_pitch_freq = 239.0619 Hz (with VAD condition)

S2 file :-
Pitch frequency =1.5472e+03 Hz (without VAD condition)
final_pitch_freq = 346.1053 Hz (with VAD condition)
S3 file :-

Pitch frequency = 471.3674 Hz without VAD condition

final_pitch_freq =250.8600 Hz (with VAD condition)

S4 file :-

Pitch frequency = 1.2335e+03 Hz (without VAD condition)

final_pitch_freq =309.1556 Hz (with VAD condition)

Comparision of above two methods :-

Pitch detection is a common task in digital signal processing that involves identifying
the fundamental frequency of a sound signal. There are several methods for pitch
detection, including the autocorrelation method and the cepstral method. Here's a
brief comparison of these two methods: -

Auto-correlation method:

In the auto-correlation method, the pitch of a sound signal is estimated by

calculating the auto-correlation function of the signal. The auto-correlation function
measures the similarity between a signal and a time-delayed version of itself. The
pitch of the signal is then estimated by identifying the delay that maximizes the
auto-correlation function.
Pros:
The auto-correlation method is simple to implement and computationally efficient.
It is a widely used method for pitch detection and can work well for many types of
sounds.
Cons:
The autocorrelation method can be sensitive to noise and harmonics that are not
related to the fundamental frequency.
It may not work well for complex signals with multiple sources and non-harmonic
components.

Cepstral method:

The cepstral method involves transforming the sound signal into the cepstral domain,
which is a logarithmic representation of the power spectrum of the signal. The
fundamental frequency can be estimated by analyzing the peaks in the cepstral
spectrum.
Pros:
The cepstral method is less sensitive to noise and harmonic interference compared
to the auto-correlation method.
It can work well for complex signals with multiple sources and non-harmonic
components.
Cons:
The cepstral method is computationally more expensive than the auto-correlation
method.
It may not work well for signals with low-frequency content, as the cepstral method
is based on logarithmic representation.
In summary, both the auto-correlation and cepstral methods have their strengths
and weaknesses, and the choice of method depends on the characteristics of the
sound signal and the specific application requirements. The auto-correlation method
is a good choice for simple signals with few sources and harmonic components,
while the cepstral method is more suitable for complex signals with multiple sources
and non-harmonic components.

Cs170 Solutions Manual
No ratings yet
Cs170 Solutions Manual
66 pages
A Tutorial To Extract The Pitch in Speech Signals Using Autocorrelation
No ratings yet
A Tutorial To Extract The Pitch in Speech Signals Using Autocorrelation
11 pages
Cepstrum Pitch Determination: OICED-speech Sounds Result From The Resonant
100% (1)
Cepstrum Pitch Determination: OICED-speech Sounds Result From The Resonant
17 pages
Pitch Estimation Using A Full/Multi-Band Approaches: Mikhail Tadjikov, Arya Ahmadi
No ratings yet
Pitch Estimation Using A Full/Multi-Band Approaches: Mikhail Tadjikov, Arya Ahmadi
5 pages
Chapter 4: Pitch Estimation For Music Signal Processing: KH Wong
No ratings yet
Chapter 4: Pitch Estimation For Music Signal Processing: KH Wong
33 pages
LPC Vocoder: 1-Introduction
No ratings yet
LPC Vocoder: 1-Introduction
12 pages
Pitch Tracking: 1. Pitch Tracking 2. Spectral Approaches 3. Time Domain 4. Example Algorithms
No ratings yet
Pitch Tracking: 1. Pitch Tracking 2. Spectral Approaches 3. Time Domain 4. Example Algorithms
18 pages
Cepstral Signal Analysis For Pitch Detection: 1.1 Definition of The Cepstral Coefficients
No ratings yet
Cepstral Signal Analysis For Pitch Detection: 1.1 Definition of The Cepstral Coefficients
3 pages
Pitch Detection of Speech Signals (Project Report)
No ratings yet
Pitch Detection of Speech Signals (Project Report)
9 pages
A Pitch Detection Method Based On Continuous Wavelet Transform For Harmonic Signal
No ratings yet
A Pitch Detection Method Based On Continuous Wavelet Transform For Harmonic Signal
10 pages
ELL720 Design Problem 2
No ratings yet
ELL720 Design Problem 2
1 page
Analysisof Speech Signal 29 TH October 2018
No ratings yet
Analysisof Speech Signal 29 TH October 2018
16 pages
Pitch
No ratings yet
Pitch
6 pages
Developing A MATLAB Code For Fundamental Frequency and Pitch Estimation From Audio Signal
No ratings yet
Developing A MATLAB Code For Fundamental Frequency and Pitch Estimation From Audio Signal
16 pages
Pitch Detection of Voice Signals
No ratings yet
Pitch Detection of Voice Signals
24 pages
Eup - C - 08-Pitch Tracking
No ratings yet
Eup - C - 08-Pitch Tracking
10 pages
Pitch Detection Algorithms
No ratings yet
Pitch Detection Algorithms
21 pages
Image Enhancement
No ratings yet
Image Enhancement
14 pages
An Automatic Speaker Recognition System
100% (1)
An Automatic Speaker Recognition System
11 pages
DSP Lab Report
No ratings yet
DSP Lab Report
19 pages
speech processing pbl
No ratings yet
speech processing pbl
13 pages
Cepstrum vs. LPC: A Comparative Study For Speech Formant Frequencies Estimation
No ratings yet
Cepstrum vs. LPC: A Comparative Study For Speech Formant Frequencies Estimation
16 pages
Robust and Efficient Pitch Tracking For Query-by-Humming: Yongwei Zhu, Mohan S Kankanhalli
No ratings yet
Robust and Efficient Pitch Tracking For Query-by-Humming: Yongwei Zhu, Mohan S Kankanhalli
5 pages
Vocal Pitch Detection For Musical Transcription PDF
No ratings yet
Vocal Pitch Detection For Musical Transcription PDF
3 pages
Padovani
No ratings yet
Padovani
4 pages
Homework 1
No ratings yet
Homework 1
3 pages
Speech Feature Extraction
No ratings yet
Speech Feature Extraction
9 pages
MFCC PDF
No ratings yet
MFCC PDF
14 pages
Robust Pitch Detection Using DCT Based Spectral Autocorrelation
No ratings yet
Robust Pitch Detection Using DCT Based Spectral Autocorrelation
20 pages
A Practical Handbook of Speech Coders
No ratings yet
A Practical Handbook of Speech Coders
15 pages
Instantaneous Pitch Estimation Algorithm Based On Multirate Sampling
No ratings yet
Instantaneous Pitch Estimation Algorithm Based On Multirate Sampling
5 pages
Encoder The Block Diagram of The Encoder Described As Follows
No ratings yet
Encoder The Block Diagram of The Encoder Described As Follows
32 pages
Bros Sier 04 Fast Notes
No ratings yet
Bros Sier 04 Fast Notes
6 pages
7.0 Speech Signals and Front-End Processing: References: 1. 3.3, 3.4 of Becchetti
No ratings yet
7.0 Speech Signals and Front-End Processing: References: 1. 3.3, 3.4 of Becchetti
50 pages
Gender Recognition by Speech Analysis
No ratings yet
Gender Recognition by Speech Analysis
24 pages
Digital Signal Processing "Speech Recognition": Paper Presentation On
No ratings yet
Digital Signal Processing "Speech Recognition": Paper Presentation On
12 pages
FFT Research
No ratings yet
FFT Research
8 pages
Unit 2 - Speech and Video Processing (SVP) - 1
No ratings yet
Unit 2 - Speech and Video Processing (SVP) - 1
23 pages
13MFCC Tutorial
No ratings yet
13MFCC Tutorial
6 pages
Voice Recognition
No ratings yet
Voice Recognition
6 pages
Mel Frequency Cepstral Coefficient (MFCC) - Guidebook - Informatica e Ingegneria Online
No ratings yet
Mel Frequency Cepstral Coefficient (MFCC) - Guidebook - Informatica e Ingegneria Online
12 pages
DSP lab mini project (1)
No ratings yet
DSP lab mini project (1)
7 pages
Frequency-Domain Techniques For High-Quality Voice Modification
No ratings yet
Frequency-Domain Techniques For High-Quality Voice Modification
5 pages
Cepstrum: Origin and Definition
No ratings yet
Cepstrum: Origin and Definition
4 pages
lec36
No ratings yet
lec36
13 pages
03 MFCC
No ratings yet
03 MFCC
50 pages
Pitch Estimation Explanation
No ratings yet
Pitch Estimation Explanation
15 pages
Chapter6 - SPEECH SIGNAL PROCESSING
No ratings yet
Chapter6 - SPEECH SIGNAL PROCESSING
54 pages
Sound Lab: Power Spectra: Background
No ratings yet
Sound Lab: Power Spectra: Background
4 pages
Comparative Pitch Detectors
No ratings yet
Comparative Pitch Detectors
20 pages
03_audio
No ratings yet
03_audio
32 pages
Week2 - Fourier Series - The Math Behind The Music - V1
No ratings yet
Week2 - Fourier Series - The Math Behind The Music - V1
5 pages
Index
No ratings yet
Index
7 pages
1 BTP Report 07010245 Suraj S Sheth16april2011
No ratings yet
1 BTP Report 07010245 Suraj S Sheth16april2011
35 pages
Lecture 7 - Automatic Speech Recognition
No ratings yet
Lecture 7 - Automatic Speech Recognition
58 pages
WCI2015 F0 Algorithm Prarthana Gladis Bala Author Manuscript
No ratings yet
WCI2015 F0 Algorithm Prarthana Gladis Bala Author Manuscript
5 pages
Filter Bank: Insights into Computer Vision's Filter Bank Techniques
From Everand
Filter Bank: Insights into Computer Vision's Filter Bank Techniques
Fouad Sabry
No ratings yet
A Beginner's Guide to Ham Radio
From Everand
A Beginner's Guide to Ham Radio
George Freeman
No ratings yet
Adaptive Filter: Enhancing Computer Vision Through Adaptive Filtering
From Everand
Adaptive Filter: Enhancing Computer Vision Through Adaptive Filtering
Fouad Sabry
No ratings yet
Learn Amateur Radio Electronics on Your Smartphone
From Everand
Learn Amateur Radio Electronics on Your Smartphone
Clive W. Humphris
No ratings yet
Analog Dialogue, Volume 45, Number 4: Analog Dialogue, #4
From Everand
Analog Dialogue, Volume 45, Number 4: Analog Dialogue, #4
Analog Dialogue
No ratings yet
Assignment 5
No ratings yet
Assignment 5
5 pages
Passive Low Pass and High Pass Filter
100% (5)
Passive Low Pass and High Pass Filter
30 pages
Tutorial - Minimization of DFA States
No ratings yet
Tutorial - Minimization of DFA States
2 pages
DTC Assignment Unit 3
No ratings yet
DTC Assignment Unit 3
2 pages
Supply Chain Error Analysis Problems
No ratings yet
Supply Chain Error Analysis Problems
4 pages
DE ZG535 (23-S2) - Session 1 (13 Jan 2024)
No ratings yet
DE ZG535 (23-S2) - Session 1 (13 Jan 2024)
7 pages
Assignment Problems Exercise
No ratings yet
Assignment Problems Exercise
7 pages
Semintro 12
No ratings yet
Semintro 12
10 pages
Chapter 8 IMP-Internal Model Principle and Repetitive Control
No ratings yet
Chapter 8 IMP-Internal Model Principle and Repetitive Control
17 pages
Laboratory 11: Dual-Tone Multifrequency: Instructor: MR Ammar Naseer EE UET New Campus
No ratings yet
Laboratory 11: Dual-Tone Multifrequency: Instructor: MR Ammar Naseer EE UET New Campus
2 pages
Questions - Quiz - 8th - Maths - 2023-12-15T1200
No ratings yet
Questions - Quiz - 8th - Maths - 2023-12-15T1200
3 pages
Os Record Print
No ratings yet
Os Record Print
42 pages
Dijkstra's Shortest Path Algorithm
No ratings yet
Dijkstra's Shortest Path Algorithm
3 pages
Digital Communication (1)
No ratings yet
Digital Communication (1)
57 pages
DAA IA1 Updated
No ratings yet
DAA IA1 Updated
14 pages
Full Stack Lab - MANUAL
0% (1)
Full Stack Lab - MANUAL
53 pages
BSC - Computer Science Cs - Semester 3 - 2022 - April - Data Structures and Algorithms I 2019 Pattern
No ratings yet
BSC - Computer Science Cs - Semester 3 - 2022 - April - Data Structures and Algorithms I 2019 Pattern
2 pages
Lecture 4 - Simplex Method in Tableau (2021)
No ratings yet
Lecture 4 - Simplex Method in Tableau (2021)
22 pages
Lab Report Template
No ratings yet
Lab Report Template
3 pages
Recurrence Relation-1
No ratings yet
Recurrence Relation-1
7 pages
UNIT-2 Foundations of Deep Learning
No ratings yet
UNIT-2 Foundations of Deep Learning
64 pages
Chapter 7 Topics
No ratings yet
Chapter 7 Topics
4 pages
Lecture 2.1 - Image Processing Image Filtering: Idar Dyrdal
No ratings yet
Lecture 2.1 - Image Processing Image Filtering: Idar Dyrdal
38 pages
Density Based Clustering Algorithm
No ratings yet
Density Based Clustering Algorithm
25 pages
DSP Lab 2
No ratings yet
DSP Lab 2
4 pages
OS CAT II - Important Questions
No ratings yet
OS CAT II - Important Questions
4 pages
Mohini Dey - Capstone
No ratings yet
Mohini Dey - Capstone
52 pages
7.1. Introduction To Advanced Audio Coding
No ratings yet
7.1. Introduction To Advanced Audio Coding
34 pages
10 Job Sequencing with Deadline
No ratings yet
10 Job Sequencing with Deadline
3 pages

D2 Report 2022JTM2399

Uploaded by

D2 Report 2022JTM2399

Uploaded by

ELL720:- Advanced Digital Signal Processing

Pitch or fundamental frequency is the lowest frequency component of a signal. The

Extracting Pitch in Time domain:-

ACF for a signal s[n] is computed as :--

Speech segment from 17000 to 20000 sample number

pitch_Hz_17k_20k = 666.6667 hz (without filtering)

pitch_Hz_17k_20k = 571.4286 hz (with filtering)

pitch_Hz_7k_10k = 533.3333 Hz (without filtering)

pitch_Hz_3k_6k = 470.882 Hz (without filtering)

pitch_Hz_7k_10k = 470.882 Hz (with filtering)

Speech segment from 12000 to 15000 sample number

pitch_Hz_12k_15k = 533.3333 Hz (without filtering)

pitch_Hz_12k_15k = 533.3333 Hz (with filtering)

For s1.wav file :-

Average pitch of s1.wav signal = 349.3121 Hz

Average pitch of s2.wav signal = 305.9340 Hz

Average pitch of s3.wav signal = 356.8377 Hz

Pitch Detection using Cepstral Method

1. Load the wav file

8. Apply "HIGH TIME LIFTERING" to get the pitch frequency

10. perform matrix multiplication of the half_cepstrum and liftering window

if mean(power_spectrum) >= 1 % an experimental value, most likely to fail

Pitch frequency = 382.0938 (without VAD condition)

Pitch frequency = 471.3674 Hz without VAD condition

final_pitch_freq =250.8600 Hz (with VAD condition)

Pitch frequency = 1.2335e+03 Hz (without VAD condition)

final_pitch_freq =309.1556 Hz (with VAD condition)

In the auto-correlation method, the pitch of a sound signal is estimated by

You might also like