0% found this document useful (0 votes)

8 views

Lec 65

Uploaded by

sanga mithra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

Lec 65

Uploaded by

sanga mithra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Biomedical Signal Processing

Prof. Sudipta Mukhopadhyay

Department of Electrical and Electronics Communication Engineering
Indian Institute of Technology, Kharagpur

Lecture - 65
Tutorial - V (Contd.)

In the third experiment of the fifth set, we are given a signal ‘safety.wav’.

(Refer Slide Time: 00:26)

It is the occurrence of the word ‘safety’ by a male speaker, and it is sampled at 8

kilohertz frequency. The signal also has some amount of background noise. Now the first
part is to segment the signal into voiced, unvoiced and the silence part, and for that we
have to use the short time RMS value, turns count, or zero crossing rate.

Next, we have to compute the PSD of each of the segment and study its characteristics.
The first part of it that is computing the RMS value, turns count and zero crossing rate
we have already done. So, we would not repeat that part. We directly assume that we
have those routines with us, we will make use of them and apply it here for segmenting
the signal into 3 parts, voiced, unvoiced and whatever is not taken into either of them
that is our that silence.
(Refer Slide Time: 01:57)

So, first we start with collecting the signal ‘safety.wav’, and the MATLAB code to read
it that is ‘safety.m’. And we kept both of them in the working directory of the MATLAB.

(Refer Slide Time: 02:21)

Now, let us read that signal for the purpose of processing. So, here we make use of a
different command ‘audioread’ to read the wav file ok, and sampling frequency is given
8 kilohertz. So, first we compute the number of samples using the command ‘length’ of
the variable sound ‘x’ where we have stored the wav file ‘safety.wav’. And then using
that we compute the time axis to plot it.
And here first part one colon this is you can say ‘slen’, the length of the variable sound
‘x’, we multiply it with 1 by fs to get the time, the first part is giving the sample numbers
only. Then we plot that with respect to the time and we label the two axes that time in
seconds and y-axis give us the sound.

From the book, we get actually what should be the boundaries of the different phonemes;
first part is ‘S’ it is 0.2 to 0.35, then ‘E’ 0.4 to 0.7, ‘F’ 0.75 to 0.95 second, then ‘T’ is
1.087 to 1.11 second and ‘I’ is 1.11 to 1.2 seconds. And in between silences are there, at
the beginning there is small silence, at the end also there is silence.

So, the same we can get, but we need to keep in mind that this is what we get from the
book this is not what we have computed ok. So, here is the plot is given and the red lines
as showing those boundaries. The initial part here is the silence, at the end we have
silence. Here we have the phoneme S, then we have E, then we have F, then we have the
small part that is T, then we have I.

Out of that, E and I they are voiced sounds or vowels, the other three that is S, F and T
they are consonants or unvoiced sounds. So, our first task would be to segment these
phonemes and then we would be able to compute the PSD and compare the different
kind of sounds and this part that is what we called silent, they are not exactly silent, the
background noise is present there. So, those three parts, in terms of their PSD, we need
to compare, ok.

(Refer Slide Time: 06:57)

So, first we give the plot of the signal and RMS value, turns count, and zero crossing
rate. Now, already in tutorial 4.3, we have shown how to compute the RMS value, turns
count and zero crossing rate. So, we simply make use of them rather than explaining that
once more.

So, first when you look at the RMS value what we note that for the vowel sounds E and
I, the RMS value is high, and for both consonant as well as the silent period, the RMS
value is small. For consonants, the turns count is high. It is small for the silent period. It
is intermediate for the RMS that vowel sounds. Between the voiced and the and unvoiced
part, zero crossing rate gives a better differentiation. Here the zero-crossing rate is low
for the vowel and it is high for the unvoiced sound.

So, that can be noted and we have to make use of these three to decide that which part is
voiced, which part is unvoiced, ok.

(Refer Slide Time: 09:14)

So, let us proceed with that. So, first what we do? we do the observation of the signal and
we find out couple of thresholds. What we find that if we can take the RMS value, RMS
value is the vector which we have plotted in the previous page, if it is more than 0.042,
and zero crossing rate is less than 10. Then it is voiced sound.

And for the unvoiced sound the RMS value is low 0.0665, on the other hand that turns
count is more than 4, and zero crossing rate is 8. Why we are taking the both? To make
sure that the silent period does not get included in the unvoiced sound. If we just look at
the RMS value then the chances are that we may take silent period also in the unvoiced
sound. So, now, these two parts we have actually separated and the two variables
‘voicedSig’ and ‘unvoicedSig’ are capturing the value in terms of 1 or 0.

So, what do we do to see them that what is their span. First, we create the pane with
command ‘figure’ then plot the sound with respect to the time, then hold the plot that
means, we want to actually overwrite on the same plot and we plot the voice signal, we
have a small scaling to place it in appropriate level and we use green color to draw the
span of the voice signal. Wherever the voice signal is present it will draw it like this ok,
rest of the part it would be 0.

The similar way wherever the unvoiced signal is there, it will have a draw a rectangle
with green color, ok. And the x-axis is in seconds, on y-axis, we have the sound and we
have to three legends for three signals, one is input signal, voiced, and unvoiced, ok. So,
with this, you go for having the plot.

(Refer Slide Time: 12:48)

Here, we show first the actual signal and the ground truth then we draw the voiced and
the unvoiced part, ok. So, we can just go back and forth, we see for ‘S’, I think we have
it more or less accurate, for ‘E’ also, we have close I think segmentation. On the other
hand, if you look at the phoneme ‘F’, this is the actual boundary and we got actually
much smaller, ok.
Let us look at the other two phonemes ‘T’ and ‘I’. We get that the segments what we
have created, they are close to the what is given in the book. So, what we get that for out
of the 5 phonemes, we have good segment created except for ‘F’. And if we have to
segment it in a better way, we need to then fine tune those parameters and have a better
segmentation of it. However, that given this formula, this is a segmentation we get, so we
will go ahead with these segments for further analysis.

(Refer Slide Time: 14:49)

So, first we plot the unvoiced sound. So, for that what we do that we find out the
transitions. The first one is this is unvoiced signal. So, we look at the transition of the
unvoiced signal wherever it is not 0 when we take the difference signal, ok. So, we are
getting the transitions there and the first two transitions we note that gives us the duration
of the signal ‘S’.

Next, we pick up the corresponding part from the variable sound ‘x’, we take that part
and we plot that part ok, ‘sWave’ and here we are showing the time axis. So, we get the
time domain plot of the signal ‘S’, ok. We get that it looks like a random signal, jagged
signal, amplitude is not very high. Now, we will go for the other waves in a similar way.
(Refer Slide Time: 16:54)

So, before that we look at the power spectrum. We have already seen that how to get the
power spectrum, we need to take the FFT, then we need to take the square of the absolute
of each of the coefficient, normalize it with the length of the signal and then we need to
plot in the db scale and for that we need to have the frequency axis that is what is done
here.

And here, we get the spectrum of ‘S’, we get its more or less flat kind of spectra, no
prominent peaks are there though undulations are there compared to the DC value. You
see that the peaks what you are getting here that is about 20 to 30 dB below. So, they are
not actually peak.

So, same way we can get the spectrum of other phonemes.

(Refer Slide Time: 18:22)

And here, we show it for different phonemes. First, we look for the voice signal ‘E’. You
see the amplitude is more in this case and when we look at the spectrum, we get couple
of peaks are there, very prominent peaks are there. So, that is the specialty of that
phoneme ‘E’ what we notice.

(Refer Slide Time: 19:07)

Now, let us move forward go for that next phoneme that is ‘F’, for ‘F’ again, the signal
amplitude is low. For the corresponding PSD, what we get that it is having not very
prominent peak ok, though it is jagged, we do not get any high peak, ok.
(Refer Slide Time: 19:41)

Next, we go for the next phoneme that is ‘T’ again it is an unvoiced signal. T comes with
a huge change that we see that suddenly the signal appears ok, and then it goes down.
When you look at the PSD, we do not get any sharp peak here.

(Refer Slide Time: 20:21)

Now, the next phoneme would be ‘I’, here we get higher amplitude and more or less
regular shape in the time domain for the signal ‘I’. And if we look at the spectral domain
again, we get couple of peaks are there, ok. So, that is the specialty of the voiced
phoneme ‘I’.
(Refer Slide Time: 20:57)

Now, next look at the silent portion, we have taken a silent portion, we see silent portion
is really random and if you look at the PSD, it is very much jagged that will not get any
pattern at all in it. So, with that we complete our observations that the different kind of
spectrum we get now. We conclude upon what we have seen.

(Refer Slide Time: 21:34)

The first thing what we get that voiced and unvoiced signals can be segmented by
thresholding the RMS value, turns count, and zero crossing rate. So, using those three,
we can have separation between the voiced and the unvoiced signal. However, that in
case of the voiced sound ‘E’ and ‘I’, when you look at the spectrum, we get that at
certain frequencies, we have the peaks. In fact, for both of them in time domain, we get
some repeated waveform and that gives rise to the concentration of energy at certain
frequency in the PSD.

On the other hand, for the random kind of time domain waveform we get for the
unvoiced signal. Here we have three; S, F and T, for all three cases, there is nothing
specific in the time domain. In the same way, in the frequency domain, we do not get any
peak in the PSD, ok. So, that is the signature of the unvoiced sound or the consonants.

Thank you.

AOCS Method Cd12b-92 - Estabilidade Oxidativa
100% (2)
AOCS Method Cd12b-92 - Estabilidade Oxidativa
5 pages
Worksheet 11: Pythagoras' Theorem and Similar Shapes: Core Revision Exercises: Shape, Space and Measures
No ratings yet
Worksheet 11: Pythagoras' Theorem and Similar Shapes: Core Revision Exercises: Shape, Space and Measures
3 pages
David Novak, Matt Sakakeeny (Eds.) - Keywords in Sound-Duke University Press (2015)
100% (1)
David Novak, Matt Sakakeeny (Eds.) - Keywords in Sound-Duke University Press (2015)
266 pages
Hands-On Lab On Speech Processing-Time-domain Processing - 2021
No ratings yet
Hands-On Lab On Speech Processing-Time-domain Processing - 2021
11 pages
Chapter6 - SPEECH SIGNAL PROCESSING
No ratings yet
Chapter6 - SPEECH SIGNAL PROCESSING
54 pages
Digital Signal Processing: Course
No ratings yet
Digital Signal Processing: Course
47 pages
Speech Lab
No ratings yet
Speech Lab
7 pages
Homework 1
No ratings yet
Homework 1
3 pages
DSP Project 2
No ratings yet
DSP Project 2
10 pages
Speech Acoustics Project
No ratings yet
Speech Acoustics Project
22 pages
Lab7 Time-Frequency+Analysis+of+Signals PDF
No ratings yet
Lab7 Time-Frequency+Analysis+of+Signals PDF
16 pages
Firs DSP
No ratings yet
Firs DSP
24 pages
Digital Signal Processing "Speech Recognition": Paper Presentation On
No ratings yet
Digital Signal Processing "Speech Recognition": Paper Presentation On
12 pages
EC39201_Expt4_Lab Report_Grp-24
No ratings yet
EC39201_Expt4_Lab Report_Grp-24
5 pages
lab9a
No ratings yet
lab9a
12 pages
Lab9: Speech Synthesis
No ratings yet
Lab9: Speech Synthesis
13 pages
2.2 Speech Processing: - Speech Synthesis. - Speech Recognition. - Speech Coding
No ratings yet
2.2 Speech Processing: - Speech Synthesis. - Speech Recognition. - Speech Coding
7 pages
List of Figures: Second Unit: Audio and Speech Descriptors
No ratings yet
List of Figures: Second Unit: Audio and Speech Descriptors
22 pages
Generating Audio Signal and Performing Different Operations On Recorded Signal
No ratings yet
Generating Audio Signal and Performing Different Operations On Recorded Signal
4 pages
Silence Removal
No ratings yet
Silence Removal
3 pages
Review On ELEC333: Spring 2011 Nico & Wilber
No ratings yet
Review On ELEC333: Spring 2011 Nico & Wilber
63 pages
46 Silence PDF
No ratings yet
46 Silence PDF
8 pages
Project Ncsi 24
No ratings yet
Project Ncsi 24
3 pages
Acoustics of Speech: Julia Hirschberg CS 4706
No ratings yet
Acoustics of Speech: Julia Hirschberg CS 4706
30 pages
Lab 04: Synthesis of Sinusoidal Signals-Music Synthesis: Signal Processing First
No ratings yet
Lab 04: Synthesis of Sinusoidal Signals-Music Synthesis: Signal Processing First
12 pages
673random Signal Analysi Final1
No ratings yet
673random Signal Analysi Final1
6 pages
Vowel Identification
No ratings yet
Vowel Identification
15 pages
AudioProcessing[1]
No ratings yet
AudioProcessing[1]
17 pages
Lab2 Cepstrales Sin Cepstrales
No ratings yet
Lab2 Cepstrales Sin Cepstrales
21 pages
SR_Lab File
No ratings yet
SR_Lab File
64 pages
Speech Features
No ratings yet
Speech Features
9 pages
l4n JN Uhbh Hiunun Hbinun
No ratings yet
l4n JN Uhbh Hiunun Hbinun
36 pages
Ece503 ps03
No ratings yet
Ece503 ps03
5 pages
396
No ratings yet
396
5 pages
Week 5 Silent Discrimination
No ratings yet
Week 5 Silent Discrimination
7 pages
2
No ratings yet
2
26 pages
Audio and Speech Processing - Prof - Muralikrishna H
No ratings yet
Audio and Speech Processing - Prof - Muralikrishna H
28 pages
DSP 1
No ratings yet
DSP 1
9 pages
1.2 Signals
No ratings yet
1.2 Signals
52 pages
Lec2 Audition
No ratings yet
Lec2 Audition
37 pages
(Alli) Linear Predictive Modelling of Speech Signal
No ratings yet
(Alli) Linear Predictive Modelling of Speech Signal
25 pages
39 22EC10057 Prasit
No ratings yet
39 22EC10057 Prasit
4 pages
ADSP Assignment
No ratings yet
ADSP Assignment
2 pages
Audproc 2
No ratings yet
Audproc 2
40 pages
Speech Coding and Phoneme Classification Using Matlab and Neuralworks
No ratings yet
Speech Coding and Phoneme Classification Using Matlab and Neuralworks
4 pages
Acoustic Phonetics - The Handbook of Phonetic Sciences - Blackwell Reference Online
100% (1)
Acoustic Phonetics - The Handbook of Phonetic Sciences - Blackwell Reference Online
32 pages
7.0 Speech Signals and Front-End Processing: References: 1. 3.3, 3.4 of Becchetti
No ratings yet
7.0 Speech Signals and Front-End Processing: References: 1. 3.3, 3.4 of Becchetti
50 pages
The Diagram Outlines The Key Steps Involved in Co
No ratings yet
The Diagram Outlines The Key Steps Involved in Co
20 pages
Objective
No ratings yet
Objective
2 pages
Terez Pitch Detection Algorithm
No ratings yet
Terez Pitch Detection Algorithm
4 pages
Sns Lab 7 19-Ee-0
No ratings yet
Sns Lab 7 19-Ee-0
12 pages
Ab Star Action
No ratings yet
Ab Star Action
7 pages
The Basic Properties of Speech
0% (1)
The Basic Properties of Speech
3 pages
Module2 SSP
No ratings yet
Module2 SSP
70 pages
#TASK-3 CODE
No ratings yet
#TASK-3 CODE
3 pages
Signal Lab 3,4 2 PDF
No ratings yet
Signal Lab 3,4 2 PDF
7 pages
Abstract:: Text-Independent and Dependent Methods. in A Text
No ratings yet
Abstract:: Text-Independent and Dependent Methods. in A Text
11 pages
Dachun Sun Physics193POM Final Project Report Fa15
No ratings yet
Dachun Sun Physics193POM Final Project Report Fa15
13 pages
Ghostbuster: Tools of the Trade
From Everand
Ghostbuster: Tools of the Trade
Simon Sherlock
No ratings yet
Music Basics of Intervals…A Little Help…Please!
From Everand
Music Basics of Intervals…A Little Help…Please!
Lynette Haddock
2.5/5 (2)
The Genetic Code of All Languages,(Part-1; An Overview)
From Everand
The Genetic Code of All Languages,(Part-1; An Overview)
Moni Kanchan Panda
No ratings yet
Harmony to All: For Professionals and Non-Professional Musicians
From Everand
Harmony to All: For Professionals and Non-Professional Musicians
Diogenes Alberto Rivera
No ratings yet
Music Theory for Teenagers
From Everand
Music Theory for Teenagers
Michael Lunika
No ratings yet
Vertex Colorings Without Rainbow Subgraphs
No ratings yet
Vertex Colorings Without Rainbow Subgraphs
15 pages
Solution Manual for Canadian Organizational Behaviour, 10th Edition Steven McShane Kevin Tasa - Download Today For A Complete Reading Experience
100% (6)
Solution Manual for Canadian Organizational Behaviour, 10th Edition Steven McShane Kevin Tasa - Download Today For A Complete Reading Experience
46 pages
Jurbal Internasional
No ratings yet
Jurbal Internasional
15 pages
Movie Analysis
No ratings yet
Movie Analysis
13 pages
Krigsmann Company Profile
No ratings yet
Krigsmann Company Profile
18 pages
Ls Series Washer Extractors: Designed To Last
No ratings yet
Ls Series Washer Extractors: Designed To Last
4 pages
Journal of Computer Science and Technology: Information For Authors
No ratings yet
Journal of Computer Science and Technology: Information For Authors
3 pages
Week 1 Processor
No ratings yet
Week 1 Processor
24 pages
LK 0.1 Lembar Kerja Mandiri Modul Profesional 1
No ratings yet
LK 0.1 Lembar Kerja Mandiri Modul Profesional 1
4 pages
Reported Speech AND: Embedded Questions
No ratings yet
Reported Speech AND: Embedded Questions
18 pages
Borivali Creek
No ratings yet
Borivali Creek
3 pages
SS 1 & 2 Heat Load Calculation
No ratings yet
SS 1 & 2 Heat Load Calculation
4 pages
ERS - Early Reading Skills
No ratings yet
ERS - Early Reading Skills
11 pages
A Survey Report On Kirloskar Oil Engine Ltd. By: Prabhanshu Maheshwari
33% (6)
A Survey Report On Kirloskar Oil Engine Ltd. By: Prabhanshu Maheshwari
56 pages
PRC-0008 Current
No ratings yet
PRC-0008 Current
50 pages
Termo King SD
100% (5)
Termo King SD
178 pages
Diagrama Pit Yokogawa Ejx530a
No ratings yet
Diagrama Pit Yokogawa Ejx530a
2 pages
Cooling Coils Algoritm
No ratings yet
Cooling Coils Algoritm
16 pages
The Teaching Profession Lesson 1 3
No ratings yet
The Teaching Profession Lesson 1 3
74 pages
BESO3D (Rhino Version) Manual - Getting Started: by Z.H. Zuo
No ratings yet
BESO3D (Rhino Version) Manual - Getting Started: by Z.H. Zuo
14 pages
Altair 4 Xbulletin
No ratings yet
Altair 4 Xbulletin
6 pages
Surface EMG
No ratings yet
Surface EMG
7 pages
Navigating Through The Demands of Pre-Service Teachers in The "Now Normal" Education
No ratings yet
Navigating Through The Demands of Pre-Service Teachers in The "Now Normal" Education
8 pages
Business Research Methods .PPT by Anindya
88% (24)
Business Research Methods .PPT by Anindya
49 pages
REVIEWER Q2 Math-6
No ratings yet
REVIEWER Q2 Math-6
3 pages
Chapter 3 - Analysing The Balance in Nature
0% (1)
Chapter 3 - Analysing The Balance in Nature
30 pages
Drive by Wire
No ratings yet
Drive by Wire
4 pages

Lec 65

Uploaded by

Lec 65

Uploaded by

Biomedical Signal Processing

Prof. Sudipta Mukhopadhyay

(Refer Slide Time: 00:26)

It is the occurrence of the word ‘safety’ by a male speaker, and it is sampled at 8

(Refer Slide Time: 02:21)

(Refer Slide Time: 06:57)

(Refer Slide Time: 09:14)

(Refer Slide Time: 12:48)

(Refer Slide Time: 14:49)

So, same way we can get the spectrum of other phonemes.

(Refer Slide Time: 19:07)

(Refer Slide Time: 20:21)

(Refer Slide Time: 21:34)

You might also like