0% found this document useful (0 votes)

264 views

Speech Processing

This document provides an overview and introduction to a course on speech processing. It outlines the following key points: - The course will cover fundamental concepts in speech production, perception, analysis, recognition, synthesis and modification. It will introduce mathematical foundations and computational methods for processing speech signals. - Students will learn to analyze, visualize and manipulate speech signals, as well as build a complete speech recognition system. - The course will meet three times a week and include homework assignments, exams and a group project. Grading will be based on homework, a project, a midterm and a final exam. - Topics will include speech analysis techniques, speech and speaker recognition using Hidden Markov Models, and speech synthesis

Uploaded by

selvaraj

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

264 views

Speech Processing

Uploaded by

selvaraj

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

L1: Course introduction

Course introduction
Course logistics
Course contents

Introduction to Speech Processing | Ricardo Gutierrez-Osuna | CSE@TAMU

Course introduction
What is speech processing?
The study of speech signals and their processing methods
Speech processing encompasses a number of related areas

Speech recognition: extracting the linguistic content of the speech signal

Speaker recognition: recognizing the identity of speakers by their voice
Speech coding: compression of speech signals for telecommunication
Speech synthesis: computer-generated speech (e.g., from text)
Speech enhancement: improving intelligibility or perceptual quality of
speech signals
The music carried on until
mju:zk kr[i,]d n ntl
after midnight and then the
:ft mdnat[|, ]n[d] en[|, ]
drummers became tired and
drmz b[,]kem tad[|, ]n[d]
the dancers became cold.
d:nsz b[,]kem kld|

Introduction to Speech Processing | Ricardo Gutierrez-Osuna | CSE@TAMU

Applications of speech processing

Human computer interfaces (e.g., speech I/O, affective)

Telecommunication (e.g., speech enhancement, translation)
Assistive technologies (e.g., blindness/deafness, language learning)
Audio mining (e.g., diarization, tagging)
Security (e.g., biometrics, forensics)

Related disciplines

Digital signal processing

Natural language processing
Machine learning
Phonetics
Human computer interaction
Perceptual psychology

Introduction to Speech Processing | Ricardo Gutierrez-Osuna | CSE@TAMU

The course objectives are to familiarize students with

Fundamental concepts of speech production and speech perception
Mathematical foundations of signal processing and pattern
recognition
Computational methods for speech analysis, recognition, synthesis,
and modification

As outcomes, students will be able to

Manipulate, visualize, and analyze speech signals
Perform various decompositions, codifications, and modifications of
speech signals
Build a complete speech recognition system using state of the art tools

Introduction to Speech Processing | Ricardo Gutierrez-Osuna | CSE@TAMU

Course logistics
Class meetings
MWF 9:10-10:00am
HRBB 126

Course prerequisites
ECEN 314 or equivalent, or permission of the instructor
Basic knowledge of signals and systems, linear algebra, and probability
and statistics
Programming experience in a high-level language is required

Textbook
The course will not have an official textbook and instead will be based
on lecture slides developed by the instructor from several sources
Additional course materials may be found in the course website
http://courses.cs.tamu.edu/rgutier/csce689_s11/

Introduction to Speech Processing | Ricardo Gutierrez-Osuna | CSE@TAMU

Recommended references
J. Holmes & W. Holmes, Speech Synthesis and Recognition, 2nd Ed,
CRC Press, 2001 (available online at TAMU libraries)
P. Taylor, Text-to-speech synthesis, Cambridge University Press, 2009
L. R. Rabiner and R. W. Schafer, Introduction to Digital Speech
Processing, Foundations and Trends in Signal Processing 1(12), 2007
B. Gold and N. Morgan, Speech and Audio Signal Processing:
Processing and perception of speech and music, Wiley, 2000
T. Dutoit and F. Marques, Applied signal processing, a Matlab-based
proof-of-concept, Springer, 2009
J. Benesty, M. M. Sondhi, and Y. Huang (Eds.), Springer Handbook of
Speech Processing, 2008 (available online at TAMU libraries)
X. Huang, A. Acero and H.-W. Hon, Spoken Language Processing,
Prentice Hall, 2001

Introduction to Speech Processing | Ricardo Gutierrez-Osuna | CSE@TAMU

Grading
Homework assignments
Three assignments, roughly every 2-3 weeks
Emphasis on implementation of material presented in class
Must be done individually

Tests
Midterm and final exam
Closed-books, closed notes (cheat-sheet allowed)

Project
Team-based, in groups of up to 3 people
Three types: application of existing tools, development of new tools,
design of new algorithms
Homework
Project
Midterm
Final Exam

Weight (%)
40
30
15
15

Introduction to Speech Processing | Ricardo Gutierrez-Osuna | CSE@TAMU

Course contents

Introduction (3 lectures)

Mathematical foundations (4 lectures)

Short-time Fourier analysis and synthesis

Linear prediction of speech
Source estimation
Cepstral analysis

Speech and speaker recognition (6 lectures)

Signals and transforms

Digital filters
Probability, statistics and estimation theory
Pattern recognition principles

Speech analysis and coding (4 lectures)

Course introduction
Speech production and perception
Organization of speech sounds

Template matching
Hidden Markov models
Refinements for HMMs
Large vocabulary continuous speech recognition
The HTK speech recognition system
Speaker recognition

Speech synthesis and modification (4 lectures)

Text-to-speech front-end
Text-to-speech back-end
Prosodic modification of speech
Voice conversion

Introduction to Speech Processing | Ricardo Gutierrez-Osuna | CSE@TAMU

Tentative schedule*
Week
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

Date
1/17
1/19
1/24
1/26
1/31
2/2
2/7
2/9
2/14
2/16
2/21
2/23
2/28
3/2
3/7
3/9
3/14
3/16
3/21
3/23
3/28
3/30
4/4
4/6
4/11
4/13
4/18
4/20
4/25
4/27
5/2
5/4
5/9

Classroom meeting
No class (MLK day)
Course introduction
Speech production and perception
Organization of speech sounds
Signals and transforms
Digital filters
Short-time Fourier analysis and synthesis
Linear prediction of speech
Source estimation
Cepstral analysis
Probability, statistics, and estimation theory
Pattern recognition principles
Template matching
Hidden Markov models
Review/catch-up day
Midterm exam
Spring Break
Spring Break
Refinements for HMMs
Large vocabulary continuous speech recognition
HTK speech recognition system
Speaker recognition
Speech synthesis (front-end)
Speech synthesis (back end)
Review/catch-up day
Proposal presentations
Prosodic modification of speech
Voice conversion
Review/catch-up day
Final exam
Prep day (no class)
Reading day (no class)
Project presentations
(8:00AM - 10:00PM)

Introduction to Speech Processing | Ricardo Gutierrez-Osuna | CSE@TAMU

Materials due

HW1 assigned

HW1 due
HW2 assigned

HW2 due

HW3 assigned

HW3 due
Project proposal

Project report

*This timeline assumes MW meeting times

RADD - Requirement Analysis and Design Definition
No ratings yet
RADD - Requirement Analysis and Design Definition
7 pages
DSP Questions
No ratings yet
DSP Questions
46 pages
AIM: To Design of LPC Filter Using Levinson-Durbin Algorithm
No ratings yet
AIM: To Design of LPC Filter Using Levinson-Durbin Algorithm
3 pages
Lecture 10 - Text To Speech
No ratings yet
Lecture 10 - Text To Speech
76 pages
Lecture 9 - Speech Recognition
No ratings yet
Lecture 9 - Speech Recognition
65 pages
Wireless Communication Lesson Plan PDF
No ratings yet
Wireless Communication Lesson Plan PDF
7 pages
Exam Family
No ratings yet
Exam Family
2 pages
Image Processing
No ratings yet
Image Processing
24 pages
Course Syllabus: Assistant Professor Dr. Qadri Hamarsheh
0% (1)
Course Syllabus: Assistant Professor Dr. Qadri Hamarsheh
7 pages
Basics in Matlab-1
No ratings yet
Basics in Matlab-1
23 pages
DSP Term Project
No ratings yet
DSP Term Project
4 pages
BM304 Biomedical Signal Processing PDF
No ratings yet
BM304 Biomedical Signal Processing PDF
2 pages
A Course in Advanced Signal Processing
No ratings yet
A Course in Advanced Signal Processing
16 pages
Speech Processing
No ratings yet
Speech Processing
71 pages
Speech Recognition Full Report
No ratings yet
Speech Recognition Full Report
11 pages
Detailed Lesson Plan-Dsp
No ratings yet
Detailed Lesson Plan-Dsp
6 pages
Introduction ECE 102
No ratings yet
Introduction ECE 102
33 pages
Digital Signal Processing Lab Manual
No ratings yet
Digital Signal Processing Lab Manual
24 pages
Gujcost MRP Scheme Final
0% (1)
Gujcost MRP Scheme Final
18 pages
Final PPT On Speech Processing
0% (1)
Final PPT On Speech Processing
20 pages
Gujarat Technological University: Instructions
No ratings yet
Gujarat Technological University: Instructions
26 pages
Atal Aicte Fdp-Schedule
0% (1)
Atal Aicte Fdp-Schedule
2 pages
Projectreport-G15 Tue
100% (1)
Projectreport-G15 Tue
19 pages
03 - LN - Unit 1 - Adsp
No ratings yet
03 - LN - Unit 1 - Adsp
7 pages
Speech Processing Lab Manual
No ratings yet
Speech Processing Lab Manual
23 pages
DSP Viva Questions
0% (1)
DSP Viva Questions
2 pages
Unit 2a
No ratings yet
Unit 2a
31 pages
Spectral Estimation Notes
100% (1)
Spectral Estimation Notes
6 pages
Answer All Questions PART A - (5 2 10)
100% (1)
Answer All Questions PART A - (5 2 10)
3 pages
Operation On Signals
No ratings yet
Operation On Signals
13 pages
ME Communication Systems R 2007 Syllabus
No ratings yet
ME Communication Systems R 2007 Syllabus
27 pages
DSP Mod1@AzDOCUMENTS - in
No ratings yet
DSP Mod1@AzDOCUMENTS - in
60 pages
GTU PHD Core Syllabus CMOS Analog Circuit Design
No ratings yet
GTU PHD Core Syllabus CMOS Analog Circuit Design
1 page
Automatic Speech Recognition Documentation
No ratings yet
Automatic Speech Recognition Documentation
24 pages
CELP
No ratings yet
CELP
23 pages
Emi All Units PDF
No ratings yet
Emi All Units PDF
381 pages
Fundamentals of Lte Arunabha Ghosh
100% (1)
Fundamentals of Lte Arunabha Ghosh
3 pages
IoT-Enabling-Technologies
No ratings yet
IoT-Enabling-Technologies
17 pages
GUIDELINES FOR PREPARATION OF PROJECT REPORT - III and Above
No ratings yet
GUIDELINES FOR PREPARATION OF PROJECT REPORT - III and Above
15 pages
Computer Networks Prof. Sujoy Ghosh Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture - 9 Sonet/Sdh
No ratings yet
Computer Networks Prof. Sujoy Ghosh Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture - 9 Sonet/Sdh
38 pages
Digital Signal Processing R13 Previous Papers
100% (1)
Digital Signal Processing R13 Previous Papers
5 pages
Design of An IIR by Impulse Invariance and Bilinear Transformation
No ratings yet
Design of An IIR by Impulse Invariance and Bilinear Transformation
5 pages
PPS Course Material
100% (1)
PPS Course Material
177 pages
Introduction To Embedded System and Pic Programming in C
No ratings yet
Introduction To Embedded System and Pic Programming in C
23 pages
Basics of Switching System
No ratings yet
Basics of Switching System
6 pages
Be Computer Engineering Semester 7 2023 May Dloc III Natural Language Processing Rev 2019 C Scheme
0% (1)
Be Computer Engineering Semester 7 2023 May Dloc III Natural Language Processing Rev 2019 C Scheme
2 pages
9 Line Coding and ISI
No ratings yet
9 Line Coding and ISI
17 pages
Digital Signal Processing Sampling
No ratings yet
Digital Signal Processing Sampling
66 pages
Matlab-Intro11.12.08 Sina PDF
No ratings yet
Matlab-Intro11.12.08 Sina PDF
26 pages
Signals and Systems
No ratings yet
Signals and Systems
72 pages
Natural Language Processing:: N-Gram Language Models
No ratings yet
Natural Language Processing:: N-Gram Language Models
48 pages
Advanced DSP
50% (2)
Advanced DSP
2 pages
Data Centric Artificial Intelligence: A Beginner's Guide
No ratings yet
Data Centric Artificial Intelligence: A Beginner's Guide
137 pages
Information Theory & Source Coding
100% (1)
Information Theory & Source Coding
14 pages
DSP Objective Questions
No ratings yet
DSP Objective Questions
4 pages
DT For Strategic Innovation
No ratings yet
DT For Strategic Innovation
79 pages
(FREE PDF Sample) Digital Signal Processing Using MATLAB 3rd Edition Vinay K. Ingle Ebooks
100% (20)
(FREE PDF Sample) Digital Signal Processing Using MATLAB 3rd Edition Vinay K. Ingle Ebooks
84 pages
DSP Lab Manual For ECE 3 2 R09
100% (2)
DSP Lab Manual For ECE 3 2 R09
147 pages
Textbook of Engineering Chemistry
From Everand
Textbook of Engineering Chemistry
C. Parameswara Murthy
No ratings yet
Basic Course Material Winter 2015
100% (1)
Basic Course Material Winter 2015
19 pages
Advanced Topics in Speech Processing (IT60116) : K Sreenivasa Rao School of Information Technology IIT Kharagpur
No ratings yet
Advanced Topics in Speech Processing (IT60116) : K Sreenivasa Rao School of Information Technology IIT Kharagpur
17 pages
Speech Processing
No ratings yet
Speech Processing
5 pages
Candidate Information (Ramesh Krishnan)
No ratings yet
Candidate Information (Ramesh Krishnan)
2 pages
Test Bank For Sport and Physical Culture in Canadian Society, 2nd Edition, Jay Scherer, Brian Wilson
No ratings yet
Test Bank For Sport and Physical Culture in Canadian Society, 2nd Edition, Jay Scherer, Brian Wilson
42 pages
Celebrities in School
No ratings yet
Celebrities in School
18 pages
Gautam
No ratings yet
Gautam
41 pages
The Suggested Tests For The First Semester - Grade 6
No ratings yet
The Suggested Tests For The First Semester - Grade 6
24 pages
Kush Bansal Proposal Report
No ratings yet
Kush Bansal Proposal Report
15 pages
S5 BTech
No ratings yet
S5 BTech
12 pages
Uttarakhand Public Service Commission Deepak Forestgaurd
No ratings yet
Uttarakhand Public Service Commission Deepak Forestgaurd
3 pages
Front Pages
No ratings yet
Front Pages
8 pages
Q-4 1.C.Empanelled NSV Proividers
No ratings yet
Q-4 1.C.Empanelled NSV Proividers
8 pages
Module 6 Item Analysis and Validation
No ratings yet
Module 6 Item Analysis and Validation
21 pages
PPST Reference
No ratings yet
PPST Reference
2 pages
Analyzing Likert Data
100% (1)
Analyzing Likert Data
5 pages
NVR GL All
No ratings yet
NVR GL All
16 pages
Lesson Plan: Name of Teacher: Date and Time: Subjects: Grade & Section: Quarter: Second Quarter
No ratings yet
Lesson Plan: Name of Teacher: Date and Time: Subjects: Grade & Section: Quarter: Second Quarter
5 pages
The stelliferous fold toward a virtual law of literature s self formation 1st ed Edition Gasché - Get the ebook in PDF format for a complete experience
No ratings yet
The stelliferous fold toward a virtual law of literature s self formation 1st ed Edition Gasché - Get the ebook in PDF format for a complete experience
80 pages
0500 First Language English: MARK SCHEME For The October/November 2006 Question Paper
No ratings yet
0500 First Language English: MARK SCHEME For The October/November 2006 Question Paper
6 pages
Class Program 1 Hour
No ratings yet
Class Program 1 Hour
16 pages
Sabbir’s resume
No ratings yet
Sabbir’s resume
3 pages
2013 H2 Chemistry (9647) Syallabus For GCE A Level (Singapore)
No ratings yet
2013 H2 Chemistry (9647) Syallabus For GCE A Level (Singapore)
48 pages
Patrick Sebranek, Verne Meyer, Dave Kemper - Writers INC - A Student Handbook For Writing and Learning (2001, Houghton Mifflin) - Libgen - Li
No ratings yet
Patrick Sebranek, Verne Meyer, Dave Kemper - Writers INC - A Student Handbook For Writing and Learning (2001, Houghton Mifflin) - Libgen - Li
612 pages
Copy-of-BHC1-Kitchen-Essentials-Syllabus
No ratings yet
Copy-of-BHC1-Kitchen-Essentials-Syllabus
18 pages
Operation Manual Eng2021
No ratings yet
Operation Manual Eng2021
180 pages
SRMISBN-645-652
No ratings yet
SRMISBN-645-652
9 pages
Strategy Project
No ratings yet
Strategy Project
9 pages
Lesson - Plan - Cls - VIII - Ways of Expressing Future
No ratings yet
Lesson - Plan - Cls - VIII - Ways of Expressing Future
3 pages
Activity Manuals
No ratings yet
Activity Manuals
2 pages
Course Outline of Intro. To Law
No ratings yet
Course Outline of Intro. To Law
3 pages

Speech Processing

Uploaded by

Speech Processing

Uploaded by

L1: Course introduction

Introduction to Speech Processing | Ricardo Gutierrez-Osuna | CSE@TAMU

Speech recognition: extracting the linguistic content of the speech signal

Introduction to Speech Processing | Ricardo Gutierrez-Osuna | CSE@TAMU

Applications of speech processing

Human computer interfaces (e.g., speech I/O, affective)

Digital signal processing

Introduction to Speech Processing | Ricardo Gutierrez-Osuna | CSE@TAMU

The course objectives are to familiarize students with

As outcomes, students will be able to

Introduction to Speech Processing | Ricardo Gutierrez-Osuna | CSE@TAMU

Introduction to Speech Processing | Ricardo Gutierrez-Osuna | CSE@TAMU

Introduction to Speech Processing | Ricardo Gutierrez-Osuna | CSE@TAMU

Introduction to Speech Processing | Ricardo Gutierrez-Osuna | CSE@TAMU

Mathematical foundations (4 lectures)

Short-time Fourier analysis and synthesis

Speech and speaker recognition (6 lectures)

Signals and transforms

Speech analysis and coding (4 lectures)

Speech synthesis and modification (4 lectures)

Introduction to Speech Processing | Ricardo Gutierrez-Osuna | CSE@TAMU

Introduction to Speech Processing | Ricardo Gutierrez-Osuna | CSE@TAMU

*This timeline assumes MW meeting times

You might also like