Discrete Time Processing of Speech Signa

Uploaded by

Ananthakrishnan K

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

0% found this document useful (0 votes)

71 views12 pages

Discrete Time Processing of Speech Signa

Uploaded by

Ananthakrishnan K

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

You are on page 1/ 12

TK7862 S65 D4y 2000 Discrete-Time Processing of Speech Signals John R. Deller, Jr. Michigan State University John H. L. Hansen University of Colorado at Boulder John G. Proakis Northeastern University IEEE Signal Processing Society, Sponsor IEEE The institute of Electrical and Electronics Engineers, Inc., New York One: INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION: New York + Chichester - Weinheim + Brisbane ~ Singapore * TorontoContents Preface to the IEEE Edition xvii Preface xix Acronyms and Abbreviations rodii 1 Signal Processing Background 1 Propaedeutic 3 1.0 hl 1.2 Preamble 3 1.0.1 The Purpose of Chapter! = 3 1.0.2 Please Read This Note on Notation 4 1.0.3 For People Who Never Read Chapter I {and Those Who Do) 5 Review of DSP ‘Concepts and Notation 6 1.1.1 “Normalized Time and Frequency” 6 Singularity Signals 9 Energy and Power Signals 9 ‘Transforms and a Few Related Concepts 10 Windows and Frames 16 Discrete-Time Systems 20 Minimum, Maximum, and Mixed-Phase Signals and Systems 24 Review of Probability and Stochastic Processes 29 1.2.1 Probability Spaces 30 1.2.2 Random Variables 33 1.2.3 Random Processes 42 1.2.4 Vector-Valued Random Processes 52 Topics in Statistical Pattern Recognition 55 1.3.1 Distance Measures 56 1.3.2. The Euclidean Metric and “Prewhitening” of Features 58 UabWavbh L i 1 VAs 11 vilviii, Contents 1.3.3 Maximum Likelihood Classification 63 1.3.4 Feature Selection and Probablistic Separability Measures 66 13.5 Clustering Algorithms 70 1.4. Information and Entropy 73 1.4.1 Definitions 73 1.4.2. Random Sources 77 1.4.3 Entropy Concepts in Pattern Recognition 78 1.5 Phasors and Steady-State Solutions 79 1.6 Onward to Speech Processing 81 1.7 Problems 85 Appendices: Supplemental Bibliography 90 1.A Example Textbooks on Digital Signal Processing, 90 1.B Example Textbooks on Stochastic Processes 90 1.C Example Textbooks on Statistical Pattern Recognition 91 1.D Example Textbooks on Information Theory 91 1.E Other Resources on Speech Processing 92 LE. Textbooks 92 1.E.2 Edited Paper Collections 92 1.E.3 Journals 92 1,E.4 Conference Proceedings 93 1.F Example Textbooks on Speech and Hearing Sciences 93 1.G Other Resources on Artificial Neural Networks 94 1.G.1 Textbooks and Monographs 94 1.G.2 Journals 94 1.G.3 Conference Proceedings 95 ll Speech Production and Modeling 2 Fundamentals of Speech Science 99 2.0 Preamble 99 2.1. Speech Communication 100 2.2 Anatomy and Physiology of the Speech Production System 101 2.2.1 Anatomy 10]Contents ix 2.2.2 The Role of the Vocal Tract and Some Elementary Acoustical Analysis . 104 2.2.3 Excitation of the Speech System and the Physiology of Voicing 110 2.3 Phonemics and Phonetics 115 2.3.1 Phonemes Versus Phones 115 2.3.2 Phonemic and,Phonetic Transcription TI6 2.3.3 Phonemic and Phonetic Classification 117 2.3.4 Prosodic Features and Coarticulation 137 2.4 Conclusions 146 2.5 Problems 146 3 Modeling Speech Production 151 3.0 Preamble 151 3.1 Acoustic Theory of Speech Production 151 3.1.1 History i51 3.1.2 Sound Propagation 156 3.1.3 Source Excitation Model 159 3.1.4 Vocal-Tract Modeling — 166 3.1.5 Models for Nasals and Fricatives 186 3.2 Discrete-Time Modeling 187 3.2.1. General Discrete-Time Speech Model 187 3.2.2 A Discrete-Time Filter Model for Speech Production 192 3.2.3 Other Specch Models 197 3.3 Conclusions 200 3.4 Problems 201 3.A Single Lossless Tube Analysis 203 3.A.1 Open and Closed Terminations 203 3.A.2 Impedance Analysis, T-Network, and Two-Post Network 206 3.B Two-Tube Lossless Model of the Vocal Tract 211 3.C Fast Discrete-Time Transfer Function Calculation 217 lll Analysis Techniques 4 Short-Term Processing of Speech 225 4.1 Introduction 225x Contents 4.2 Short-Term Measures from Long-Term Concepts 226 4.2.1 Motivation 226 4.2.2 “Frames” of Speech 227 4.2.3 Approach 1 to the Derivation of a Short-Term Feature and its Two Computational Forms 227 4.2.4 Approach 2 to the Derivation of a Short-Term Feature and Its Two Computational Forms 237 4.2.5 On the Role of “1/N” and Related Issues 234 4.3 Example Short-Term Features and Applications 236 4.3.1 Short-Term Estimates of Autocorrelation 236 4.3.2 Average Magnitude Difference Function 244 4.3.3 Zero Crossing Measure 245 4.3.4 Short-Term Power and Energy Measures 246 4.3.5 Short-Term Fourier Analysis 257 4.4 Conclusions 262 4.5 Problems 263 5 Linear Prediction Analysis 266 5.0 Preamble 266 5.1 Long-Term LP Analysis by System Identification 267 5.1.1 The All-Pole Model 267 5.1.2 Identification of the Model 270 5.2 How Good Is the LP Model? 280 5.2.1 The “Ideal” and “Almost Ideal” Cases 280 5.2.2 “Nonideal” Cases 281 5.2.3, Summary and Further Discussion 287 5.3 Short-Term LP Analysis 290 5.3.1 Autocorrelation Method 290 5.3.2 Covariance Method 292 5.3.3 Solution Methods 296 5.3.4 Gain Computation 325 5.3.5 A Distance Measure for LP Coefficients 327 5.3.6 Preemphasis of the Speech Waveform 329 5.4 Alternative Representations of the LP Coefficients 331 5.4.1 The Line Spectrum Pair = 33] 5.4.2 Cepstral Parameters 333 5.5 Applications of LP in Speech Analysis 333 5.5.1 Pitch Estimation 333 5.5.2 Formant Estimation and Glottal Waveform Deconvolution 336Contents xi 5.6 Conclusions 342 5.7 Problems 343 5.A Proof of Theorem 5.1 348 5.B The Orthogonality Principle 350 6 Cepstral Analysis 352 6.1 Introduction 352 6.2 “Real” Cepsirum 355 6.2.1 Long-Term Real Cepstrum 355 6.2.2 Short-Term Real Cepstrum —_ 364 6.2.3 Example Applications of the stRC to Speech Analysis and Recognition 366 6.2.4 Other Forms and Variations on the stRC Parameters 380 6.3. Complex Cepstrum 386 6.3.1. Long-Term Complex Cepstrum 386 6.3.2 Short-Term Complex Cepstrum 393 6.3.3 Exampie Application of the stCC to Speech Analysis 394 6.3.4 Variations on the Complex Cepstrum. 397 6.4 A Critical Analysis of the Cepstrum and Conclusions 397 6.5 Problems 401 IV Coding, Enhancement and Quality Assessment 7 Speech Coding and Synthesis 409 7.1 Introduction 410 7.2 Optimum Scalar and Vector Quantization 410 7.2.1 Scalar Quantization 417 7.2.2 Vector Quantization 425 7.3. Waveform Coding 434 7.3.1 Introduction 434 73.2 Time Domain Waveform Coding 435 7.3.3 Frequency Domain Waveform Coding 451 7.3.4 Vector Waveform Quantization 457 74 Vocoders 459 7.4.1 The Channel Vocoder 460 7.4.2 The Phase Vocoder 462 74.3 The Cepstral (Homomorphic) Vocoder 462xii Contents i) 76 V7 TA 7.4.4 Formant Vocoders 469 7.4.5 Linear Predictive Coding 471 7.4.6 Vector Quantization of Model Parameters 485 Measuring the Quality of Speech Compression Techniques 488 Conclusions 489 Problems 490 Quadrature Mirror Filters 494 8 Speech Enhancement 501 8.1 8.2 8.3 8.4 8.5 8.6 Introduction 501 Classification of Speech Enhancement Methods 504 Short-Term Spectral Amplitude Techniques 506 8.3.1 Introduction 506 8.3.2 Spectral Subtraction 506 8.3.3 Summary of Short-Term Spectral Magnitude Methods 516 Speech Modeling and Wiener Filtering 517 8.4.1 Introduction 517 8.4.2 Iterative Wiener Filtering 517 8.4.3 Speech Enhancement and All-Pole Modeling = 527 8.4.4 Sequential Estimation via EM Theory 524 8.4.5 Constrained Iterative Enhancement 525 8.4.6 Further Refinements to Iterative Enhancement 527 8.4.7 Summary of Speech Modeling and Wiener Filtering 528 Adaptive Noise Canceling 528 8.5.1 Introduction 528 8.5.2 ANC Formalities and the LMS Algorithm 530 8.5.3 Applications of ANC 534 8.5.4 Summary of ANC Methods $41 Systems Based on Fundamental Frequency Tracking S4t 8.6.1 Introduction 54] 8.6.2 Single-Channel ANC 542 8.6.3 Adaptive Comb Filtering 545 8.6.4 Harmonic Selection 549 8.6.5 Summary of Systems Based on Fundamental Frequency Tracking 551Contents xii 8.7 Performance Evaluation 552 8.7.1 Introduction 552 8.7.2 Enhancement and Perceptual Aspects of Speech 552 8.7.3 Speech Enhancement Algorithm Performance 554 8.8 Conclusions 556 8.9 Problems 557 8A The INTEL System 561 8.B Addressing Cross-Talk in Dual-Channel ANC 565 9 Speech Quality Assessment 9.1 Introduction 568 9.1.1 The Need for Quality Assessment 568 9.1.2 Quality Versus Intelligibility 570 9.2 Subjective Quality Measures 570 9.2.1 Intelligibility Tests 572 9.2.2 Quality Tests 575 9.3 Objective Quality Measures 580 9.3.1 Articulation Index 582 9.3.2 Signal-to-Noise Ratio 584 9.3.3 Itakura Measure 587 9.3.4 Other Measures Based on LP Analysis 588 9.3.5 Weighted-Spectral Slope Measures 589 9.3.6 Global Objective Measures 590 9.3.7 Example Applications 59] 9.4 Objective Versus Subjective Measures 593 9.5 Problems 595 V Recognition 10 The Speech Recognition Problem 10.1 Introduction 601 10.1.1 The Dream and the Reality 601 10.1.2 Discovering Our Ignorance 604 10.1.3 Circumventing Our Ignorance 605 10.2. The “Dimensions of Difficulty” 606 10.2.1 Speaker-Dependent Versus Speaker-Independent Recognition 607 10.2.2 Vocabulary Size 607 568 601xiv Contents 10.2.3 Isolated-Word Versus Continuous-Speech Recognition 608 10.2.4 Linguistic Constraints 6/4 10.2.5 Acoustic Ambiguity and Confusability 6/9 10.2.6 Environmental Noise 620 10.3. Related Problems and Approaches 620 10.3.1 Knowledge Engineering 620 10.3.2 Speaker Recognition and Verification 621 10.4 Conclusions 621 10.5 Problems 621 11° Dynamic Time Warping 11.1 Introduction 623 11.2 Dynamic Programming 624 11.3. Dynamic Time Warping Applied to IWR = 634 11.3.1. DTW Problem and Its Solution Using DP 634 11.3.2 DTW Search Constraints 638 11.3.3 Typical DTW Algorithm: Memory and Computational Requirements 649 11.4 DTW Applied to CSR 651 11.4.1 Introduction 657 114.2 Level Building 652 11.4.3 The One-Stage Algorithm 660 11.4.4 A Grammar-Driven Connected-Word Recognition System 669 11.4.5 Pruning and Beam Search 670 11.4.6. Summary of Resource Requirements for DTW Algorithms 671 11.5 Training Issues in DTW Algorithms 672 11.6 Conclusions 674 11.7 Problems 674 12 The Hidden Markov Model 12.1 Introduction 677 12.2 Theoretical Developments 679 12.2.1 Generalities 679 12.2.2 The Discrete Observation HMM = 684 12.2.3 The Continuous Observation HMM = 705 12.2.4 Inclusion of State Duration Probabilities in the Discrete Observation HMM = 709 12.2.5 Scaling the Forward-Backward Algorithm 7/5 623 67712.3 12.4 12.5 Contents xv 12.2.6 Training with Multiple Observation Sequences 718 12.2.7 Alternative Optimization Criteria in the Training of HMMs 720 12.2.8 A Distance Measure for HMMs 722 Practical Issues 723 12.3.1 Acoustic Observations 723 12.3.2 Model Structure and Size 724 12.3.3 Training with Insufficient Data 728 12.3.4 Acoustic Units Modeled by HMMs 730 First View of Recognition Systems Based on HMMs 734 12.4.1 Introduction 734 12.4.2 IWR Without Syntax 735 12.4.3 CSR by the Connected-Word Strategy Without Syntax 738 12.4.4 Preliminary Comments on Language Modeling Using HMMs 740 Problems 740 13 Language Modeling 745 13.1 13.2 13.3 13.4 13.5 13.6 13.7 13.8 13.9 Introduction 745 Formal Tools for Linguistic Processing 746 13.2.1 Formal Languages 746 13.2.2 Perplexity of a Language 749 13.2.3 Bottom-Up Versus Top-Down Parsing 757 HMMs, Finite State Automata, and “Regular Grammars 754 A “Bottom-Up” Parsing Example 759 Principles of “Top-Down” Recognizers 764 13.5.1 Focus on the Linguistic Decoder 764 13.5.2. Foens on the Acoustic Decoder 770 13.5.3 Adding Levels to the Linguistic Decoder 772 13.5.4. Training the Continuous-‘Speech Recognizer 775 Other Language Models 779 13.6.1 N-Gram Statistical Models 779 13.6.2 Other Formal Grammars 785 FWR As “CSR” —- 789 Standard Databases for Speech-Recognition Research 790 A Survey of Language-Model-Based Systems 791xvi Contents 13,40 Conclusions 801 13.11 Problems 801 14 «The Artificial Neural Network 805 14. 14.2 14.3 14.4 14.5 14.6 Index Introduction 805 The Artificial Neuron 808 Network Principles and Paradigms 813 14.3.1 Introduction 873 14.3.2 Layered Networks: Formalities and Definitions 875 14.3.3 The Multilayer Perceptron 819 14,3.4 Learning Vector Quantizer 834 Applications of ANNs in Speech Recognition 837 14.4.1, Presegmented Speech Material 837 14.4.2, Recognizing Dynamic Speech 839 14.4.3, ANNs and Conventional Approaches 841 14.4.4 Language Modeling Using ANNs 845 14.4.5 Integration of ANNs into the Survey Systems of Section 13.9 845 Conchusions 846 Problems 847 899Sample Iustration af, Matrices A,B @) M, Matrices A,B, Observation probabilities “tied” b(K11) = b(kI2) Wk. {b) Interpolated model A=eA +(e) A, A=eB,+(1-2)B, © FIGURE 12.15. {a) A three-state HMM trained in the conventional (e.g., F-B algorithm) manner. (b) The “same” three-state HMM with tied states. (c) An “interpolated” mode! derived from the models of {a) and (b).

Unit 2 - Speech and Video Processing (SVP) - 1
No ratings yet
Unit 2 - Speech and Video Processing (SVP) - 1
23 pages
(Thomas F. Quatieri) Discrete Time Speech Signal P (BookFi - Org) 2 PDF
100% (3)
(Thomas F. Quatieri) Discrete Time Speech Signal P (BookFi - Org) 2 PDF
800 pages
Lawrence R. Rabiner, Digital Processing of Speech Signals
100% (1)
Lawrence R. Rabiner, Digital Processing of Speech Signals
527 pages
Digital Processing of Speech Signals (Rabiner & Schafer 1978) PDF
100% (2)
Digital Processing of Speech Signals (Rabiner & Schafer 1978) PDF
265 pages
Linear Prediction of Speech: D. Markel A. H. Gray, JR
No ratings yet
Linear Prediction of Speech: D. Markel A. H. Gray, JR
299 pages
A Microphone Array System For Speech Source Localization, Denoising, and Dereverberation
No ratings yet
A Microphone Array System For Speech Source Localization, Denoising, and Dereverberation
163 pages
Dokumen - Pub Discrete Time Speech Signal Processing Principles and Practice Low Price Ed Lpe 013242942x 9780132429429 9788177587463 8177587463
No ratings yet
Dokumen - Pub Discrete Time Speech Signal Processing Principles and Practice Low Price Ed Lpe 013242942x 9780132429429 9788177587463 8177587463
802 pages
Towards Neurocomputational Speech and So
No ratings yet
Towards Neurocomputational Speech and So
279 pages
Fundamental of Speech Enhencements
No ratings yet
Fundamental of Speech Enhencements
112 pages
Pub - Digital Processing of Speech Signals PDF
No ratings yet
Pub - Digital Processing of Speech Signals PDF
265 pages
Rabiner & Juang - Fundamentals of Speech Recognition
100% (2)
Rabiner & Juang - Fundamentals of Speech Recognition
277 pages
Post-Processing Method For Single Channel Speech Enhancement Systems 1
No ratings yet
Post-Processing Method For Single Channel Speech Enhancement Systems 1
74 pages
4. Human Speech Communication
No ratings yet
4. Human Speech Communication
44 pages
Discrete-Time Processing of Speech Signals (IEEE Press Classic Reissue) PDF
No ratings yet
Discrete-Time Processing of Speech Signals (IEEE Press Classic Reissue) PDF
919 pages
Self Learning Speaker Identification A System For PDF
No ratings yet
Self Learning Speaker Identification A System For PDF
185 pages
Multi-Band Excitation Vocoder: RLE Technical Report No. 524
No ratings yet
Multi-Band Excitation Vocoder: RLE Technical Report No. 524
140 pages
Chapter 2 - Speech Signal Processing
No ratings yet
Chapter 2 - Speech Signal Processing
60 pages
TEST-1
No ratings yet
TEST-1
77 pages
Prentice Hall - Digital Processing of Speech Signals - 1978 PDF
No ratings yet
Prentice Hall - Digital Processing of Speech Signals - 1978 PDF
265 pages
7.0 Speech Signals and Front-End Processing: References: 1. 3.3, 3.4 of Becchetti
No ratings yet
7.0 Speech Signals and Front-End Processing: References: 1. 3.3, 3.4 of Becchetti
50 pages
svp (1-5)units notes 4th yr csm
No ratings yet
svp (1-5)units notes 4th yr csm
35 pages
Lecture 1
No ratings yet
Lecture 1
48 pages
Speech Recognition Using Matrix Comparison: Vishnupriya Gupta
No ratings yet
Speech Recognition Using Matrix Comparison: Vishnupriya Gupta
3 pages
Lecture 7 - Automatic Speech Recognition
No ratings yet
Lecture 7 - Automatic Speech Recognition
58 pages
Speech Recognition Using DSP PDF
No ratings yet
Speech Recognition Using DSP PDF
32 pages
Basic Course Material Winter 2015
100% (1)
Basic Course Material Winter 2015
19 pages
Audio Signal Processing Audio Signal Processing
No ratings yet
Audio Signal Processing Audio Signal Processing
31 pages
Linear Prediction
No ratings yet
Linear Prediction
18 pages
Speech Analisys
No ratings yet
Speech Analisys
56 pages
Prepared By: Mamatha.K.S M.Tech (S.P) 1 Sem Guided By: Mr. Satish.M.N
No ratings yet
Prepared By: Mamatha.K.S M.Tech (S.P) 1 Sem Guided By: Mr. Satish.M.N
21 pages
Speech and Audio Coding
No ratings yet
Speech and Audio Coding
16 pages
Spectral Analysis in Speech Processing Techniques: Prof. Vijaya Sugandhi
No ratings yet
Spectral Analysis in Speech Processing Techniques: Prof. Vijaya Sugandhi
3 pages
Speech Processing Based On A Sinusoidal Model: Mcaulay and T.F. Quatieri
No ratings yet
Speech Processing Based On A Sinusoidal Model: Mcaulay and T.F. Quatieri
16 pages
10212EC122 -Signal Processing Techniques for Speech Recognition Syllabus
No ratings yet
10212EC122 -Signal Processing Techniques for Speech Recognition Syllabus
3 pages
A Tutorial On Speech Synthesis Models
No ratings yet
A Tutorial On Speech Synthesis Models
8 pages
Theory and Application of Digital Speech Processing by L. R. Rabiner and R. W. Schafer
No ratings yet
Theory and Application of Digital Speech Processing by L. R. Rabiner and R. W. Schafer
35 pages
Algorithms For Speech Processing
No ratings yet
Algorithms For Speech Processing
18 pages
Voice Signal Processing For Speech Synthesis: June 2006
No ratings yet
Voice Signal Processing For Speech Synthesis: June 2006
6 pages
Voice Signal Processing For Speech Synthesis: June 2006
No ratings yet
Voice Signal Processing For Speech Synthesis: June 2006
6 pages
Use of Spectral Autocorrelation in Spectral Envelope Linear Prediction For Speech Recognition
No ratings yet
Use of Spectral Autocorrelation in Spectral Envelope Linear Prediction For Speech Recognition
31 pages
Speaker Recognition System
No ratings yet
Speaker Recognition System
7 pages
Speech Recognition Using A DSP: Lunds Universitet
No ratings yet
Speech Recognition Using A DSP: Lunds Universitet
12 pages
Super Listener: 2. Signal Processing
No ratings yet
Super Listener: 2. Signal Processing
4 pages
Text, speech and phono
No ratings yet
Text, speech and phono
2 pages
Speech Recognition (Dr. M. Sabarimalai Manikandan
No ratings yet
Speech Recognition (Dr. M. Sabarimalai Manikandan
2 pages
A Review On Feature Extraction and Noise Reduction Technique
No ratings yet
A Review On Feature Extraction and Noise Reduction Technique
5 pages
Advanced Topics in Speech Processing (IT60116) : K Sreenivasa Rao School of Information Technology IIT Kharagpur
No ratings yet
Advanced Topics in Speech Processing (IT60116) : K Sreenivasa Rao School of Information Technology IIT Kharagpur
17 pages
Lab 9a. Linear Predictive Coding For Speech Processing: Vocal Tract Parameters Pitch Period Voiced/Unvoiced Speech Switch
No ratings yet
Lab 9a. Linear Predictive Coding For Speech Processing: Vocal Tract Parameters Pitch Period Voiced/Unvoiced Speech Switch
5 pages
Reconocimiento de Voz - MATLAB
No ratings yet
Reconocimiento de Voz - MATLAB
5 pages

Discrete Time Processing of Speech Signa

Uploaded by

Discrete Time Processing of Speech Signa

Uploaded by

You might also like