0% found this document useful (0 votes)
32 views

Speech Enhancement Through Elimination of Impulsive Disturbance Using Log MMSE Filtering

The document discusses speech enhancement through the removal of impulsive noise using log minimum mean square error filtering. It introduces the topic, discusses preprocessing of audio signals including pre-emphasis and signal segmentation. It then explains the use of mean-square error log-spectral amplitude filtering to minimize error and enhance the speech component. Performance is evaluated using signal-to-noise ratio and other quality metrics.

Uploaded by

Abukari Yakubu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

Speech Enhancement Through Elimination of Impulsive Disturbance Using Log MMSE Filtering

The document discusses speech enhancement through the removal of impulsive noise using log minimum mean square error filtering. It introduces the topic, discusses preprocessing of audio signals including pre-emphasis and signal segmentation. It then explains the use of mean-square error log-spectral amplitude filtering to minimize error and enhance the speech component. Performance is evaluated using signal-to-noise ratio and other quality metrics.

Uploaded by

Abukari Yakubu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

www.ijecs.

in
International Journal Of Engineering And Computer Science ISSN:2319-7242
Volume 2 Issue 12, Dec.2013 Page No. 3435-3438

Speech Enhancement through Elimination Of Impulsive


Disturbance Using Log MMSE Filtering
D.Koti Reddy1, T.Jayanandan2 and C.L Vijay Kumar3
1
Student, Prathyusha Institute Of Technology and Management,
Thiruvallur, TamilNadu, India
[email protected]
2
Assistant Professor, Prathyusha Institute Of Technology and Management,
Thiruvallur, TamilNadu, India
[email protected]
3
Assistant Professor, Prathyusha Institute Of Technology and Management,
Thiruvallur, TamilNadu, India
[email protected]

Abstract:
The project presents an enhancement of the speech signal by removal of impulsive disturbance from noisy speech using log
minimum mean square error filtering approach. Impulsive noise has a potential to degrade the performance and reliability of Speech
signal. To enhance the speech component from impulsive disturbance we go for emphasis, signal segmentation and log MMSE filtering.
In preprocessing of audio signals start with pre-emphasis refers to a system process designed to increase the magnitude of some
frequencies with respect to the magnitude of other frequencies. Emphasis refers to a system process designed to increase the magnitude
of some frequencies with respect to the magnitude of other frequencies in order to improve the overall signal-to-noise ratio. Then the
signal samples are segmented into fixed number of frames and each frame samples are evaluated with hamming window coefficients.
Mean-Square Error Log-Spectral Amplitude (MMSE), which minimizes the mean-square error of the log-spectra, is obtained as a
weighted geometric mean of the gains associated with the speech signal. The performance of the filtering is measured with signal to
noise ratio, Perceptual Evaluation of Speech Quality (PESQ), Correlation

. engineered electronic communication systems, the information to


be transmitted is encoded in the form of a continuously varying
Index Terms—Inventory-style speech enhancement, modified
(analog) waveform that can be transmitted, recorded, manipulated,
imputation, uncertainty-of-observation techniques.
and ultimately decoded by a human listener. In the case of speech,
the fundamental analog form of the message is an acoustic
I. INTRODUCTION
waveform, which we call the speech signal. Speech signals can be
The fundamental purpose of speech is communication, i.e., converted to an electrical waveform by a microphone, further
the transmission of messages. A message represented as a sequence manipulated by both analog and digital signal processing, and then
of discrete symbols can be quantified by its information content in converted back to acoustic form by a loudspeaker, a telephone
bits, and the rate of transmission of information is measured in handset or headphone, as desired. Signals are usually corrupted by
bits/second (bps). In speech production, as well as in many human- noise in the real world. To reduce the influence of noise, two

D.Koti Reddy1IJECS Volume 2 Issue 12, Dec. 2013, Page No.3435-3438 Page 3435
research topics are the speech enhancement and speech recognition
in noisy environments have arose. For the speech enhancement, the
extraction of a signal buried in noise, adaptive noise cancellation
(ANC) provides a good solution. In contrast to other enhancement
techniques, its great strength lies in the fact that no a priori
knowledge of signal or noise is required in advance. The advantage Fig. 1 Mother wavelet w(t)
is gained with the auxiliary of a secondary input to measure the
noise source. The cancellation operation is based on the following
principle. Since the desired signal is corrupted by the noise, if the Normally it starts at time t = 0 and ends at t = T. The
noise can be estimated from the noise source, this estimated noise shifted wavelet w(t - m) starts at t = m and ends at t = m + T. The
can then be subtracted from the primary channel resulting in the scaled wavelets w(2kt) start at t = 0 and end at t = T/2k. Their
desired signal. Traditionally, this task is done by linear filtering. In graphs are w(t) compressed by the factor of 2k as shown in Fig.
real situations, the corrupting noise is a nonlinear distortion version 3.3. For example, when k = 1, the wavelet is shown in Fig 3.3 (a).
of the source noise, so a nonlinear filter should be a better choice. If k = 2 and 3, they are shown in (b) and (c), respectively.
In the typical speech enhancement methods based on STFT, only
the magnitude spectrum is modified and phase spectrum is kept
unchanged. It was believed that the magnitude spectrum includes
most of the information of the speech, and phase spectrum contains
little of that. Furthermore, the human auditory system is phase
deaf. For above reason, in typical speech enhancement algorithms,
such as Spectral subtraction (SS), MMSE-STSA or MAP
algorithm, the speech enhancement process is on the basis of
(a)w(2t) (b)w(4t) (c)w(8t)
spectral magnitude component only and keep the phase component
unchanged. Fig. 2 Scaled wavelets

The wavelets are called orthogonal when their inner products are
II. WAVELET BASED DENOISING zero. The smaller the scaling factor is, the wider the wavelet is.
Wide wavelets are comparable to low-frequency sinusoids and
Wavelets are mathematical functions defined over a finite interval narrow wavelets are comparable to high-frequency sinusoids. The
and having an average value of zero that transform data into reconstruction of the image is achieved by the inverse discrete
different frequency components, representing each component with wavelet transform (IDWT). The values are first up sampled and
a resolution matched to its scale. The basic idea of the wavelet then passed to the filters.
transform is to represent any arbitrary function as a superposition
of a set of such wavelets or basis functions. These basis functions
or baby wavelets are obtained from a single prototype wavelet
called the mother wavelet, by dilations or contractions (scaling)
and translations (shifts). They have advantages over traditional
Fourier methods in analyzing physical situations where the signal
contains discontinuities and sharp spikes. Many new wavelet
applications such as image compression, turbulence, human vision,
radar, and earthquake prediction are developed in recent years. In Fig. 3 Wavelet Reconstruction

wavelet transform the basic functions are wavelets. Wavelets tend


The wavelet analysis involves filtering and down
to be irregular and symmetric. All wavelet functions, w(2kt - m),
sampling, whereas the wavelet reconstruction process consists of
are derived from a single mother wavelet, w(t). This wavelet is a
up sampling and filtering. Up sampling is the process of
small wave or pulse like the one shown in Fig. 3.2.
lengthening a signal component by inserting zeros between
samples as shown in fig

D.Koti Reddy1IJECS Volume 2 Issue 12, Dec. 2013, Page No.3435-3438 Page 3436
Fig. 4 Reconstruction using up sampling.

Wavelet denoising is considered a non-parametric method. Thus, it


is distinct from parametric methods in which parameters must be
estimated for a particular model that must be assumed a priori. Fig:5. Block Diagram

X (t) = S (t) + N (t)


Consider a filter with an input x[n] and an output y[n] given by
Assume that the observed data contains the true signal S(t) with
additive noise N(t) as Functions in time t to be sampled. Let W(·)
and W−1 (·) denote the forward and inverse wavelet transform
operators. Let D (·,λ) denote the denoising operator with soft
threshold λ. We intend to wavelet denoised X(t) in order to recover
Ŝ(t) as an estimate of S(t).
Where the Wk values1 weight the samples of the input signal at
different delays Dk. We require that the delays be distinct.
III. LOG MMSE FILTERING AND SIGNAL
SEGMENTATION A conventional causal FIR filter would simply have Dk =
k for k = 0, M − 1. We will keep the delays general until later in
The problem is discussed in more generality than in many other this document, when it becomes useful to specialize them to get
expositions specifically we allow for general filter delays (to further results. The goal is to find a set of filter coefficients that
accommodate the pitch filtering problem, for instance) and cover minimize the squared error between the output of the filter y[n] and
both the stochastic case and block-based analyses with a single a desired signal d[n].
formalism. For mean-square error computations, we will only need
to use at most second order statistical properties (correlations and First we write the filtering operation in vector form,

means). For the case of stochastic signals, these notes look at the
derivation of the correlation values required for a minimum mean-
square error solution. We also examine systems which involve
cyclo stationary signals (interpolation filter, for instance).

The important linear prediction problem is examined


Where,
in detail. This includes the setup for non-equally spaced delay
values. For the equally spaced delay case, we can develop a rich set
of results. For the least-squares problem, these notes give a
generalized view of windowing: windowing the data and/or
windowing the error. This view subsumes the traditional special
cases, viz the auto correlation and covariance methods. These notes
present a number of examples based on “real” signals. With the
background developed, the results are obtained with relatively
straightforward MATLAB scripts. The results illustrate the useful
insights that can be obtained when minimum mean-square error
To be able to handle both the stochastic case and block-
theory is appropriately fleshed out.
based squared-error cases with a single formulation, we define an
“averaging” operation with a bar over an expression. For the case
of ensemble averaging, this is the expectation operator. For other
cases, it will signify a sum of squares. In many of the cases we

D.Koti Reddy1IJECS Volume 2 Issue 12, Dec. 2013, Page No.3435-3438 Page 3437
study, the averaging operation will remove the dependency on n. V. CONCLUSION
For the case of wide-sense stationary processes, results are
The project presented that an enhancement of the speech
reviewed in Appendix B. signal by removal of impulsive disturbance based on log spectral
gain filtering approach. Here, Mean-Square Error Log-Spectral
The error is, Amplitude was used to minimize the mean-square error of the log-
spectra, is obtained as a weighted geometric mean of the gains
associated with the speech signal effectively. It provided that better
results in terms of performance parameters, processing time and
speech signal quality rather than prior methods. This system will
be enhanced with a modified filtering method to restore signals
with better accuracy rather than Log spectra.

VI. REFERENCES
IV. SIMULATION RESULTS
[1] P. Vary and R. Martin, Digital Speech Transmission:
We analyzed the performance of the proposed method within
a variety of noise scenarios. Results were compared to four Enhancement, Coding and Error Concealment. Chichester, U.K.:
established reference techniques. Two of these reference Wiley, 2006.
techniques were log-MMSE enhancers after Ephraim and Malah.
One of these methods, referred to as log-MMSE(MS), employed [2] P. C. Loizou, Speech Enhancement—Theory and Practice.
the Minimum Statistics technique developed by Martin [11] to Boca Raton, FL, USA: CRC, Taylor and Francis, 2007.
estimate the underlying noise power. The other method, referred to
as log-MMSE(RA), employed the same VAD-supported Recursive [3] X. Xiao and R. M. Nickel, “Speech enhancement with
Averaging that was also used in the “log-MMSE Filter” block of inventory style speech resynthesis,” IEEE Trans. Audio, Speech,
Fig. 5. The performance gains between the log-MMSE(RA)
method and the proposed method are, therefore, directly Lang. Process., vol. 18, no. 6, pp. 1243–1257, Aug. 2010.
attributable to the inventory search and the subsequent cepstral [4] J. Ming, R. Srinivasan, and D. Crookes, “A corpus-based
smoothing. As a third reference method
we chose the Multiband Spectral Subtraction (MBSS) technique approach to speech enhancement from nonstationary noise,” IEEE
proposed by Kamath and Loizou [2] and lastly, we also Trans. Audio, Speech, Lang. Process., vol. 19, no. 4, pp. 822–836,
implemented a slightly modified version of the inventory-style
baseline system. May 2011.

[5] J. Ming, R. Srinivasan, and D. Crookes, “A corpus-based


approach to speech enhancement from nonstationary noise,” in
Proc. INTERSPEECH, Makuhari, Japan, Sep. 2010, pp. 1097–
1100.
[6] R. M. Nickel and R. Martin, “Memory and complexity
reduction for inventory-style speech enhancement systems,” in
Proc. EUSIPCO, Barcelona, Spain, Sep. 2011, pp. 196–200.

Fig:6. Noisy Speech Signal

Fig:7.Denoised Signal

D.Koti Reddy1IJECS Volume 2 Issue 12, Dec. 2013, Page No.3435-3438 Page 3438

You might also like