AWH Khong - Tasl.2009.a Class of Sparseness-Controlled Algorithms For Echo Cancellation
AWH Khong - Tasl.2009.a Class of Sparseness-Controlled Algorithms For Echo Cancellation
Abstract—In the context of acoustic echo cancellation (AEC), it [5]–[7], data reusing techniques [8], [9], partial update adap-
is shown that the level of sparseness in acoustic impulse responses tive filtering techniques [10], [11] and subband adaptive filtering
can vary greatly in a mobile environment. When the response is (SAF) schemes [12]. These approaches aim to address issues in
strongly sparse, convergence of conventional approaches is poor.
Drawing on techniques originally developed for network echo
echo cancellation including the performance with colored input
cancellation (NEC), we propose a class of AEC algorithms that can signals, time-varying echo paths and computational complexity,
not only work well in both sparse and dispersive circumstances, to name but a few. In contrast to these approaches, sparse adap-
but also adapt dynamically to the level of sparseness using a new tive algorithms have been developed specifically to address the
sparseness-controlled approach. Simulation results, using white performance of adaptive filters in sparse system identification.
Gaussian noise (WGN) and speech input signals, show improved For sparse echo systems, the NLMS algorithm suffers from slow
performance over existing methods. The proposed algorithms
achieve these improvement with only a modest increase in compu-
convergence [13].
tational complexity. One of the first sparse adaptive filtering algorithms for NEC
is proportionate NLMS (PNLMS) [2] in which each filter co-
Index Terms—Acoustic echo cancellation (AEC), network echo efficient is updated with an independent step-size that is lin-
cancellation (NEC), sparse impulse responses, adaptive algo-
early proportional to the magnitude of that estimated filter co-
rithms.
efficient. It is well known that PNLMS has very fast initial con-
vergence for sparse impulse responses after which its conver-
I. INTRODUCTION gence rate reduces significantly, sometimes resulting in a slower
overall convergence than NLMS. In addition, PNLMS suffers
from slow convergence when estimating dispersive impulse re-
CHO cancellation in telephone networks comprising
E mixed packet-switched and circuit-switched components
requires the identification and compensation of echo systems
sponses [13], [14]. To address the latter problem, subsequent im-
proved versions, such as PNLMS++ [14], were proposed. The
PNLMS++ algorithm achieves improved convergence by alter-
with various levels of sparseness. The network echo response nating between NLMS and PNLMS for each sample period.
in such systems is typically of length 64–128 ms, characterized However, as shown in [15], the PNLMS++ algorithm only per-
by a bulk delay dependant on network loading, encoding, and forms best in the cases when the impulse response is sparse or
jitter buffer delays [1]. This results in an “active” region in highly dispersive.
the range of 8–12 ms duration and consequently, the impulse An improved PNLMS (IPNLMS) [15] algorithm was pro-
response is dominated by “inactive” regions where coefficient posed to exploit the “proportionate” idea by introducing a
magnitudes are close to zero, making the impulse response controlled mixture of proportionate (PNLMS) and non-propor-
sparse. The echo canceller must be robust to this sparseness tionate (NLMS) adaptation. A sparseness-controlled IPNLMS
[2]. This network echo cancellation (NEC) issue is particularly algorithm was proposed in [16] to improve the robustness
important in legacy networks comprising packet-switched and of IPNLMS to the sparseness variation in impulse responses.
circuit switched components whereas in pure packet-switched Composite PNLMS and NLMS (CPNLMS) [17] adaptation was
networks NEC is not normally required. proposed to control the switching of PNLMS++ between the
Traditionally, adaptive filters have been deployed to achieve NLMS and PNLMS algorithms. For sparse impulse responses,
NEC by estimating the network echo response using algorithms CPNLMS performs the PNLMS adaptation to update the large
such as the normalized least-mean-square (NLMS) algorithm. coefficients and subsequently switches to NLMS, which has
Several approaches have been proposed over recent years to better performance for the adaptation of the remaining small
improve the performance of the standard NLMS algorithm in taps. The -law PNLMS (MPNLMS) [18] algorithm was
various ways for NEC. These include Fourier [3] and wavelet [4] proposed to address the uneven convergence rate of PNLMS
based adaptive algorithms, variable step-size (VSS) algorithms during the estimation process. As proposed in [18], MPNLMS
uses optimal step-size control factors to achieve faster overall
Manuscript received September 09, 2008; revised May 29, 2009. First pub- convergence until the adaptive filter reaches its steady state.
lished nulldate. The associate editor coordinating the review of this manuscript With the development of hands-free mobile telephony in re-
and approving it for publication was Dr. Jingdong Chen.
P. Loganathan and P. A. Naylor are with the Department of Electrical and cent years, another type of echo, acoustic echo, seriously de-
Electronic Engineering, Imperial College London, London SW7 2AZ, U.K. grades user experience due to the coupling between the loud-
(e-mail: [email protected]; [email protected]). speaker and microphone. For this reason, effective acoustic echo
A. W. H. Khong is with the School of Electrical and Electronic En-
gineering, Nanyang Technological University, Singapore 639798 (e-mail:
cancellation (AEC) [19] is important to maintain usability and
[email protected]). to improve the perceived voice quality of a call. Although sparse
Digital Object Identifier 10.1109/TASL.2009.2025903 adaptive filtering algorithms, such as those described above,
1558-7916/$26.00 © 2009 IEEE
1592 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 8, NOVEMBER 2009
(1)
(2)
The IPNLMS [15] algorithm was originally developed for where and , then . On
NEC and was further developed for the identification of acoustic the other hand, when , then . In reality
room impulse responses [20]. It employs a combination of pro- and hence is time-varying and depends on factors
portionate (PNLMS) and non-proportionate (NLMS) adapta- such as temperature, pressure and reflectivity [21]. As explained
tion, with the relative significance of each controlled by a factor in Section I, the sparseness of AIRs varies with the location
such that the diagonal elements of are given as of the receiving device in an open or enclosed environment. We
show below how can also vary with the loudspeaker–mi-
crophone distance in an enclosed space.
Consider an example case where the distance between
(9) a fixed position loudspeaker and the talker using a wireless
microphone is varying. Fig. 3 shows two AIRs, generated using
where is defined as the -norm and the first and second the method of images [24], [25] with 1024 coefficients using
terms are the NLMS and the proportionate terms, respectively. It room dimensions of 8 10 3 m and 0.57 as the reflection
can be seen that IPNLMS behaves like NLMS when coefficient. The loudspeaker is fixed at 4 9.1 1.6 m in the
and PNLMS when . Use of a higher weighting for LRMS while the microphone is positioned at 4 8.2 1.6 m
NLMS adaptation, such as or , is a favor- and 4 1.4 1.6 m giving impulse responses as shown in
able choice for most AEC/NEC applications [15]. It has been Fig. 3(a) and (b) for m and m, respectively.
1594 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 8, NOVEMBER 2009
Fig. 5. Convergence of the PNLMS for different values of using WGN input
signal. Impulse responses in Fig. 3(a) and (b) are used as sparse and dispersive
AIRs, respectively. [ = 0 3 SNR = 20
: ; dB].
Fig. 4. Sparseness measure against the distance between loudspeaker and mi-
crophone a. The impulse responses are obtained from the image model using a
2 2
fixed room dimensions of 8 10 3 m.
TABLE I
SPARSENESS-CONTROLLED ALGORITHMS
(14)
TABLE II
COMPLEXITY OF ALGORITHMS’ COEFFICIENTS UPDATE—ADDITION
(A), MULTIPLICATION (M), DIVISION (D), LOGARITHM (LOG),
AND COMPARISON (C)
0
Fig. 9. Time to reach 20-dB normalized misalignment level for different Fig. 10. Convergence of the SC-PNLMS for different values of using WGN
values of in SC-PNLMS using WGN input signal. Impulse response input signal with an echo path change at 3.5 s. Impulse response is changed from
in Fig. 3(a) and (b) used as sparse AIR and dispersive AIR, respectively. Fig. 3(a) to (b) and = 0 3 SNR = 20
: ; dB.
[ = 0 3 SNR = 20
: ; dB].
0
Fig. 17. Time to reach the 20-dB normalized misalignment against different
sparseness measures of eight systems for NLMS, PNLMS, SC-PNLMS,
Fig. 15. Relative convergence of NLMS, IPNLMS, and SC-IPNLMS using IPNLMS, and SC-IPNLMS.
WGN input signal with an echo path change at 3.5 s. Impulse response is
changed from that shown from Fig. 3(a) to (b) and = =
03
: ; = 0 7 SNR = 20
: ; dB.
0
Fig. 18. Time to reach the 20-dB normalized misalignment against dif-
ferent sparseness measures of eight systems for NLMS, MPNLMS, and
SC-MPNLMS.
Pradeep Loganathan received the M.Eng. degree in research was mainly on partial-update and selective-tap adaptive algorithms
information systems engineering from Imperial Col- with applications to mono- and multichannel acoustic echo cancellation for
lege London, London, U.K., in 2007. hands-free telephony. He has also published works on acoustic blind channel
Since 2007, he has been with Imperial College identification for speech dereverberation. His other research interests include
London as a Research Postgraduate. His research in- speech enhancement and blind deconvolution algorithms.
terests are mainly in the area of adaptive algorithms,
both in time and frequency domains, with applica-
tions to single-channel acoustic echo cancellation.