0% found this document useful (0 votes)
16 views

An_ECG_Signal_Denoising_Method_Using_Conditional_Generative_Adversarial_Net

Uploaded by

hum1992
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

An_ECG_Signal_Denoising_Method_Using_Conditional_Generative_Adversarial_Net

Uploaded by

hum1992
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 26, NO.

7, JULY 2022 2929

An ECG Signal Denoising Method Using


Conditional Generative Adversarial Net
Xiaoyu Wang , Bingchu Chen, Ming Zeng , Yuli Wang, Hui Liu , Ruixia Liu , Member, IEEE,
Lan Tian , and Xiaoshan Lu

Abstract—In this paper, a novel denoising method for Index Terms—Convolutional auto-encoder (CAE),
electrocardiogram (ECG) signal is proposed to improve per- generative adversarial network (GAN), ECG signal
formance and availability under multiple noise cases. The denoising.
method is based on the framework of conditional gener-
ative adversarial network (CGAN), and we improved the
CGAN framework for ECG denoising. The proposed frame-
work consists of two networks: a generator that is com- I. INTRODUCTION
posed of the optimized convolutional auto-encoder (CAE)
LECTROCARDIOGRAM (ECG) plays an irreplaceable
and a discriminator that is composed of four convolution
layers and one full connection layer. As the convolutional
layers of CAE can preserve spatial locality and the neigh-
E role in the recognition [1], diagnosis, and classifica-
tion [2] of cardiac diseases (e.g., atrial fibrillation [3], premature
borhood relations in the latent higher-level feature repre- ventricular, and atrial beats, etc.). With the development of
sentations of ECG signal, and the skip connection facili-
telemedicine, remote ECG monitoring provides important auxil-
tates the gradient propagation in the denoising training pro-
cess, the trained denoising model has good performance iary information for automatic diagnosis of cardiac diseases [4].
and generalization ability. The extensive experimental re- However, ECG signals are often polluted by plenty of noises
sults on MIT-BIH databases show that for single noise and from electrode motion (EM), baseline wander (BW), and muscle
mixed noises, the average signal-to-noise ratio (SNR) of de- artifacts (MA) [5], thus the recognition and diagnosis accuracy
noised ECG signal is above 39 dB, and it is better than that
of ECG must be affected [6]. For this reason, noise interference
of the state-of-the-art methods. Furthermore, the denoised
classification results of four cardiac diseases show that should be removed from noisy ECG signals.
the average accuracy increased above 32% under multiple There are plenty of traditional denoising methods for ECG
noises under SNR=0 dB. So, the proposed method can signals, for instance, adaptive filtering, Empirical Mode Decom-
remove noise effectively as well as keep the details of the position (EMD), S-Transform (ST), Wavelet Transform (WT),
features of ECG signals.
and Fourier decomposition. Rahman et al. [7] denoised remote
ECG signals with a number of adaptive recurrent filters. How-
ever, adaptive filtering often required reference noise as the input
signal, which was hard to obtain in ECG signal acquisition
system. Kabir et al. [8] removed the noise from the initial
Manuscript received April 29, 2021; revised October 31, 2021, Jan-
uary 12, 2022, and March 8, 2022; accepted April 18, 2022. Date of intrinsic mode functions (IMFs) through the windowing method
publication April 21, 2022; date of current version July 4, 2022. This work integrated with EMD. This method preserved the QRS complex
was supported in part by the Natural Science Foundation of Shandong and produced the comparatively pure ECG signals. The EMD
Province under Grants ZR2021ZD40 and ZR2021MF065, in part by
the Research Project for Graduate Education and Teaching Reform, method used the Hilbert transform, but the Hilbert transform
Shandong University, China under Grant XYJG2020108, and in part has no ability to separate the signals whose frequency is similar,
by the Innovation Pilot Project for Integration of Science, Education
and Industry, Qilu University of Technology (Shangdong Academy of
so it is easy to accidentally filter P-waves and T-waves of ECG
Sciences), China under Grant 2020KJC-ZD03. (Xiaoyu Wang, Bingchu signals. ST was used in [9] to represent the noisy ECG signals.
Chen, and Ming Zeng are co-first authors.) (Corresponding authors: Lan Then, masking and filtering techniques were used to reduce the
Tian; Xiaoshan Lu.)
Xiaoyu Wang, Bingchu Chen, Ming Zeng, and Lan Tian are with the
noise in time-frequency domain. However, the spectra of MA
School of Microelectronics, Shandong University, Jinan 250100, China, noise and ECG signals are overlapping. So T-wave is attenuated
and also with the Shandong Artificial Intelligence Institute, Qilu Univer- and QRS peak becomes smaller. Gokhale et al. [10] proposed
sity of Technology (Shandong Academy of Sciences), Jinan 250353,
China (e-mail: [email protected]; [email protected];
a discrete WT (DWT) method for removing 50 Hz power line
[email protected]; [email protected]). interference (PLI) noise. And Smital et al. [11] proposed an
Xiaoshan Lu is with the School of Information Science and Engineer- adaptive wavelet Wiener filtering method for reducing broad-
ing, Shandong University, Qingdao 266237, China (e-mail: luxiaoshan@
126.com).
band electromyographic (EMG) noise in ECG signals. However,
Yuli Wang, Hui Liu, and Ruixia Liu are with the Shandong Arti- the ST-wave and BW noise as well as the QRS wave group and
ficial Intelligence Institute, Qilu University of Technology (Shandong MA noise are difficult to distinguish based on frequency domain
Academy of Sciences), Jinan 250353, China (e-mail: [email protected];
[email protected]; [email protected]).
features completely. Adaptive Fourier Decomposition (AFD)
Digital Object Identifier 10.1109/JBHI.2022.3169325 was used in [12] for denoising. This method decomposed the

2168-2194 © 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SRINAGAR. Downloaded on October 20,2022 at 11:27:54 UTC from IEEE Xplore. Restrictions apply.
2930 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 26, NO. 7, JULY 2022

signal in accordance with the energy distribution and separated CGAN proposed by Mirza et al. [21] in 2014 and it proposed
the ECG signals and noise which have the same frequency a method of conditioning additional information to control data
range but different energy distribution. However, noise with generation process in a supervised manner. CGAN has witnessed
the same energy distribution cannot be removed. To sum up, great progress in the field of image processing, such as style
in the above traditional denoising methods based on different transfer [22], super-resolution [23], and image inpainting [24].
transformations, it is difficult to distinguish ECG signals from It is worth mentioning that CGAN is also used in image en-
noise, when the spectra or energy distribution of noise and hancement tasks, for instance, image de-raining [25], image
ECG signals are overlapping. Therefore, they cannot effectively dehazing [26], and infrared image enhancement [27]. Qin et
remove noise in ECG signals under the above conditions. al. [28] proposed a model based on a one-dimensional CNN
Cardiac monitoring system and a large amount of with CGAN. CGAN was used for sample augmentation of plant
telemedicine data provide database for ECG researches, and the electrical signals and then one-dimensional CNN was used for
denoising methods using deep learning have also been studied. classification. Inspired by this, we apply the CGAN to our ECG
Suranai et al. [13] proposed a denoising method based on denoising task. The convolutional auto-encoder (CAE) is also
wavelet neural network (WNN), which integrated the adaptive adopted in our method.
learning ability of the neural network and the multi-resolution Based on the above analysis, we propose a new CGAN based
feature of the wavelet, but it only removed high-frequency noise. on CAE (i.e., CAE-CGAN). The CAE-CGAN is a GAN-based
In his later work [14], he proposed an adaptive filtering method framework, which is composed of a generator and a discrimina-
for ECG signal denoising based on DWT and neural network. tor. The generator is constructed with optimized CAE, which
This method can remove EM, MA, BW, and mixed noises. The is used to generate the denoised ECG signals. And the dis-
stacked contractive denoising auto-encoder (CDAE) proposed criminator is an auxiliary network that helps the generator to
by Peng et al. [15] and its improved version [16] denoising produce denoised ECG signals that are more similar to the raw
auto-encoder (DAE) can also remove EM, MA, BW, and mixed signals. Once the training is done, the noisy ECG signals can
noises in ECG signals. The DAE can capture as much informa- be denoised by forward propagation of the trained generator.
tion as possible in each given sample, when the sample is simply And our experimental results show that the proposed method
corrupting the input distribution. Therefore, the more similar the can obviously improve ECG denoising performance.
waveform of the training sample to the test sample, the better To sum up, the contributions of this paper are as follows:
the denoising effect. So, this method relies on sample selection r An end-to-end ECG denoising method based on CAE-
to achieve a better denoising effect. CGAN is proposed. We conducted a comparative experi-
However, the learning method based on generative adversarial mental study on how CGAN is used in ECG noise reduc-
networks (GAN) is different from that of DAE. It uses adver- tion task, and obtained the optimized CGAN structure.
sarial training mode to update generator (G) to generate data r The CAE structure is applied as the generator of CGAN
that meets the requirements of learning purposes. In this way, framework. The optimized CAE structure is determined
G can automatically learn the complex distribution of raw ECG by optimizing the parameters and designing the structure.
signals. GAN was introduced by Ian Goodfellow et al. in [17] The experimental results show that the optimized structure
for generating samples via an adversarial process. They trained has better noise reduction effect than other structures.
two sub-networks: a generator G to generate a realistic sample, r Extensive experiments are conducted on the MITBIH
and a discriminator D to estimate the probability that a sample datasets, compared with the existing methods, our per-
came from the real sample rather than that of G. The network formance metrics are significantly improved for various
corresponded to a minimax two-player game. Wang et al. [18] cases of single noise and mixed noise. And for multiple
proposed an ECG denoising method based on an adversarial noises cases under SNR = 0 dB, the average accuracy of
method. They used a GAN model to remove single noise and the denoised classification results for four cardiac diseases
mixed noise effectively and this method achieved the highest increased above 32%.
SNR of denoised signal. But we think that this structure may r The denoising results for new records and new lead show
achieve a better denoising effect after adding condition to direct that our method has good generalization ability.
the data generation process. In 2020, Pratik et al. [19] proposed The organization of this paper is as follows. The related
convolutional neural network (CNN) based on GAN model, knowledge of GAN is introduced in Section II. The details of the
which can carry out end-to-end noise reduction for EM, BW, proposed method are presented in Section III. Section IV ana-
and MA noises. However, the noise reduction effect of mixed lyzes the experiments results, and Section V gives a conclusion
noise is not studied, and the influence of the input variables z on of whole work.
the denoising task was not explored. In 2021, Xu et al. [20] used a
combination of GAN and residual network for ECG denoising.
Their denoising results were about 29 dB, 32 dB, and 60 dB II. THE RELATED KNOWLEDGE OF GAN
when SNR=0 dB, 1.25 dB, and 5 dB, respectively. We found GAN is a generative model proposed by Ian Goodfellow [17].
this method is not ideal at low SNR. In the above methods based It uses adversarial training mode to update generator (G) to
on GAN, conditional generative adversarial network (CGAN) is generate data that meets the requirements of specific purposes.
not explored for ECG denoising. Discriminator (D) is usually a binary classifier, whose task is to

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SRINAGAR. Downloaded on October 20,2022 at 11:27:54 UTC from IEEE Xplore. Restrictions apply.
WANG et al.: ECG SIGNAL DENOISING METHOD USING CONDITIONAL GENERATIVE ADVERSARIAL NET 2931

Fig. 1. The overall structure of CAE-CGAN for ECG denoising.

distinguish the true sample of the training data from the false III. PROPOSED METHOD
sample generated by G. G learns the training data distribution
In this section, the architecture of the proposed network
and simulates the training data to generate false samples. G
including the generator and the discriminator, and loss function
is committed to generating more vivid sample data to fool D.
are analyzed, respectively.
This adversarial learning process is modeled as a minimax game
between G and D. That is, the following objective function is
optimized: A. The Overall Structure
The overall architecture of CAE-GAN is shown in Fig. 1. The
min max = Ex∼pdata (x) [logD(x)] left side shows the noise reduction process of the ECG signal,
G D
(1)
and the right side shows the training process of the model. On the
+ Ez∼pz (z) [log(1 − D(G(z))].
right, the proposed CAE-CGAN contains a generator G and a
where pdata (x) is the distribution of the real data x, pz (z) is discriminator D. We adopt convolutional auto-encoder (CAE) to
the prior distribution of input variables z. build a generator that contains an encoding stage and a decoding
After the emergence of GAN, a variety of improved versions stage. Its input is the noisy ECG signal x̃, and its output is the
have been proposed, among which one of the most important denoised signal x  . The discriminator receives a pair of signals
is the CGAN proposed by Mirza et al. [21]. GAN can be as input, which is composed of raw data and noisy data (x, x̃) or
improved to a conditional version by adding additional condition denoised data and noisy data ( x, x̃). The discriminator updates
information y. y can be any additional information like class the D’s parameters through the discriminator loss function (loss
labels or data from other distributions. The objective function of D) formed by its own output. The generator updates the G’s pa-
CGAN is: rameters through the generator loss function (Loss G) composed
of its own output and the discriminator’s output. When the model
min max = Ex∼pdata (x) [logD(x, y)] training is finished, save the generator model. On the left side
G D
(2) of Fig. 1, we slice the noisy signals into one sample which has
+ Ez∼pz (z) [log(1 − D(G(z, y), y))] M sampling points and send them to the saved generator model
In the denoising task, y is set to the noisy signal x̃. for denoising. Then the denoised signal can be output.
To overcome disadvantages like gradient vanishing, the least The adversarial training process of G and D is shown in
squares GAN (LSGAN) method [29] replaces the cross-entropy Fig. 2. The training process is divided into two steps. First,
loss with the least square loss. The LSGAN’s objective functions D should be trained. We input (x, x̃) and ( x, x̃) into D, and
of D and G are as follows: then feedback their output results to the D through the loss
function to complete the updating of D’s parameters. During this
1
min VLSGAN (D) = Ex∼pdata (x) [(D(x, y) − 1)2 ] process, the parameters of G are frozen, as shown in Fig. 2(a)
D 2 and (b). Second, the generator should be trained. We fix the
(3)
1 2 parameters of D, input ( x, x̃) into D and feed the result back to
+ Ez∼pz (z) [(D(G(z, y), y)) ]
2 G through the loss function. Then G parameters are updated, as
1 shown in Fig. 2(c). G and D are trained alternatively until Nash
min VLSGAN (G) = Ez∼pz (z) [(D(G(z, y), y) − 1)2 ] (4) equilibrium is reached [30], and the training is completed.
G 2

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SRINAGAR. Downloaded on October 20,2022 at 11:27:54 UTC from IEEE Xplore. Restrictions apply.
2932 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 26, NO. 7, JULY 2022

Fig. 4. Network architecture for generator of CAE-CGAN.

the resulting dimensions for each layer are 512 × 1, 512 ×


Fig. 2. Adversarial training for ECG denoising. Dashed lines represent 16, 512 × 32, 512 × 64, 256 × 128, 128 × 256, 64 × 512,
the flow of gradient back-prop.
and 32 × 1024. The decoder is a mirroring of the encoder,
and their corresponding layers have the same filter widths and
numbers. The PReLU activation function layer is set between
each convolution layer. In addition, in order to facilitate gradient
propagation, we introduce skip connection between each coding
layer and corresponding decoding layer. The pooling layer will
lose details, so there is no pooling layer in the generator.
Based on LSGAN objective function, the input variables z
is removed, and the distance function ldist and the maximum
local difference lmax [18] are added. ldist is used to measure the
differences between the denoised signal x  and the raw signal
x. The smaller ldist , the better the quality of the denoised
signal. lmax keeps track of the maximum difference between the
Fig. 3. Two generator models. (a) generator with z. (b) generator
without z. denoised and raw signals. The smaller lmax is, the more details
of the ECG signal are preserved, and the greater the medical
value of the denoised signal is. Calculations of ldist and lmax
B. Generator and Its Loss Function are as follows:

We adopt convolutional auto-encoder (CAE) to build a gen- N
erator. Convolutional layers can preserve spatial locality and the ldist = |xn − xn | (5)
n=1
neighborhood relations of the input in their potential higher-level
feature representations [31]. lmax = max(|
x1 − x1 |, |
x2 − x2 |, . . .,
GANs are generative models that learn a mapping from input (6)
n − xn |, . . ., |x
|x N − xN |)
variables z to real data x (G : z ⇒ x). In contrast, conditional
GANs learn a mapping from condition y and input variables z where N represents the number of samples; x n represents the
to real data x (G : {z, y} ⇒ x). In [32] they used CGAN with n-th sample of denoised signals; xn represents the n-th sample
input variables z for image-to-image translation. Correspond- of raw signals.
ingly, in the image enhancement task [27], the input variables The loss function of G is as follow:
z was removed. In order to explore whether the input variables
z is needed for the ECG denoising task based on CGAN, we min V (G) = Ex̃∼pnoisy (x̃) [(D(G(x̃), x̃)) − 1)2 ]
G
design two models. The first model is shown in Fig. 3(a). In the
encoding process, the features of the input data are extracted and + λ1 ldist + λ2 lmax (7)
compressed through the convolutional layer. The compressed where pnoisy (x̃) is the distribution of the noisy data x̃. λ1 and
features are called the thought vector c, which is connected to λ2 are weight coefficients used to adjust the weights of ldist
the input variables z. The second model, as shown in Fig. 3(b), and lmax in the objective function, which are set to 0.7 and 0.2
removes the input variables z. respectively through experiments.
Experiments are carried out in Section IV(B) to compare
the denoising performance of the two models in Fig. 3(a) and
Fig. 3(b). The experimental results confirm that the module C. Discriminator and Its Loss Function
without input variables z can not only improve the denoising The discriminator has two-channel input. In other words, the
performance but also retain more waveform details. Therefore, discriminator receives a pair of signals as input. This pair of
we selected the model without input variables z. signals may be a pair of raw signal and noisy signal (x, x̃) or
The network architecture of the generator is shown in Fig. 4. a pair of denoised signal and noisy signal ( x, x̃). The noisy
The encoder is composed of 7 convolutional layers. We conduct signal in the input is the additional conditional information used
an optimization experiment on dimensions (see IV.C for details), to direct the data generation process. As shown in Fig. 5, the

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SRINAGAR. Downloaded on October 20,2022 at 11:27:54 UTC from IEEE Xplore. Restrictions apply.
WANG et al.: ECG SIGNAL DENOISING METHOD USING CONDITIONAL GENERATIVE ADVERSARIAL NET 2933

Another 10 new records are selected in experiments to explore


the effects of the model on different data sets. The 10 new records
are 100, 101, 106, 112, 117, 121, 123, 209, 220, and 228. These
records data is also from MLII.
(b) New lead data set
Lead V1 is not used in previous experiments, so it is selected
to explore the effects of the model on different lead data. Since
record 103 does not have lead V1, we chose record 101 instead.
Fig. 5. Network architecture for discriminator of CAE-CGAN. The other nine records are the same. So, 10 records are 101, 105,
111, 116, 122, 205, 213, 219, 223, and 230.
discriminator uses four convolution layers and one full connec- For the above data sets, 80% were used for training and 20%
tion layer, and each convolution layer is followed by the virtual for testing respectively.
batch normalization (VBN) layer and the LeakyReLU activation The experimental samples are constructed according to the
function layer. After the final full connection layer, the sigmoid following steps:
function outputs a probability value between 0 and 1. (1) Referring to reference [18], the noisy signals used in our
At the beginning of training, the discriminator is trained, as experiments are as follows:
shown in Fig. 2(a) and (b). The loss function of the discriminator
x˜n = xn + nn (11)
is used to update the discriminator so that the discriminator
considers (x, x̃) as true and (
x, x̃) as false. where, x˜n represents the noisy signals, xn represents the raw
The modified loss function is as follows: signal in the MIT-BIH Arrhythmia Database, nn represents the
min V (D) = Ex∼pdata (x) [(D(x, x̃) − 1)2 ] noise. The SNR of noisy signals x̃ is calculated as follows:
D
(8) N 2
n=1 xn
+ Ex̃∼pdata (x) [D(G(x̃), x̃)2 ] SN R = 10log10 N
2
(12)
n=1 nn
IV. EXPERIMENTAL RESULTS AND DISCUSSION When SNR is of different values, such as 0, 1, 2, 3, 4, and
5 dB, different noisy signals can be calculated.
A. Experimental Setup
(2) In [15], [16], and [18], they need to use one heartbeat as
1) Performance Metrics: Root mean square error (RMSE) a training sample through locating the position of the R peak.
and signal-to-noise ratio (SNR) are used as performance metrics, But accurate R peak detection is difficult in the ECG signal with
and the formulas are as follows: strong noise. Therefore, we propose a more practical way for

1 N slicing ECG data. A sample is segmented by certain number of
RM SE = n − xn )2
(x (9) sampling points as follows.
N n=1
N Assume that the sampling frequency of the database is Fs , the
xn 2 number of the sample points is M , and the heartbeat period is T .
SN R = 10log10 N n=1 (10)
xn − xn )2 Generally, the heartbeat rate is about 50 to 120 beats per minute,
n=1 (
so the heartbeat period is about 0.5 to 1.2 seconds. In order to
where x represents the raw signal, x  represents denoised signal. ensure that each sample (M ) covers at least one heartbeat (T )
These metrics are the key evaluation parameters of denoising and the dimensions of the denoising model is not too large, we
methods. The higher the SNR and the smaller the RMSE, the select the suitable length of sample according to the following
better the denoising effect. formula.
2) Database and Experimental Samples: Raw data are ob-
tained from MIT-BIH Arrhythmia Database [33]. Each record M ≥ Fs ∗ Tmax (13)
lasts 30 minutes and it is sampled at a frequency of 360 Hz. Here, Fs is 360 Hz of the MIT-BIH database, Tmax is 1.2
Noise data are obtained from MIT-BIH Noise Stress Test seconds, M should be bigger than 432. In order to facilitate the
Database [34]. BW, EM, and MA noises records are used. The establishment of G network, we set M is 512.
following two types of data sets are used in this article. (3) In order to balance the signal strength in each sample,
(I) Modeling Experimental data set accelerate the convergence speed and improve the stability of
In order to compare the proposed method with Improved DAE the model, the minimum-maximum normalization is as follows:
[16] and Adversarial Method [18], raw data are obtained from
xn − xmin
the same 10 records numbered 103, 105, 111, 116, 122, 205, N ormalized(xn ) = (14)
213, 219, 223, and 230 in the MIT-BIH Arrhythmia Database xmax − xmin
[33], these records data is from modified limb lead II (MLII). where xmax and xmin represent the maximum and minimum
(II) Generalization data set (See section IV.F for more details) values in each sample, respectively.
In order to explore the generalization ability of this method 3) The Computing Platform: We conduct all the training
for different test data, we conduct experiments on the following tests and validation of the experiments on an NVIDIA Tesla
two new data sets. V100 server. And the codes are implemented in python 3.6 and
(a) New records data set PyTorch 1.1.

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SRINAGAR. Downloaded on October 20,2022 at 11:27:54 UTC from IEEE Xplore. Restrictions apply.
2934 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 26, NO. 7, JULY 2022

Fig. 6. Comparison of denoising results of models with z and


without z.

The network parameters of both G and D are optimized by


using the RMSprop optimizer. And the optimizer has a learning
rate of 0.0001. The batch size is 256.

B. Model Comparison With and Without the Input


Variables z
The input variables z has been retained in reference [19] and
reference [21]. In order to further study the impact of the input
variables z in ECG denoising task, we conduct a model com-
parison experiments. The generator model with z is shown in
Fig. 3(a). We add the input variables z input between the encoder
and the decoder. The input variables z is directly connected with
the thought vector c, and the concatenated vector is decoded to
obtain the denoised signal. As shown in Fig. 3(b), the generator
model without z directly compresses the noisy signals into the
thought vector c at the encoding stage, and then decodes it to
the denoised signals at the decoding stage. Fig. 7. Comparison of denoised waveforms using different models.
We select the EM noise and add it into the raw data by setting
different SNRs (0 dB, 1 dB, 2 dB, 3 dB, 4 dB, and 5 dB) to
get the noisy data. Then, we randomly select 80% of the data
for training and the remaining data for testing. There are 61020
samples in training set and 15120 samples in test set.
Fig. 6 intuitively compares the average SNR and the average
RMSE of the two models. It is clear that the model without
z has higher SNR and smaller RMSE for each record under
different SNR conditions. By removing the input variables z, the
average SNR is increased by about 4 dB, and the average RMSE
is decreased by about 0.002. Thus, it implies that removing input
variables z can reach better performance in terms of SNR and
RMSE.
As Fig. 7(e) and (f) show, both models can generate denoised Fig. 8. Generator models of different coding sizes (dimensions).
signals close to the raw signals, indicating that both models
can effectively remove noise. By comparing the waveforms in
convolutional layer, and the feature maps double from the second
Fig. 7(g) and (h), it is confirmed that the model without the input
layer to 1024 layer by layer. In terms of the change in samples,
variables z retains more details of the waveforms, and makes the
samples of the five models are halved layer by layer from the
denoised signals have higher medical value. Therefore, the input
second, third, fourth, fifth, and sixth layers, respectively. As is
variables z of CGAN should be removed in the ECG denoising
shown in the red box in Fig. 8, after 7 convolutional layers, the
task.
five models get the dimensions of 8 × 1024, 16 × 1024, 32 ×
1024, 64 × 1024, and 128 × 1024 in the form of samples ×
C. Dimensions Tuning
feature maps, respectively. We use the dimensions of this layer
In order to explore the influence of dimensions of G to represent different models.
on denoising performance, we build five models of various Fig. 9 shows the denoising results of five models with different
dimensions with the same number of layers. As shown in Fig. 8, dimensions. It can be seen from Fig. 9 that when the dimensions
the dimensions of the five models are all the same after the first are set to 32 × 1024, SNR reaches the maximum and RMSE

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SRINAGAR. Downloaded on October 20,2022 at 11:27:54 UTC from IEEE Xplore. Restrictions apply.
WANG et al.: ECG SIGNAL DENOISING METHOD USING CONDITIONAL GENERATIVE ADVERSARIAL NET 2935

Fig. 11. Denoising results in removing the single noise.

Fig. 9. Denoising results of five models with different dimensions. five methods. Finally, we adopted Sy-CNN+SC as the generator
in our CAE-GAN.

E. Noise Reduction for Different Noise Types


We add different noises (i.e., EM, BW, and MA noises) to the
raw signals to get noisy signals by setting different SNRs (i.e.,
0 dB, 1 dB, 2 dB, 3 dB, 4 dB, and 5 dB). In addition, considering
that mixed noises are more common in the actual situation,
we also denoise the mixed noises (i.e., EM+BW, MA+BW,
MA+EM, and MA+EM+BW noises).
Fig. 11 shows the results of noise reduction of the BW, EM,
and MA noises with our method. In subfigures, from top to
bottom, they are raw signals, noisy signals, and denoised signals.
Fig. 10. Results of different model structures. It can be clearly seen that the signals denoised by the proposed
method are very close to the raw signals, indicating that this
method can effectively remove three kinds of single noise: BW,
reaches the minimum. To sum up, the resulting dimensions for EM, and MA.
each layer are 512 × 1, 512 × 16, 512 × 32, 512 × 64, 256 × Fig. 12 shows the noise reduction results of the mixed noises
128, 128 × 256, 64 × 512, and 32 × 1024. The decoder of G with our method. Subfigures a, b, c, and d respectively represent
is a mirroring of the encoder, and they have the same width and the denoising results of the EM+BW, MA+BW, MA+EM, and
number of filters for the corresponding layer. MA+EM+BW. For each group of subgraphs, from top to bottom,
they are raw signals, noisy signals, and denoised signals respec-
D. Structure Designing tively. As can be seen from these figures, the denoised ECG
signals are very similar to the raw ECG signals. Obviously, the
To determine the appropriate generator structure, we chose proposed method can also remove the mixed noises effectively.
different structures to try to build the network. We defined Table I shows the denoising results of the proposed method for
symmetrical coding means the dimensions of an encoder are single noise and mixed noises on the test set. From the test set, for
the same as that of the decoder. We design the following five a single type of noise, the denoised SNR can be improved above
model structures. (1) symmetrical coding with fully connected 39.09 dB, and for mixed noises, the denoised SNR can reach
deep neural networks (Sy-FCDNN) (2) symmetrical coding with more than 39.49 dB, indicating that this method has outstanding
recurrent neural networks (Sy-RNN) (3) symmetrical coding denoising effects for both single noise and mixed noise.
with convolutional neural network (Sy-CNN) (4) asymmetrical
coding with convolutional neural network and skip connection
F. Evaluation of Generalization Ability
(Asy-CNN+SC) (5) symmetrical coding with convolutional neu-
ral network and skip connection (Sy-CNN+SC). The practicality of the denoising method should be evaluated
As shown in Fig. 10, by comparing the results of Sy-FCDNN, by the generalization ability of the model, that is the prediction
Sy-RNN and Sy-CNN, we found the Sy-CNN structure has a ability of the model to learn unknown data. In order to explore
better effect by adopting the convolutional auto-encoder. By the generalization ability of this method for different test data,
comparing the results of Sy-CNN and Sy-CNN+SC, we found we conduct experiments on the following two new data sets (i.e.,
that better result can be obtained by introducing skip connec- new records and new lead) under different input SNRs (i.e., 0 dB,
tion between each coding layer and corresponding decoding 1.25 dB, and 5 dB). The average denoising results are shown in
layer, and skip connection facilitates gradient propagation. By Table II and Table III.
comparing Asy-CNN+SC and Sy-CNN+SC, the result of the From Table II, after denoising, the average SNR of the new
symmetrical encoder is better. Sy-CNN+SC is the best of the data set with single noise (i.e., EM, MA, and BW) still reaches

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SRINAGAR. Downloaded on October 20,2022 at 11:27:54 UTC from IEEE Xplore. Restrictions apply.
2936 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 26, NO. 7, JULY 2022

TABLE I
DENOISING RESULTS OF PROPOSED METHOD

TABLE II
AVERAGE DENOISING RESULTS OF NEW RECORDS DATA SET

TABLE III
AVERAGE DENOISING RESULTS OF NEW LEAD DATA SET

41 dB, the average SNR of two mixed noise cases (i.e., MA+EM, G. Computational Load and Energy Consumption
EM+BW, and MA+BW) reaches 40 dB, and the average SNR
In this section, we estimate the computational load and en-
of three mixed noise case (i.e., MA+EM+BW) reaches 39 dB.
ergy consumption. The trained generator is used in the noise
It is proved that this method is also applicable to other records
reduction process, so we mainly focus on the generator. The
in the MIT-BIH Arrhythmia dataset.
total number of parameters in the generator model is 50.31 M,
As shown in Table III, for new lead test data, the denoised
and the saved generator model takes up 191 MB, and the compu-
average SNRs are respectively over 39 dB, 44 dB, and 44 dB
tational quantity of each sample denoising is 4.08243 GFLOP
under a single noise (i.e., EM, MA, and BW), two mixed noise
(giga floating point operations). The proposed model runs on
cases (i.e., MA+EM, EM+BW, and MA+BW), and three mixed
an NVIDIA Tesla V100 server whose max power consumption
noise case (i.e., MA+EM+BW).
is 250 W and max neural network computing ability is about

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SRINAGAR. Downloaded on October 20,2022 at 11:27:54 UTC from IEEE Xplore. Restrictions apply.
WANG et al.: ECG SIGNAL DENOISING METHOD USING CONDITIONAL GENERATIVE ADVERSARIAL NET 2937

TABLE IV
DENOISING RESULTS FOR REMOVING THE BW NOISE

TABLE V
DENOISING RESULTS FOR REMOVING THE EM NOISE

112 TFLOPS. Based on estimating, the energy consumption for sampling points as a sample, while the other two methods need
each sample processing is about 9.1 mW. Through our testing, to use a heartbeat as a sample.
each sample denoising needs about 18 milliseconds. The above Tables IV, V, and VI respectively show the SNR and RMSE
energy consumption cannot represent the power consumption of after noise reduction for ten records containing BW, EM and MA
the method to be transplanted to portable low-power devices in noise. It can be seen that our method has the best performance
the future. in SNR and RMSE for different noises. For BW and MA, the
average SNRs are over 39 dB, for EM, the average SNRs are
over 40 dB.
H. Comparison With Other Methods Our proposed CAE-CGAN adds additional information in the
We compare the performance metrics of Improved DAE [16], structure as a condition to direct the data generation process, and
Adversarial Method [18], and the proposed method (CAE- the adversarial training way of CAE-CGAN does not require the
CGAN) by experiments. It should be noted that the experimental selection of samples. Therefore, the denoised signal is closer to
samples in the three methods are different. Their length of the the raw signal. The above experiments also confirm that this
samples are 101, 310 and 512, respectively. Their number of method avoids their shortcomings of the previous methods and
the training samples are 30000, 54000 and 61020, respectively. has a better noise reduction effect.
Their number of the test samples are 2000, 5940 and 15120, As shown in Fig. 13, in the seven noise cases, the denoising
respectively. In addition, the proposed method is every 512 results of our method all exceed the Adversarial Method [18]

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SRINAGAR. Downloaded on October 20,2022 at 11:27:54 UTC from IEEE Xplore. Restrictions apply.
2938 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 26, NO. 7, JULY 2022

TABLE VI
DENOISING RESULTS FOR REMOVING THE MA NOISE

TABLE VII
EXPERIMENTAL RESULTS OF CLASSIFICATION

significantly, and it is increased by 35.81% at most. The average balance the data distribution of different types of heartbeats,
SNR of seven kinds of noises is increased by 25.55%. It is the same number of heartbeats is selected for each type. In
obvious that the proposed method is better than the Adversarial this experiment, the number is 50. Secondly, 5-order wavelet
Method [18]. decomposition is performed for each heartbeat data. Daubechies
6 (DB6) wavelet is used as the wavelet function, and the ap-
proximation coefficients of the wavelet transform are taken to
I. Classification Experiments represent the characteristics of each heartbeat data. Finally, the
Besides the metrics RMSE and SNR, we also give the new Support Vector Machine (SVM) was used to classify heartbeats
metric that is the classification accuracy of cardiac diseases. in four categories (i.e., N, V, A, and L).
The cardiac disease classification experiment is carried out as The classification results are shown in Table VII. The accuracy
follows. Firstly, Pan-Tompkins (PT) algorithm [35] is used to of the denoised signals is significantly improved compared with
locate the QRS position of the signal, and then 250 points are that of the noisy signals and is very close to that of raw signals.
intercepted as a heartbeat period based on the QRS position. Since EM can mimic the appearance of ectopic beats [34], it has
According to the annotation of the MIT-BIH database, four a great impact on classification accuracy. For the noisy signals
types of heartbeats are saved, which are normal beat (N), pre- with EM (i.e., EM, EM+MA, EM+BW, and EM+BW+MA), the
mature ventricular contraction (V), atrial premature beats (A), improved accuracy is the most obvious, which reaches above
and left bundle branch block beat (L), respectively. In order to 43%. This shows that the denoising effect for EM is greatly

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SRINAGAR. Downloaded on October 20,2022 at 11:27:54 UTC from IEEE Xplore. Restrictions apply.
WANG et al.: ECG SIGNAL DENOISING METHOD USING CONDITIONAL GENERATIVE ADVERSARIAL NET 2939

Fig. 14. The denoising result of record 205 and 219 by the proposed
Fig. 12. Denoising results in removing the mixed noise. method.

J. Summary
CGAN is mainly used for generating tasks (such as image
generation, audio generation, etc.). These tasks generate differ-
ent output by inputting the input variables z, whereas image
enhancement tasks do not require input variables z. We intro-
duce CGAN into the ECG denoising task, so it is necessary to
explore the role of the input variables z in the ECG denoising
task. The experimental results confirm that removing the input
variables z can not only improve the denoising performance but
also retain more ECG waveform details.
After the structure design experiment, our generator chose a
convolutional auto-encoder. We also perform dimensions tuning
Fig. 13. Comparison with the Adversarial Method [18]. experiments on the generator to achieve the best denoising
effect. Our method achieves better denoising effect on both
single noise and mixed noise, indicating that the selected model
improved by the proposed method. For the seven noise cases
(Sy-CNN+SC) has better adaptability to remove different types
under SNR = 0 dB, the diseases classification accuracy of
of noises.
denoised signals can reach more than 92%, and the average
accuracy increased above 32%. The results showed that the
denoised signal retained medical features of raw data. V. CONCLUSION
Taking record 205 as an example, the raw signal, the noisy We propose a denoising method of ECG signals based on
signal, and the denoised signal are shown in Fig. 14(a). We can CGAN in this paper. Our proposed CAE-CGAN adds additional
find that the denoised signal maintains the rhythm irregularity information in the structure as a condition to direct the data
of the raw data. Taking the segment containing atrial fibrillation generation process and this method avoids the shortcomings of
(AF) in record 219 as an example, the raw signal, the noisy the previous methods. For single noise (i.e., EM, BW, and MA),
signal, and the denoised signal are shown in Fig. 14(b). The red the average SNRs can reach more than 40.61 dB. For two mixed
line represents the rhythm annotation of AF (“(AFIB” in MIT- noise cases (i.e., MA+EM, EM+BW, and MA+BW), the average
BIH database). It also shows the denoising signals can retain SNRs can reach more than 41.78 dB. For the three mixed noise
beneficial medical information. (i.e., MA+EM+BW), the average SNR can reach 41.2 dB. The

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SRINAGAR. Downloaded on October 20,2022 at 11:27:54 UTC from IEEE Xplore. Restrictions apply.
2940 IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, VOL. 26, NO. 7, JULY 2022

average SNR of seven kinds of noises was increased by 25.05% [13] S. Poungponsri and X.-H. Yu, “Electrocardiogram (ECG) signal modeling
compared with the state-of-the-art method. These results show and noise reduction using wavelet neural networks,” in Proc. IEEE Int.
Conf. Automat. Logistics, 2009, pp. 394–398.
that the proposed method has outstanding denoising effects. In [14] S. Poungponsri and X.-H. Yu, “An adaptive filtering approach for electro-
addition, the method is also applicable to other records and cardiogram (ECG) signal noise reduction using neural networks,” Neuro-
lead in the MIT-BIH arrhythmia dataset, the denoising results computing, vol. 117, pp. 206–213, Oct. 2013.
[15] P. Xiong, H. Wang, M. Liu, F. Lin, Z. Hou, and X. Liu, “A stacked
also prove our method has good generalization ability. We also contractive denoising auto-encoder for ECG signal denoising,” Physiol.
carried out the cardiac disease classification experiment. For the Meas., vol. 37, no. 12, pp. 2214–2230, Nov. 2016.
seven noise cases under SNR=0 dB, the diseases classification [16] P. Xiong, H. Wang, M. Liu, S. Zhou, Z. Hou, and X. Liu, “ECG signal
enhancement based on improved denoising auto-encoder,” Eng. Appl.
accuracy of denoised signals can reach more than 92%, and the Artif. Intell., vol. 52, pp. 194–202, Jun. 2016.
average accuracy increased above 32%. All the above results [17] I. Goodfellow et al., “Generative adversarial nets,” Adv. Neural Inf. Pro-
indicate that this method is promising in ECG denoising. cess. Syst., vol. 27, pp. 2672–2680, 2014.
[18] J. Wang et al., “Adversarial de-noising of electrocardiogram,” Neurocom-
Although the denoising effect of our method has improved puting, vol. 349, pp. 212–224, Jul. 2019.
obviously, we found that the denoising results on several records [19] P. Singh and G. Pradhan, “A new ECG denoising framework using genera-
(103, 105 and 205) were still not ideal under single MA or BW tive adversarial network,” IEEE/ACM Trans. Comput. Biol. Bioinf., vol. 18,
no. 2, pp. 759–764, Mar. 2021.
noise case. The problem might be caused by a variety of factors, [20] B. Xu, R. Liu, M. Shu, X. Shang, and Y. Wang, “An ECG denoising
such as noise type, noise intensity, and sample state, etc. In method based on the generative adversarial residual network,” Comput.
addition, the data used in the current study was only derived Math. Methods Med., vol. 2021, pp. 1–23, Apr. 2021.
[21] M. Mirza and S. Osindero, “Conditional generative adversarial nets,”
from modified limb lead II (MLII) and lead V1, so 12-lead data Nov. 2014, Accessed: Mar. 09, 2022. [Online]. Available: http://arxiv.org/
needs to be explored in the future. They will be our further abs/1411.1784
studies. [22] J. Johnson, A. Alahi, and L. Fei-Fei, “Perceptual losses for real-time style
transfer and super-resolution,” in Proc. IEEE Eur. Conf. Comput. Vis.,
2016, pp. 694–711.
REFERENCES [23] C. Ledig et al., “Photo-realistic single image super-resolution using a
generative adversarial network,” in Proc. IEEE Conf. Comput. Vis. Pattern
[1] K. Li, W. Pan, Y. Li, Q. Jiang, and G. Liu, “A method to detect sleep apnea Recognit., 2017, pp. 4681–4690.
based on deep neural network and hidden Markov model using single-lead [24] R. A. Yeh, C. Chen, T. Y. Lim, A. G. Schwing, M. Hasegawa-Johnson, and
ECG signal,” Neurocomputing, vol. 294, pp. 94–101, Jun. 2018. M. N. Do, “Semantic image inpainting with deep generative models,” in
[2] Y. Li, Y. Pang, J. Wang, and X. Li, “Patient-specific ECG classification Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 5485–5493.
by deeper CNN from generic to dedicated,” Neurocomputing, vol. 314, [25] H. Zhang, V. Sindagi, and V. M. Patel, “Image de-raining using a condi-
pp. 336–346, Nov. 2018. tional generative adversarial network,” IEEE Trans. Circuits Syst. Video
[3] S. Hong, Y. Zhou, J. Shang, C. Xiao, and J. Sun, “Opportunities and chal- Technol., vol. 30, no. 11, pp. 3943–3956, Nov. 2020.
lenges of deep learning methods for electrocardiogram data: A systematic [26] R. Li, J. Pan, Z. Li, and J. Tang, “Single image dehazing via conditional
review,” Comput. Biol. Med., vol. 122, Jul. 2020, Art. no. 103801. generative adversarial network,” in Proc. IEEE/CVF Conf. Comput. Vis.
[4] D. G. Katritsis, G. C. Siontis, and A. J. Camm, “Prognostic significance Pattern Recognit., 2018, pp. 8202–8211.
of ambulatory ECG monitoring for ventricular arrhythmias,” Prog. Car- [27] X. Kuang, X. Sui, Y. Liu, Q. Chen, and G. Gu, “Single infrared image en-
diovasc. Dis., vol. 56, no. 2, pp. 133–142, Sep. 2013. hancement using a deep convolutional neural network,” Neurocomputing,
[5] G. Friesen, T. Jannett, M. Jadallah, S. Yates, S. Quint, and H. Nagle, “A vol. 332, pp. 119–128, Mar. 2019.
comparison of the noise sensitivity of nine QRS detection algorithms,” [28] X.-H. Qin et al., “Using a one-dimensional convolutional neural net-
IEEE Trans. Biomed. Eng., vol. 37, no. 1, pp. 85–98, Jan. 1990. work with a conditional generative adversarial network to classify plant
[6] M. Z. U. Rahman, R. A. Shaik, and D. R. K. Reddy, “Efficient sign based electrical signals,” Comput. Electron. Agriculture, vol. 174, Jul. 2020,
normalized adaptive filtering techniques for cancelation of artifacts in ECG Art. no. 105464.
signals: Application to wireless biotelemetry,” Signal Process., vol. 91, [29] X. Mao, Q. Li, H. Xie, R. Y. Lau, Z. Wang, and S. P. Smolley, “Least squares
no. 2, pp. 225–239, Feb. 2011. generative adversarial networks,” in Proc. IEEE Int. Conf. Comput. Vis.,
[7] M. Z. U. Rahman, R. A. Shaik, and D. V. R. K. Reddy, “Efficient and 2017, pp. 2794–2802.
simplified adaptive noise cancelers for ECG sensor based remote health [30] T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, and X.
monitoring,” IEEE Sensors J., vol. 12, no. 3, pp. 566–573, Mar. 2012. Chen, “Improved techniques for training gans,” in Proc. Adv. Neural Inf.
[8] M. A. Kabir and C. Shahnaz, “Denoising of ECG signals based on noise Process. Syst., vol. 29, 2016, pp. 481–489.
reduction algorithms in EMD and wavelet domains,” Biomed. Signal [31] J. Masci, U. Meier, D. Cireşan, and J. Schmidhuber, “Stacked convolu-
Process. Control, vol. 7, no. 5, pp. 481–489, Sep. 2012. tional auto-encoders for hierarchical feature extraction,” in Lecture Notes
[9] S. Ari, M. K. Das, and A. Chacko, “ECG signal enhancement using s- Comput. Sci., 2011, pp. 52–59.
transform,” Comput. Biol. Med., vol. 43, no. 6, pp. 649–660, Jul. 2013. [32] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation
[10] M. Gadaleta and A. Giorgio, “A method for ventricular late potentials with conditional adversarial networks,” in Proc. IEEE Conf. Comput. Vis.
detection using time-frequency representation and wavelet denoising,” Pattern Recognit., 2017, pp. 1125–1134.
ISRN Cardiol., vol. 2012, pp. 1–9, Aug. 2012. [33] G. B. Moody and R. G. Mark, “The impact of the MIT-BIH arrhythmia
[11] L. Smital, M. Vítek, J. Kozumplík, and I. Provazník, “Adaptive wavelet database,” IEEE Eng. Med. Biol. Mag., vol. 20, no. 3, pp. 45–50, 2001.
wiener filtering of ECG signals,” IEEE Trans. Biomed. Eng., vol. 60, no. 2, [34] G. B. Moody, W. K. Muldrow, and R. G. Mark, “Noise Stress Test for
pp. 437–445, Feb. 2013. Arrhythmia Detectors,” Comput. Cardiol., vol. 11, pp. 381–384, 1984.
[12] Z. Wang, F. Wan, C. M. Wong, and L. Zhang, “Adaptive Fourier decompo- [35] J. Pan and W. J. Tompkins, “A real-time QRS detection algorithm,” IEEE
sition based ECG denoising,” Comput. Biol. Med., vol. 77, pp. 195–205, Trans. Biomed. Eng., vol. BME-32, no. 3, pp. 230–236, Mar. 1985.
Oct. 2016.

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY SRINAGAR. Downloaded on October 20,2022 at 11:27:54 UTC from IEEE Xplore. Restrictions apply.

You might also like