0% found this document useful (0 votes)

1 views19 pages

A Deep Biometric Recognition and Diagnosis Network

This article presents a deep learning approach for arrhythmia screening using electrocardiogram (ECG) recordings, featuring three convolutional neural network (CNN) models designed for improved feature extraction. The best-performing model, MSF-CNN architecture B, achieved an average accuracy of 98.00%, indicating its potential for clinical application in diagnosing arrhythmias. The study emphasizes the significance of automated detection and classification of arrhythmia heartbeats to aid cardiologists in clinical settings.

Uploaded by

darkfenix.lmgv1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1 views19 pages

A Deep Biometric Recognition and Diagnosis Network

Uploaded by

darkfenix.lmgv1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3016938, IEEE Access

Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/ACCESS. 2017. Doi Number

A deep biometric recognition and diagnosis

network with residual learning for arrhythmia
screening using electrocardiogram recordings
HAO DANG1,2, YARU YUE1,2, DANQUN XIONG3, XIAOGUANG ZHOU1,2, XIANGDONG XU3,
and XINGXIANG TAO1,2
1
School of Automation, Beijing University of Posts and Telecommunications, Beijing 100876, China
2
Engineering Research Center of Information Network, Ministry of Education, Beijing 100876, China
3
Department of Cardiology, Jiading District Central Hospital Affiliated Shanghai University of Medical and Health Sciences, Shanghai
201800, China

Corresponding author: Xiaoguang Zhou ([email protected]), Xiangdong Xu ([email protected]), and Xingxiang Tao
([email protected])

This work was supported in part by the Open Foundation of State key Laboratory of Networking and Switching Technology (Beijing
University of Posts and Telecommunications) under Grant SKLNST-2018-1-18, and in part by the Foundation of Jiading Health Science
under Grant 2018-KY-04.

ABSTRACT Arrhythmia is one of the most persistent chronic heart diseases in the elderly and is associated
with high morbidity and mortality such as stroke, cardiac failure, and coronary artery diseases. It is significant
for patients with arrhythmias to automatically detect and classify arrhythmia heartbeats using
electrocardiogram (ECG) signals. In this paper, we develop three robust deep convolutional neural network
(DCNN) models, including a plain-CNN network and two MSF(multi-scale fusion)-CNN architectures (A
and B), to aid in better feature extraction for the detection of arrhythmia and thus significantly improve the
performance metrics. The proposed models are trained and tested with a public MIT-BIH arrhythmia database
on five types of signals. Six groups of ablation experiments are conducted to analyze the performance of the
models. The accuracy, sensitivity, and specificity obtained from MSF-CNN architecture A are higher than
those from the plain-CNN model, demonstrating that the different parallel group convolution blocks (1×3, 1
× 5, and 1 × 7) dramatically improve a model’s performance. Additionally, the best model MSF-CNN
architecture B achieves an average accuracy, sensitivity, and specificity of 98.00%, 96.17%, and 96.38%,
respectively. This illustrates the method with residual learning and concatenation group convolution blocks
has a profound effect on the feature learning of the model. The results of ablation experiments show that our
proposed biometric recognition and diagnosis network with residual learning (MSF-CNN B) achieves a rapid
and reliable diagnosis approach on ECG signal classification, which has the potential for introduction into
clinical practice as an excellent tool for aiding cardiologists in reading ECG heartbeat signals.

INDEX TERMS Heartbeat, Arrhythmia, Deep Learning, Convolutional Neural Network,

Electrocardiogram Signal

I. INTRODUCTION electrical signal activity over time. It is an important standard

Arrhythmias are an important group of cardiovascular diseases in the diagnosis of arrhythmias [4]. ECG signals include
that are characterized by slow, fast, or irregular heartbeats important morphological information, which are usually
[1,2]. They may occur alone or in conjunction with other obtained by ECG inspection equipment, such as
cardiovascular diseases. Some serious arrhythmias also may electrocardiograph, 24-hour Holter, and wireless wearable
occur suddenly and lead to sudden death, stroke, cardiac devices [5]. And they are widely used in the analysis of cardiac
failure, and coronary artery diseases [3]. function. Cardiac arrhythmias are currently diagnosed by
Electrocardiogram (ECG), a noninvasive, inexpensive, and manual interpretation of the ECG signal. To automatically
reliable diagnostic tool, which reflects the specific changes in diagnose arrhythmias through ECG records, monitoring

1
VOLUME XX, 2019

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3016938, IEEE Access

equipment must be able to analyze the morphological achieved great progresses, but the complex feature extraction
characteristics of ECG signals [6] as well as the correlation process consumes considerable computing resources [4]. In
between heartbeats, and finally detect abnormal heartbeats and recent years, deep learning has become a mainstream pattern
determine types. recognition method. It is an end-to-end learning approach that
According to the standard from the Association for the does not require complex process of hand-crafted extracted
Advancement of Medical Instrumentation (AAMI) [7], ECG features. Moreover, great achievements have been obtained in
signals can be divided into five categories: normal beat (N), the fields of image classification [10-14], object detection [15-
supraventricular ectopic (S), ventricular ectopic (V), fusion 17], and image segmentation [18-21]. Therefore, in this paper,
beat (F), and unknown beat (Q). The AAMI standard focuses we introduce a deep learning technology into the study of one-
on the detection of ventricular ectopic beats (VEBs) and non- dimensional signals and propose a more accurate, rapid, and
VEBs, and each category includes several types of heartbeats. robust discriminant model to analyze the classification of ECG
The specific classification is shown in Table 1. In Table 1, signals.
each heartbeat represents different cardiac activity patterns. This paper is organized as follows: Section II introduces
Under different cardiac activity states, each ECG signal has a literature related to the classification of ECG signals,
different implication and requires different targeted treatments including data pre-processing, machining learning methods,
[8]. At present, visual evaluation based on cardiologists is an and deep learning methods. Then the database is described in
important standard of diagnosis. It requires numerous well- section III. We propose a plain-CNN network and two MSF-
trained specialists to correctly identify the type of signal, CNN architectures (A and B) and deeply analyze the
which not only leads to the deviation between subjective configuration parameters of three network architectures in
judgment and the actual situation [9], but also consumes section IV. In section V, the experimental results are shown in
considerable time and energy. Therefore, it is of utmost detail, and the performance evaluation is also compared with
importance for cardiologists to automatically identify recent popular algorithms. Finally, we conclude our work and
abnormal heart rhythms before clinical treatment. propose future research directions in section VI.
TABLE 1 MAPPING OF THE MIT-BIT ARRHYTHMIA DATABASE HEARTBEAT
TYPES TO THE AAMI STANDARD II. RELATED WORK
AAMI In this section, we survey related literature on traditional
N S V F Q
heartbeat
types machine learning approaches and recent popular deep learning
methods based on the detection and classification of ECG
NOR AP P signals. In general, traditional machine learning methods
PVC
mainly consist of three steps for the classification of
LBBB aAP
arrhythmias: data preprocessing, feature extraction and
MIT-BIH fPN
fVN selection, and feature classification. However, the deep
heartbeat AE NP
types
learning approach is an end-to-end model, which shows the
RBBB capacity to self-learn from the input ECG signal segmentation.
VE
SP U
Nodal(jun A. DATA PRE-PROCESSING
ctional) The pre-processing of ECG signals mainly includes denoising
Abbreviations: and segmentation. Firstly, the ECG signals are contaminated
Heartbeat types: N: Any heartbeat not in the S, V, F, Q by various noise and artefacts [22]. In arrhythmias, as the ECG
classes; S: Supraventricular Ectopic beat; V: Ventricular signals belong to low-amplitude and low-frequency signals,
ectopic beat; F: Fusion beat; Q: Unknown beat; NOR: Normal diverse noises lead physicians to perform an incorrect
beat; LBBB: Left bundle branch block beat; AE: Atrial escape assessment and reduce the accuracy of diagnosis. Therefore,
beats; RBBB: Right bundle branch block beat; AP: Atrial the denoising of ECG signals is a significant baseline [23] of
premature beat; AAP: Aberrated atrial premature beat; PAC: data pre-processing. The goal is to reduce noises and artefacts
Premature atrial contraction beat; NP: Nodal(junctional) and determine the point of interest, which is beneficial to
premature beat; SP: Supraventricular premature beat; PVC: extract effective waveform features from ECG signals. Many
Premature ventricular contraction; VE: Ventricular escape scholars have proposed different preprocessing methods. In
beat; fVN: Fusion of ventricular and normal beat; P: Paced general, they can be divided into four categories: filtering
beat; FPN: Fusion of paced and normal beat; U: Unclassified methods, transformation filtering methods, statistical methods,
beat. and a combination of these methods [24-28]. Additionally, the
Over the past decades, ECG signal recognition and ECG signals segmentation is also necessary, which mainly
classification have become an established technique that can divides the whole signal record into a large number of
effectively assist physicians in clinical diagnosis [4]. The heartbeats or RR intervals, and the heartbeats or RR intervals
relevant automatic recognition models mainly rely on belonging to same classification are grouped together
traditional pattern matching methods. These methods have according to the annotations of the expert.

2
VOLUME XX, 2019

B. MACHINING LEARNING METHODS classification models [38-39], neural network methods [39-
In recent decades, traditional machine learning algorithms 40], and support vector machines (SVMs) [40].
have been widely used in the classification of arrhythmia For example, Li and Zhou [38] presented an approach to
signals and have made remarkable achievements. The classify ECG signals using wavelet packet entropy (WPE) and
machining learning methods include the complex processing random forests (RF) following the recommendations from
of feature extraction, feature selection and feature learning. AAMI. The experimental results have shown that the WPE
1) FEATURE EXTRACTION AND SELECTION and RF methods are superior to several state-of-the-art
Feature extraction and selection are a pivotal part of the competitive methods. A. M. Alqudah [37] introduced a novel
classification of ECG signals in traditional machine learning method to model cardiac-related biological signals (ECG and
methods, which is conducive to obtaining the most essential PPG) based on Gaussian mixture waves. The proposed
features of signals and providing an accurate feature for the method has been applied to the MICIC and MIT-BIH
final classification. The main features of ECG signals include arrhythmia databases.
time-domain features (also known as waveform features), Moreover, A. M. Alqudah et al. [39] utilized two classifier
frequency domain features, and statistical features [22]. techniques, the probabilistic neural network (PNN) algorithm
Time-domain features mainly refer to physical parameters and random forest (RF) algorithm to extract gaussian mixture
reflecting the activity regularity of the ECG signal, including and wavelets features, which were applied to classify the ECG
the frequency and amplitude of each waveform, such as P- beat into six classes, normal beat (N), left bundle branch block
wave, Q-wave, R-wave, S-wave, T-wave, and intervals beat (LBBBB), right bundle branch block beat (RBBBB),
information, such as PR-interval, QT-interval, and RR- premature ventricular contraction (PVC), atrial premature beat
interval. The QRS-complex and RR-interval features from (APB), and aberrated atrial premature (AAP).
ECG signals are significant in the time-domain, which mainly Hammad et al. [40] employed four support vector machines
reflect the position, duration, amplitude, and shape of a (SVM), two Neural Networks (NNs), and a k-nearest neighbor
specific waveform or deflection in signals [29-30]. Otherwise, (KNN) classifier to classify the ECG signals. These algorithms
digital filters [31], neural networks [32], high-order moments extracted 13 features from each ECG segmentation and set
[33], and phasor transforms [34] have also been used for them as an input of the proposed classifier. All the records of
detecting of the QRS-complex. the MIT-BIH arrhythmia database were used to validate these
Frequency-based approaches are one of the most popular algorithms.
feature extraction techniques for representing ECG signals In general, although these above methods have shown
[22]. Many researchers claim the wavelet transform is the best favorable classification performances, they also have
approach for feature extraction and selection from the ECG numerous shortcomings. First, these automatic ECG signal
signals [35]. Within the wavelet transform, the discrete classification models mainly depend on machine learning and
wavelet transforms (DWTs) is the most widely used in ECG pattern recognition. In the process, ECG signal segmentations
signal classification. In addition to DWT, continuous wavelet are regarded as a sequence of stochastic patterns. The hand-
transforms (CWTs) are also used to extract features from ECG crafted extracted feature process requires burdensome
signals, which overcomes the disadvantages of representation computational resource and time. Second, in terms of
coarseness and instability from DWT [36]. classification algorithms and training datasets, the robustness
The main statistical features are the expectation, variance, of classification models is still limited because they fail to
maximum, minimum, standard deviation, and high-order handle large intra-class variations. In addition, the above
moment of ECG signal [24]. In general, these features provide algorithms often subject to overfitting and show poor
an effective method for analyzing the complexity and performance during validating the different datasets.
distribution of waves on any time series. Therefore, in the case Furthermore, the classifier algorithms don’t perform well in
of ECG recording, these functions are conducive for practical applications under the condition of the various ECG
distinguishing the variation process of particular patients and signals from different patients, which shows a common
diseases [22]. disadvantage of inconsistent performance results when
In general, the above feature extraction and selection classifying a new ECG record. This makes them less reliable
methods are implemented in machine learning classification clinically or in practice. Finally, the recent ECG monitoring
algorithms. In this work, we introduce the deep learning models require well-established cardiologists for diagnosis,
approach into 1-D ECG signal classification. It is an end-to- which also consumes a lot of time and energy.
end model with self-learning. The features are automatically
extracted from the ECG signals by the convolutional neural C. DEEP LEARNING METHODS
network. The hand-crafted feature extraction and selection Deep learning is a new technology that has become the
process is unnecessary. mainstream in computer vision and pattern recognition. In the
2) FEATURE LEARNING METHODS past few years, deep learning has been widely used in the fields
These methods are summarized according to different types of of image classification [10-14], object detection [15-17], and
classifiers, including statistical methods [37], decision tree image segmentation [18-21]. In recent years, deep learning-

3
VOLUME XX, 2019

based methods have been successfully applied to analyze ECG features of input signals. It can achieve self-learning through
signal so that overcome the challenges from traditional end-to-end model design. Meanwhile, the radical problem of
machine learning-based methods. both methods is that they only focus on how to propose a better
For example, Kiranyaz et al. [41] presented a fast and model, but do not pay attention to data processing issues: such
accurate patient-specific ECG classification system for as data denoising, data augmentation, and multi-scale data
recognizing the two types of signals of supraventricular training and testing. The data preprocessing of signals should
ectopic beats (S) and ventricular ectopic beats (V). The model be focus on because signals and images are different data types.
designed three convolutional layers and two multi-layer Hence, in this work, inspired by these previous efforts, a
perceptron to obtain the experimental result. more accurate, comprehensive, and robust method based on
In additional, Jun et al. [42] proposed a deep neural network deep learning is proposed to identify five different types of
for the classification of premature ventricular contraction arrhythmia signals. The proposed model not only pays
(PVC) beats. Acharya et al [43] developed a 9-layer CNN attention to the superiority of model design but also presents
model to automatically classify five classes of heartbeats. the importance of data processing in this paper. The final
Murugesan et al. [6] also implemented three robust deep results also prove that the application of ECG signal
neural networks (DNNs) (CNNs, LSTM, and CNN-LSTM) to classification using the convolutional neural network is
detect the two types of Premature Ventricular Contraction reliable. The deep learning architecture outperforms the hand-
(PVC) and premature atrial contraction (PAC). The results crafted feature extractors assembled by machine learning
showcased the potential of the network as a feature extractor models in terms of classification accuracy, sensitivity,
for ECG signal classification. specificity, and confusion matrix.
Moreover, in [44], the CNN was transferred in this study to The contributions of this work are as follows:
carry out automatic ECG arrhythmia diagnostics after (1) We propose an end-to-end plain-CNN architecture and
employing the higher-order spectral algorithms. Transfer two MSF-CNN architectures (A and B) to replace additional
learning strategies were applied on a pre-trained convolutional hand-crafted feature extraction, selection, and classification
neural network, namely AlexNet and GoogleNet, to carry out using machine learning methods. The plain-CNN is a baseline
the final classification. model, the MSF-CNN A and B are implemented based on this
Compared with traditional machine learning methods, the baseline network. Thus, it significantly enhances the
most critical feature of deep learning is that it does not require performance against recent state-of-the-art studies.
the processes of feature extraction and feature selection. The (2) Moreover, the signal processing problems are fully
deep learning approaches have the ability to self-learning from considered. We first design multi-scale input signals,
input signals. In other words, the previous processes of feature including 251 samples (named set A) and 361 samples (named
extraction and selection in machine learning are embedded in set B). This design can improve the generalization ability of
the deep learning model, which can continuously learn the model by extracting multi-scale signal features. Then, the
features from input data. However, the above deep learning signal denoising and data augmentation also are implemented
methods also showcased some imperfections. The research in this paper. The data augmentation strategy is a major
directions of [41], [42] and [6] were a two-class problem. It innovation in this paper. This problem has not been paid much
was a simple research point compared to the five-class attention in most ECG signal research papers before.
problem in this work. Otherwise, [37] and [40] presented a (3) In particular, we present six sets of detailed ablation
plain CNNs model to extract features from ECG signals. The experiments on ECG signal classification and achieve
structure of the plain model was not conducive to the excellent performance metrics. And we also compare the
extraction of features from deep layers. Moreover, [43] results from our model to recent state-of-the-art methods.
proposed 9-layer models, which is enough to features Additionally, detailed analysis and comparison are presented
extraction. But the model didn’t fully consider the imbalance in this paper.
between data classes, which may lead to the overfitting of
model. Additionally, the influence of different lengths of input III. ECG DATABASE DESCRIPTION AND PRE-
signal and the problem of unbalanced original data PROCESSING
classification on model’s performance has not been fully It is crucial to acquire and process the research data in our
considered. work. In this section, we first introduce the MIT-BIH
Broadly speaking, the fundamental disadvantages and Arrhythmia Database in detail, and then we fully illustrate the
challenges of existing machine learning methods for ECG data pre-processing, including denoising, data segmentation,
signal detection and classification are that hand-crafted and data augmentation.
extracted feature, which not only greatly affects the accuracy
of the algorithm, but also consumes a lot of calculation time A. THE DESCRIPTION OF DATABASE
and cost. The deep convolutional neural network is essentially The MIT-BIH Arrhythmia Database (MITDB) [45] is an
realized by stacking automatic encoders. Considerable feature open-source PhysioBank database that is widely used to
representational power effectively reveals unknown abstract research the detection and classification of ECG signals. The

4
VOLUME XX, 2019

Figure 1. A 10 s signal example of MLII and V5 from MITDB. Each ECG record is approximately 30 minutes, which includes two leads. The MLII of Figure 1
denotes the signal of lead II, and V5 describes the lead V.
database consists of 48 half-hour ECG records obtained from completely masks the ECG waveform [4]. Power-line
47 subjects, and each ECG record contains two leads (lead II interference and high-frequency noise are usually removed by
and lead V) originating from different electrodes. Figure 1 a low pass filter. Considering the feature, first, the wavelet
shows an example of signals from the MITDB. Each ECG transform multi-resolution theory is leveraged to decompose
record duration is approximately 30 minutes, and the signal the noisy signal. Then, we take advantage of the different
sampling frequency is 360Hz. These subjects comprise 25 distribution of signal and noise on the spectrum to remove the
males aged range from 32 to 89 and 22 females aged 23 to 89. detail component on the scale of wavelet decomposition
The Arrhythmia database is divided into 25 subjects of normal directly corresponding to the noise. Finally, wavelet inverse
ECG recordings and 23 subjects with abnormal ECG transformation is used to reconstruct signals, which can
recordings. effectively remove the noise in the signal component.
In this paper, two-lead signals (lead II or MLII) are used to 2）DATA SEGMENTATION
train, validate and test the algorithm. In addition, all the signal The denoised ECG signals are classified into 5 classifications:
records are independently annotated by at least two normal (N), supraventricular ectopic beat (S), ventricular
cardiologists. A total of 109,454 heartbeats are extracted in ectopic beat (V), fusion beat (F), and unknown beat (Q)
this work (shown in Table 2). The data directory contains the according to the annotation from cardiologists, and these
entire MIT-BIH arrhythmia data, which uses a custom format signals will be fed into the classification network. A complete
to save file length and storage space. An ECG record consists normal heartbeat is shown in Figure 2, including an integrated
of three parts: a header file (.hea), a data file (.dat), and an rhythm from P-wave onset to T-wave offset (or U-wave onset).
annotation file (.atr). Considering the different lengths of ECG signals contain
different amounts of feature information, data segmentation
B. DATA PRE-PROCESSING follows two strategies: 251 samples and 361 samples. The
We process the original raw data from the MIT-BIH original raw ECG signals with denoising are segmented into a
arrhythmia database through a series of approaches such as mass of heartbeats centered around the R-peak without the
denoising, data segmentation, and data augmentation to form inclusion of the first and last heartbeats. Each heartbeat
the new data sets, and finally train a network with stronger consists of 251 samples (60 samples before the R-peak and
robustness and better generalization ability. The specific 190 samples after R-peak), including an integrated P-, Q-, R-,
processes are as follows: S-, and T-peak. We regard these signals included 251 samples
1）DENOISING as set A. Likewise, these original raw signals with denoising
The main function is to eliminate power-line interferences and also are segmented into 361 samples of a heartbeat (120
baseline wanderings caused by patient respiration or samples before the R-peak and 240 samples after the R-peak).
movement, which will lead to several problems in detecting We regard these signals included 361 samples as set B.
heart diseases. Baseline wandering is a low-frequency noise 3）DATA AUGMENTATION
signal. For baseline wandering, the median filtering method is It is an important part of this work, mainly to balance the
adopted to remove this kind of noise. Power-line interference number of five classifications (N, S, V, F, Q), which is more
is an interfering voltage with an integer multiple of 50 Hz that conducive to feature learning in deep neural networks. A total
5
VOLUME XX, 2019

It should be noted that data augmentation is a process that

generates new samples as a supplement to real data, which is
applied only to the training processes. In testing, we leverage
the original data without augmentation.

IV. NETWORK ARCHITECTURE

In this section, we first introduce the model structure of the
most popular convolutional neural networks. Then, three
different architectures, a plain-CNN, and two MSF-CNN
models (A and B), are proposed. The primary idea of the
network is to build a robust MSF-CNN-based feature
extraction to derive features from ECG signals. The network
would also be easily adaptable to multiple datasets by transfer
learning.
Figure 2. A complete normal heartbeat. A complete heartbeat is a section
of rhythm ranging from P onset to T offset (or U onset), consisting of P- A. CONVOLUTIONAL NEURAL NETWORK
wave, PR-interval, Q-wave, R-wave, S-wave, T-wave, QT-interval and U Convolutional neural networks (CNNs) are one of the most
wave. Each waveform corresponds to the physiological process of frequently used in the field of artificial neural networks [46].
cardiac excitement. The total duration of a heartbeat is approximately 0.8 Since AlexNet [47] won first place in the ImageNet
s. competition in 2012 by using a 7-layer CNN, CNN has been
of five types of ECG signals are considered in this work. As widely used in the fields of image classification, semantic
seen in Table 2, the number of samples in each category is segmentation, video recognition, and speech recognition and
different. The number of F signals is the lowest before data has also achieved great success. The standard architecture of
augmentation. Although unbalanced data distribution is more CNNs includes six parts: the convolutional layer, pooling
common in practical applications, the large difference in the layer, rectified linear activation function, batch normalization,
number of categories is not beneficial to train the network fully connected layer, and softmax function.
model. 1) CONVOLUTIONAL LAYER
Therefore, the data augmentation approaches are leveraged Each convolutional layer is composed of several
to balance the types of signals. Additionally, the unbalanced convolutional units, and all the parameters are optimized by
data distribution is modestly maintained in this paper. the back-propagation algorithm. The main function of the
Specifically, the number of segmentations in the N class convolution operation is to map the input to the hidden layer
remains invariable because they are the most adequate. The feature space so that extract different features from the input
number of remaining classes (S, V, F, Q) is augmented to signal. The shallow layers can only extract some low-level
match the number in the N class. In this paper, three methods local features such as edges, lines, and angles, while the deep
are leveraged to implement the data augmentation strategy. layers iteratively extract corresponding detail features from
The first method is time shift augmentation, which randomly high layers. The convolution operation is computed by the
shifts the signal by rolling it along the time sequence. The following equation (1).
second method is noise augmentation. We add random white
noise with a damping coefficient of 0.4 to the original signal. 𝑦𝑛 = ∑𝑁−1
𝑘=0 𝑥𝑘 𝑓𝑛−𝑘 (1)
We also combine two signals proportionally to obtain the new
signals in the same category. where 𝑥 denotes the input signals, 𝑓 represents the
TABLE 2 THE DATA DISTRIBUTION OF HEARTBEATS IN THE MIT-BIH
convolution kernel, and 𝑁 is the number of elements in the
ARRHYTHMIA DATABASE
input signal 𝑥. The output vector is denoted by 𝑦.
Number of instances Number of 2) POOLING LAYER
Classification (without instances (with The pooling layer, namely down-samples, aims to reduce the
augmentation) augmentation) number of feature maps so that it decreases the calculation cost
N (Normal) 90,595 90,595 by lessening the network parameters. The common pooling
S (Supraventricular operations mainly include max-pooling and average-pooling.
2,781 55,620 The max-pooling only outputs the maximum number in each
ectopic beat)
V (Ventricular ectopic 7,235 72,350 kernel, thus reducing the size of the feature maps and retaining
beat)
F (Fusion beat) 802 32,080
the local features. The average-pooling outputs the mean value
in each kernel, thus aggregating the global feature information.
Q (Unknow beat) 8,041 80,410 It follows equation (2).
Total 109,454 331,055 𝑥𝑖 = max[𝑥𝑖−1 (𝑛 × 𝑠 + 𝑟)]
𝑟∈𝑅

6
VOLUME XX, 2019

𝑜𝑟 𝑥𝑖 = mean[𝑥𝑖−1 (𝑛 × 𝑠 + 𝑟)] (2) x

𝑟∈𝑅
where max and mean denote the max-pooling and average-
pooling, respectively. 𝑠 describes the stride. 𝑛 is the element
index of a feature map. In this study, max-pooling is Convolution layer
implemented in shallow layers, and mean-pooling is leveraged
in deep layers. Thus, this configuration retains both global and F(x) ReLU
local features. Shortcut
connection
3) RECTIFIED LINEAR ACTIVATION FUNCTION
Convolution layer
The rectified linear activation function implements nonlinear
mapping from the output of the convolutional layer, realizing
the nonlinear transformation between the input and output of x
the neuron. Nair et al. [48] has reported that faster convergence
H(x ):=F(x)+x ReLU
and higher accuracy can be obtained using ReLU. Hence, the
activation function of ReLU is utilized in this paper. Its
characteristic is fast convergence and reducing the Figure 3. A building block of the residual learning network.
disappearing gradient. The ReLU is computed by the A residual learning network was first proposed in [10] about
following equation (3). image classification, which resolves the degradation problem
𝑥, 𝑥 > 0 of deep networks. The degradation problem appears with the
ReLU(x) = { (3) deepening of the network layer. The specific phenomenon is
0, 𝑥 ≤ 0
4) BATCH NORMALIZATION that the accuracy saturates and then decreases rapidly with
increasing network depth. The residual learning network is
It is complicated that training a CNN by the fact that
implemented by identity shortcut connections. As shown in
distribution of each layer’s inputs changes during training,
Figure 3, it directly skips one or more convolutional layers, so
because the parameters of previous layers usually change with
that the output from the first several layers is introduced into
the update of gradient. This makes it very difficult to train
the input of the following layers. And it is also a vital
models, which requires lower learning rates and perfect
parameter initialization to solve the problem. This innovation of this paper to introduce the residual learning
phenomenon is called internal covariate shift. In order to block into the one-dimensional signal analysis.
overcome the problem, Loff et al. [49] proposed a method The main reason that the residual network addresses the
called Batch Normalization (BN), which demonstrates that the degradation problem is that the identity shortcut connections
network training converges faster if its inputs are whitened make every layer fit a residual mapping instead of requiring
(linearly transforming the input to have zero means and unit each few stacked layer to directly fit a desired underlying
variances). mapping. Formally, the desired underlying mapping is
5) FULLY CONNECTED LAYER represented as 𝐻(𝑥), and we hope that each nonlinear layer
The fully connected layer plays the role of a classifier in the will map 𝐹(𝑥): = 𝐻(𝑥) − 𝑥. The original mapping is recast
deep neural network. It implements a weighted sum of the into 𝐹(𝑥) + 𝑥, which is implemented by a feedforward neural
feature from previous layers. The feature space is mapped to network with shortcut connections (Figure 3). Thus, the
the sample marker space by a linear transformation. residual network optimizes the residual function 𝐹(𝑥): =
𝐻(𝑥) − 𝑥 instead of 𝐻(𝑥) . Although both forms of the
(6) SOFTMAX FUNCTION
objective function can approximate the required function in
Softmax functions are often used in the last layer of the
principle, the difficulty of optimization is different. A large
convolutional neural network, which is an output layer for
number of experiments also have confirmed this conclusion.
multi-classification. Softmax function maps multiple scalars
If the optimal function is closer to the identity mapping than
to a probability distribution with each value range of (0,1),
the zero mapping, it is much easier for the solver to optimize
which follows equation (4).
𝑧 the residual function to zero than to fit identity mapping by
𝑒 𝑗
𝜕(𝑧)𝑗 = ∑𝐾 𝑧𝑘 𝑓𝑜𝑟 𝑗 = 1, ⋯ , 𝐾 (4) nonlinear layers.
𝑘=1 𝑒
In detail, the residual learning block is divided into two
The output of the softmax function is an 𝑋 dimensional vector, parts: identity mapping and residual mapping. As shown in
and 𝑋 is the number of classes. In this work, there are five Figure 4, the shortcut connection of the right curve is identity
classifications (N, S, V, F, and U). mapping, and 𝐹(𝑥) is the residual learning block, which is
composed of two convolutional layers in our work. In the
B. RESIDUAL LEARNING NETWORK network model, the number of feature maps from the input and
output may be different, and there are two representations of
the residual learning block following equations (5) and (6).
𝐻(𝑥) = 𝐹(𝑥) + 𝑥 (5)

7
VOLUME XX, 2019

Equation (5) is the representation of residual learning when upgraded network based on the plain-CNN to verify the
the number of feature maps from the input and output is the processing ability of three parallel convolution kernels for
same. If the number of feature maps from the input and output ECG signals. As shown in Figure4 (b), the network mainly
is different, the convolution of 1 × 1 will be leveraged to includes one parallel group convolutional block, three
increase the dimension or decrease dimension. convolution layers, two max-pooling layers, one global
𝐻(𝑥) = 𝐹(𝑥) + ℎ(𝑥) (6) average-pooling layer, two full convolutional layers, and the
where ℎ(𝑥) is a convolution operation of 1 × 1 added in the corresponding BN, ReLU, and dropout. The datasets are first
shortcut connection. divided into two subsets (set A and set B) according to the
In addition to solving the degradation problem by different length of ECG signals and fed into three different
optimizing the residual function, residual learning can also parallel convolution kernels (1×7, 1×5, 1×3). The three
effectively reduce gradient dispersion. outputs are then concatenated. This strategy can enable the
When the layer of network becomes deep, the gradient back network model to learn the hierarchical feature information
propagation is as follows. from different spaces, and finally obtain more continuous and
𝜕𝐿𝑜𝑠𝑠 𝜕𝐹𝑁 (𝑋𝐿𝑁 ,𝑊𝐿𝑁 ,𝑏𝐿𝑁 ) 𝜕𝐹2 (𝑋𝐿2 ,𝑊𝐿2 ,𝑏𝐿2 )
better representation. Then it is followed by the BN and ReLU
= ∗⋯∗ (7) layers. The trick of BN relieves overfitting, and ReLU
𝜕𝑥1 𝜕𝑋𝐿 𝜕𝑋1
During the backpropagation of this gradient value, if 𝑁 is increases nonlinear expression. The first two convolutional
large, the gradient value will decrease as it propagates to the blocks contain a convolutional layer, max-pooling, BN and
first few layers, and the gradient may disappear when it is ReLU, and the last convolutional blocks are connected to a
deeper in the deep neural network. However, residual learning global max-pooling layer. The two fully connected layers are
solves this problem at the level of the neural network structure. followed by BN, ReLU, and dropout operations. The MSF-
The gradient back propagation is as follows when the residual CNN A is mainly introduced three parallel convolution
learning is utilized in the model. kernels to fully extract the feature from set A and set B.
Finally, we design another multi-scale fusion CNN
𝜕𝐿𝑜𝑠𝑠
=
𝜕𝑋𝐿 +𝐹(𝑋𝐿 ,𝑊𝐿 ,𝑏𝐿)
=1+
𝜕𝐹2 (𝑋𝐿 ,𝑊𝐿 ,𝑏𝐿 )
(8) architecture B (MSF-CNN B, in Figure 4 (c)) based on the
𝜕𝑥1 𝜕𝑋𝐿 𝜕𝑋𝐿 MSF-CNN A, which is inspired by VGGNets [50] and ResNet
Hence, even with deep network layers, gradient dispersion [10]. The MFS-CNN B is upgraded network based on the
will be effectively contained. MFS-CNN A to verify processing ability of the concatenation
group convolution blocks and residual learning blocks for
C. THE PROPOSED NETWORK ARCHITECTURE
ECG signals. The architecture includes one parallel group
The design of the network mainly relies on the six parts convolutional block (1×7, 1×5, and 1×3) as the MSF-CNN A,
computing units mentioned above. In this work, we design 7 convolution layers, two residual learning blocks, two max-
three network architectures (plain-CNN, MSF-CNN A, and pooling layers, one global average pooling, and two fully
MSF-CNN B.) with a highly modularized block, which are connected layers. The parallel group convolution block is the
inspired by the idea of VGG published as a conference paper same as the MSF-CNN A. The difference between network A
at ICLR 2015[50]. VGG is a mature deep neural network that and B is that two or three convolutional layers (named the
has been proven to effectively solve various problems in the concatenation group convolution block) are grouped together
field of computer vision. in the deep layer of MSF-CNN B, sharing the same number of
As shown in Figure 4 (a), the plain-CNN network, a filters, and the concatenation group convolution blocks are
baseline network, is a simple CNN architecture to verify the separated by the max-pooling layer. Therefore, one parallel
processing ability of 1-D CNN for ECG signals. It includes group convolutional block and two concatenation group
three convolution layers, two fully connected layers, and convolutional blocks constitute the entire convolution MSF-
corresponding nonparametric layers (pooling layer, batch CNN B, and the global average pooling layer is behind the
normalization layer, ReLU layer, and softmax layer). The third concatenation group convolutional blocks. Most
input signals of set A and set B are directly fed into the importantly, we implement the residual learning block to
convolution layer. The first two convolution layers are avoid the degradation problem described above. The
followed by a max-pooling layer, a batch normalization (BN) concatenation group convolution blocks and residual learning
layer, and a ReLU layer, respectively. The last convolution blocks are a vital innovation of this model.
layer is followed by global average pooling. The fully In training, the operation of the fully connected layer is
connected layer is followed by a BN layer, a ReLU layer, and replaced by a full convolutional layer in the network. Since the
a dropout layer. The plain-CNN is an ordinary multi-layer output of the convolutional layer maintains the spatial locality
convolution network. between the feature signals, and the input size of ECG signals
In addition, we propose a multi-scale fusion CNN is not limited. Additionally, this conversion greatly reduces the
architecture A (MSF-CNN A, in Figure 4 (b)) that integrates number of parameters that need to be trained, and it can also
different spatial features by using one parallel group provide a better effect. The corresponding function is shown
convolutional block (1×7,1×5, and 1×3). The MFS-CNN A is in equation (9).

8
VOLUME XX, 2019

ECG signal

Denoising
Pre-processing

Data Segmentation
The input of
network

Input:1×250 Set A Set B Input:1×360

Conv 1×3 conv 1×5 conv 1×7 conv 1×3 conv 1×5 conv 1×7 conv

Convolution Convolution
Max- pooling block block

Concat Concat
Batch Normalization
Batch Normalization Batch Normalization
ReLU
ReLU ReLU

Conv Conv
Conv

Max- pooling Conv

Max- pooling

Max- pooling
Batch Normalization
Batch Normalization

Batch Normalization
ReLU ReLU

ReLU
Conv Conv

Conv
Global average pooling Max- pooling
Conv
Batch Normalization Batch Normalization
Max- pooling
ReLU ReLU
Batch Normalization
Conv
Fully-connected
ReLU
Global average pooling
Batch Normalization
Conv
Batch Normalization
ReLU
Conv
ReLU
Dropout Global average pooling
Fully-connected
Fully-connected Batch Normalization
Batch Normalization
ReLU
Batch Normalization
ReLU
Fully-connected
ReLU
Dropout
Batch Normalization
Dropout
Fully-connected
ReLU

Batch Normalization
Target class Dropout

ReLU
Fully-connected

Dropout
Batch Normalization

Target class ReLU

Dropout

Target class

(a) plain-CNN (b) MSF-CNN A (b) MSF-CNN B

Figure 4. Example network architecture. (a): the plain network as a reference. (b): the MSF-CNN architecture A. (c): the MSF-CNN architecture B.
Table 2 shows more details and other variants.

9
VOLUME XX, 2019

𝑦𝑗 = 𝑓(∑𝑖∈𝑀 𝑘𝑖𝑗 ∗ 𝑥𝑖 + 𝑏𝑗 ) (9) Algorithm 1 MSF-CNN B

Where 𝑥 and 𝑦 are the input and output of the network, Input:
respectively. 𝑀 is the convolution kernel size, 𝑗 denotes the SetA/SetB is the dataset;
index of convolution kernels, and 𝑖 denotes the index of input 10 is cross-validation times;
feature maps. 𝑘𝑖𝑗 describes the convolution kernel for the 𝑖 − T is test data;
𝑡ℎ input map and 𝑗 − 𝑡ℎ output map. optim Algorithm is Adam;
In the plain-CNN, the number of convolution kernels is 64 D is pre-trained model;
in the first convolutional layer and then increases by a factor N is heartbeat classes
of two after each max-pooling layer until it reaches 256. In the Output:
MSF-CNN A, the number of convolution kernels is also 64 in The predicted probability p (·);
the parallel group convolutional block as the plain-CNN. 1: (trainSet; testSet) ← split (SetA/SetB)
However, it then increases by a factor of two after each max- 2: S ← (split trainSet in equal parts of 10)
pooling layer until it reaches 512. In the MSF-CNN B, the 3: for each round t=1, 2, ... ,10 do
configuration of convolution kernels is the same as MSF-CNN 4: {verify; train} ← {St; S – St}
A, and the number of convolution kernels is 64 in the parallel 5: (tf; vf) ← (generate spam feature of train and verify)
6: mt ← modelFit(Adam; tf)
group convolutional block and then increases by a factor of
7: rt ← modelEvaluate(mt; vf)
two after each max-pooling layer until it reaches 512 in the
8: end for
concatenation group convolution blocks. The detailed
9: m ← bestModel((mt; rt)|t = 1, 2, , 10)
configuration of the three network architectures evaluated in 10: test ← (generate spam feature of testSet)
this paper is described in Table 3. 11: res ← modelEvaluate (m; test)
V. ABLATION EXPERIMENTS In the fully connected layer, dropout operation is adopted to
In this section, we first briefly describe the implementation reduce overfitting and improve generalization ability.
details of the experiment and then introduce our performance Considering one-dimensional signals and the number of
metrics of the three models in our experiment. Finally, we neurons, the dropout parameter is set to 0.3. According to
carry out detailed experiments and performance comparison. equation (10), the cross-entropy loss function of five
Additionally, we also discuss the advantages and limitations classification problems can be obtained.
of the proposed model. 1
L(X, y) = ∑𝑛𝑖=1 log 𝑝(𝑦|𝑥) (10)
𝑛
A. IMPLEMENTATION DETAILS where X is the input ECG signal, y is the ground truth of each
The network is designed with a fixed input of 251 (set A) and input ECG signal, and p (·) is the predicted probability.
361 (set B) samples, and the output is the probability of five In addition, 10-fold cross-validation is leveraged to evaluate
categories. The outline of model is presented in Algorithm 1. model performance. The original dataset is randomly divided
Taking set B as an example, first, the original data is called set into 10 equal-sized subsets. The 9 subsets are used for training,
B after pre-processing, and set B is divided into trainSet and and the remaining subset is used to test the proposed model.
testSet. Then, trainSet is divided into 10 equal parts for cross- The process is repeated according to iterations. The
validation. Compared with the results rt of 10 cross-validation, performance metrics (specificity, sensitivity, and accuracy)
the model m with the best performance is obtained through the are evaluated in each epoch. Finally, the classification results
validation and comparison of the training process. Finally, the of each validation are obtained and averaged to estimate the
testSet is loaded to evaluate the model. performance of the model on the whole dataset.
The network model optimizes the cross-entropy function We find that gradient explosion and overfitting may exist in
with the Adam optimizer, which is optimized by using a mini- the comparative experiments. Therefore, to avoid these
batch size of 128 tensors on the 4 NVIDIA TITAN Xp GPUs. problems, regularization is introduced to our proposed model.
The Adam optimization is leveraged in this paper to update the In the experiment, the L2 norm of the model parameters
parameters of the proposed network structure. It has been (equation (11)) is implemented to relieve these problems.
observed that it allows the network to converge at a fast rate, Specifically, the threshold is set to 0.5 to stabilize the training
thus improving the efficiency of the training process. The process.
mini-batch size is chosen as 128 to trade off two 𝑙(x) = L(X) + σ ∑3𝑖=1 ||𝑤𝑖 ||2 (11)
considerations. The size results in a short convergence time by
reducing the variance of training and brings more power for where 𝑙(𝑥) is the loss function with L2 regularization and
Adam optimizer to jump out of shallow minima in training. 𝐿(𝑥) is the cross-entropy loss function from equation (9). σ
According to the experiments, the learning rate starts from denotes a penalty factor, which is to balance the goal of
0.001 and is divided by 10 when the error plateaus. The decay achieving better training results and keeping smaller
rate is also set to 0.0001. The initialization momentum is 0.5, parameter values. Thus, the regularization can avoid
and it is annealed to 0.9 after a multiple epoch gradually. overfitting effectively by narrowing down all the parameters.

10
VOLUME XX, 2019

TABLE 3 THE DETAILED CONFIGURATION OF THE PROPOSED NETWORKS

Layers Plain-CNN network MSF-CNN architecture A MSF-CNN architecture B
Convolution block (kernel size Convolution block (kernel size1×3, 1×5, 1×
1 Conv (kernel size1×3, feature map 64)
1×3, 1×5, 1×7, feature map 64) 7, feature map 64)
Max-pooling (stride 1) Concatenation Concatenation
2 Batch Normalization Batch Normalization Batch Normalization
ReLU ReLU ReLU
3 Conv (kernel size1×3, feature map 128) Conv (kernel size1×3, feature map 128) Conv (kernel size1×3, feature map 128)
Max-pooling (stride 1) Max-pooling (stride 1)
4 Batch Normalization Batch Normalization Conv (kernel size1×3, feature map 128)
ReLU ReLU
Max-pooling (stride 1)
5 Conv (kernel size1×5, feature map 256) Conv (kernel size1×3, feature map 256) Batch Normalization
ReLU
Global average pooling (stride 2) Max-pooling (stride 1)
6 Batch Normalization Batch Normalization Conv (kernel size1×3, feature map 256)
ReLU ReLU
7 Fully connected layer (512) Conv (kernel size1×5, feature map 512) Conv (kernel size1×3, feature map 256)
Batch Normalization Global average pooling (stride 2) Max-pooling (stride 1)
8 ReLU Batch Normalization Batch Normalization
Dropout (0.3) ReLU ReLU
9 Fully connected layer (1024) Fully connected layer (1024) Conv (kernel size1×3, feature map 512)
Batch Normalization Batch Normalization
10 ReLU ReLU Conv (kernel size1×3, feature map 512)
Dropout (0.3) Dropout (0.3)
Global average pooling
(stride 2)
11 Softmax (5 classes) Fully connected layer (1024)
Batch Normalization
ReLU
Batch Normalization
12 ———— ReLU Fully connected layer (1024)
Dropout (0.3)
Batch Normalization
13 ———— Softmax (5 classes) ReLU
Dropout (0.3)
14 ———— ———— Fully connected layer (1024)
Batch Normalization
15 ———— ———— ReLU
Dropout (0.3)
16 ———— ———— Softmax (5 classes)

𝑇𝑃
𝑤𝑖 describes the weight of 𝑖 − 𝑡ℎ layers. Sensitivity = (13)
𝑇𝑃+𝐹𝑁

B. EVALUATION METRICS 𝑇𝑁
Specificity = (14)
For the evaluation, the four-standard metrics of accuracy, 𝑇𝑁+𝐹𝑃
sensitivity (also known as recall), specificity (also known as TP (true positive) refers to the number of samples that are
the true negative rate), and confusion matrix are used to truly identified as positive samples, TN (true negative) refers
evaluate the classification performance of the plain-CNN, to the number of samples that are truly identified as negative
MSF-CNN A, and MSF-CNN B, respectively. Accuracy is samples, FP (false positive) refers to the number of samples
defined as the ratio of the number of correct predictions (It is that are mistaken for positive samples, which actually is
means that positive samples are classified into positive and negative samples, and FN (false negative) refers to the number
negative samples are classified into negative) to the total of samples that are mistaken for negative samples, which are
number of predictions. Sensitivity describes the proportion of actually positive samples. Because of the large differences in
positive cases identified with accounts for all positive cases, different categories, sensitivity and specificity are more
which is to judge model’s ability of detecting positives relevant performance criteria in arrhythmia detection than
accurately. Specificity denotes the proportion of negative
accuracy.
cases identified accounts for all negative cases, which is to
In addition, the confusion matrix is leveraged to validate the
judge model’s ability of detecting negatives accurately.
performance of proposed model, which is an important
Among them, sensitivity and specificity are two commonly
judgment standards in the field of medical classification tasks. standard to judge the performance of multi-classification
These metrics are defined in the following equations (12), (13), model.
and (14): In the confusion matrix, the greater the number of true
𝑇𝑃+𝑇𝑁
positive cases and true negative cases are, the better the
Accuracy = (12) model’s performance is. Likewise, the fewer false positive
𝑇𝑃+𝐹𝑃+𝑇𝑁+𝐹𝑁

11
VOLUME XX, 2019

examples and false negative examples, the better the overall better than set A, mainly because each heartbeat from set B
performance of the model is. includes more samples than set A, and these models can learn
more abundant features information. Otherwise, the overall
C. PERFORMANCE COMPARISON AND DISCUSSION average classification performances (accuracy, sensitivity, and
In this section, we implement six groups of ablation specificity) for set A and set B in the three models are shown
experiments to analyze the performance of model. First, we in Table 4. In set A, the average accuracies of the three
carry out a set of experiments to compare the effects of networks are 83.15%, 86.40%, 89.17%, respectively. The
different lengths (set A and set B) of signals on our models’ result of MSF-CNN A is 3.25% higher than the performance
performances. Moreover, we show the change of of the plain-CNN in the set A. Additionally, the result of MSF-
performances by using the data augmentation method on CNN B without residual learning is 2.77% higher than the
training process. In addition, we conduct a set of experiments performance of MSF-CNN A in set A. In set B, the
to demonstrate the function of denoising on the pre-processing performances of the three models also differ by 4.42% and
of data. Meanwhile, we specially designed an experiment to 2.78%, respectively. Otherwise, sensitivity and specificity of
verify the effect of the residual learning network. And the 75.90% and 87.64% are also obtained in this experiment from
convergence analysis experiment is shown to validate our set B. It is lower than the metrics from the plain-CNN network
models’ convergence ability in the fifth group experiment. and MSF-CNN A in set B without residual learning. However,
Finally, the confusion matrix also is implemented to analyze they are higher than the metrics from the three models in set
each classification signals’ performances. The detailed A. It is analyzed that data imbalance may lead to this problem.
discussion about the six specific groups of experiments is as In Table 2, the number of instances of each category without
follows. data augmentation is quite different. Overall, the results also
1) SET A VS. SET B suggest that the parallel group convolutional block in MSF-
We design a set of experiments to verify the effect of set A and CNN A and B and the concatenation group convolution block
set B on three models in the first phase. Every heartbeat in MSF-CNN B without residual learning have an important
includes 251 samples in set A and 361 samples in set B. Figure effect on the performance improvement of the proposed
5 presents the performances’ trends of the two datasets on the models. In theory, longer ECG records cover more heartbeat
three models. According to Figure 5, the changing curves of rhythm information, which will lead to better classification
accuracy from the three models (plain-CNN, MSF-CNN A, performance. Thus, in the following experiments, we use the
MSF-CNN B) indicate that the accuracy of set B is slightly data from set B to implement ablation experiment analysis.

(a) (b) (c)

Figure 5. The accuracy plot of set A and set B. (a): the result of the plain-CNN. (b): the result of MSF-CNN A. (c): the result of MSF-CNN B.
TABLE 4. THE AVERAGE CLASSIFICATION RESULTS FOR SET A AND SET B ON THE PROPOSED THREE MODELS
Set A (251 samples) Set B (361 samples)
network
Acc. (%) Se. (%) Sp. (%) Acc. (%) Se. (%) Sp. (%)
Plain-CNN 83.15 65.14 85.08 85.23 87.41 79.50
MSF-CNN A 86.40 77.69 81.54 89.65 83.96 88.67
MSF-CNN B (w/o
89.17 68.79 74.63 92.43 75.90 87.64
residual learning)

2) DATA AUGMENTATION VS. WITHOUT DATA augmentation is implemented in accordance with the
AUGMENTATION description of section III. B, and the total number of heartbeats
In the second phase, we set up a set of experiments to analyze increased to 331,055 after data augmentation (shown in Table
the impact of data augmentation on the model. The data used 2). In Figure 6, we compare the performances of the proposed
in this experiment are from set B. The strategy of data three networks architectures with data augmentation and

12
VOLUME XX, 2019

without data augmentation in set B. As seen in Figure 6, the 92.67%, respectively. It is better than the metrics from the
models with data augmentation perform dramatically better plain-CNN and MSF-CNN A with data augmentation.
than these models’ performances without data augmentation. Additionally, the performances are superior to the results of
Table 5 shows detailed evaluation metrics of the model the three models without data augmentation. The experiment
predictions. The average accuracies of set B are 92.81%, confirms that data augmentation dramatically improves the
95.48%, and 95.96% with data augmentation on the three classification performance of ECG signals, which is also
models. The results are 7.58%, 5.83%, and 3.53% higher than beneficial to data balancing in the dataset. Therefore, we adopt
those of the three models without data augmentation. set B with data augmentation to perform the following
Otherwise, due to data augmentation, the independent experiments.
performance assessment of MSF-CNN B without residual
learning results in sensitivity and specificity of 96.58% and

(a) (b) (c)

Figure 6. The accuracy plot of set B. (a): the result of the plain -CNN. (b): the result of MSF-CNN A. (c): the result of MSF-CNN B (w/o residual
learning).

TABLE 5. THE AVERAGE CLASSIFICATION RESULTS FOR SET B WITH DATA AUGMENTATION ON THE PROPOSED THREE MODELS

Set B (361 samples, without data augmentation) Set B (361 samples, with data augmentation)
network
Acc. (%) Se. (%) Sp. (%) Acc. (%) Se. (%) Sp. (%)

Plain-CNN 85.23 87.41 79.50 92.81 95.84 93.92

MSF-CNN A 89.65 83.96 88.67 95.48 96.53 87.74
MSF-CNN B (w/o
92.43 75.90 87.64 95.96 96.58 92.67
residual learning)

3) DENOISING VS. WITHOUT DENOISING 96.38%, and 97.03% with denoising on the three models
In this experiment, we set up a set of experiments to analyze without residual learning, respectively.
the impact of denoising on the model. The data used in this The results are 0.6%, 0.9%, and 1.07% higher than those of
experiment are from set B with data augmentation. As shown the three models without denoising. Moreover, compared with
in Figure 7, the performance of denoising performs slightly all the other models, very high sensitivity (94.43%) and
better than these models’ performances without the processing specificity (96.41%) are obtained in this experiment. It is
of denoising. The detailed classification measures are reported necessary to emphasize that the data augmentation strategy is
in Table 6. The average accuracies of set B are 93.41%,

(a) (b) (c)

Figure 7. The accuracy plot of set B. (a): the result of the plain-CNN. (b): the result of MSF-CNN A. (c): the result of MSF-CNN B (w/o residual
learning).
13
VOLUME XX, 2019

TABLE 6 THE AVERAGE CLASSIFICATION RESULTS FOR SET B WITH DATA AUGMENTATION AND DENOISING ON THE PROPOSED THREE MODELS

Set B (361 samples, with augmentation, without

Set B (361 samples, with augmentation, with denoising)
denoising)
network
Acc. (%) Se. (%) Sp. (%) Acc. (%) Se. (%) Sp. (%)

Plain-CNN 92.81 95.84 93.92 93.41 87.16 89.73

MSF-CNN A 95.48 96.53 87.74 96.38 91.82 92.58

MSF-CNN B (w/o 95.96 96.58 92.67 97.03 94.43 96.41
residual learning)
MSF-CNN B
(w/residual learning) ______ ______ ______ 98.00 96.17 96.38

implemented in this experiment. It is clear that the denoising and between 80 and 100 epochs during validation. Hence, 100
technique has an influence on the performance of the models. epochs are used in this experiment to ensure full convergence
4) RESIDUAL LEARNING VS. WITHOUT RESIDUAL of the model and reduce overfitting. Moreover, the speed of
LEARNING convergence from the model with residual learning is faster.
Next, we evaluate the effect of the residual learning block on
MSF-CNN B with augmentation and denoising on set B. The
baseline network is the same as the above MSF-CNN B
without the residual learning block. The MSF-CNN B with
residual learning adds a shortcut connection to each pair of 1
×3 as in Figure 4 (c). We make two major observations from
Table 6 (the last row) and Figure 8. First, the result situation
(accuracy) is reversed with residual learning—the MSF-CNN
B with residual learning is better than it without residual
learning (differ by 0.97%). Most importantly, the
performances of sensitivity and specificity also exhibit
excellent and stable metrics. This indicates that the residual Figure 9. Training and validation loss function of set B on MSF-CNN B
learning block dramatically enhances the optimization without residual learning over the epochs.
efficiency by providing faster convergence at the early stage.

Figure 8. The accuracy plot of set B with denoising. The blue solid denotes Figure 10. Training and validation loss function of set B on MSF-CNN B with
MSF-CNN B without residual learning, and the red solid denotes MSF-CNN residual learning over the epochs.
B with residual learning. 6) CONFUSION MATRIX ANALYSIS
5) CONVERGENCE ANALYSIS Finally, in addition to evaluating each classification signal’s
Then, we obtain the loss details during the training and performances of the model with residual learning block, we
validation processes. Figure 9 illustrates the change curve of also assessed a confusion matrix of ECG heartbeats (Tables 7
loss of set B on MSF-CNN B without residual learning block, and 8). They show the accuracy, sensitivity, and specificity of
and Figure 10 also shows the result of set B on MSF-CNN B each classification. Table 8 shows a confusion matrix from the
with residual learning block. As shown in the figures 9 and 10, MSF-CNN B without a residual learning block. Table 9
the convergence effect of the model with residual learning is describes a confusion matrix from the model with residual
better than that of the model without residual learning. In learning block. According to Table 8, on average less than
addition, these experiments’ results also show that the model 1.12% of the ECG heartbeats are wrongly classified across all
converges after between 60 and 100 epochs during training 10-fold when the model does not utilize a residual learning

14
VOLUME XX, 2019

block. Likewise, for the model with residual learning block, detection of class Q and is 95.33%. And the minimal
less than 1.00% of the ECG heartbeats are wrongly classified specificity is 96.81%, which is a model with residual learning
across all 10-folds. The minimal sensitivity recorded for both block attributed to the detection of class V. The results also
models are attributed to the detection of class F and are 92.25% demonstrate that the residual learning block has a positive
and 92.32%, respectively. The minimal specificity for the impact on the performance of the model.
model without residual learning block is attributed to the

TABLE 7 A CONFUSION MATRIX OF ECG HEARTBEATS WITHOUT RESIDUAL LEARNING BLOCK ACROSS ALL 10-FOLDS

Predicted
Confusion Matrix Acc (%) Sen (%) Spe (%)
N S V F Q
N 87904 511 1036 926 218 98.49 97.03 98.05
S 36 54579 427 105 473 99.98 98.13 97.24

V 678 236 69815 1231 390 98.42 96.50 96.96

True
F 803 176 924 29617 560 98.44 92.25 99.11
Q 721 93 243 367 78986 99.05 98.23 95.33

Acc=accuracy, Sen=sensitivity, Spe=specificity.

TABLE 8 A CONFUSION MATRIX OF ECG HEARTBEATS WITH RESIDUAL LEARNING BLOCK ACROSS ALL 10-FOLDS

Predicted
Confusion Matrix Acc (%) Sen (%) Spe (%)
N S V F Q
N 88837 267 48 526 917 99.46 98.06 97.68
S 43 54930 26 497 124 97.52 98.76 99.68
V 544 103 70903 267 533 99.41 98.00 96.81
True
F 97 233 307 31438 5 99.32 92.32 99.46

Q 86 279 108 299 79638 99.28 99.04 98.37

Acc=accuracy, Sen=sensitivity, Spe=specificity.

Recent advances and representative techniques in the deep learning method of STFT-Based Spectrogram [59]
arrhythmias are summarized in Table 9, which also yield high- also provide a new idea for future work. In additional, the
performance results. However, compared to recent advances, CNN and RNN (Recurrent Neural Network) is two popular
the benefits of our proposed MSF-CNN B are as follows: deep learning methods to process the time series data. In [62],
(1) Compared with most literature, the evaluation metrics though the performance is superior to our models’ result,
from our proposed model, including accuracy, sensitivity, compared with the LSTM-based auto-encoder network in [62],
specificity, and confusion matrix, is comprehensive and our model is more lightweight and less computationally
outperform the most of recent advances. And our proposed expensive. The LSTM is a replacement of the traditional RNN.
MSF-CNN is end-to-end based on deep learning, which And it is a bidirectional model, which is utilized to extract the
replaces additional hand-crafted feature extraction using bidirectional information from the forward model and
traditional machining learning. backward model at the same time. There is no doubt that the
(2) Even though the performance of our model is slightly advantage will also cost a lot of computational expensive.
lower than [61], our proposed model deals with multi- Most importantly, we think the LSTM-based auto-encoder
classification problems, rather than the two-classification (AE) network [62] is a positive strategy, which can effectively
problem studied in [61]. extract the characteristic information of time series signals.
(3) We implemented the 10-fold cross-validation approach We will fully consider the optimization methods of [62] in our
in the proposed models, thus boosting the robustness of the future work.
models.
Otherwise, compare with our work, even though the VI. CONCLUSION AND FUTURE WORK
average accuracy from reference [59] is better than our
model’s performance result, the performance metrics In this study, three end-to-end network models, including a
(accuracy, sensitivity, and specificity) of our paper are more plain-CNN and two MSF-CNN architectures (A and B), are
comprehensive than the metrics (only accuracy) of [59]. And presented to automatically identify and classify the five

15
VOLUME XX, 2019

different types of ECG heartbeats. The plain-CNN is a (1) The real-world constraints must be considered in the
baseline network with multiple convolution layers, which is a new model. We will put theory research results into a specific
simple CNN architecture to verify the processing ability of 1- filed or for a specific product.
D CNN for ECG signals. The MSF-CNN A is proposed to (2) It’s considerable to design an adaptive parameter system
improve the learning ability of the plain-CNN. It is an to improve the robustness of optimization model.
upgraded network based on baseline network to verify the (3) We will consider the imbalanced data classification
processing ability of three parallel convolution kernels for problem and sufficient prior knowledge. The dendritic neuron
ECG signals, which increases a parallel group convolution model [69] and evolutionary cost-sensitive [70] will provide a
block (including three different convolution kernels with new idea in future work.
1×7,1×5, and 1×3). Finally, the MSF-CNN B based on the
MSF-CNN A is improved by implementing a residual learning
block with three concatenation groups convolution blocks to
promote the performance of the model. It is an upgraded
network based on the MFS-CNN A to verify processing ability
of the concatenation group convolution blocks and residual
learning blocks for ECG signals.
The three proposed models are trained and tested with a
public MIT-BIH arrhythmia database on five types of signals,
N, S, V, F, and Q. Six groups of ablation experiments are also
conducted to analyze the performances of these models. The
best model MSF-CNN B with residual learning and group
convolution blocks (including the parallel and concatenation
group convolution blocks) achieves an average accuracy,
sensitivity, and specificity of 98.00%, 96.17%, and 96.38% in
set B. Otherwise, the strategy of multi-scale data, data
augmentation, and denoising also have an important effect on
the training of the three models in our experiments.
Therefore, our proposed deep neural network algorithm
(MSF-CNN B) shows the potential of deep learning-based
approach for feature extraction of the MIT-BIH arrhythmia
database. As is evident from these results, the proposed
approach is an efficient automatic cardiac arrhythmia
classification method and provided a reliable recognition
system based on well-established CNN architectures instead
of training a deep CNN from scratch. It has the potential to
provide accurate ECG signal classification in clinical practice.
In future work, we would like to introduce more clinical
diagnosis data to test the proposed model. Additionally, the
temporal (heartbeats) and spatial (spectrogram) signal features
will be combined to improve the performance metrics of the
models in future work. We would also like to determine the
severity grades of patients with chronic heart diseases by the
detection and classification of ECG signals, which may
represent normal, abnormal, and cardiac electrical activity
conditions that may be life-threatening.
Specifically, compared with the self-organizing structural
size method [63-65], the deep convolutional neural network is
complicated to fast determine its optimal structure given
specific applications. Hence, we will propose a new method
combined the self-organizing maps and convolutional neural
network to the ECG signal research in the future work.
Moreover, we will try our best to propose a new method
combined the optimization approaches [66-68] and
convolutional neural network to the ECG signal research in
the future. This new method will focus on the following
aspects:

16
VOLUME XX, 2019

TABLE 9 A SUMMARY OF SELECTED WORKS FOR AUTOMATIC ARRHYTHMIA CLASSIFICATION OF ECG SIGNALS FROM THE DATABASE OF MIT-BIH ARRHYTHMIA

Performance (%)
Literature
Main Work Database Approach
and Time Accuracy (Correct Specificity (positive
Sensitivity (Recall)
recognition rate) predictivity)
MIT-BIH
Five classification ML and DL:
2017 [51] arrhythmia 97.50 / /
(N, S, V, F, Q) MFSWT; DNN
database
MIT-BIH ML: PSO
Two classification
2018 [52] arrhythmia optimized least- 89.90 80.80(S), 82.20(V) 96.70(S), 99.00(V)
(S, V)
database square twin SVM
Chinese
2019 [53] Ten classification Cardiovascular ML: PPNN 74.16 75.23 73.92
Disease Database
The ventricular MIT-BIH
2019 [54] ectopic beat arrhythmia DL: 1D-CNN 95.50 85.80 64.50
detection database
MIT-BIH
Five classification DL: DRNNs
2019 [55] arrhythmia 98.40 / /
(N, S, V, F, Q) based on BGRU
database
Chinese
Seven DL: Parallel
2019 [56] Cardiovascular 95.98 / /
classification GRU RNN
Disease Database
Six classification MIT-BIH
2019 [57] (Normal, L, R, V, arrhythmia ML: KNN 97.70 / /
A, P) database
MIT-BIH
2019 [58] Two-classification arrhythmia DL: CNN 94.70 77.30(S), 93.70(V) 97.70(S); 98.80(V)
database
MIT-BIH DL: CNN of
2019 [59] Five classification arrhythmia STFT-Based 99.00 / /
database Spectrogram
Chinese
Multi- DL: MTGBi-
2020 [60] Cardiovascular 88.86 94.19 /
classification LSTM
Disease Database
Personal Wearable 99.20(VEB) 93.00(VEB) 99.80(VEB)
2020 [61] Two-classification DL: LSTM-RNN
Devices 98.30(SVEB) 66.90(SVEB) 99.80(SVEB)
MIT-BIH
ML and DL:
2020 [62] Five classification arrhythmia 99.45 98.63 99.66
LSTM, SVM
database
Five MIT-BIH DL: Plain-CNN 93.41(Plain-CNN) 87.61(Plain-CNN) 89.73(Plain-CNN)
This
classification (N, arrhythmia MSF-CNN A 96.38(MSF-CNN A) 91.82(MSF-CNN A) 92.58(MSF-CNN A)
paper
S, V, F, Q) database MSF-CNN B 98.00(MSF-CNN B) 96.17(MSF-CNN B) 96.38(MSF-CNN B)

Abbreviations: [2] U.R. Acharya, J.S. Suri, J.A.E. Spaan, S.M. Krishnan, Advances in Cardiac
Signal Processing, 2007.
Heartbeat types: S: Supraventricular ectopic beat; V:
[3] Arrhythmia irregular heartbeat center, Heart Disease and Abnormal Heart
Ventricular ectopic beat; F: Fusion beat; Q: Unknown beat; N: Rhythm (Arrhythmia), 2017. [Online]. Available:
any heartbeat not in the S, V, F, Q classes or normal beat; PVC: https://www.medicinenet.com/ arrhythmia_irregular_heartbeat/article.htm.
Premature ventricular contraction beat; PAC: Premature atrial [4] S.M. Mathews, K. Chandra, K.E. Barner, “A novel application of deep
learning for single-lead ECG classification,” Computers in Biology and
contraction beat; L: Left bundled branch blocks; R: Right
Medicine, vol. 99, pp. 53–62, Jun. 2018.
bundled branch blocks; V: Premature ventricular contractions; [5] S. Preejith, R. Dhinesh, J. Joseph, and M. Sivaprakasam, “Wearable ECG
A: Atrial premature beats; P: Paced beats; VEB: Ventricular platform for continuous cardiac monitoring,” in Engineering in Medicine and
ectopic beats; SVEB: Supraventricular ectopic beats. Biology Society (EMBC), 2016 IEEE 38th Annual International Conference
of the. IEEE, pp. 623–626, Oct. 2016.
Approaches: ML: Machine learning, DL: Deep learning,
[6] B. Murugesan, V. Ravichandran, and K. Ram, “ECGNet: Deep Network
SVM: Support vector machine; DNN: Deep neural network, for Arrhythmia Classification,” IEEE Instrumentation and Measurement
CNN: Convolutional neural network; MFSWT: Slice wavelet Society, pp. 623–626, Jun. 2018.
transform; PSO: Particle swarm optimization; PPNN: [7] American National Standards Institute, Testing and Reporting
Performance Results of Cardiac Rhythm and ST Segment Measurement
Probabilistic process neural network; DRNNs: Deep recurrent
Algorithms, 2012.
neural networks; BGRU: Bidirectional gated recurrent unit; [8] R.J. Martis, U.R. Acharya, H. Adeli, “Current methods in
KNN: k-Nearest Neighbor; MTG: Multi-Task Group. electrocardiogram characterization,” Comput. Biol. Med, vol. 48, no.1, pp.
133-149, May. 2014.
REFERENCES [9] U.R. Acharya, S.L. Oh, Y. Hagiwara, “A Deep Convolutional Neural
Network Model to Classify Heartbeats,” Computers in Biology and Medicine,
[1] National Heart Lung and Blood Institute, Types of Arrhythmias, 2011
vol. 89, no.1, pp. 389-396, Oct. 2017.
[Online]. Available: https://www.nhlbi.nih.gov/health/health-
[10] K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun, “Deep Residual Learning for
topics/topics/arr/types. (Accessed 5 July 2017).
Image Recognition,” In CVPR, 2017.

17
VOLUME XX, 2019

[11] K. Simonyan and A. Zisserman. Very deep convolutional networks for [36] E. J. da S. Luz, W. R. Schwartz, G. C. Chavez, et al., “ECG-based
large-scale image recognition. Presented at CVPR,2015. [Online]. Available: heartbeat classification for arrhythmia detection: A survey,” Computer
arxiv.org/abs/1409.1556. Methods and Programs in Biomedicine, vol. 127, pp: 144-164, 2015.
[12] Gao Huang, Zhuang Liu, Kilian Q. Weinberger. Densely Connected [37] A. M. Alqudah, “An enhanced method for real-time modelling of cardiac
Convolutional Networks. Presented at CVPR,2016. [Online]. Available: related biosignals using Gaussian mixtures,” Journal of medical engineering
arxiv.org/abs/1608.06993. & technology, vol. 41, no. 8, pp. 600-611, Oct. 2017.
[13] G. Cai, Y. Wang, L. He, and M. Zhou, “Unsupervised Domain Adaptation [38] T.Y. Li, and M. Zhou, “ECG Classification Using Wavelet Packet
with Adversarial Residual Transform Networks,” IEEE Transactions on Entropy and Random Forests,” Entropy, vol. 18, no. 8, pp. 285-300, Aug. 2016.
Neural Networks and Learning Systems, DOI: TNNLS.2019.2935384, online [39] A. M. Alqudah, I. Abuqasmieh, A. Badarneh and H. Alquran,
2019. “Developing of robust and high accurate ECG beat classification by
[14] T. D. Pham, K. Wardell, “A. Eklund and G. Salerud, “Classification of combining Gaussian mixtures and wavelets features,” Australasian physical &
short time series in early Parkinsonʼs disease with deep learning of fuzzy engineering sciences in medicine, vol. 42, no. 1, pp. 149-157, Jan. 2019.
recurrence plots,” IEEE/CAA Journal of Automatica Sinica, vol. 6, no. 6, pp. [40] M. Hammad, A. Maher, K. Q. Wang, F. Jiang, and M. Amrani, “Detection
1306-1317, November 2019. of Abnormal Heart Conditions Based on Characteristics of ECG Signals,”
[15] S. Ren, K. He, R. Girshick, and J. Sun. Faster R-CNN: Towards Real- Measurement, vol. 125, pp. 634-644, Sep.2018.
Time Object Detection with Region Proposal Networks. Presented at [41] S. Kiranyaz, T. Ince, M. Gabbouj, “Real-time patient-specific ECG
CVPR,2015. [Online]. Available: arXiv:1506.01497. classification by 1-D convolutional neural networks,” IEEE Transactions on
[16] J. Dai, Y. Li, K. He, and J. Sun. R-FCN: Object Detection via Region- Biomedical Engineering, vol. 63, no. 3, pp. 664–675, Mar. 2016.
based Fully Convolutional Networks. Presented at NIPS, 2016. [Online]. [42] T. J. Jun, H. J. Park, Y. H. Kim, “Premature ventricular contraction beat
Available: arXiv:1605.06409. detection with deep neural networks,” in 15th IEEE International Conference
[17] Y. Tian, X. Li, K. Wang and F. Wang, “Training and testing object on Machine Learning and Applications, pp. 859–864, 2016.
detectors with virtual images,” IEEE/CAA Journal of Automatica Sinica, vol. [43] U. R. Acharya, S. L. Oh, Y. Hagiwara, et al., “A Deep Convolutional
5, no. 2, pp. 539-546, Mar. 2018. Neural Network Model to Classify Heartbeats,” Computers in Biology and
[18] Dai, J., Qi, H., Xiong, Y., Li, Y., Zhang, G., Hu, H., Wei, Y. Deformable Medicine, vol. 89, pp 389-396, 2017.
Convolutional Networks. Presented at CVPR, 2017. [Online]. Available: [44] H. Alquran, A. M. Alqudah, I. Abu-Qasmieh, Al-Badarneh, S.
arXiv:1703.06211. Almashaqbeh, “ECG classification using higher order spectral estimation and
[19] Liang-Chieh Chen, George Papandreou, Florian Schroff, Hartwig Adam. deep learning techniques,” Neural Network World, vol. 29, no. 4, pp: 207-219,
Rethinking Atrous Convolution for Semantic Image Segmentation. Presented Aug. 2019.
at CVPR, 2017. [Online]. Available: arXiv:1706. 05587. [45] A.L. Goldberger, “PhysioBank, PhysioToolkit, and PhysioNet:
[20] Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, Jiaya Jia. components of a new research resource for complex physiologic signals,”
Pyramid Scene Parsing Network. Presented at CVPR, 2017. [Online]. Circulation 101 (23) (2000) e215–e220.
Available: arXiv:1612.01105,2017. [46] J. Schmidhuber, “Deep Learning in neural networks: an overview,”
[21] K. He, G. Gkioxari, P. Dollar, and R. Girshick. Mask R-CNN. Presented Neural Netw, vol. 61, pp. 85-117, Jan.2015.
at ICCV, 2017. [Online]. Available: arXiv:1703.06870. [47] A. Krizhevsky, I. Sutskever, Geoffrey E. Hinton, “ImageNet
[22] S.K. Berkaya, A. K. Uysal, E.S. Gunal, “A survey on ECG analysis,” Classification with Deep Convolutional Neural Network”, NIPS Curran
Biomedical Signal Processing and Control, vol. 43, pp. 216-235, May. 2018. Associates Inc,2012.
[23] M.A. Awal, S.S. Mostafa, M. Ahmad, M.A. Rashid, “An adaptive level [48] V. Nair and G. E. Hinton, “Rectified linear units improve restricted
dependent wavelet thresholding for ECG denoising,” Biocybern Biomed. Eng, boltzman machines,” Proceedings of the 27th international conference on
vol. 34, no.4, pp. 238-249, Mar. 2014. machine learning (ICML-10), pp. 807–814, 2010.
[24] L.Q. Wang, “Research on ECG Waveform Detection and Arrhythmia [49] S. Ioffe and C. Szegedy, “Batch Normalization: Accelerating Deep
Classification[D],” Hebei University of Technology, Tianjin, 2014. Network Training by Reducing Internal Covariate Shift,” In CVPR, 2015.
[25] N. Sen, C. Chandrakar, “Development of a Novel ECG signal Denoising [50] K. Simonyan, A. Zisserman. “Very deep convolutional networks for
System Using Extended Kalman Filter,” IJAREEIE. vol. 3, pp. 6896-6901, large-scale image recognition”. In ICLR, 2015.
2014. [51] K. Luo, J. Li, Z. Wang, A. Cuschieri, “Patient-specific deep architectural
[26] B.S. Gayal, F.I. Shaikh, “Denoising of ECG signal using undecimated model for ECG classification,” J. Healthcare Eng., vol. 2017, May 2017.
wavelet transform,” IJAREEIE. vol. 3, pp. 7200-7208, 2014. [52] S. Raj, K. C. Ray, “Sparse representation of ECG signals for automated
[27] R. Rodrigues, P. Couto, “A Neural Network Approach to ECG Denoising,” recognition of cardiac arrhythmias,” Expert Systems with Applications, vol.
Available online: arXiv:1212.5217, 2012. 105, pp. 49–64, Sep. 2018.
[28] Md. A. Kabir, C. Shahnaz, “Denoising of ECG signals based on noise [53] N. D. Feng, S. H. Xu, Y. Q. Liang, K. Liu, “A Probabilistic Process Neural
reduction algorithms in EMD and wavelet domains,” Biomedical Signal Network and Its Application in ECG Classification,” IEEE Access, vol. 7, pp.
Processing and Control, vol. 7, pp. 481---489, 2012. 50431 – 50439, Apr. 2019.
[29] Y.C. Yeh, W.J. Wang, C.W. Chiou, “Cardiac arrhythmia diagnosis [54] A. A. S. León, J. R. N. Alvarez, “1D Convolutional Neural Network for
method using linear discriminant analysis on ECG signals,” Measurement, vol. Detecting Ventricular Heartbeats,” IEEE Latin America Transactions, vol. 17,
42, no.5, pp. 778-789, Jun. 2009. no. 12, pp. 1970 – 1977, Dec. 2019.
[30] Y.C. Yeh, W.J. Wang, C.W. Chiou, “Feature selection algorithm for ECG [55] H. M. Lynn, S. B. Pan, P. Kim, “A Deep Bidirectional GRU Network
signals using Range-Overlaps Method,” Expert Syst. Appl, vol. 37, no. 4, pp. Model for Biometric Electrocardiogram Classification Based on Recurrent
3499-3512, Apr. 2010. Neural Networks,” IEEE Access, vol. 7, pp. 145395-145405, Sep. 2019.
[31] V.X. Afonso, W.J. Tompkins, T.Q. Nguyen, S. Luo, “ECG beat detection [56] S. H. Xu, J. J. Li, K. Liu, L. Wu, “A Parallel GRU Recurrent Network
using filter banks,” IEEE Trans. Biomed. Eng., vol. 46, pp. 192–202, 1999. Model and Its Application to Multi-Channel Time-Varying Signal
[32] B. Abibullaev, H.D. Seo, “A new QRS detection method using wavelets Classification,” IEEE Access, vol. 7, pp. 118739 - 118748, Sep. 2019.
and artificial neural networks,” J. Med. Syst. vol. 35, pp. 683–691, 2011. [57] H. Yang, Z. Q. Wei, “Arrhythmia Recognition and Classification Using
[33] M. Korurek, B. Dogan, “ECG beat classification using particle swarm Combined Parametric and Visual Pattern Features of ECG Morphology,”
optimization and radial basis function neural network,” Expert Syst. Appl., vol. IEEE Access, vol. 8, pp. 47103 - 47117, Mar. 2019.
37, pp. 7563–7569, 2010. [58] S. S. S. Xu, M. W. Mak, C. C. Cheung, “Towards End-to-End ECG
[34] A. Martínez, R. Alcaraz, J.J.Rieta, “Application of the phasor transform Classification with Raw Signal Extraction and Deep Neural Networks,” IEEE
for automatic delineation of single-lead ECG fiducial points,” Physiol. Meas., Journal of Biomedical and Health Informatics, vol. 23, no. 4, pp. 1574 - 1584,
vol. 31, pp. 1467–1485, 2010. Jul. 2019.
[35] Y. Kutlu, D. Kuntalp, “Feature extraction for ECG heartbeats using higher [59] J. Huang, B. Chen, B. Yao and W. He, “ECG Arrhythmia Classification
order statistics of WPD coefficients,” Comput. Method Program Biomed., Using STFT-Based Spectrogram and Convolutional Neural Network,” IEEE
vol.105, no. 3, pp. 257–267, 2012. Access, vol. 7, pp. 92871-92880, 2019.
[60] Q. J. Lv, H. Y. Chen, W. B. Zhong, Y. Y. Wang, J. Y. Song, S. D. Guo,
L. X. Qi, C. Y.C. Chen, “A Multi-Task Group Bi-LSTM Networks

18
VOLUME XX, 2019

Application on Electrocardiogram Classification,” IEEE Journal of interests include the machine learning, computer vision, neural networks, and
Translational Engineering in Health and Medicine, vol. 8, pp. 1900111-
so on.
1900121, Feb. 2020.
[61] S. Saadatnejad, M. Oveisi, M. Hashemi, “LSTM-Based ECG
Classification for Continuous Monitoring on Personal Wearable Devices,” DANQUN XIONG received the B.S. degree from
IEEE Journal of Biomedical and Health Informatics, vol. 24, no. 2, pp. 515 – the Department of Clinical Medicine, Nanchang
523, Feb. 2020. University, Nanchang, China, in 2013, and the M.S
[62] B. Hou, J. Yang, P. Wang and R. Yan, “LSTM-Based Auto-Encoder degree in internal medicine from medicine school of
Model for ECG Arrhythmias Classification,” IEEE Transactions on Tongji University, Shanghai, China, in 2016. He is
Instrumentation and Measurement, vol. 69, no. 4, pp. 1232-1240, April 2020. currently an attending doctor in Department of
[63] G. M. Wang, J. F. Qiao, J. Bi, W. J. Li, M. C. Zhou, “TL-GDBN: Growing Cardiology of Jiading District Central Hospital
deep belief network with transfer learning,” IEEE Transactions on Automation Affiliated Shanghai University of Medical and Health
Science and Engineering, vol. 16, no.2, pp. 874-885, 2019. Sciences. His research focus on the detection and
[64] W. A. Khan, S. H. Chung, H. L. Ma, et al., “A novel self-organizing diagnose of arrhythmia.
constructive neural network for estimating aircraft trip fuel consumption,”
Transportation Research Part E: Logistics and Transportation Review, vol.132, XIANGDONG XU received the B.S. degree from
pp. 72-96, 2019. the Department of Clinical Medicine, Nanchang
[65] E. J. Palomo, E. López-Rubio, “The growing hierarchical neural gas self- University,Nanchang, China, in 1997, He is currently
organizing neural network,” IEEE transactions on neural networks and an Professor, and a Master Supervisor and director of
learning systems, vol. 28, no. 9, pp. 2000-2009, 2016. in Department of Cardiology of Jiading District
[66] Y. Yu, S. Gao, Y. Wang, and Y. Todo, “Global optimum-based search Central Hospital Affiliated Shanghai University of
differential evolution,” IEEE/CAA J. Autom. Sinica, vol. 6, no. 2, pp. 379-394, Medical and Health Sciences. He has published more
Mar. 2019. than 20 articles. His research focus on the usage and
[67] Q. Kang, X. Song, M. Zhou, and L. Li, “A Collaborative Resource challenge of innovative technology in General
Allocation Strategy for Decomposition-Based Multiobjective Evolutionary Practice Medicine, such as Machining Learning, Sequencing, and Big-Data.
Algorithms,” IEEE Transactions on Systems, Man, and Cybernetics: Systems,
vol.49, no. 12, pp. 2416-2423, Dec. 2019. XIAOGUANG ZHOU received the M.S. degree
[68] K. Z. Gao, Z. G. Cao, L. Zhang, Z. H. Chen, Y. Y. Han, and Q. K. Pan, from the Department of Precision Instrument,
“A review on swarm intelligence and evolutionary algorithms for solving Tsinghua University, in 1984, and the Ph.D. degree
flexible job shop scheduling problems,” IEEE/CAA J. Autom. Sinica, vol. 6, in engineering from the Tokyo University of
no. 4, pp. 875-887, July 2019. Agriculture and Technology, Japan. He was a Visitor
[69] S. Gao, M. Zhou, Y. Wang, J. Cheng, H. Yachi, and J. Wang, "Dendritic Professor with the Tokyo University of Agriculture
neuron model with effective learning algorithms for classification, and Technology from 2001 to 2002, and a JSPS
approximation and prediction," IEEE Transactions on Neural Networks and Researcher with Tokyo University from 2013 to
Learning Systems, vol. 30, no. 2, pp. 601 - 614, Feb. 2019. 2014. He is currently a Professor, and a Doctoral
[70] G. S. Hong, “A Cost-Sensitive Deep Belief Network for Imbalanced Supervisor with the School of Automation, Beijing
Classification,” in IEEE Transactions on Neural Networks and Learning University of Posts and Telecommunications. He
Systems, vol. 30, no. 1, pp. 109-122, Jan. 2019. also serves as the Director of the Engineering Research Center of
HAO DANG received the M.S. degree in pattern Information Networks, Ministry of Education. He is the author of over 10
recognition and intelligent system from the Henan books, over 100 articles, and over 16 inventions. His research interests
University of Technology, Zhengzhou, China, in include control theory and its application in engineering, deep learning,
2016. He is currently pursuing the Ph.D. Degree in computer vision, Internet of Things and automated logistics system, and
control science and engineering with the School of mechatronics technology. He is a permanent member of the Chinese
Automation, Beijing University of Posts and Association of Automation/Manufacturing Technology Committee and the
Telecommunications, Beijing, China. His research China Institute of Communications/Equipment manufacturing technical
interests include the pattern recognition, intelligent Committee.
systems, machine learning, and so on.

XINGXIANG TAO received the M.S. degree in

logistics engineering from the School of Information,
Beijing Wuzi University, Beijing, China, in 2017. He
is currently pursuing the Ph.D. Degree in control
science and engineering with the School of
Automation, Beijing University of Posts and
Telecommunications, Beijing, China. His research
interests include the machine learning, the pattern
recognition and intelligent systems.

YARU YUE received the M.S. degree in Detection

Technology and Automatic Equipment from the
Beijing Information Science and Technology
University, Beijing, China, in 2018, where he is
currently pursuing the Ph.D. degree in pattern
recongnition and intelligent system from School of
Autumation, Beijing University of Posts and
Teleconmmunications, Beijing, China. His research

19
VOLUME XX, 2019

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.

Hyster C530a Parts Manual PDF
83% (6)
Hyster C530a Parts Manual PDF
928 pages
CompTIA Security+ (701) Study Notes
No ratings yet
CompTIA Security+ (701) Study Notes
80 pages
TM257 - Tma02 - E395923x
No ratings yet
TM257 - Tma02 - E395923x
8 pages
FANUC Series 0-MB, FANUC Series 00-MB OPERATOR'S MANUAL
100% (6)
FANUC Series 0-MB, FANUC Series 00-MB OPERATOR'S MANUAL
540 pages
Ikura de Yaremasu Ka - 1262032 Doujin - Edoujin
89% (19)
Ikura de Yaremasu Ka - 1262032 Doujin - Edoujin
18 pages
Processes and Polymers: Applied Graduate Studies
No ratings yet
Processes and Polymers: Applied Graduate Studies
2 pages
01 - Introduction To Aircraft Drawing
No ratings yet
01 - Introduction To Aircraft Drawing
33 pages
(Journal Q4 2023) Ensemble of Deep Learning Models For Classification of Heart Beats Arrhythmias Detection
No ratings yet
(Journal Q4 2023) Ensemble of Deep Learning Models For Classification of Heart Beats Arrhythmias Detection
12 pages
CinC2020-044
No ratings yet
CinC2020-044
4 pages
1-s2.0-S1674862X25000175-main
No ratings yet
1-s2.0-S1674862X25000175-main
19 pages
Electronics 09 00121
No ratings yet
Electronics 09 00121
15 pages
degirmenci2021
No ratings yet
degirmenci2021
12 pages
ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Network
No ratings yet
ECG Arrhythmia Classification Using STFT-Based Spectrogram and Convolutional Neural Network
10 pages
Research Article: An Effective LSTM Recurrent Network To Detect Arrhythmia On Imbalanced ECG Dataset
No ratings yet
Research Article: An Effective LSTM Recurrent Network To Detect Arrhythmia On Imbalanced ECG Dataset
11 pages
Accepted Manuscript: Computers in Biology and Medicine
No ratings yet
Accepted Manuscript: Computers in Biology and Medicine
21 pages
Visvesvaraya Technological University Belagavi, Karnataka - 590 018
No ratings yet
Visvesvaraya Technological University Belagavi, Karnataka - 590 018
3 pages
A Fast Machine Learning Model For ECG-Based Heartbeat Classification and Arrhythmia Detection
No ratings yet
A Fast Machine Learning Model For ECG-Based Heartbeat Classification and Arrhythmia Detection
11 pages
Journal Pre-Proof: Computers in Biology and Medicine
No ratings yet
Journal Pre-Proof: Computers in Biology and Medicine
24 pages
7.fncom 14 564015
No ratings yet
7.fncom 14 564015
10 pages
ECG CWT Arrythmia
No ratings yet
ECG CWT Arrythmia
8 pages
CNN Skip
No ratings yet
CNN Skip
13 pages
s13246-019-00815-9
No ratings yet
s13246-019-00815-9
11 pages
دهمد
No ratings yet
دهمد
14 pages
A Hybrid Deep CNN Model For Abnormal Arrhythmia Detection Based On Cardiac ECG Signal
No ratings yet
A Hybrid Deep CNN Model For Abnormal Arrhythmia Detection Based On Cardiac ECG Signal
13 pages
ref 25
No ratings yet
ref 25
10 pages
Paper 27-Arrhythmia Classification Using 2D
No ratings yet
Paper 27-Arrhythmia Classification Using 2D
9 pages
s41598-024-59311-0
No ratings yet
s41598-024-59311-0
10 pages
7
No ratings yet
7
6 pages
Embc44109 2020 9175640
No ratings yet
Embc44109 2020 9175640
4 pages
Inter-And Intra-Patient ECG Heartbeat Classification For Arrhythmia Detection: A Sequence To Sequence Deep Learning Approach
No ratings yet
Inter-And Intra-Patient ECG Heartbeat Classification For Arrhythmia Detection: A Sequence To Sequence Deep Learning Approach
8 pages
(Journal Q2 2023) Heart Arrhythmia Detection and Classification A Comparative Study
No ratings yet
(Journal Q2 2023) Heart Arrhythmia Detection and Classification A Comparative Study
21 pages
Mavaddati - 2025 - ECG arrhythmias classification based on deep learn
No ratings yet
Mavaddati - 2025 - ECG arrhythmias classification based on deep learn
10 pages
1 s2.0 S016926072031573X Main
No ratings yet
1 s2.0 S016926072031573X Main
12 pages
2405.15312v1 (1)_compressed
No ratings yet
2405.15312v1 (1)_compressed
13 pages
1 s2.0 S1746809421004407 Main
No ratings yet
1 s2.0 S1746809421004407 Main
8 pages
Healthcare 11 01000
No ratings yet
Healthcare 11 01000
13 pages
11
No ratings yet
11
26 pages
2022 - Ma - Deep Learning Based Data Augmentation and Model Fusion For
No ratings yet
2022 - Ma - Deep Learning Based Data Augmentation and Model Fusion For
17 pages
Bioengineering 10 00429
No ratings yet
Bioengineering 10 00429
16 pages
DeepArrNet: An Efficient Deep CNN Architecture For Automatic Arrhythmia Detection and Classification From Denoised ECG Beats
No ratings yet
DeepArrNet: An Efficient Deep CNN Architecture For Automatic Arrhythmia Detection and Classification From Denoised ECG Beats
13 pages
paper3-rashed-al-mahfuz2021
No ratings yet
paper3-rashed-al-mahfuz2021
16 pages
Automated Detection of Arrhythmias Using
No ratings yet
Automated Detection of Arrhythmias Using
25 pages
A robust penalty regression function-based deep convolutional neural network for accurate cardiac arrhythmia classification using electrocardiogram signals
No ratings yet
A robust penalty regression function-based deep convolutional neural network for accurate cardiac arrhythmia classification using electrocardiogram signals
12 pages
fphys-13-982537 (1)
No ratings yet
fphys-13-982537 (1)
13 pages
Classification of Arrhythmia Diseases by the Convo
No ratings yet
Classification of Arrhythmia Diseases by the Convo
10 pages
An explainable attention-based TCN heartbeats classification model for arrhythmia detection
No ratings yet
An explainable attention-based TCN heartbeats classification model for arrhythmia detection
9 pages
Deep Learning-Based ECG Arrhythmia Classification A Systematic Review
No ratings yet
Deep Learning-Based ECG Arrhythmia Classification A Systematic Review
25 pages
doi-10.5455-jjcit.71-16425964711650529130-47
No ratings yet
doi-10.5455-jjcit.71-16425964711650529130-47
12 pages
Automatic Cardiac Arrhythmia Classification Using Combination of Deep Residual Network and Bidirectional LSTM
No ratings yet
Automatic Cardiac Arrhythmia Classification Using Combination of Deep Residual Network and Bidirectional LSTM
17 pages
Research Article Automatic Detection of Atrial Fibrillation From Single-Lead ECG Using Deep Learning of The Cardiac Cycle
No ratings yet
Research Article Automatic Detection of Atrial Fibrillation From Single-Lead ECG Using Deep Learning of The Cardiac Cycle
12 pages
Deep learning-assisted arrhythmia classification using 2-D ECG spectrograms
No ratings yet
Deep learning-assisted arrhythmia classification using 2-D ECG spectrograms
15 pages
Raj2015, Arm Based Arrhythmia Beat Monitoring
No ratings yet
Raj2015, Arm Based Arrhythmia Beat Monitoring
8 pages
Pattern Recognition Application in ECG Arrhythmia Classification
No ratings yet
Pattern Recognition Application in ECG Arrhythmia Classification
9 pages
2017-Design of an Artificial Neural Network and Feature Extraction to Identify Arrhythmias From ECG
No ratings yet
2017-Design of an Artificial Neural Network and Feature Extraction to Identify Arrhythmias From ECG
7 pages
Arrhythmia Classification Using One Dimensional Conventional Neural Network
No ratings yet
Arrhythmia Classification Using One Dimensional Conventional Neural Network
18 pages
ecg classifier
No ratings yet
ecg classifier
26 pages
Y2023.V01.N01.P03
No ratings yet
Y2023.V01.N01.P03
9 pages
1 s2.0 S2215016123001954 Main
No ratings yet
1 s2.0 S2215016123001954 Main
15 pages
Entropy: CNN-FWS: A Model For The Diagnosis of Normal and Abnormal ECG With Feature Adaptive
No ratings yet
Entropy: CNN-FWS: A Model For The Diagnosis of Normal and Abnormal ECG With Feature Adaptive
13 pages
Automated Arrhythmia Detection From Electrocardiogram Signal Using Stacked Restricted Boltzmann Machine Model
No ratings yet
Automated Arrhythmia Detection From Electrocardiogram Signal Using Stacked Restricted Boltzmann Machine Model
10 pages
A-novel-deep-neural-network-heartbeats-classi_2023_International-Journal-of-
No ratings yet
A-novel-deep-neural-network-heartbeats-classi_2023_International-Journal-of-
10 pages
ECG Heartbeat Classification: A Deep Transferable Representation
No ratings yet
ECG Heartbeat Classification: A Deep Transferable Representation
5 pages
2206.14200v3
No ratings yet
2206.14200v3
14 pages
Bilal Scopus
No ratings yet
Bilal Scopus
15 pages
Time-Frequency Domain for Segmentation and Classification of Non-stationary Signals: The Stockwell Transform Applied on Bio-signals and Electric Signals
From Everand
Time-Frequency Domain for Segmentation and Classification of Non-stationary Signals: The Stockwell Transform Applied on Bio-signals and Electric Signals
Ali Moukadem
No ratings yet
Cardiac Electrophysiology (EP) Essentials: Cath Lab Expertise: A Series for Cardiovascular Technologists
From Everand
Cardiac Electrophysiology (EP) Essentials: Cath Lab Expertise: A Series for Cardiovascular Technologists
Swapnil S M
No ratings yet
Electrocardiography Method (ECG/EKG): A Primary Guideline for Starters to Understand about Arrhythmias & EKG Interpretation
From Everand
Electrocardiography Method (ECG/EKG): A Primary Guideline for Starters to Understand about Arrhythmias & EKG Interpretation
Rose Dickens
No ratings yet
BTMark
No ratings yet
BTMark
5 pages
Wideband Code Division Multiple Access
No ratings yet
Wideband Code Division Multiple Access
31 pages
Project Module-3
No ratings yet
Project Module-3
58 pages
History of C++ (C With Classes)
No ratings yet
History of C++ (C With Classes)
30 pages
Sella Flow Regulator 392 393
No ratings yet
Sella Flow Regulator 392 393
2 pages
AERON Range Comparison V3.3
No ratings yet
AERON Range Comparison V3.3
1 page
H225SpecSheet 02 2016
No ratings yet
H225SpecSheet 02 2016
2 pages
Din en Iso 306: Thermoplastic Materials
100% (2)
Din en Iso 306: Thermoplastic Materials
13 pages
Homework Propulsion Systems 5
No ratings yet
Homework Propulsion Systems 5
22 pages
Wireless Communication Networks Assignment Answers
No ratings yet
Wireless Communication Networks Assignment Answers
6 pages
Dorot 68-De - El - MR
No ratings yet
Dorot 68-De - El - MR
6 pages
USER MANUAL of HD05R 2
No ratings yet
USER MANUAL of HD05R 2
2 pages
MY PROJECT_065842_052656
No ratings yet
MY PROJECT_065842_052656
27 pages
Fundamentals of Dbms and Oracle
No ratings yet
Fundamentals of Dbms and Oracle
122 pages
1KD Engine Repair Manual Section 3 (Fuel System) Pub. No. CE302
No ratings yet
1KD Engine Repair Manual Section 3 (Fuel System) Pub. No. CE302
15 pages
SPPU SE CE Building Tech Architectural Planning Nov Dec 2022
No ratings yet
SPPU SE CE Building Tech Architectural Planning Nov Dec 2022
2 pages
AEC Commerce 2 Notes
No ratings yet
AEC Commerce 2 Notes
8 pages
Mobile Edge Computing - A Survey on Architecture and Computation Offloading
No ratings yet
Mobile Edge Computing - A Survey on Architecture and Computation Offloading
28 pages
A CV OF VIMUKTHI AMARASURIYA - Accounts Executive
No ratings yet
A CV OF VIMUKTHI AMARASURIYA - Accounts Executive
2 pages
Screenshot 2024-03-01 at 1.10.20 PM
No ratings yet
Screenshot 2024-03-01 at 1.10.20 PM
1 page
System Design Interview The Ultimate... (Z-Library) (2)
No ratings yet
System Design Interview The Ultimate... (Z-Library) (2)
149 pages
2009 Ford Focus SEL
No ratings yet
2009 Ford Focus SEL
59 pages
FusionServer XH321 V5 Server Node Technical White Paper
No ratings yet
FusionServer XH321 V5 Server Node Technical White Paper
76 pages

A Deep Biometric Recognition and Diagnosis Network

Uploaded by

A Deep Biometric Recognition and Diagnosis Network

Uploaded by

This article has been accepted for publication in a future issue of this journal, but has not been

A deep biometric recognition and diagnosis

INDEX TERMS Heartbeat, Arrhythmia, Deep Learning, Convolutional Neural Network,

I. INTRODUCTION electrical signal activity over time. It is an important standard

It should be noted that data augmentation is a process that

IV. NETWORK ARCHITECTURE

𝑜𝑟 𝑥𝑖 = mean[𝑥𝑖−1 (𝑛 × 𝑠 + 𝑟)] (2) x

Input:1×250 Set A Set B Input:1×360

Max- pooling Conv

Target class ReLU

(a) plain-CNN (b) MSF-CNN A (b) MSF-CNN B

𝑦𝑗 = 𝑓(∑𝑖∈𝑀 𝑘𝑖𝑗 ∗ 𝑥𝑖 + 𝑏𝑗 ) (9) Algorithm 1 MSF-CNN B

TABLE 3 THE DETAILED CONFIGURATION OF THE PROPOSED NETWORKS

(a) (b) (c)

(a) (b) (c)

Plain-CNN 85.23 87.41 79.50 92.81 95.84 93.92

(a) (b) (c)

Set B (361 samples, with augmentation, without

Plain-CNN 92.81 95.84 93.92 93.41 87.16 89.73

MSF-CNN A 95.48 96.53 87.74 96.38 91.82 92.58

V 678 236 69815 1231 390 98.42 96.50 96.96

Acc=accuracy, Sen=sensitivity, Spe=specificity.

Q 86 279 108 299 79638 99.28 99.04 98.37

Acc=accuracy, Sen=sensitivity, Spe=specificity.

XINGXIANG TAO received the M.S. degree in

YARU YUE received the M.S. degree in Detection

You might also like