Obstructive Sleep Apnea Detection Based On Electrocardiogram Signal Using One-Dimensional Convolutional Neural Network
Obstructive Sleep Apnea Detection Based On Electrocardiogram Signal Using One-Dimensional Convolutional Neural Network
Rahmat Widadi1, Achmad Rizal2, Sugondo Hadiyoso3, Hilman Fauzi2, Ziani Said4
1
Department of Biomedical Engineering, Faculty of Telecommunication and Electrical Engineering, Institut Teknologi Telkom
Purwokerto, Purwokerto, Indonesia
2
School of Electrical Engineering, Telkom University, Bandung, Indonesia
3
School of Applied Science, Telkom University, Bandung, Indonesia
4
Department of Health Technologies Engineering, Research Group in Biomedical Engineering and Pharmaceutical Sciences, ENSAM,
Mohammed V University, Rabat, Morocco
Corresponding Author:
Sugondo Hadiyoso
School of Applied Science, Telkom University
Bandung, Indonesia
Email: [email protected]
1. INTRODUCTION
One of sleep disease or sleep disorder in obstructive sleep apnea (OSA) [1]. OSA conditions occur if
the intake of air is stopped for 10 seconds or more during sleep inspiration [1]. Apnea causes obstruction
because the body's muscles relax, causing the airways to collapse [2]. If the airway is closed, the patient will
wake up during sleep or experience a sudden transition to sleep. OSA is associated with several diseases,
including hypertension and several cardiovascular diseases such as heart failure or myocardial ischemia [3].
Depending on the signal to be observed, several methods are used to diagnose OSA. Became one of
the OSA observation standard is polysomnography (PSG). The disadvantage of this PSG is that it is
time-consuming, costly, and impractical, so it is only performed on patients with a specific level of severity
[4]. Through patient movement, video processing can be used to detect OSA [5]. Speech processing methods
can be used to analyze snoring so that OSA can be detected [6]. The analysis of speech signals is frequently
used to analyze sleep apnea using respiratory signals [7]. Voice activity detection (VAD) is a popular algorithm
for detecting the presence or absence of the respiratory process during sleep [8].
Numerous studies have examined the relationship between electrocardiogram (ECG) signals and
respiration, called as electrocardiogram-derived respiration (EDR) [9]. Respiration influences ECG signals
through a variety of mechanisms, including changes in thorax impedance resulting from lung volume changes
[10]. The subsequent mechanism is the alteration of the heart vector due to shifts or changes in heart orientation
associated with ECG electrodes [11], [12].
OSA classification using single-lead ECG has been explored by many researchers using various
methods. There are two essential parts in the OSA ECG classification process: feature extraction and
classification. In the feature extraction process, the heart rate variability (HRV) parameter is one of the top
choices for many researchers because it can be directly extracted from the ECG signal. Median, mean,
skewness (third momentum), kurtosis, minimum, and range calculated to get the HRV parameters. Another
parameter derived from the internal R-R is the standard deviation of successive differences between adjacent
R–R intervals (SDSD).
Differences in the number of R-R intervals that exceed a specific time, such as 50 ms or 20 ms, are
also extracted parameters such as NN50, NN20, pNN50, and pNN20 [13]. The extracted features can also be
features from the frequency domain. This frequency component usually represents the activities of
thermoregulation mechanisms, sympathetic activity, and parasympathetic activities [11]. Characteristics
associated with this frequency include the very low-frequency (VLF) band, low-frequency (LF) band, and
high-frequency (HF) band. Another method used for feature extraction is signal decomposition, and then
feature extraction is performed. For example, Zarei and Asl [14] used wavelet transforms and entropy for ECG
OSA classification. Another study used multiscale entropy as a feature [15]. This method is based on multiscale
signal complexity, which is considered related to OSA ECG.
The following process in the OSA ECG classification is the classifier. The support vector machine
(SVM) is one of the most widely used machine learning classifiers [16]. In its development, deep learning is
the next choice because of its advantages in performing feature extraction automatically [17]. Even so, some
researchers still combine feature extraction and deep learning. Singh and Majumder [18] used a scalogram and
convolutional neural network (CNN) to produce an accuracy of 86.2%. Bahrami and Forouzanfar [19] used
long short-term memory (LSTM) to have 80.67% accuracy, 75.04% sensitivity, and 84.13% specificity.
Meanwhile, another group of researchers using LTSM produced the highest accuracy of 99.8% with RR
interval as a feature [20]. The RR interval combined with a multiscale dilation attention (MSDA)-one
dimensional convolutional neural network (1D-CNN) resulted in the highest accuracy of 89.4% [21]. From the
several studies that have been described, it can be divided into two groups; those who use the feature extraction
method and who do not use the feature extraction method. Apart from these differences, another difference is
the method of processing or cutting data from the dataset used. To compare the performance of deep learning
for OSA classification from a single lead ECG without a feature extraction process, it is necessary to try using
deep learning directly without a feature extraction process or trimming the ECG signal further.
In this study, the classification of normal and OSA ECG signals was carried out using 1D-CNN.
Long-time ECG recording was cut into minute segments and used to input the 1D-CNN. In this study, the
feature extraction process was not carried out but directly using the clipped ECG signal. CNN is optimized to
get the best configuration to produce the highest accuracy. The proposed method can be an alternative for ECG
signal processing, especially for OSA signal classification.
2.1. Dataset
This study used the PhysioNet ECG sleep apnea dataset [22], [23]. There are 70 records, with 35
belonging to the training set and 35 to the testing set. Each recording consists of a digital ECG signal and is
annotated by an expert [24]. ECG signal recording length ranges from seven to ten hours. In some recordings
there are also additional signals such as respiration and oxygen saturation, but these additional signals were not
used in this study. The ECG signal is truncated every 6,000 samples. This number is equal to a one-minute
signal because it has a sampling frequency of 100 Hz. This process produces 17,010 data for train and test the
proposed method. The normal ECG and OSA signals are shown in Figure 2. Based on Figure 2, the OSA ECG
signal Figure 2(a) has a different orientation in the spike area compared to normal ECG signals in Figure 2(b).
(a)
(b)
Figure 2. Normal and OSA ECG signal, (a) standard ECG, one minute recoding and (b) OSA event in ECG
of the same subject
The output of the convolution process is used in the pooling process to reduce the signal size [27].
Two types of pooling used in this study are average pooling and max pooling. Average pooling takes the
average value of the signal window along 𝑓. This window or filter is shifted on the signal by a shift of 𝑠 (stride).
A similar process occurs in max pooling, only that the value taken is the maximum value instead of the average.
Figure 4 shows example of max pooling process.
The next operation on convolution layers is flatten. The output signal in pooling operation has
dimensions of more than 1. To be used as input for the MLP stage, it needs to be converted into 1 dimension.
Figure 5 is an illustration of the flatten process.
The signals that have been flattened can be used as MLP inputs. MLP is also known as a fully
connected layer. This term is because each neuron in the previous layer is connected to all neurons in the next
layer. Figure 6 shows an example of an MLP with an input layer, one hidden layer, and an output layer. There
are seven neurons, namely 𝑥1 and 𝑥2 on the input layer, ℎ1 , ℎ2 , ℎ3 on the hidden layer, 𝑜1 and 𝑜2 on the input
layer. The output of neurons in the hidden layer is determined using (2).
𝑦 = 𝑓(𝑤 𝑇 𝑥) (2)
Inner product weight w with input x is given to the activation function f so that it is non-linear. The activation
function used in this study is ReLU. The ReLU function is defined (3) [28].
𝑦 = max(0, 𝑤 𝑇 𝑥) (3)
Epoch
Figure 7. Training and validation loss on 1D-CNN by using filter length 7, number of filters 15, and max
pooling
The performance calculation of the system model classification is performed using the confusion
matrix from the testing stage. 1D-CNN was evaluated using 3,402 OSA data. The system model is a system
with max pooling at a filter length of 7, the number of filters is 15, and the epoch is 50th. The confusion matrix
value is obtained based on the evaluation results, as shown in Figure 8.
Furthermore, by using an accuracy calculation based on the value of the confusion matrix, a system
classification accuracy of 90.8% is obtained.
1,127+1,968
𝐴𝑐𝑐𝑢𝑟𝑟𝑎𝑐𝑦 = 𝑥 100% = 90.8%
1,127+1,968+166+141
The results of measuring system accuracy using max pooling get considered good results. The
accuracy system results with max pooling can be seen in Table 1. The highest average accuracy of the system
based on filter length is 90% at a filter length of 9. Furthermore, the highest average accuracy system based on
the number of filters occurs in the number of filters 20, with an accuracy of 88.50%. Specifically, based on the
filter length and the number of filters on the max pooling type, the highest system accuracy was performed at
91% on filter lengths of 7 and 10 with the filter numbers of 15 and 20.
Loss Epoch
Figure 1. Training and validation loss on 1D-CNN by using filter length 9, number of filters 20, and average
pooling
The advantage of this method is that it does not require a separate feature extraction process. CNN
has the ability to perform feature extraction automatically. The configuration of the designed CNN determines
accuracy. The weakness of using CNN is that it requires a large enough dataset for sufficient training. In
addition, it requires repeated optimization to get the configuration that produces the highest accuracy.
Experiments using other deep learning methods are an interesting research topic at a later stage.
4. CONCLUSION
Based on the results of measuring system performance with two types of pooling: max pooling and
average pooling, it is found that the average pooling type can produce better performance. It is evidenced by
measuring accuracy based on the filter effect, loss based on epoch values, and classification accuracy based on
confusion matrix values, all of which show that a system with average pooling is better than max pooling.
Based on the value of accuracy and the ability of the system to classify data, average pooling is considered
capable of providing higher values of 1.69% and 2.1%, respectively. In addition, even though both systems
show a similar loss pattern trend that both decrease in direct proportion to increasing epochs, the gap between
the training and validation loss values in the system with average pooling shows less value than the system
Obstructive sleep apnea detection based on electrocardiogram signal … (Rahmat Widadi)
4136 ISSN: 2252-8938
with max pooling. It shows that a system with average pooling using a filter length of nine with several 20
filters at the 50th epoch is proven to detect OSA through ECG signal processing with excellent system accuracy
and stability.
REFERENCES
[1] J. R. Tietjens et al., “Obstructive sleep apnea in cardiovascular disease: A review of the literature and proposed multidisciplinary
clinical management strategy,” Journal of the American Heart Association, vol. 8, no. 1, 2019, doi: 10.1161/JAHA.118.010440.
[2] J. S. -Soler, B. F. Giraldo, J. A. Fiz, and R. Jane, “Relationship between heart rate excursion and apnea duration in patients with
obstructive sleep apnea,” Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology
Society, EMBS, pp. 1539–1542, 2017, doi: 10.1109/EMBC.2017.8037129.
[3] A. Rizal, F. D. A. A. Siregar, and H. T. Fauzi, “Obstructive sleep apnea (OSA) classification based on heart rate variability (HRV)
on electrocardiogram (ECG) signal using support vector machine (SVM),” Traitement du Signal, vol. 39, no. 2, pp. 469–474, 2022,
doi: 10.18280/ts.390208.
[4] O. Faust, U. R. Acharya, E. Y. K. Ng, and H. Fujita, “A review of ECG-based diagnosis support systems for obstructive sleep
apnea,” Journal of Mechanics in Medicine and Biology, vol. 16, no. 1, 2016, doi: 10.1142/S0219519416400042.
[5] N. Febriana, A. Rizal, and E. Susanto, “Sleep monitoring system based on body posture movement using Microsoft Kinect sensor,”
AIP Conference Proceedings, vol. 2092, 2019, doi: 10.1063/1.5096680.
[6] O. Elisha, A. Tarasiuk, and Y. Zigel, “Detection of obstructive sleep apnea using speech signal analysis,” Models and Analysis of
Vocal Emissions for Biomedical Applications - 7th International Workshop, MAVEBA 2011, pp. 13–16, 2011.
[7] N. Selvaraj and R. Narasimhan, “Detection of sleep apnea on a per-second basis using respiratory signals,” Annual International
Conference of the IEEE Engineering in Medicine and Biology Society, EMBS, pp. 2124–2127, 2013, doi: 10.1109/EMBC.2013.6609953.
[8] H. Huang and F. Lin, “A speech feature extraction method using complexity measure for voice activity detection in WGN,” Speech
Communication, vol. 51, no. 9, pp. 714–723, 2009, doi: 10.1016/j.specom.2009.02.004.
[9] C. Varon, A. Caicedo, D. Testelmans, B. Buyse, and S. V. Huffel, “A novel algorithm for the automatic detection of sleep apnea from single-
lead ECG,” IEEE Transactions on Biomedical Engineering, vol. 62, no. 9, pp. 2269–2278, 2015, doi: 10.1109/TBME.2015.2422378.
[10] P. Langley, E. J. Bowers, and A. Murray, “Principal component analysis as a tool for analyzing beat-to-beat changes in ECG
features: Application to ECG-derived respiration,” IEEE Transactions on Biomedical Engineering, vol. 57, no. 4, pp. 821–829,
2010, doi: 10.1109/TBME.2009.2018297.
[11] M. Bahrami and M. Forouzanfar, “Sleep apnea detection from single-lead ECG: a comprehensive analysis of machine learning and
deep learning algorithms,” IEEE Transactions on Instrumentation and Measurement, vol. 71, 2022, doi: 10.1109/TIM.2022.3151947.
[12] R. P. -Areny, J. C. -Balagué, and F. J. Rosell, “The effect of respiration-induced heart movents on the ECG,” IEEE Transactions
on Biomedical Engineering, vol. 36, no. 6, pp. 585–590, 1989, doi: 10.1109/10.29452.
[13] A. P. Razi, Z. Einalou, and M. Manthouri, “Sleep apnea classification using random forest via ECG,” Sleep and Vigilance, vol. 5,
no. 1, pp. 141–146, 2021, doi: 10.1007/s41782-021-00138-4.
[14] A. Zarei and B. M. Asl, “Automatic detection of obstructive sleep apnea using wavelet transform and entropy-based features from
single-lead ECG signal,” IEEE Journal of Biomedical and Health Informatics, vol. 23, no. 3, pp. 1011–1021, 2019, doi:
10.1109/JBHI.2018.2842919.
[15] A. Rizal, U. R. Iman, and H. Fauzi, “Classification of sleep apnea using multi scale entropy on electrocardiogram signal,” International
Journal of Online and Biomedical Engineering (iJOE), vol. 17, no. 14, pp. 79–89, Dec. 2021, doi: 10.3991/ijoe.v17i14.25905.
[16] L. Almazaydeh, K. Elleithy, and M. Faezipour, “Obstructive sleep apnea detection using SVM-based classification of ECG signal
features,” Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS,
pp. 4938–4941, 2012, doi: 10.1109/EMBC.2012.6347100.
[17] Jondri and A. Rizal, “Classification of premature ventricular contraction (PVC) based on ecg signal using convolutional neural network
(CNN),” Indonesian Journal of Electrical Engineering and Informatics, vol. 8, no. 3, pp. 494–499, 2020, doi: 10.11591/ijeei.v8i3.1530.
[18] S. A. Singh and S. Majumder, “A novel approach OSA detection using single-lead ECG scalogram based on deep neural network,”
Journal of Mechanics in Medicine and Biology, vol. 19, no. 4, 2019, doi: 10.1142/S021951941950026X.
[19] M. Bahrami and M. Forouzanfar, “Detection of sleep apnea from single-lead ECG: Comparison of deep learning algorithms,” 2021 IEEE
International Symposium on Medical Measurements and Applications, MeMeA 2021, 2021, doi: 10.1109/MeMeA52024.2021.9478745.
[20] O. Faust, R. Barika, A. Shenfield, E. J. Ciaccio, and U. R. Acharya, “Accurate detection of sleep apnea with long short-term memory
network based on RR interval signals,” Knowledge-Based Systems, vol. 212, 2021, doi: 10.1016/j.knosys.2020.106591.
[21] Q. Shen, H. Qin, K. Wei, and G. Liu, “Multiscale deep neural network for obstructive sleep apnea detection using RR interval from
single-lead ECG signal,” IEEE Transactions on Instrumentation and Measurement, vol. 70, 2021, doi: 10.1109/TIM.2021.3062414.
[22] T. Penzel, G. B. Moody, R. G. Mark, A. L. Goldberger, and J. H. Peter, “The apnea-ECG database,” in Computers in Cardiology
2000, 2018, vol. 11, no. 1, pp. 255–258, doi: 10.1109/CIC.2000.898505.
[23] A. L. Goldberger et al., “PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex
physiologic signals,” Circulation, vol. 101, no. 23, 2000, doi: 10.1161/01.cir.101.23.e215.
[24] T. Paul et al., “ECG and SpO2 signal-based real-time sleep apnea detection using feed-forward artificial neural network,” AMIA
Annual Symposium proceedings, AMIA Symposium, vol. 2022, pp. 379–385, 2022.
[25] Y. Sun, “The neural network of one-dimensional convolution-an example of the diagnosis of diabetic retinopathy,” IEEE Access,
vol. 7, pp. 69657–69666, 2019, doi: 10.1109/ACCESS.2019.2916922.
[26] A. Anton, N. F. Nissa, A. Janiati, N. Cahya, and P. Astuti, “Application of deep learning using convolutional neural network (CNN)
method for women’s skin classification,” Scientific Journal of Informatics, vol. 8, no. 1, pp. 144–153, 2021, doi: 10.15294/sji.v8i1.26888.
[27] A. Zafar et al., “A comparison of pooling methods for convolutional neural networks,” Applied Sciences, vol. 12, no. 17, 2022, doi:
10.3390/app12178643.
[28] B. Daróczy, “Gaussian perturbations in ReLU networks and the arrangement of activation regions,” Mathematics, vol. 10, no. 7,
2022, doi: 10.3390/math10071123.
[29] K. Z. -Mokhtar and J. M. -Saleh, “An oil fraction neural sensor developed using electrical capacitance tomography sensor data,”
Sensors, vol. 13, no. 9, pp. 11385–11406, 2013, doi: 10.3390/s130911385.
[30] A. Rizal, S. Hadiyoso, H. Fauzi, and R. Widadi, “Obstructive sleep apnea detection based on ECG signal using statistical features
of wavelet subband,” International Journal of Electrical and Computer Engineering Systems, vol. 13, no. 10, pp. 877–884, 2022,
doi: 10.32985/ijeces.13.10.13.
BIOGRAPHIES OF AUTHORS
Hilman Fauzi received his master’s degree from STEI Institut Teknologi Bandung,
Indonesia in 2013. He received Ph.D. from Malaysia-Japan International Institute of
Technology, Universiti Teknologi Malaysia (UTM), Malaysia in 2020. Currently, he is a
researcher and lecturer at Telkom University Indonesia. His research interests include biosignal
processing, artificial intelligence, and biomedical engineering. He can be contacted at email:
[email protected].