0% found this document useful (0 votes)
2 views

A Deep Learning Model Based on the Combination of Convolutional and Recurrent Neural Networks to Enhan

This document presents a study on a deep learning model combining convolutional and recurrent neural networks (CNN-RNN) to enhance the classification of sleep stages in children with obstructive sleep apnea using pulse oximetry signals. The model achieved an accuracy of 86.0% and a Cohen's kappa of 0.743, demonstrating its potential to automate sleep staging and improve diagnosis efficiency compared to traditional polysomnography methods. The research highlights the clinical relevance of using low-cost pulse oximeters for accessible sleep disorder assessments in pediatric patients.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

A Deep Learning Model Based on the Combination of Convolutional and Recurrent Neural Networks to Enhan

This document presents a study on a deep learning model combining convolutional and recurrent neural networks (CNN-RNN) to enhance the classification of sleep stages in children with obstructive sleep apnea using pulse oximetry signals. The model achieved an accuracy of 86.0% and a Cohen's kappa of 0.743, demonstrating its potential to automate sleep staging and improve diagnosis efficiency compared to traditional polysomnography methods. The research highlights the clinical relevance of using low-cost pulse oximeters for accessible sleep disorder assessments in pediatric patients.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

2023 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC) | 979-8-3503-2447-1/23/$31.

00 ©2023 IEEE | DOI: 10.1109/EMBC40787.2023.10341100

A deep learning model based on the combination of convolutional


and recurrent neural networks to enhance pulse oximetry ability to
classify sleep stages in children with sleep apnea
Fernando Vaquerizo-Villar, Daniel Álvarez*, Member, IEEE, Gonzalo C. Gutiérrez-Tobal, Member,
IEEE, Félix del Campo, David Gozal, Leila Kheirandish-Gozal, Thomas Penzel, Senior Member,
IEEE, Roberto Hornero, Senior Member, IEEE

Abstract— Characterization of sleep stages is essential in the After the test and following the rules of the American
diagnosis of sleep-related disorders but relies on manual scoring Academy of Sleep Medicine (AASM), sleep technicians
of overnight polysomnography (PSG) recordings, which is visually inspect the electroencephalogram (EEG),
onerous and labor-intensive. Accordingly, we aimed to develop electrooculogram (EOG), and submental electromyogram
an accurate deep-learning model for sleep staging in children (EMG) channels to assign each 30-s non-overlapping epoch to
suffering from pediatric obstructive sleep apnea (OSA) using a sleep stage: wake (W), three levels of non-Rapid Eye
pulse oximetry signals. For this purpose, pulse rate (PR) and Movement (non-REM) sleep (N1, N2, and N3), and REM
blood oxygen saturation (SpO2) from 429 childhood OSA sleep [2]. However, PSG is costly, complex, highly intrusive,
patients were analyzed. A CNN-RNN architecture fed with PR and scarcely available, thus delaying the diagnosis of sleep
and SpO2 signals was developed to automatically classify wake disorders [3]. Furthermore, the process of manual sleep
(W), non-Rapid Eye Movement (NREM), and REM sleep stages.
scoring takes up to hours per sleep study and suffers from a
This architecture was composed of: (i) a convolutional neural
considerable inter-rater variability [4], which may alter the
network (CNN), which learns stage-related features from raw
PR and SpO2 data; and (ii) a recurrent neural network (RNN),
accuracy of the diagnosis.
which models the temporal distribution of the sleep stages. The To overcome these limitations, multiple studies have
proposed CNN-RNN model showed a high performance for the proposed automated approaches for sleep scoring from a
automated detection of W/NREM/REM sleep stages (86.0% minimum number of signals [5]. A large proportion of these
accuracy and 0.743 Cohen’s kappa). Furthermore, the total sleep studies have focused on automated sleep staging in patients
time estimated for each children using the CNN-RNN model with obstructive sleep apnea (OSA), a highly prevalent sleep
showed high agreement with the manually derived from PSG
disorder that affects nearly 1 billion people around the globe
(intra-class correlation coefficient = 0.747). These results were
superior to previous works using CNN-based deep-learning
[6]. OSA diagnosis is based on the apnea-hypopnea index
models for automatic sleep staging in pediatric OSA patients (AHI: number of apneas and hypopneas per sleep hour), so the
from pulse oximetry signals. Therefore, the combination of CNN scoring of sleep stages and the calculation of the total sleep
and RNN allows to obtain additional information from raw PR time (TST) are imperative in this context [2].
and SpO2 data related to sleep stages, thus being useful to Among others, EEG, EOG, electrocardiogram,
automatically score sleep stages in pulse oximetry tests for actigraphy, airflow and pulse oximetry signals have been
children evaluated for suspected OSA.
employed for automatic sleep staging in OSA cohorts [5]. In
this respect, pulse oximetry signals have been frequently
Clinical Relevance—This research establishes the usefulness
proposed for sleep scoring and diagnosing sleep disorders as
of a CNN-RNN architecture to automatically score sleep stages
they can be recorded at patient’s home with low-cost portable
in pulse oximetry tests for pediatric OSA diagnosis.
pulse oximeters [3], thus being an accessible and simplified
I. INTRODUCTION alternative to PSG [7], [8]. Pulse oximeters record the
photoplethysmography (PPG) signal, which is used to derive
Characterization of the sleep macro-structural changes both blood oxygen saturation (SpO2) and pulse rate (PR)
(i.e., sleep stages) is essential in the diagnosis of sleep-related signals [9].
disorders [1]. Overnight polysomnography (PSG) is the gold
standard approach, which involves the recording of a wide The dynamics of PPG and PPG-derived PR and SpO2
range of neurophysiological and cardiorespiratory signals [2]. changes during sleep stages [7], [8], [10]. This relationship,

This work was supported by 'Ministerio de Ciencia e Innovación/Agencia F. Vaquerizo-Villar, D. Álvarez, G. C. Gutiérrez-Tobal, F. del Campo,
Estatal de Investigación/10.13039/501100011033/’, ERDF A way of making and R. Hornero are with the Biomedical Engineering Group, Universidad de
Europe, and NextGenerationEU/PRTR under projects PID2020-115468RB- Valladolid (e-mail: [email protected]) and CIBER-BBN (ISCIII), Spain.
I00 and PDC2021-120775-I00, and by ‘CIBER -Consorcio Centro de D. Álvarez, and F. del Campo are with the Hospital Universitario Río
Investigación Biomédica en Red-’ (CB19/01/00012) through ‘Instituto de Hortega of Valladolid, Spain (e-mail: [email protected]) and CIBER-BBN
Salud Carlos III’. (ISCIII), Spain.
Daniel Álvarez was supported by a "Ramón y Cajal" Grant RYC2019- L. Kheirandish-Gozal and D. Gozal are with the Department of Child
028566-I funded by MCIN/AEI/ 10.13039/501100011033 and by “ESF Health, The University of Missouri School of Medicine, Columbia, Missouri,
Investing in your future”. L. Kheirandish-Gozal and D. Gozal were supported USA (email: [email protected]).
by the Leda J. Sears Foundation for Pediatric Research, by a Tier 2 grant from T. Penzel is with the Interdisciplinary Center of Sleep Medicine, Charité-
the University of Missouri and National Institutes of Health grant AG061824. Universitätsmedizin Berlin, Germany (e-mail: [email protected]).

979-8-3503-2447-1/23/$31.00 ©2023
Authorized licensed use IEEE
limited to: ANNA UNIVERSITY. Downloaded on August 10,2024 at 05:32:27 UTC from IEEE Xplore. Restrictions apply.
together with the recent advances in deep-learning
TABLE I. CLINICAL AND DEMOGRAPHIC DATA OF THE CHILDREN
methodologies, has led to several studies applying deep- IN THE STUDY
learning algorithms to automatically score sleep stages in adult
OSA subjects from pulse oximetry signals [7], [8], [11]. Validation
All Training set Test set
set
Conversely, only two conference papers developed by our own Subjects (n) 429 257 85 87
group have approached sleep staging in pediatric OSA patients Age (years) 6 [5, 8] 6 [5, 8] 6 [5, 7] 6 [5, 7]
[10], [12], which present distinguishing etiological, diagnostic, Males (n) 208 (48.5%) 127 (49.4%) 35 (41.2%) 46 (52.9%)
and treatment considerations, as well as less profound and BMI 17.2 17.1 18.5 16.5
recurrent desaturations (SpO2) and bradycardia/tachycardia (kg/m2) [15.4, 22.0] [15.6, 21.7] [15.2, 23.4] [15.2, 22.3]
4.7 4.6 4.6 5.1
(PR) patterns when compared to adult subjects [2], [13]. In AHI (e/h)
[2.7, 8.7] [2.6, 8.8] [2.5, 8.5] [3.2, 9.4]
these two preliminary studies, a convolutional neural network 133891 79814 27685 26392
(CNN) was applied to detect sleep stages from raw PPG [12], Wake (n)
(25.4%) (25.1%) (27.0%) (25.0%)
and raw PR and SpO2 data [10], respectively. Despite their NREM (n)
319038 193547 61419 64072
usefulness to learn stage-related features from pulse oximetry (60.6%) (60.9%) (59.9%) (60.6%)
signals, CNNs do not consider the temporal distribution of 73405 44724 13464 15217
REM (n)
(14.0%) (14.1%) (13.1%) (14.4%)
sleep stages during sleep. Instead, recurrent neural networks 608 617 590 607
(RNNs) learn the temporal dependency of the data [14], which TRT (min)
[557, 658] [563, 661] [539, 652] [562, 640]
has been shown to be useful in order to learn the temporal TST (min)
466 472 447 461
distribution of the sleep stages [5], [7]. [429, 494] [440, 497] [420, 482] [423, 500]
Data are presented as median [interquartile range], n or %. BMI: Body
Based on these considerations, we hypothesized that a Mass Index; AHI: Apnea-Hypopnea Index; e/h: events per hour; REM:
deep-learning architecture based on the combination of a CNN Rapid Eye Movement; NREM: Non-REM; TRT: Total Recording Time;
and a RNN (CNN-RNN) could extract additional information TST: Total Sleep Time
from the PR and SpO2 signals able to improve the automated
detection of sleep stages in childhood OSA patients. B. CNN-RNN architecture
Consequently, our main objective is to design and assess a Figure 1 shows the main components of the CNN-RNN
CNN-RNN deep-learning architecture to identify W, NREM, architecture employed in this study. Adapted from the CNN-
and REM stages from PR and SpO2 recordings in children with RNN proposed by Korkalainen et al. (2019) to detect sleep
suspected OSA. stages in adults from PPG data [7], the proposed CNN-RNN
receives as input a sequence of 100 consecutive epochs of 30-
II. MATERIALS AND METHODS s of the PR and SpO2 signals (100x30x2 samples). First, each
A. Subjects and signals epoch is processed separately through a time distributed layer
that contains a CNN. The CNN is composed of 5 convolutional
The baseline dataset from the semi-public Childhood blocks (conv block), which are intended to automatically learn
Adenotonsillectomy Trial (CHAT) database was used in this the features of each epoch of the PR and SpO2 signals (30x2
study [15], [16]. The clinical trial identifier of the CHAT samples) related with W/NREM/REM stages. Each conv
database is NCT00560859 and its full research protocol can block consists of: (i) a convolutional layer, which extracts the
be found in the supplementary material of Marcus et al. [15]. feature maps from PR and SpO2 data using 32 filters of size
The CHAT-baseline dataset is composed of PSG recordings 5x2; (ii) a batch normalization layer that normalizes the feature
from 453 children aged 5 to 10 years old suffering from OSA, maps; (iii) a Rectified Linear Unit (ReLU) activation function
who were randomized to a strategy of watchful waiting or that introduces nonlinearity to the normalized feature maps;
early adenotonsillectomy treatment [16]. Each sleep study
and (iv) a dropout operation that minimizes overfitting by
contains annotations of sleep stages and apnea/hypopnea
randomly removing node connections with a probability of 0.1
events, which were done using the AASM 2007 rules [17].
[14]. After the last conv block, the 3D feature maps are
This dataset provided valid PR and SpO2 signals from 429 reshaped into 1D data using a flattening operation.
pediatric subjects. The data, originally recorded during PSG
The time distributed CNN is then processed using a RNN
using sampling rates (fs) from 1 to 512 Hz, were resampled to
to learn the temporal distribution of the sleep stages in the
a common fs of 1 Hz [8], [10]. Then, a subject-based
sequence. First, a dropout layer with a rate of 0.3 is used to
standardization was performed to normalize PR and SpO2
minimize overfitting [7], [14]. Next, a bidirectional Gate
baseline levels among different children. PR and SpO2 signals
Recurrent Unit (GRU) layer is applied to model the temporal
were finally divided into consecutive 30-second epochs,
dependence of the input sequence, deciding the information to
being each epoch classified as W, NREM, or REM with the
be retained and the information to be forgotten from the
annotations provided by sleep technicians [10].
network [14]. GRU was chosen instead of Long Short-Term
The data were split into three sets: training (257 first Memory (LSTM) as it provided similar results with a lower
children, 60%), used to train the CNN-RNN model; validation computational cost [14]. This layer contains 64 units with a
set (85 following children, 20%), employed to monitor the dropout probability of 0.3 in the forward step and 0.5 in the
convergence of the CNN-RNN; and test set (last 87 children, recurrent step [7]. Finally, a time distributed layer containing
20%), used for performance assessment. Table I shows a softmax activation function is employed to obtain the
clinical and demographic data from the population under probability of belonging to W ( ), NREM ( ), and
study. REM ( ) stages for the epoch of the input sequence.

Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on August 10,2024 at 05:32:27 UTC from IEEE Xplore. Restrictions apply.
(W/NREM/REM). Interestingly, the CNN-RNN model fed
with sequences of 100 epochs of PR and SpO2 signals rightly
classified 86.1% of the 30-s epochs (91240/106029), with a
kappa of 0.743, a MF1 of 0.820, and F1-scores of 0.847, 0.901,
and 0.711 for W, NREM, and REM sleep stages, respectively.
B. Estimation of the TST
Figure 3 shows the Bland-Altman plot comparing the TST
calculated from automatic CNN-RNN scoring (TSTCNN-RNN)
with the TST derived from PSG (TSTPSG) in the test set. ICC
is also shown. TSTCNN-RNN slightly overestimated TSTPSG, as
reported by their mean difference (16.1 min) and confidence
interval (from -52.6 to 84.8 min). Additionally, TSTCNN-RNN
showed an ICC of 0.747 with TSTPSG.
IV. DISCUSSION
In this work, we propose a CNN-RNN architecture to
enhance the automatic scoring of wake, NREM, and REM
sleep stages from pulse oximetry signals (PR and SpO2) in
childhood OSA patients. To our knowledge, the application of
a deep-learning model based on the combination of a CNN and

Figure 1. Overview of the proposed deep-learning architecture based on


the combination of a CNN and a RNN (CNN-RNN). Each convolutional
block (conv block) includes a convolutional layer, batch normalization, a
ReLU activation function, and dropout.
The CNN-RNN architecture was implemented using
TensorFlow library and trained with the following
configuration [14]: He-normal method to initialize network
weights; categorical cross-entropy as the loss function; batch
size of 128 with a random data shuffling strategy; the Adam
method with an initial learning rate of 0.0001 to optimize
network weights; early stopping after 30 training steps of non- Figure 2. Confusion matrix of the CNN-RNN architecture in the test set.
improvement; in the validation loss; and 500 as the maximum This matrix compares the sleep stages manually scored from PSG with the
corresponding automatic assignation using the CNN-RNN model.
number of training steps.
C. Statistical analysis
The overall performance of the CNN-RNN for automatic
sleep staging was assessed by means of confusion matrices (3-
class), which were used to compute the 3-class accuracy (Acc),
Cohen’s kappa index (kappa), macro F1-score (MF1), and per-
class F1-score (F1). Additionally, the TST was computed for
each patient based on the sleep stages scored by the CNN-
RNN model (TSTCNN-RNN) and compared with the TST from
standard PSG (TSTPSG). Bland-Altman plots and the intra-
class correlation coefficient (ICC) were used to assess the
estimated TST agreement.
III. RESULTS
A. CNN-RNN model performance
Figure 2 shows the confusion matrix of the CNN-RNN Figure 3. Bland-Altman plot comparing TSTCNN-RNN with TSTPSG in the test
model obtained in the test set for automatic sleep staging set.

Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on August 10,2024 at 05:32:27 UTC from IEEE Xplore. Restrictions apply.
a RNN is novel in the framework of automated sleep staging a higher performance than the reported by previous studies.
in pediatric subjects. In addition, we showed that the CNN-RNN model can
The proposed CNN-RNN architecture reached a high provide a reliable estimation of the TST in pulse oximetry
performance, with 86.1% Acc and 0.743 kappa for tests. Thus, we conclude that CNN-RNN architectures can be
W/NREM/REM sleep classification. Particularly, the kappa used to extract additional information on the temporal
value obtained by the CNN-RNN model (in the range 0.61- distribution of sleep stages from pulse oximetry recordings in
0.80) indicates that there is a substantial agreement between children being evaluated for suspected OSA.
our automatic deep-learning model and manual scoring from
PSG [18]. Hence, our proposal could provide sleep stage REFERENCES
annotations in at-home pulse oximetry tests for the screening
[1] M. J. Sateia, “International Classification of Sleep Disorders-
of childhood OSA [3]. The TST derived from the CNN-RNN Third Edition,” Chest, vol. 146, no. 5, pp. 1387–1394, Nov. 2014.
architecture also showed a high concordance with the TST [2] R. B. Berry, R. Brooks, C. E. Gamaldo, S. M. Harding, C. L.
from PSG (TSTPSG), with an ICC of 0.747, a mean difference Marcus, and B. V. Vaughn, “The AASM Manual for the Scoring
of 16.1 min, and a confidence interval of -52.6 to 84.8 min. of Sleep and Associated Events,” Am. Acad. Sleep Med., vol. 53,
The slight overestimation of TSTPSG can be explained by the no. 9, pp. 1689–1699, 2018.
[3] F. del Campo, A. Crespo, A. Cerezo-Hernández, G. C. Gutiérrez-
slight trend of the CNN-RNN to classify W epochs (15%) as
Tobal, R. Hornero, and D. Álvarez, “Oximetry use in obstructive
NREM (see figure 2), being NREM the majority class in the sleep apnea,” Expert Rev. Respir. Med., vol. 12, no. 8, pp. 665–
data. Conversely, the obtained ICC value (in the range 0.50- 681, 2018.
0.75) indicates a moderate agreement [18], highlighting the [4] A. Malhotra et al., “Performance of an automated
usefulness of our proposal to derive the TST in oximetry tests polysomnography scoring system versus computer-assisted
[3], [7], [8]. manual scoring,” Sleep, vol. 36, no. 4, pp. 573–582, 2013.
[5] O. Faust, H. Razaghi, R. Barika, E. J. Ciaccio, and U. R. Acharya,
Two preliminary studies performed by our research group “A review of automated sleep stage scoring based on
have shown the usefulness of CNN-based deep-learning physiological signals for the new millennia,” Comput. Methods
Programs Biomed., vol. 176, pp. 81–91, 2019.
methodologies for pediatric sleep staging, reporting a superior [6] A. V Benjafield et al., “Estimation of the global prevalence and
performance than previous feature-based approaches [10], burden of obstructive sleep apnoea: a literature-based analysis,”
[12]. In Vaquerizo et al. [12], we reported 78.3% Acc and 0.57 Lancet Respir Med, vol. 7, no. 8, pp. 687–698, 2020.
kappa for the detection of W/NREM/REM from raw PPG data, [7] H. Korkalainen et al., “Deep learning enables sleep staging from
and an ICC of 0.59 for the estimation of the TST. In Vaquerizo photoplethysmogram for patients with suspected sleep apnea,”
Sleep, vol. 43, no. 11, pp. 1–10, 2020.
et al. [10], 83.1% Acc and 0.68 kappa were obtained for [8] R. Casal, L. E. Di Persia, and G. Schlotthauer, “Temporal
W/NREM/REM classification from raw PR and SpO2 data, convolutional networks and transformers for classifying the sleep
whereas an ICC of 0.677 was obtained for the calculation of stage in awake or asleep using pulse oximetry signals,” J.
the TST. In this work, which has used the same database as in Comput. Sci., vol. 59, no. December 2021, 2022.
the two previous studies [10], [12], a higher performance was [9] E. D. Chan, M. M. Chan, and M. M. Chan, “Pulse oximetry:
Understanding its basic principles facilitates appreciation of its
obtained with a CNN-RNN fed with PR and SpO2 data: 86.1% limitations,” Respir. Med., vol. 107, no. 6, pp. 789–799, 2013.
Acc, 0.743 kappa, and 0.747 ICC. Thus, the information about [10] F. Vaquerizo-Villar et al., “A convolutional neural network to
the temporal distribution of the data provided by the RNN classify sleep stages in pediatric sleep apnea from pulse oximetry
allows to improve the detection of sleep stages. signals,” MELECON 2022 - IEEE Mediterr. Electrotech. Conf.
Proc., pp. 108–113, 2022.
It is important to denote some limitations of our study. [11] M. Radha et al., “A deep transfer learning approach for wearable
First, although the sample size is considerably large (429 sleep stage classification with photoplethysmography,” npj Digit.
subjects), the database only contains children suffering from Med., vol. 4, no. 1, pp. 1–11, 2021.
OSA (AHI≥ 1 e/h). Thus, additional pediatric datasets that [12] F. Vaquerizo-villar et al., “Automatic Sleep Staging in Children
with Sleep Apnea using Photoplethysmography and
include, among others, healthy control subjects would be Convolutional Neural Networks,” in 43rd Annual International
desirable. Another limitation is the computational load of the Conference of the IEEE Engineering in Medicine & Biology
RNN, which may hinder its implementation in portable Society (EMBC 2021), 2021, pp. 216–219.
devices. In this respect, novel deep-learning methods with a [13] C. L. Rosen, L. D’Andrea, and G. G. Haddad, “Adult criteria for
lower computational cost than RNNs (e.g., transformers), as obstructive sleep apnea do not identify children with serious
obstruction,” Am Rev Respir Dis, vol. 146, no. 5 Pt 1, pp. 1231–
well as novel strategies addressing imbalance between sleep 1234, 1992.
stages, should be assessed in future studies. Finally, another [14] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT
interesting future goal could be to design and assess an Press, 2016.
automatic deep-learning model that simultaneously score [15] C. L. Marcus et al., “A Randomized Trial of Adenotonsillectomy
sleep stages and apnea/hypopnea events, thus providing a for Childhood Sleep Apnea,” N. Engl. J. Med., vol. 368, no. 25,
pp. 2366–2376, 2013.
complete diagnosis of childhood OSA from pulse oximetry [16] S. Redline et al., “The Childhood Adenotonsillectomy Trial
signals. (CHAT): Rationale, Design, and Challenges of a Randomized
Controlled Trial Evaluating a Standard Surgical Procedure in a
V. CONCLUSION Pediatric Population,” Sleep, vol. 34, no. 11, pp. 1509–1517,
2011.
In summary, a deep-learning architecture based on the [17] C. Iber, S. Ancoli-Israel, A. Chesson, and S. F. Quan, “The
combination of a CNN and a RNN has shown usefulness to AASM Manual for the Scoring of Sleep and Associated Events:
Rules, Terminology and Technical Specification,” J. Clin. Sleep
automatically score wake, NREM, and REM sleep stages Med., vol. 3, no. 7, p. 752, 2007.
from raw PR and SpO2 data in childhood OSA patients, with [18] M. L. McHugh, “Interrater reliability: the kappa statistic,”
Biochem. Medica, vol. 22, no. 3, pp. 276–282, 2012.

Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on August 10,2024 at 05:32:27 UTC from IEEE Xplore. Restrictions apply.

You might also like