0% found this document useful (0 votes)
51 views

Early Prediction of Poststroke Rehabilitation Outcomes Using Wearable Sensors

This document describes a study that used wearable sensors to predict rehabilitation outcomes for stroke patients. Fifty-five stroke patients undergoing inpatient rehabilitation were involved. Supervised machine learning models were trained on data collected at admission, including patient information, functional scores, and inertial sensor data from gait and balance tasks. The models predicted three outcomes at discharge: ambulation ability, independence level, and fall risk. For ambulatory patients, sensor data improved predictions of ambulation and fall risk compared to models without sensors. The best models predicted outcomes with accuracies of 84.4-68.8%. Sensor data did not provide additional value for non-ambulatory patients. The findings support investigating wearable sensors to personalize stroke rehabilitation.

Uploaded by

Gustavo Cabanas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views

Early Prediction of Poststroke Rehabilitation Outcomes Using Wearable Sensors

This document describes a study that used wearable sensors to predict rehabilitation outcomes for stroke patients. Fifty-five stroke patients undergoing inpatient rehabilitation were involved. Supervised machine learning models were trained on data collected at admission, including patient information, functional scores, and inertial sensor data from gait and balance tasks. The models predicted three outcomes at discharge: ambulation ability, independence level, and fall risk. For ambulatory patients, sensor data improved predictions of ambulation and fall risk compared to models without sensors. The best models predicted outcomes with accuracies of 84.4-68.8%. Sensor data did not provide additional value for non-ambulatory patients. The findings support investigating wearable sensors to personalize stroke rehabilitation.

Uploaded by

Gustavo Cabanas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Early Prediction of Poststroke Rehabilitation Outcomes Using Wearable Sensors

Running Head: Wearable Sensors to Predict Stroke Outcomes

Article Type: Original Research


TOC Category: Special Issue: Rehab Tech Advances
Submitted Date: January 15, 2023

PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
Revised Date: November 13, 2023
Accepted Date: December 3, 2023

RI
Authors: Megan K. O’Brien, PhD1,2*; Francesco Lanotte, PhD1,2*; Rushmin Khazanchi, BA3*; Sung Yul
Shin, PhD1,2; Richard L. Lieber, PhD2,4,5; Roozbeh Ghaffari, PhD4,6; John A. Rogers, PhD4,6,7,8; Arun
Jayaraman, PT, PhD1,2†

SC
* These authors should be considered co-first authors
† Corresponding Author

U
AN
1
Max Nader Lab for Rehabilitation Technologies and Outcomes Research, Shirley Ryan AbilityLab,
Chicago, IL, USA
2
Department of Physical Medicine and Rehabilitation, Northwestern University, Chicago, IL, USA
M
3
Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
D
TE
EC
RR
CO
N
U

The Author(s) 2024. Published by Oxford University Press on behalf of the American Physical
Therapy Association.
4
Department of Biomedical Engineering, Northwestern University, Evanston, IL, USA
5
Shirley Ryan AbilityLab, Chicago, IL, USA
6
Querrey Simpson Institute for Bioelectronics, Northwestern University, Evanston, IL, USA

PT
7

Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024


Departments of Materials Science and Engineering, Chemistry, Mechanical Engineering, Electrical
Engineering and Computer Science, Northwestern University, Evanston, IL, USA
8

RI
Department of Neurological Surgery, Northwestern University Feinberg School of Medicine,
Northwestern University, Chicago, IL, USA

SC
Address all correspondence to Dr Jayaraman at: [email protected].

Keywords: Outcome Assessment (Health Care); Gait; Balance; Biomedical Engineering; Decision

U
Making: Computer-Assisted; Technology Assessment: Biomedical; Inpatients; Patient Care Planning;

AN
Rehabilitation; Prognosis
M
D
TE
EC
RR
CO
N
U

1
1 Abstract

2 Objectives. Inpatient rehabilitation represents a critical setting for stroke treatment, providing

3 intensive, targeted therapy and task-specific practice to minimize a patient’s functional deficits

4 and facilitate their reintegration into the community. However, impairment and recovery vary

PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
5 greatly after stroke, making it difficult to predict a patient’s future outcomes or response to

RI
6 treatment. In this study, the authors examined the value of early-stage wearable sensor data to

SC
7 predict 3 functional outcomes (ambulation, independence, and risk of falling) at rehabilitation

8 discharge.

U
9 Methods. Fifty-five individuals undergoing inpatient stroke rehabilitation participated in this

AN
10 study. Supervised machine learning classifiers were retrospectively trained to predict discharge

11 outcomes using data collected at hospital admission, including patient information, functional
M
12 assessment scores, and inertial sensor data from the lower limbs during gait and/or balance tasks.
D
13 Model performance was compared across different data combinations and benchmarked against a
TE

14 traditional model trained without sensor data.

15 Results. For patients who were ambulatory at admission, sensor data improved predictions of
EC

16 ambulation and risk of falling (with weighted F1-scores increasing by 19.6% and 23.4%,
RR

17 respectively), and maintained similar performance for predictions of independence, compared to a

18 benchmark model without sensor data. The best-performing sensor-based models predicted
CO

19 discharge ambulation (community vs. household), independence (high vs. low), and risk of falling

20 (normal vs. high) with accuracies of 84.4%, 68.8%, and 65.9%, respectively. Most
N

21 misclassifications occurred with admission or discharge scores near the classification boundary.
U

22 For patients who were non-ambulatory at admission, sensor data recorded during simple balance

23 tasks did not offer predictive value over the benchmark models.

2
24 Conclusion. These findings support the continued investigation of wearable sensors as an

25 accessible, easy-to-use tool to predict functional recovery after stroke.

26 Impact. Accurate, early prediction of poststroke rehabilitation outcomes from wearable sensors

27 would improve our ability to deliver personalized, effective care and discharge planning in the

PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
28 inpatient setting and beyond.

RI
29

SC
U
AN
M
D
TE
EC
RR
CO
N
U

3
30 Introduction

31 Stroke is a leading cause of disability worldwide.1 Following initial treatment, many stroke

32 survivors are admitted to an inpatient rehabilitation facility (IRF), for ongoing medical care and

33 targeted, intensive, multidisciplinary therapy in the early stages after stroke. A primary goal of IRF

PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
34 rehabilitation is to maximize neural and functional recovery to help patients reintegrate into the

RI
35 community upon discharge.2 However, not all individuals have the same potential for recovery.

SC
36 Patients achieve widely varying levels of function after initial treatment, with some returning to

37 pre-morbid function and others retaining severe deficits that require additional short- or long-term

U
38 care.3

AN
39

40 Starting at IRF admission, clinicians must plan when the patient will be discharged from the
M
41 hospital, where they can be safely discharged (ie, to their home with or without caregiver
D
42 assistance, or to a skilled nursing facility for ongoing rehabilitative care), and how to structure
TE

43 therapy to optimize a patient’s overall discharge disposition. In the US, the average IRF length of

44 stay has decreased to 12.9 days for patients with Medicare,4 giving clinicians, patients, and families
EC

45 only a brief window to design short-term care strategies and post-discharge plans suited to the
RR

46 patient’s needs (eg, seeking and training caregivers, making home modifications or alternative

47 living arrangements, ordering assistive devices). Early, objective, and accurate predictions of a
CO

48 patient’s functional recovery would help clinicians, patients, and families plan appropriate

49 treatment and reintegration strategies based on the expected discharge disposition.


N

50
U

51 Numerous research models have been proposed to predict stroke recovery.5,6 Many of these models

52 use exclusively information available from electronic medical records (EMRs), including patient

4
53 demographics and clinical information.7-9 While such models lend themselves to simple and

54 relatively undemanding clinical implementation, their resolution may not detect subtle differences

55 between patients, leading more often to rules of thumb about recovery rather predicting specific

56 patient outcomes. Conversely, high-resolution metrics, such as from transcranial magnetic

PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
57 stimulation or brain imaging, could improve prediction resolution and accuracy,10-13 but these

RI
58 measures are costly and not often available in rehabilitation settings, posing barriers to clinical

SC
59 uptake.

60

U
61 Non-invasive wearable sensors show promise for capturing biomarkers of disease and recovery,

AN
62 by mining patterns from continuous, high-resolution physiological or behavioral data.14,15 We

63 previously demonstrated that data from inertial measurement units (IMUs), recorded during a brief
M
64 walking bout within a week of IRF admission, improved prediction of ambulation ability at
D
65 discharge compared to traditional functional assessments and other patient descriptors.16 However,
TE

66 a patient’s discharge disposition depends on different abilities, such as navigating their home

67 environment and performing activities of daily living safely and independently. Therefore, we
EC

68 propose 3 functional outcomes for prediction models that may be considered broadly
RR

69 representative of these attributes: the 10-Meter Walk Test (10MWT; ambulation), Functional

70 Independence Measure score (FIM; independence, specifically related to motor tasks), and the
CO

71 Berg Balance Scale (BBS; risk of falling). To enhance the clinical value of model predictions, we

72 used clinically-significant cut-off scores to classify outcomes as signifying none-to-mild and


N

73 moderate-to-severe impairment. Finally, while in our previous work we used sensor data solely
U

74 from walking tasks, here the recorded activities also encompassed simple balance tasks.

5
75 Consequently, incorporating a non-ambulatory population into our approach expands our insights

76 into the potential of sensor-based prediction models for a broader range of patients and IMU data.

77

78 The objectives of the present study were to expand our early-stage prognostic models to predict 3

PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
79 poststroke functional outcomes (ambulation, independence, and risk of falling) at IRF discharge

RI
80 for both patients who are ambulatory and patients who are non-ambulatory using data recorded at

SC
81 admission, and evaluate the ability of IMU data to predict each of these 3 outcomes. We

82 hypothesized that incorporating lower-limb IMU data would improve the prediction of discharge

U
83 outcomes relative to models trained on clinician-scored functional assessments and demographic

AN
84 and clinical patient information alone.

85
M
86 Materials and Methods
D
87 Participants
TE

88 Fifty-five patients were recruited from the inpatient rehabilitation unit of the Shirley Ryan

89 AbilityLab (Chicago, IL, USA). Inclusion criteria were having a primary diagnosis of stroke, being
EC

90 aged at least 18 years, and able and willing to give consent and follow study directions. Exclusion
RR

91 criteria were having a known neurodegenerative pathology; pregnant or nursing; or utilizing a

92 powered, implanted cardiac device for monitoring or supporting heart function. Medical clearance
CO

93 was obtained from the primary physician prior to participation. All individuals (or a proxy)

94 provided written informed consent, and the study was approved by the Institutional Review Board
N

95 of Northwestern University (Chicago, IL; STU00205532).


U

96

97 Experimental Protocol

6
98 Data were collected from patients at 2 timepoints: within one week of IRF admission and within

99 one week prior to discharge. At each timepoint, participants completed a series of standardized

100 functional assessments, including the 10MWT, BBS, 6-Minute Walk Test (6MWT), and Timed

101 “Up & Go” test (TUG). FIM scores were extracted from the patient’s EMR at each timepoint. All

PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
102 assessments were administered and scored by a licensed physical therapist. Assessments that could

RI
103 not be completed were scored as zero. Patient information – including demographics, pre-morbid

activity level, and stroke characteristics – were obtained from the EMR and a study intake form.

SC
104

105

U
106 Sensor data were collected from 3 flexible, wireless IMUs (BioStampRC; MC10, Inc., Cambridge,

AN
107 MA) during the functional assessments. These devices were attached to the lumbar region (L4-L5

108 level) and each ankle (proximal to the lateral malleolus, along the mid-sagittal line), using an
M
109 adhesive film (Tegaderm; 3M, St. Paul, MN, USA). They recorded triaxial signals from an
D
110 accelerometer (sensitivity ±4g) and a gyroscope (sensitivity ±2000°/s) sampled at 31.25 Hz.
TE

111

112 Selection of Sensor Data for Model Training


EC

113 We divided participants into 2 groups based on their walking status at IRF admission. Patients who
RR

114 were ambulatory (N = 43) were individuals who could complete at least one walking assessment

115 at admission (10MWT, 6MWT, or TUG) with no more than moderate assistance from a physical
CO

116 therapist. Patients who were unable to complete all the walking assessments at admission were

117 considered non-ambulatory (N = 12).


N

118
U

119 To establish a simple yet inclusive set of physical activities to capture potential biomarkers of

120 recovery across these 2 groups, we narrowed the sensor analysis to a single walking task that could

7
121 be completed by most participants who are ambulatory, and a series of non-ambulatory tasks that

122 could be completed by most participants regardless of ambulatory status.

123

124 For the walking task, we selected a single trial of the 10MWT at self-selected velocity, which we

PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
125 previously found to be predictive of ambulation discharge outcomes among individuals who are

RI
126 ambulatory.16 In our present dataset, 33 patients who were ambulatory had IMU data during the

SC
127 10MWT (Fig. 1).

128

U
129 For the non-ambulatory tasks, we selected the first 4 items of the BBS (standing unsupported for

AN
130 up to 2 minutes, sitting unsupported for up to 2 minutes, stand-to-sit transition, and sit-to-stand

131 transition), which are among the least demanding and had a high completion rate among all patients
M
132 (Suppl. Fig. 1). In our dataset, 8 patients who were non-ambulatory and 42 patients who were
D
133 ambulatory had IMU data during these 4 tasks (Fig. 1).
TE

134

135 Annotated sensor data for each task were cleaned by removing duplicate timestamps and
EC

136 resampling to the expected sampling frequency (31.25 Hz) using spline interpolation. Data
RR

137 processing, filtering, and subsequent feature extraction were completed in MATLAB R2017b

138 (Mathworks, Inc., Natick, MA, USA).


CO

139

140 Feature extraction


N

141 Features are measurable, independent variables used as input to a machine learning algorithm to
U

142 make predictions. Three feature categories were defined in this study: patient information (PI),

143 functional assessment scores (FA), and wearable sensor (IMU) data. To reduce dimensionality of

8
144 the feature space and increase robustness to sensor placement, IMU features were computed from

145 the Euclidean norm of the triaxial accelerometer and gyroscope signals. IMU features for the BBS

146 were supplemented with measures of postural sway, computed from the mediolateral and

147 anteroposterior axes of the lumbar sensor (Suppl. Tab. 1). We applied one-hot encoding to

PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
148 categorical variables to prevent ordinality issues. Supplementary Table 2 summarizes

RI
149 characteristics of the PI and FA features for our different training and testing datasets.

SC
150

151 Combinations of these feature categories were used to train prediction models, creating 3 different

U
152 types of models for comparison: a benchmark model (PI+FA, no sensor data) including both

AN
153 patient information and functional assessments, a streamlined sensor model (PI+IMU) including

154 easily obtained patient information, and a comprehensive model (PI+FA+IMU) including all
M
155 feature types. The PI+FA benchmark served as a comparative point of reference to determine the
D
156 impact of sensor data on predicting each discharge outcome.
TE

157

158 Model Architecture and Training


EC

159 We trained separate supervised learning classifiers to predict 3 different discharge outcomes:
RR

160 ambulation, independence, and risk of falling (Fig. 2). For each outcome, we defined 2 classes of

161 patient function at discharge; namely, household vs. community ambulators (based on 10MWT
CO

162 score17,18), low vs. high independence (based on FIM motor sub-score19,20), and high vs. normal

163 risk of falling (based on BBS score21).


N

164
U

165 Classifiers were developed using the Scikit-Learn (0.23.2) library in Python 3.8.8. We selected

166 L1-penalized logistic regression (L1-LR) given its ability to handle the high dimensionality,

9
167 relatively small sample size, and the varying degrees of class imbalance. L1-LR also requires few

168 hyperparameters and calculates feature importance scores, simplifying the training and

169 interpretation processes for more direct comparison between the models. Models were trained and

170 tested to predict the 3 discharge outcomes for the ambulatory and non-ambulatory populations,

PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
171 using nested leave-one-subject-out cross validation (Suppl. Fig. 2).

RI
172

SC
173 Models predicting ambulatory outcomes at discharge were exclusively trained and tested using the

174 32 patients who were ambulatory who had IMU data available for both the 10WMT and BBS (Fig.

U
175 1). To determine the most predictive sensor tasks for patients who were ambulatory, we compared

AN
176 model performance when training with IMU features from BBS only (IMUBBS), 10MWT only

177 (IMU10MWT), and BBS and 10MWT combined (IMU10MWT+BBS).


M
178
D
179 Models predicting non-ambulatory outcomes at discharge were trained using data from the
TE

180 combined ambulatory and non-ambulatory populations to maximize the availability of the BBS

181 IMU data. We refer to these models as non-ambulatory models because they were tested and
EC

182 intended exclusively for the 8 patients who were non-ambulatory. This combined training was
RR

183 adopted to increase the sample size and heterogeneity of discharge outcomes for model learning

184 compared to the non-ambulatory cohort alone.


CO

185

186 Model Interpretation


N

187 The primary performance metric was the weighted F1 score (WF1), defined as the harmonic mean
U

188 of the precision and recall, computed separately for each class 𝑗, and weighted by the number of

10
189 samples 𝑛𝑗 within each class, with the highest possible value of 1.0 indicating perfect precision

190 and recall.22

𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛𝑗 · 𝑟𝑒𝑐𝑎𝑙𝑙𝑗
∑𝐿𝑗=1 2 · ·𝑛
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛𝑗 + 𝑟𝑒𝑐𝑎𝑙𝑙𝑗 𝑗
191 𝑊𝐹1 =
∑𝐿𝑗=1 𝑛𝑗

PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
192

RI
193 Secondary performance metrics were accuracy and log-loss scores. Accuracy is the ratio of correct

SC
194 predictions to the total number of samples, with the highest value of 1.0:

195

U
𝑡𝑝 + 𝑡𝑛

AN
196 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
𝑡𝑝 + 𝑓𝑝 + 𝑡𝑛 + 𝑓𝑛

197 M
198 where 𝑡𝑝, 𝑡𝑛, 𝑓𝑝, and 𝑓𝑛 are the numbers of true positives, true negatives, false positives, and
D
199 false negatives, respectively. Positive classes were household ambulation ability, low
TE

200 independence, and high risk of falling.

201
EC

202 Log-loss measures the variation between prediction probabilities and true classes, wherein lower

values indicate greater certainty about the predictions.23 Given a true label 𝑦𝑖 and the prediction
RR

203

204 probability 𝑝𝑖 = Pr⁡(𝑦𝑖 = 1), log-loss is computed as:


CO

205
1
206 𝐿𝑜𝑔𝐿𝑜𝑠𝑠 = ⁡ − 𝑁 ∑𝑁
𝑖=1(𝑦𝑖 · ln(𝑝𝑖 ) + (1 − 𝑦𝑖 ) · ln⁡(1 − 𝑝𝑖 )).
N

207
U

208 Confusion matrices were generated for the best performing models to examine misclassifications

209 for each outcome and patient group. These were also compared to the benchmark PI+FA model.

11
210 Parameter importance was determined from coefficients fit in the best model, taking the median

211 coefficient value and the 25th and 75th percentiles across all participants. Parameters with median,

212 25th, and 75th percentile values equal to zero were discarded.

213

PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
214 Role of the Funding Source: The funders played no role in the design, conduct, or reporting of

RI
215 this study.

SC
216

217 Results

U
218 Classification performance for each model and feature set is presented in the Table and

AN
219 summarized below.

220 Ambulation
M
221 For patients who were ambulatory, the benchmark PI+FA ambulation model had a WF1 of 0.709.
D
222 Gait-based IMU features, either alone or combined with balance features, improved performance
TE

223 in both the streamlined and comprehensive sensor model configurations by 19.6%. Balance-based

224 IMU features alone did not affect ambulation predictions.


EC

225
RR

226 The gait-based streamlined sensor model, PI+IMU10MWT, was selected as the best model for

227 patients who were ambulatory, given its simple configuration and highest WF1 (Fig. 3A). The
CO

228 streamlined sensor model outperformed the benchmark, correctly identifying more patients who

229 were household (4 vs. 1 patient(s)) and community (23 vs. 21 patients) ambulators at discharge
N

230 (Fig. 3B). The PI+IMU10MWT model also correctly identified 27 of 29 patients who did not change
U

231 ambulation category from IRF admission to discharge, though it misclassified 3 patients who

232 improved from household to community ambulators (Fig. 3C). Eleven features were selected for

12
233 the PI+IMU10MWT model, including lesion location, activity lifestyle, and IMU features from all

234 sensor locations (Fig. 3D).

235

236 For patients who were non-ambulatory, the comprehensive model trained on the combined dataset

PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
237 was the best ambulation model, achieving a WF1 of 0.859 (Suppl. Fig. 3A). The PI+FA+IMUBBS

RI
238 model correctly classified 1 of 2 individuals who were non-ambulatory who progressed to

SC
239 community ambulators, as well as all 6 individuals who were non-ambulatory and were discharged

240 as household ambulators (Suppl. Fig. 3B, 3C). Notably, 2 individuals remained non-ambulatory at

U
241 discharge, with 1 completing the 6MWT but unable to complete the 10MWT. Among the 28

AN
242 features selected for the comprehensive model, the admission 10MWT score and IMU balance

243 features were the most important predictors of community and household ambulation, respectively
M
244 (Suppl. Fig. 3D).
D
245
TE

246 Independence

247 For patients who were ambulatory, the benchmark PI+FA independence model had a WF1 of
EC

248 0.685. Gait-based IMU features yielded a similar WF1, while balance features performed slightly
RR

249 worse in both the streamlined (−14.2%) and comprehensive (−9.2%) sensor models. Combining

250 gait and balance IMU features further decreased WF1 (up to −17.8%) for both sensor models.
CO

251

252 The gait-based comprehensive model, PI+FA+IMU10MWT, was the best-performing model
N

253 according to WF1 (Fig. 4A). Compared to benchmark, the comprehensive model correctly
U

254 classified more individuals who were ambulatory who were discharged with low independence

255 (11 vs. 9 patients), though with fewer correct predictions for individuals with high independence

13
256 (11 vs. 13 patients) (Fig. 4B). Misclassifications were higher among participants with discharge

257 FIM motor scores close to the class threshold. The PI+FA+IMU10MWT model correctly identified

258 10 out of the 16 patients who transitioned from low to high independence (Fig. 4C). Fourteen

259 features were selected for this model, including gyroscope features from the lumbar and

PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
260 unaffected-side ankle. Participant age was the most discriminative feature for low independence

RI
261 at discharge, while 10MWT and BBS admission scores indicated high independence (Fig. 4D).

SC
262

263 For patients who were non-ambulatory, independence predictions achieved the same WF1 of 0.933

U
264 across models, with the least uncertainty in the comprehensive model (Suppl. Fig. 4A-4C). We

AN
265 selected the benchmark as the best model, which used simple features as age, admission 6MWT,

266 and admission BBS scores to differentiate between the two levels of discharge independence
M
267 (Suppl. Fig. 4D).
D
268
TE

269 Risk of falling

270 For patients who were ambulatory, the benchmark risk-of-falling model had a WF1 of 0.534 (Fig.
EC

271 5A). Balance-based IMU features decreased performance in the streamlined sensor model to 0.347,
RR

272 but slightly increased performance in the comprehensive model to 0.566. Gait-based IMU features

273 improved performance relative to the benchmark model in both the streamlined (23.4%) and
CO

274 comprehensive (17.6%) sensor models. Combined gait and balance IMU features did not increase

275 performance further in either model configuration.


N

276
U

277 The gait-based streamlined sensor model, PI+IMU10MWT, was selected as the best risk-of-falling

278 model. Compared to the benchmark, the streamlined sensor model correctly classified more

14
279 individuals who were ambulatory who were discharged with both high risk (12 vs. 9 patients) and

280 normal risk (9 vs. 8 patients) (Fig. 5B). Incorrect predictions were more likely when the BBS

281 discharge score was near the cut-off value. The PI+IMU10MWT model correctly predicted 5 patients

282 who transitioned from high to normal risk (out of 8 total) (Fig. 5C). Of the 23 features selected for

PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
283 this model, various IMU and demographic features had similar average importance to distinguish

RI
284 individuals with high and normal risk of falling (Fig. 5D).

SC
285

286

U
287 For patients who were non-ambulatory, risk of falling predictions were perfectly accurate for the

AN
288 benchmark and comprehensive models, whereas the streamlined sensor model exhibited

289 marginally lower performance (Suppl. Fig. 5A). Both the benchmark and the comprehensive
M
290 models identified all individuals who were non-ambulatory with high risk of falling (Suppl. Fig.
D
291 5B, 5C). The benchmark was selected as the best model, utilizing the simplest set of 4 features
TE

292 with relatively low uncertainty. Lifestyle and left-side hemiparesis were markers for high fall risk,

293 whereas BBS admission score had the highest importance to identify individuals with normal risk
EC

294 (Suppl. Fig. 5D).


RR

295

296 Discussion
CO

297 For patients who were ambulatory at admission, we found that IMU sensor data recorded from the

298 lumbar and ankles during walking tasks improved early predictions of poststroke inpatient
N

299 rehabilitation outcomes (ambulation, independence, risk of falling) compared to benchmark


U

300 predictions derived from EMR-based patient information and standardized functional assessment

301 scores. For ambulation and risk of falling, IMU features extracted during a 10m walking bout

15
302 increased the WF1 in a streamlined sensor model (PI+ IMU10MWT), while FA features from

303 admission further improved predictions for independence. Similar to our previous work, including

304 sensor data improved predictions of discharge ambulation compared to a benchmark model.16 This

305 finding was repeatable across the 2 studies despite using different modeling approaches and

PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
306 algorithms (Random Forest vs. L1-LR). A streamlined PI+IMU10MWT model improved ambulation

RI
307 predictions by 19.6% over the benchmark performance, achieving an 84.4% accuracy to predict

SC
308 community/household ambulators based on the 10MWT score. A comprehensive

309 PI+FA+IMU10MWT model performed similarly to the benchmark independence performance,

U
310 scoring an accuracy of 68.8% to classify high/low independence based on the FIM motor subscore.

AN
311 A streamlined PI+IMU10MWT model improved the benchmark risk-of-falling performance by

312 23.4%, achieving a 65.9% accuracy to classify normal/high risk based on the BBS score. Most
M
313 misclassifications occurred when patients had admission or discharge scores near the class
D
314 boundary (Fig. 3-5C).
TE

315

316 For patients who were non-ambulatory at admission, incorporating IMU data from simple balance
EC

317 tasks added less value to predicting discharge ambulation function. The comprehensive models
RR

318 were as accurate as the benchmark models for independence and risk of falling outcomes, with

319 lower log-loss values indicating less uncertainty due to better convergence between prediction
CO

320 probabilities and actual classes. Interpretation of the non-ambulatory models is challenging given

321 the small, imbalanced sample size and similar discharge outcomes for these patients, which likely
N

322 limited the model’s ability to learn from the available non-ambulatory patient data.
U

323

16
324 A growing body of research focuses on development and testing early prediction tools after stroke.

325 Stinear et al24 provide a detailed review of models predicting functional and motor-related

326 outcomes, enumerating the strengths and limitations of methods published up to 2019. Several

327 previous studies have developed predictive models for IRF discharge, with most incorporating

PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
328 functional assessments and therapist evaluations obtained at admission.25-27 For example, Bland et

RI
329 al25 use the BBS and FIM walk scores at admission to predict ambulation at IRF discharge

SC
330 according to the 10MWT, with greater sensitivity (91%–94%, household ambulation) but lower

331 specificity (60–65%, community ambulation) compared to our findings. We have previously

U
332 developed sensor-free regression models to predict discharge scores using similar PI+FA features

AN
333 and 50 participants from this study with mean average error of 0.3 m/s, 9.5 points, and 7.4 points

334 for the 10MWT, FIM, and BBS, respectively.9 The TWIST model28 is another promising approach
M
335 for predictions outside of the IRF setting, utilizing age, BBS, and knee extension grade at 1-week
D
336 poststroke to predict independent walking according to Functional Ambulation Categories at 4, 6,
TE

337 9, 16, or 26 weeks after stroke, with 83% accuracy across all timepoints. Only recently has the

338 research community begun investigating the predictive value of wearable sensor data for similar
EC

339 prognostic applications.14,15 However, the utility of sensor data in regression models or long-term
RR

340 outcomes remains uncertain.

341
CO

342 Accurately predicting expected post-treatment outcomes early in rehabilitation would improve

343 discharge planning for clinicians, patients, families, and insurance companies by providing a
N

344 roadmap of the patient’s care needs after leaving the hospital. In this study, sensor features were
U

345 important predictors for individuals discharged with limited ambulation ability and high risk of

346 falling, providing quantitative measures of movement symmetry (eg, the skewness) and

17
347 repeatability (eg, sample entropy) for treatment monitoring.9,29,30 Sensor models could replace or

348 reduce reliance on functional assessment scores, as less time is needed to collect the data.

349 Consumer-grade devices and an about 5-minute sequence of simple physical activities (brief

350 walking bout, standing, stand-to-sit, sitting, sit-to-stand) would enable quicker and more frequent

PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
351 evaluations than the longer and more complex standardized functional assessments. The

RI
352 assessments considered in this study (10MWT, 6MWT, BBS, TUG, FIM) are typically collected

SC
353 at IRF admission in the US for clinical evaluation and insurance reporting. However, completing

354 these assessments upon admission can be challenging due to time limitations during

U
355 intake/treatment and varied patient impairments, including fatigability and physical or cognitive

AN
356 deficits.

357
M
358 Our results should be considered in context of previous findings for clinical machine learning
D
359 models — namely, that appropriate choices of target population,15 activities,14 sensor modalities,14
TE

360 and prediction outcomes5 are paramount to design a successful model.6 For instance, in the case

361 of patients were ambulatory, IMU data from the 10MWT and BBS were less impactful for
EC

362 predicting discharge independence, as defined by the FIM motor subscore. This is unsurprising,
RR

363 considering the FIM motor assessment evaluates a breadth of functional activities — including

364 walking, stair climbing, transfers, dressing, bathing, grooming, toileting, and bowel or bladder
CO

365 management — and some of these activities may not be well-characterized by gait or balance

366 movements at IRF admission. Sensor features from other physical activities may better capture
N

367 biomarkers of motor independence according to the FIM. Similarly, predictions for patients who
U

368 were non-ambulatory did not significantly benefit from sensor data, revealing the need for

369 alternative modeling approaches for patients with severe gait impairment.

18
370

371 Limitations

372 The number of incorrect predictions is a primary limitation of the models presented in this study.

373 Indeed, a naïve model predicting no change in outcome classification from admission to discharge

PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
374 would generally perform well for this study sample, since only a fraction of the patients changed

RI
375 classes in our study (ie, 9%–50% of patients who were ambulatory, or 0%–25% of patients who

SC
376 were non-ambulatory patients, depending on the outcome). However, such a model will always

377 fail to identify individuals who improved functional classes, who are arguably the most difficult

U
378 and clinically meaningful cases to predict. In contrast, our models could identify some individuals

AN
379 who improved in the independence (10 out of 16) and risk of falling (5 out of 8) functional classes.

380 The small and unbalanced populations in our single-site study may limit the sensitivity,
M
381 generalizability, and utility of the proposed models, with a potential risk of overfitting in these
D
382 high dimensional feature sets. Larger sample sizes, particularly for patients who are non-
TE

383 ambulatory at admission and achieve heterogenous discharge outcomes, will be crucial to further

384 train and validate sensor-based prediction models.


EC

385
RR

386 Future work should also investigate sensor regression models that predict continuous outcome

387 scores at discharge rather than classification models that predict categories based on a cut-off
CO

388 score. Regression models may offer greater clinical utility by removing reliance on predefined
N

389 classification boundaries and providing higher-resolution discharge predictions, though possibly

with greater sensitivity to error.6,14 Alternative clinical outcomes (eg, Fugl-Meyer Assessment),
U

390

391 sensor placements (eg, upper limbs), and functional abilities (eg, endurance) should also be

392 considered in these models for various clinical applications.

19
393

394 We did not evaluate other machine learning algorithms, which may outperform L1-penalized

395 logistic regression. Rather, we sought to understand the relative value of sensor data using a single,

396 well-performing and interpretable algorithm for each of these outcomes and patient groups.

PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
397 Alternative algorithms and extended hyperparameter tuning could improve the prediction

RI
398 performance shown here.

SC
399

U
400 A potential disadvantage of models trained to predict outcomes at hospital discharge is the use of

AN
401 hospital- and care-specific data. Because treatment strategies and patient characteristics can vary

402 nationally and internationally, a model trained using data from one location may not generalize to

403
M
others. For example, the PREP2 model31,32 – which demonstrated 75% accuracy in New Zealand

404 for categorizing 3-month upper limb function after 1-week poststroke – had drastically lower
D

405 accuracy for patients in the US and Europe.29,30 This highlights the necessity for additional testing
TE

406 and external validation to determine whether site-specific training data are essential for prediction
EC

407 models, or whether combined training data from multiple sites would broaden generalization

408 across IRFs.


RR

409
CO

410 Conclusions

411 This study affirms that motion-based measures from wearable sensors can be beneficial for
N

412 predicting certain patient outcomes following acute poststroke rehabilitation. We have highlighted
U

413 the potential and open challenges of moving these machine learning algorithms into clinical

414 practice to inform tailored and effective rehabilitation therapies. While sensor-based models may

20
415 increase predictive performance, additional research is needed to refine and validate these models

416 for new patients and IRF settings.

PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
RI
SC
U
AN
M
D
TE
EC
RR
CO
N
U

21
Author Contributions

(I) Conception and design: M.K.O, R.L.L., A.J.


(II) Administrative support: R.L.L., A.J.
(III) Provision of study materials or patients: R.L.L., R.G., J.A.R, A.J.

PT
(IV) Collection and assembly of data: M.K.O.

Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024


(V) Data analysis and interpretation: M.K.O., F.L., R.K., S.Y.S.

RI
(VI) Manuscript writing: All authors
(VII) Final approval of manuscript: All authors

SC
U
Acknowledgments

AN
The authors thank Nsude Okeke Ewo, Alexander J. Boe, Marco Hidalgo-Araya, Sara Prokup,

Matthew Giffhorn, Kelly McKenzie, Kristen Hohl, and Matthew McGuire for their help in
M
patient recruitment and data collection.
D
TE

Ethics Approval

All individuals (or a proxy) provided written informed consent, and the study was approved by
EC

the Institutional Review Board of Northwestern University (Chicago, IL; STU00205532).


RR

Funding
CO

This work was supported by the Shirley Ryan AbilityLab, with partial support from the National

Institutes of Health under institutional training grants at Northwestern University


N

(T32HD007418 to M.K.O.), center grant to establish the Center for Smart Use of Technology to
U

Assess Real-world Outcomes (C-STAR, P2CHD101899 to R.L.L.), and the National Institute on

Aging of the NIH (R43AG067835 to R.L.L.). This work was also supported in part by Research

Career Scientist Award from the United States Department of Veterans Affairs Rehabilitation

22
R&D Service (IK6 RX003351 to R.L.L.). The funders played no role in the design, conduct, or

reporting of this study.

Disclosure

PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
The authors completed the ICMJE Form for Disclosure of Potential Conflicts of Interest and

RI
reported no conflicts of interest.

SC
U
AN
M
D
TE
EC
RR
CO
N
U

23
References

[1] Centers for Disease Control and Prevention (CDC). Prevalence and Most Common Causes
of Disability Among Adults --- United States, 2005. MMWR Morb Mortal Wkly Rep 2009;
58: 421–426.
[2] Le Danseur M. Stroke Rehabilitation. Crit Care Nurs Clin North Am 2020; 32: 97–108.
[3] Brandstater ME, Shutter LA. Rehabilitation Interventions During Acute Care of Stroke

PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
Patients. Top Stroke Rehabil 2002; 9: 48–56.
[4] Report to the Congress: Medicare Payment Policy – MedPAC. Washingtond, DC,
https://www.medpac.gov/document/march-2022-report-to-the-congress-medicare-

RI
payment-policy/ (March 2022, accessed 11 September 2022).
[5] Kwah LK, Herbert RD. Prediction of Walking and Arm Recovery after Stroke: A Critical

SC
Review. Brain Sciences 2016, Vol 6, Page 53 2016; 6: 53.
[6] Campagnini S, Arienti C, Patrini M, et al. Machine learning methods for functional
recovery prediction and prognosis in post-stroke rehabilitation: a systematic review.

U
Journal of NeuroEngineering and Rehabilitation 2022 19:1 2022; 19: 1–22.
[7] Harvey RL. Predictors of Functional Outcome Following Stroke. Physical Medicine and

AN
Rehabilitation Clinics of North America 2015; 26: 583–598.
[8] Stinear CM, Smith MC, Byblow WD. Prediction Tools for Stroke Rehabilitation. Stroke
2019; 50: 3314–3322.
[9] Harari Y, O’Brien MK, Lieber RL, et al. Inpatient stroke rehabilitation: Prediction of
M
clinical outcomes using a machine-learning approach. J Neuroeng Rehabil 2020; 17: 1–
10.
[10] Piron L, Piccione F, Tonin P, et al. Clinical correlation between motor evoked potentials
D
and gait recovery in poststroke patients. Arch Phys Med Rehabil 2005; 86: 1874–1878.
[11] Stinear CM, Barber PA, Petoe M, et al. The PREP algorithm predicts potential for upper
TE

limb recovery after stroke. Brain 2012; 135: 2527–2535.


[12] Rondina JM, Filippone M, Girolami M, et al. Decoding post-stroke motor function from
structural brain imaging. Neuroimage Clin 2016; 12: 372–380.
EC

[13] Stinear CM, Byblow WD, Ackerley SJ, et al. PREP2: A biomarker-based algorithm for
predicting upper limb function after stroke. Ann Clin Transl Neurol 2017; 4: 811–820.
[14] Adans-Dester C, Hankov N, O’Brien A, et al. Enabling precision rehabilitation
RR

interventions using wearable sensors and machine learning to track motor recovery. npj
Digital Medicine 2020 3:1 2020; 3: 1–10.
[15] Lee SI, Adans-Dester CP, Obrien AT, et al. Predicting and Monitoring Upper-Limb
CO

Rehabilitation Outcomes Using Clinical and Wearable Sensor Data in Brain Injury
Survivors. IEEE Trans Biomed Eng 2021; 68: 1871–1881.
[16] O’Brien MK, Shin SY, Khazanchi R, et al. Wearable Sensors Improve Prediction of Post-
Stroke Walking Function Following Inpatient Rehabilitation. IEEE J Transl Eng Health
N

Med; 10. Epub ahead of print 2022. DOI: 10.1109/JTEHM.2022.3208585.


[17] Perry J, Garrett M, Gronley JK, et al. Classification of walking handicap in the stroke
U

population. Stroke 1995; 26: 982–989.


[18] Bowden MG, Balasubramanian CK, Behrman AL, et al. Validation of a speed-based
classification system using quantitative measures of walking performance poststroke.
Neurorehabil Neural Repair 2008; 22: 672–675.

24
[19] Alexander MP. Stroke rehabilitation outcome: A potential use of predictive variables to
establish levels of care. Stroke 1994; 25: 128–134.
[20] Teasell R, Foley N. Managing the Stroke Rehabilitation Triage Process. Evidence Based
Review of Stroke Rehabilitation, http://www.ebrsr.com/evidence-review/4-managing-
stroke-rehabilitation-triage-process (2008, accessed 22 March 2021).
[21] Berg K, Wood-Dauphinee S, Williams JI. The balance scale: Reliability assessment with
elderly residents and patients with an acute stroke. Scand J Rehabil Med 1995; 27: 27–36.

PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
[22] Sokolova M, Lapalme G. A systematic analysis of performance measures for
classification tasks. Inf Process Manag 2009; 45: 427–437.
[23] Murphy KP. Machine learning: a probabilistic perspective. Cambridge, MA: MIT Press,

RI
2012.
[24] Stinear CM, Smith MC, Byblow WD. Prediction Tools for Stroke Rehabilitation. Stroke

SC
2019; 50: 3314–3322.
[25] Bland MD, Sturmoski A, Whitson M, et al. Prediction of Discharge Walking Ability From
Initial Assessment in a Stroke Inpatient Rehabilitation Facility Population. Arch Phys Med

U
Rehabil 2012; 93: 1441–1447.
[26] Scrutinio D, Lanzillo B, Guida P, et al. Development and validation of a predictive model

AN
for functional outcome after stroke rehabilitation the maugeri model. Stroke 2017; 48:
3308–3315.
[27] Henderson CE, Fahey M, Brazg G, et al. Predicting Discharge Walking Function With
High-Intensity Stepping Training During Inpatient Rehabilitation in Nonambulatory
M
Patients Poststroke. Arch Phys Med Rehabil 2022; 103: S189–S196.
[28] Smith M-C, Barber AP, Scrivener BJ, et al. The TWIST Tool Predicts When Patients Will
Recover Independent Walking After Stroke: An Observational Study. Original Research
D
Article Neurorehabilitation and Neural Repair 2019; 2022: 461–471.
[29] Barth J, Waddell KJ, Bland MD, et al. Accuracy of an Algorithm in Predicting Upper
TE

Limb Functional Capacity in a United States Population. Arch Phys Med Rehabil 2022;
103: 44–51.
[30] Lundquist CB, Nielsen JF, Arguissain FG, et al. Accuracy of the Upper Limb Prediction
EC

Algorithm PREP2 Applied 2 Weeks Poststroke: A Prospective Longitudinal Study.


Neurorehabil Neural Repair 2021; 35: 68–78.
[31] Stinear CM, Byblow WD, Ackerley SJ, et al. PREP2: A biomarker-based algorithm for
RR

predicting upper limb function after stroke. Ann Clin Transl Neurol 2017; 4: 811–820.
[32] Smith MC, Ackerley SJ, Barber PA, et al. PREP2 Algorithm Predictions Are Correct at 2
Years Poststroke for Most Patients. Neurorehabil Neural Repair 2019; 33: 635–642.
CO
N
U

25
Table. Performance Metrics for Ambulatory and Non-Ambulatory Patients for Each Prediction Model and Feature Seta

PT

Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024


Benchmark Sensor Models
Comprehensive –

RI
Patient Prediction PI+FA Streamlined – PI+IMU
Group Model PI+FA+IMU
IMU Task
Log- Log- Log-

SC
WF1 Accuracy WF1 Accuracy WF1 Accuracy
Loss Loss Loss
BBS 0.688 0.688 0.594 0.688 0.688 0.484

U
0.848 0.848

AN
Ambulation 0.709 0.719 0.904 10MWTb b 0.844 0.546 b 0.844 0.545

BBS+10MWT 0.838 0.844 0.292 0.838 0.844 0.294

M
BBS 0.588 0.594 1.016 0.622 0.625 1.044
Ambulatory 0.688
Independence 0.685 0.688 1.236 10MWTb 0.657 0.656 1.211 0.688 1.346

D
(N = 32) b

TE BBS+10MWT

BBS
0.563

0.347
0.563

0.344
1.422

1.116
0.622

0.566
0.625

0.563
1.081

0.727
EC
0.659
Risk of falling 0.534 0.531 1.380 10MWTb b 0.656 0.974 0.628 0.625 0.877

BBS+10MWT 0.597 0.594 1.788 0.658 0.656 0.785


RR

0.859
Ambulation 0.643 0.750 0.787 BBSb 0.300 0.250 0.916 b 0.875 0.392
Non-
CO

ambulatory 0.933
Independence 0.933b 0.875 0.782 BBSb 0.933 0.875 0.284 b 0.875 0.407
(N = 8)
1.000
Risk of falling 1.000b 1.000 0.246 BBSb 0.933 0.875 0.294 1.000 0.184
N

b
U

26
a
10MWT = 10-Meter Walk Test; BBS = Berg Balance Scale; FA = functional assessments; IMU = inertial measurement unit; PI =

PT

Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024


patient information; WF1 = weighted F1 score.
b
Highest WF1 for each model and patient group shown in bold, indicating the best performing parameter sets and IMU tasks to predict

RI
discharge outcomes.

SC
U
AN
M
D
TE
EC
RR
CO
N
U

27
PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
RI
SC
U
AN
M
Figure 1. Inpatient dataset available for model training and testing. Data were collected from 55
individuals undergoing poststroke inpatient rehabilitation at admission and discharge. Training sets
for prediction models were determined based on ambulatory status at admission and the availability
D
of IMU data from gait and balance tasks. For patients who were ambulatory at admission, we
utilized their IMU data recorded during the 10MWT and BBS (N = 32). For patients who were non-
TE

ambulatory at admission, we combined IMU BBS data for both patients who were ambulatory and
non-ambulatory (N = 50) and tested only on those who were non-ambulatory (N = 8). All models
were tested using a leave-one-subject-out approach. 6MWT = 6-Minute Walk Test; 10MWT = 10-
EC

Meter Walk Test; Adm = admission; Amb = ambulatory; BBS = Berg Balance Scale; FIM =
Functional Independence Measure; IMU = inertial measurement unit; Ind = independence; Non-
amb = non-ambulatory.
RR
CO
N
U

28
PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
RI
SC
U
AN
Figure 2. Data pipeline for prediction models. Data collected at inpatient rehabilitation facility
M
(IRF) admission (PI, FA, and IMU signals) were combined in different feature sets and input into an
L1-penalized logistic regression model. The model was trained to predict functional outcomes at
IRF discharge, related to the classification of ambulation, independence, and risk of falling. 6MWT
D
= 6-Minute Walk Test; 10MWT = 10-Meter Walk Test; Acc = accelerometer; BBS = Berg Balance
Scale; FA = functional assessments; FIM = Functional Independence Measure; Gyr = gyroscope;
TE

IMU = inertial measurement unit; ML = machine learning; PI = patient information; TUG = Timed
“Up & Go” test; X1, X2, X3, XN = example features extracted from admission data.
EC
RR
CO
N
U

29
PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
RI
SC
U
AN
Figure 3. Prediction models for ambulation at discharge (ambulatory at admission). (A) WF1,
M
accuracy, and log-loss for the benchmark model (PI+FA), streamlined sensor model
(PI+IMU10MWT), and comprehensive model (PI+FA+IMU10MWT). (B) Confusion matrices. (C)
10MWT score at admission (circles) and discharge (crosses) timepoints. Values at discharge are
D
marked in blue if correctly predicted by the best-performing model (simplest model with the highest
WF1), or in red if incorrectly predicted. (D) Median and interquartile ranges of the coefficients fit to
TE

the most important features for the best-performing model. 10MWT = 10-Meter Walk Test; Acc =
accelerometer; ȧ = derivative of acceleration; 𝜔̇ = derivative of gyroscope; Adm = admission; Amb
= ambulatory; AoM = amount of motion; AS = affected side; Dis = discharge; FA = functional
EC

assessments; Gyr = gyroscope; IMU = inertial measurement unit; PI = patient information; PSD =
power spectral density; SampEn = sample entropy; US = unaffected side; WF1= weighted F1 score.
RR
CO
N
U

30
PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
RI
SC
U
AN
M
Figure 4. Prediction models for independence at discharge (ambulatory at admission). (A) WF1,
accuracy, and log-loss for the benchmark model (PI+FA), streamlined sensor model
(PI+IMU10MWT), and comprehensive model (PI+FA+IMU10MWT). (B) Confusion matrices. (C)
D
10MWT score at admission (circles) and discharge (crosses) timepoints. Values at discharge are
marked in dark green if correctly predicted by the best-performing model (simplest model with the
TE

highest WF1), or in red if incorrectly predicted. (D) Median and interquartile ranges of the
coefficients fit to the most important features for the best-performing model. 6MWT = 6-Minute
Walk Test; 10MWT = 10-Meter Walk Test; 𝜔̇ = derivative of rotational velocity (from gyroscope);
EC

Acc = accelerometer; Adm = admission; Amb = ambulatory; AS = affected side; BBS = Berg
Balance Scale; Dis = discharge; FA = functional assessments; FIM = Functional Independence
Measure; Gyr = gyroscope; IMU = inertial measurement unit; Ind = independence; PI = patient
RR

information; PSD = power spectral density; US = unaffected side; WF1 = weighted F1 score.
CO
N
U

31
PT
Downloaded from https://academic.oup.com/ptj/advance-article/doi/10.1093/ptj/pzad183/7505420 by guest on 11 January 2024
RI
SC
U
AN
Figure 5. Prediction models for risk of falling at discharge (ambulatory at admission). (A) WF1,
M
accuracy, and log-loss for the benchmark model (PI+FA), streamlined sensor model
(PI+IMU10MWT), and comprehensive model (PI+FA+IMU10MWT). (B) Confusion matrices. (C)
D
10MWT score at admission (circles) and discharge (crosses) timepoints. Values at discharge are
marked in blue if correctly predicted by the best-performing model (simplest model with the highest
TE

WF1), or in red if incorrectly predicted. (D) Median and interquartile ranges of the coefficient fit to
the most important features for the best-performing model. 10MWT = 10-Meter Walk Test; ȧ =
derivative of acceleration; 𝜔̇ = derivative of rotational velocity (from gyroscope); a(fmax) =
EC

amplitude at maximum frequency; Acc = accelerometer; Adm = admission; Amb = ambulatory; AS


= affected side; BBS = Berg Balance Scale; Dis = discharge; FA = functional assessments; Gyr =
gyroscope; IMU = inertial measurement unit; PI = patient information; RMS = root mean square;
RR

SampEn = sample entropy; US = unaffected side; WF1 = weighted F1 score.


CO
N
U

32

You might also like