0% found this document useful (0 votes)
42 views

ICACE 2020 Presentation

The document describes a study that aimed to develop models using multiple linear regression and feedforward neural networks to predict hospital admissions in Dhaka, Bangladesh due to chest diseases. Daily indoor patient data from a hospital was used as the output variable, with air quality and meteorological data from monitoring stations as input variables. Both linear regression and neural network models were implemented and compared based on their ability to predict admissions, with the neural network models showing slightly better performance than linear regression.

Uploaded by

Ishmam Shahid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views

ICACE 2020 Presentation

The document describes a study that aimed to develop models using multiple linear regression and feedforward neural networks to predict hospital admissions in Dhaka, Bangladesh due to chest diseases. Daily indoor patient data from a hospital was used as the output variable, with air quality and meteorological data from monitoring stations as input variables. Both linear regression and neural network models were implemented and compared based on their ability to predict admissions, with the neural network models showing slightly better performance than linear regression.

Uploaded by

Ishmam Shahid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

2020

Predicting Hospital Admissions in Dhaka due to Chest


Diseases using Multiple Linear Regression and Feed
Forward Neural Network

R.A. Rafsan1*, Z.S. Ishmam2, T. Ahammed3


1Department of Civil Engineering, Bangladesh University of Engineering and Technology, Dhaka, Bangladesh,
email: [email protected]
2Department of Civil Engineering, Bangladesh University of Engineering and Technology, Dhaka, Bangladesh,
email: [email protected]
3Department of Civil Engineering, Bangladesh University of Engineering and Technology, Dhaka, Bangladesh,
email: [email protected]
*Corresponding Author
Air Pollution: The Invisible Killer

LUNG CANCER STROKE


29% of Deaths from Lung 24% of Deaths from Stroke
c Cancer

HEART DISEASE LUNG DISEASE


25% of Deaths from 43% of Deaths from Lung
v Heart Disease c Disease

Not always visible, but deadly!

Source: https://blog.liftshare.com/liftshare/air -pollution -the-invisible-killer 2


Air Pollution in Dhaka City
Ranked worst in terms of AQI multiple time
Pollutant concentration level high throughout the year, specifically during dry periods
Accounts for the risk of 17.6% of the respiratory problems in B angladesh (Source: Siddiqui et al. (2020))

Photo Courtesy: The Daily Star


3
Objective
predict hospital admission due to chest
• Develop models to
diseasesusing air quality levels and meteorological parameters as
predictors

• Models:Multiple Linear Regression


MLR)
( and Multi-layer
Perceptron MLP),
( a type of feedforward Artificial Neural Network
(ANN)

• Compare the model performances


4
Methodology
S tudy Area Data Collection
Dhaka C ity
• Hospital Admission: Daily
indoor patients of National
Overall Process Institute of Diseases of the Ches
and Hospital(NIDC H ), Dhaka
• Air Quality: From the 3 C AM S
of Department of Environment
(DoE)
• Meteorological Parameters:
From B angladesh
M eteorological Department
(B M D)

T ime Period
January 2013 – December 2018
5
Dataset Summary

Boxes showing median, quartiles and range. Whiskers are showing the range for all variables 6
N o. of Indoor Patients

100
120

0
20
40
60
80
1/1/2013

4/1/2013

7/1/2013

10/1/2013

1/1/2014

4/1/2014

7/1/2014

10/1/2014

1/1/2015

4/1/2015

7/1/2015

10/1/2015

1/1/2016
Date

4/1/2016

7/1/2016

10/1/2016

1/1/2017

4/1/2017
Output/Response Variable
Daily Indoor Patient Data, N IDC H (2013-2018)

7/1/2017

10/1/2017

1/1/2018

4/1/2018

7/1/2018

10/1/2018
7
Data Prepocessing: Missing Value Imputation
SO2, NO2, CO, O3, PM2.5, PM10, Solar Radiation,
Variables Containing Missing Values
No. of indoor Patients

S ummary
Imputation T echnique Unprocessed Data (%) After Imputation (%)
CAMS CAMS CAMS CAMS CAMS CAMS
• R eplaced a missing value of a specific 1 2 3 1 2 3

chronology in a year with the SO2 78.5 21.2 12.1 7.3 4.1 4.5

average value of the previous and NO2 84.1 42.0 12.8 8.4 2.6 1.6

next year’s data of the same CO 54.7 14.6 24.6 10.2 3.1 0.2

chronology O3 65.3 11.4 17.5 7.1 3.0 1.5

• The patient data was kept


PM2.5
PM10
43.8
52.6
35.9
46.6
5.0
5.2
7.6
5.5
7.9
6.8
0.4
0.3
unchanged as it was the output
Solar
variable in our models Rad.
36.7 34.5 2.6 5.2 2.5 0.2

8
Data Prepocessing: Data Cleaning and Data Scaling

Data Cleaning Feature Scaling

Uniform datasets were Z-score


prepared by ensuring no Normalization
sample in the datasets
contained any missing value x−μ
xnew =
σ

9
Multiple Linear Artificial Neural Network
Regression (ML P) Arc hitec ure
Implementation Framework: PyTorch (version 0.23.1), on
Implementation Framework: Python
Scikit-learn (version 0.23.1), on
Python
11

𝑦𝑦 = 𝛼𝛼 + � 𝛽𝛽𝑖𝑖 𝑋𝑋𝑖𝑖
𝑖𝑖=1
H ere,
𝑦𝑦 = value of response variable (daily
no. of indoor patients),
𝛼𝛼 = unknown regression bias,
𝛽𝛽𝑖𝑖 = unknown regression coefficients,
𝑥𝑥𝑖𝑖 = values of independent variables.

Schematic Diagram of the feedforward neural network architecture


10
Model Training and Testing
Data from the MLP: Loss Functions Training with
M SE Training subset
3 C AM S,
M eteorological M AE Final Prediction with
H uber Loss Test subset
Data MLR
Datasets Training and Testing
Models

Result Comparison (RMSE -Root Mean Squared Error)


MLP MLP MLP
Station MLR
(MSE-trained) (MAE-trained) (Huber Loss-trained)

Sangshad (CAMS-1) 13.321 13.413 13.447 13.133

Farmgate (CAMS-2) 13.400 13.548 13.450 13.380

Darussalam (CAMS-3) 12.524 12.803 12.879 12.978


11
Study Findings
Optimum Model: M LP trained with M SE-loss ▷ Artificial Neural Networks (ANN) →
using air pollution data from C AM S-3 better prediction performance
▷ These models can identify trends
but, not good at predicing extreme
values
▷ O n weekly and government
holidays, the patient records are
usually low and surge event on the
next working day
▷ Unpredictable nature of the
dependent variable, due to different
surge events
N o. of hospital admissions for chest diseases predicted by
the optimum model (W eekly moving average). 12
Scopes for Future Research

• Including the effect of Seasonal Variability

• Training with larger dataset with less missing values

• B etter handling of missing values

• O ther Neural Network architectures, specifically R NN


based LSTM networks

• Study on multiple hospitals


13
Thank You!
Do you any questions?

Contact:
Rizvan Ahmed Rafsan. Email: [email protected]

You might also like