A Pneumonia Detection Method Based On Improved
A Pneumonia Detection Method Based On Improved
Abstract—For traditional machine learning and image neural network still seems to be a little inadequate, the
processing, it is difficult to extract features, the quality of feature recognition rate is not high, prone to over-fitting series of
extraction will directly affect the classification accuracy, another problems; Yang Li[3] of Shanghai jiao tong university also took
problem is that the large difference between the original datasets the LBP information of pneumonia images as the classification
and the target datasets in the transfer learning, lead to identification feature, and combined it with the Support Vector
unsatisfactory results. In addition, the original convolution model Machine (SVM) classifier to identify pneumonia; However,
network is too shallow, recognition rate is not high; Therefore, this when the training sample is too large, the calculation amount and
paper presents an improved convolutional neural network method parameters will increase exponentially, resulting in the model
for pneumonia detection based on deep learning model. First, the
effect is not particularly ideal; Wen Bo Xiang[4] of Harbin
image size in the original datasets are fixed and the appropriate
university of science and technology used CNN algorithm to
batch size are used as the input of the network, Then the
convolution layers and pooling layers are added on the basis of
automatically extract the characteristics of pneumonia, and
lenet-5 model, Secondly, a feature integration layer is added to finally got a better recognition result; Xinyu He[5] of Wuhan
construct an optimal model for accurate classification. The university of science and technology adopted the pre-training
proposed method not only avoids the complex feature extraction model, used the GoogleNet inception V3 network trained by
process but also has fewer parameters than other classical ImageNet data set to extract features, and combined with the
convolutional neural networks. By using two public datasets, the random forest[6] classifier for classification, and finally
second which provided by the RSNA® joint Kaggle medical image achieved the accuracy rate of 96.77%. It can be seen that
pneumonia recognition competition, after a series of experiments compared with some traditional image processing and machine
of single dataset and fusion of two datasets, the final accuracy rate learning methods, deep learning method has the advantages of
reached 98.83% and 98.44% respectively, the test accuracy also automatic feature extraction and higher accuracy. In order to
reached 97.26% and 91.41% respectively. Compared with existing have a greater breakthrough in pneumonia detection, this paper
transfer learning, GoogleNet Inception V3+Data proposes an improved convolutional neural network based on
Augmentation(GIV+DA), GIV3+RF models, this model has many LeNet network and successfully applies it to the pneumonia
improvements. detection model.
Keywords—mage process; deep learning; Pneumonia detection; The main research problem in this paper is the detection of
convolutional neural network pneumonia, so when given some x-ray of lung images our
proposed algorithm can detect whether it is a pneumonia image
I. INTRODUCTION
or not, the main works are as follow:
Pneumonia is a common lung disease, at present, the
methods to detect pneumonia mainly include X-ray, CT and so (a) According to the analysis of the dataset, the appropriate
on, as an auxiliary means of medical diagnosis[1], these methods picture is obtained, and the input layer of the network
have always played an importance role in clinical medicine and is designed to be 64*64 size.
medical disease diagnosis. However, using these methods to (b) The main task of pneumonia detection model is to
diagnose diseases is a high challenging task, which requires detect two categories of pneumonia pictures and
doctors have excellent professional knowledge and rich normal pictures. Therefore, the output layer of the
experience. In recent years, with the continuous development of traditional lenet-5 model is changed from 10 neurons to
medicine and computer science, more and more computer-aided 2 neurons.
diagnosis systems have been applied in various aspects of
medicine and satisfactory results have been achieved. It not only (c) To solve the problem that convolutional neural network
reduces the burden on professional doctors but also improves the needs a large number of training images, the original
accuracy of identification. dataset (1) is preprocessed to 8 times of the original
data, and the dataset (1) and dataset (2) are fused to
In order to pursue higher accuracy and recognition results, many increase the amount of data, so as to prevent the
researchers at home and abroad are committed to the research of network overfitting from leading to poor test results.
medical image. Liang Zhang[2] of south China university of
technology proposed a method based on LBP feature extraction, (d) on the basis of the original lenet-5 network,
which used BP neural network to detect the extracted pneumonia convolution layers and pooling layers are added.
features Compared with the current convolutional network, BP Secondly, a feature integration layer is added at the end
Authorized licensed use limited to: Fondren Library Rice University. Downloaded on May 17,2020 at 05:56:03 UTC from IEEE Xplore. Restrictions apply.
of the network to make the obtained features highly (i). Convolution Layer
abstract and make the network deeper and improve the
recognition accuracy. For convolutional neural network, the convolutional layer is
the most important layer, which is usually called the core. It is
II. PNEUMONIA IMAGE DETECTION SYSTEM FRAMEWORK composed of different feature planes, and each feature plane is
composed of multiple neurons. Usually the first convolution
A. Neural Network Model layer is the input of the network model. In a convolutional neural
Traditional image classification algorithms usually require network, the input of each node is a small block in the previous
manual participation and feature extraction. However, when neural network, which has a special name called convolution
features are relatively complex, researchers need to have rich kernel or filter. The size of the commonly used convolution
background knowledge, which will cost a lot of manpower and kernel are 1*1,3*3,5*5,7*7,11*11. Of course, the size of the
material resources, resulting in low generalization ability of the convolution kernel can also be changed according to the
obtained model and poor application effect. The simple neuron requirements of the network. The convolution layer base on the
structure of traditional neural network is shown in figure 1, size of the convolution kernel, therefore the result of convolution
which is mainly composed of input layer, hidden layers and operation is also different. The superposition of multiple small
output layer, where a1-a6 are neurons node and a3-a5 are hidden convolution kernels is generally better than the single use of
layer neurons. large convolution kernels. Under the condition of unchanged
connectivity, the number of parameters and computational
a3 complexity of network model are greatly reduced. The formula
for the convolution layer is:
input
a1 𝑥𝑗𝑙 = 𝑓(∑𝑖𝜖𝑀𝑗 𝑥𝑗𝑙−1 𝑘𝑖𝑗
𝑙
+ 𝑏𝑗𝑙 ) (1)
output
a4 a6 Where, l is the current layer, 𝑀𝑗 is the convolution window
corresponding to the j th convolution kernel, k is the convolution
input a2 kernel, b is the bias term of the current layer, and the activation
function is generally sigmoid, ReLU, etc. ReLU is used as the
a5 activation function in this paper.
(ii). Pooling Layer
Figure. 1. Traditional neural network model
The pooling layer is often referred to as the sampling layer.
The emergence of deep learning technology has opened up a The pooling layer does not change the depth of the three-
new research direction in the field of modern computer science, dimensional matrix, but it can reduce the size of the matrix.
which was proposed by Hinton in 2006. In essence, it is Pooling operation can be considered as the transformation of
supervised training, and a complex feature extraction project is images with higher resolution into images with lower resolution.
carried out through a series of pooling layers and convolutional Through pooling, the parameters of the final full connection
layers in the convolutional neural network with a large amount layer can be further reduced, so as to reduce the parameters in
of data with labels[7]. This process is completely automated the whole neural network, and also play a certain role in
without manual participation, and finally a corresponding preventing overfitting of the network.
relationship is obtained from the input mapping to the output.
Compared with traditional research methods[8], many (iii). Full Connection Layer
complicated processes are eliminated. The obtained model has
better generalization ability and is more widely used, and then The full connection layer is generally located at the end of
the convolutional neural network and integrates the trained
deep learning has better learning ability and better recognition
features. After training, a classifier is obtained. In this paper,
result, which convolutional neural network has the
softmax classifier is used to obtain the final result.
characteristics of local receptive field, weight sharing and
pooling; The local receptive field can be used to extract the C. Improved Convolutional Neural Network Structure
features of blocks layer by layer; Weight sharing will not lead to
exponential growth of parameters when there are too many Lenet-5 network[9] structure was first proposed by Yann
convolution layers. This feature can reduce the parameters LeCun in the experiment of handwritten digit recognition in
between neurons and reduce the time required for training; 1998. This paper improves the original network structure, two
Pooling can further reduce the parameters in the network convolution layers are increased to six convolution layers, and
parameters. pooling operation is carried out after convolution of each layer.
Maxpooling is selected, and feature integration layer is added
B. Convolutional Neural Network after the full connection layer to further abstract the original
The model of convolutional neural network consists of features. Feature integration layer is shown in figure 2:
convolution layers, pooling layers and full connection layers.
The convolution layer and pooling layer are superimposed
alternately. After passing through the full connection layers,
another softmax layer is connected to map the probability of
each category to the output of the network.
489
Authorized licensed use limited to: Fondren Library Rice University. Downloaded on May 17,2020 at 05:56:03 UTC from IEEE Xplore. Restrictions apply.
convolution, so set the stride of convolution to 1*1 and use
2048 2048
‘SAME’ padding; The filter size of the pooling layer is 3*3 and
64 64 the stride is 2*2. The pooling method uniformly adopts
2
dropout dropout maxpooling, and the pooling layer is connected to the
Early feature dropout
convolution layer after each convolution layer to conduct a
extraction
reduction sampling of the convolution layer information, so as
to reduce the parameters of the model and improve the
generalization ability of the model. In this CNN model, 1-3
convolutional layers are used to extract some relatively low-
cnn feature integration layer level features, and 4-6 convolutional layers are used to extract
relatively high-level features, because the relatively low-level
Figure. 2. Feature integration layer features are relatively simple in the extraction process, the
In order to obtain better detection result, the CNN model number of convolution kernels used in the first three layers of
designed in this paper is shown in figure 3, which includes 6 the improved Lenet network is 64,128 and 256 respectively.
convolution layers, 6 pooling layers and a feature integration 512,1024 and 2,048 convolution kernels were used for the
layer composed of 3 full connection layers and 3 dropout layers. subsequent 4-6 layers to extract advanced features, Finally, the
The experiment proves that the convolution kernel of 3*3 is the probability of final dichotomy is obtained in the feature
most suitable size Therefore, the size of the convolution kernel integration layer composed of three layers of full connection
is unified as 3*3 in this paper, If the stride is too large, which is layer and three layers of dropout layer.
easy to lose some important features in the process of
input 64x64 64x64x64 32x32x64 32x32x128 16x16x128 16x16x256
C1 S1 C2 S2 C3
…… …… …… …… ……
S3
S6 S4 C4
…
…… ……
2048
64
64
2
In this paper, ReLU[10] is used as an activation function training process, and 𝑓(. ) is the ReLU activation function.
after each convolutional layer to eliminate linearization and The parameter update formula in feature integration is shown
improve the non-linear expression ability of the model. Its in (4):
mathematical expression is shown in formula (2), cross entropy 𝑊𝑖𝑗𝑙 = 𝑊𝑖𝑗𝑙 − 𝛽 𝑙
𝜕𝑗
(4)
loss function and Adam gradient descent algorithm are 𝜕𝑊𝑖𝑗
𝜕𝑗
combined to carry out back propagation, after each iteration, 𝑏𝑖𝑙 = 𝑏𝑖𝑙 −𝛽 (5)
𝜕𝑏𝑖𝑙
Adam gradient descent algorithm is used to adjust the
parameters in the feature integration layer, and the output Where, 𝛽 is the learning rate. In this paper, the learning rate
calculation formula in feature fusion is shown in (3) is 1e-5, 𝑊𝑖𝑗𝑙 is the weight of updating between neurons, and 𝑏𝑖𝑙
Η(𝑝, 𝑞) = − ∑𝑥 𝑝(𝑥) log(𝑞(𝑥)) (2) is the bias, The two formulas (4) and (5) and Adam gradient
𝑝(𝑥) is a real sample distribution probability, 𝑞(𝑥) is the descent algorithm [11]are used to update the parameters in the
convolutional neural network model through input data whole network and between the feature integration layer.
calculated probability estimates, Η(𝑝, 𝑞) for the loss between III. EXPERIMENTAL TEST AND RESULT ANALYSIS
the two.
𝑋𝑜𝑢𝑡 = 𝑓(𝑊𝑋𝑖𝑛 + 𝑏) (3) A. Experimental Environment
Where 𝑋𝑜𝑢𝑡 is the output of the feature integration layer, In order to verify the feasibility of the pneumonia detection
𝑋𝑖𝑛 is the input of the full connection, 𝑊 is the weight of the algorithm based on the improved convolutional neural network
490
Authorized licensed use limited to: Fondren Library Rice University. Downloaded on May 17,2020 at 05:56:03 UTC from IEEE Xplore. Restrictions apply.
proposed in this paper, the experimental environment is shown TABLE III. DATASET AFTER FUSION
in table (1): Dataset Normal Pneumonia Total
TABLE I. EXPERIMENTAL ENVIRONMENT
Category parameter Training 9852 5995 15847
Operating system Windows10,64bit
CPU Core i5-9400F @2.90GHz Test 4223 2570 6793
GPU Nvida GeForce GTX 1660
Memory 8G DDR4 2666MHz
Python 3.5.2
Tensorflow 1.14
B. Dataset
Dataset(1)
The first dataset in this paper was used by Daniel Kermany increase contrast increase sharpness increase brightness reduce contrast
et al[12] at the university of California, San Diego in the 2018
public pneumonia dataset (Chest X-ray Images). In this data set,
there are two categories, normal category and pneumonia
category, after the integration of the original data set, there were
1575 normal images and 4,265 pneumonia images. In order to original
solve the problem of insufficient data set in training convolution
neural network model, the data set was expanded in this
experiment; By increasing brightness, contrast, sharpness,
gaussian blur, and reducing brightness, contrast, and sharpness,
reduce sharpness reduce brightness gaussian blur
the data set was expanded to 8 times [13]of its original value, as
shown in figure 4. Finally, all the data were integrated and Figure. 4. Dataset expansion
randomly scrambled to divide the expanded data set into training
set and test set at a ratio of 7:3, as shown in table 2: C. The Experimental Process
TABLE II. EXPEND DATASET First, the dataset (1) is processed into an improved Lenet
model's input, all images are processed to 64*64. By
Dataset Normal Pneumonia Total appropriately increasing batch_size and decreasing learning rate
Training 8820 23890 32710 to enhance the smooth convergence of the model; After
repeatedly adjusting the learning rate, batch_size and the
number of iterations of the experiment, finally, batch_size was
set to 256, the learning rate was 1e-5, and the number of
Test 3780 10238 14018
iterations was 10000 model smooth convergence. The final
training accuracy is 98.83%, loss is 0.04, test accuracy is 97.26%,
loss is 0.12, and the results are shown in figure (5) and (6), 2.06%
Dataset(2) better than the best existing models; It is explained that the
algorithm of pneumonia detection based on improved
To further verify the robustness of the proposed algorithm, convolution neural network proposed in this experiment
another public dataset is used in this paper, this dataset was performs well.
provided in the September 2018 medical image pneumonia
recognition contest sponsored by the radiological association of
North America (RSNA®) in association with Kaggle. After
screening, a total of 12500 normal samples and 4300 pneumonia
samples were collected. Because both datasets were normal and
pneumonia, so we fused the data sets of Chest X-ray Images and
RSNA® into a new dataset and uniformed the size of 64*64
during the experiment. Due to the large number of samples in
the data set after fusion, there was no expansion. After fusion,
there were 14075 normal samples and 8565 pneumonia samples.
Similarly, the samples were randomly divided into training set
and test set according to the proportion of 7:3, Data samples are
shown in table 3:
491
Authorized licensed use limited to: Fondren Library Rice University. Downloaded on May 17,2020 at 05:56:03 UTC from IEEE Xplore. Restrictions apply.
Figure. 8. Fusion data set training loss
In data set (1), domestic and foreign scholars have done a lot
of research, and the accuracy rate obtained by using transfer
learning[12] is 92.80%, The GoogleNet Inception V3+Data
Augmentation(GIV+DA model)[14] algorithm adopted by
Vinicius Pavanelli Vianna obtained 95.56%. xinyu He of Wuhan
university of science and technology (hereinafter referred to as
GIV3+RF model)[5] got 96.77%, The improved CNN proposed
in this paper is 98.83%. Compared with the existing model, the
convolutional network model is more concise and efficient in
complexity, network parameters and generalization ability. the
accuracy is shown in table 4: The experimental results of two
different datasets in this paper are as shown in table 5:
TABLE IV. COMPARISON WITH EXISTING RESEARCH RESULTS
Figure. 6. training loss
Methods Accuracy
Secondly, we also conducted several experiments on the
dataset (2), without changing the network structure, the
hyperparameters were adjusted for many times, especially for Transfer Learning 92.80%
the adjustment of batch_size and learning rate, we find that when GIV+DA 95.51%
the learning rate is too low, the convergence of accuracy curve
is too slow and the training time is too long, when the learning GIV3+RF 96.77%
rate is too high, the accuracy curve converges more quicker than Improved Lenet 98.83%
low learning rate but the curve fluctuates greatly, finally the
learning rate adjusted to 1e-4, The model also converges
smoothly after 10,000 iterations, the final training accuracy is
98.44%, loss is 0.06, test accuracy is 91.41%, and loss is 0.22. TABLE V. ACCURACY OF DIFFERENT DATASETS
The results are shown in figure (7) and (8). Although the test
results are not as good as the data set (1), it is also close to the Dataset Training Accuracy Test Accuracy
existing research results.
Dataset1 98.83% 97.26%
IV. CONCLUSIONS
This paper first introduces the models and results proposed
by some previous researchers, and briefly introduces some basic
knowledge of convolutional neural networks Then, on the basis
of relevant research, a convolutional neural network based on
improved Lenet was proposed to realize the detection of
pneumonia images, the model in this paper is based on the
original classical lenet-5 model by adding convolutional layer
Figure. 7. Training accuracy of fusion dataset and pooling layer as well as feature integration layer, the
obtained features were further highly abstracted, and finally
excellent results were obtained on two public datasets, not only
on the training set but also on the test set, which shows the
proposed model has good robustness. In the following research
work, further studies will be conducted on the detection of
pneumonia types in our team, which will be detected according
to types. at the same time, using convolutional neural network
segment the lung area and locate the lesion area. Using
CapsNet[15] reconstruct the blurred images of lesion area
(CapsNet has been proven that reconstructed images have very
useful functions such as smoothing the noise) then, using
appropriate neural networks for detection. In this way, the
computerized pneumonia diagnosis system developed in the
near future can quickly detect pneumonia types and locate lesion
492
Authorized licensed use limited to: Fondren Library Rice University. Downloaded on May 17,2020 at 05:56:03 UTC from IEEE Xplore. Restrictions apply.
areas, so as to, it is able to assist doctors taking appropriate [7] Mou Duoduo Liu Lei. Comparative Study of ELM and SVM in
medicine to shorten the cure time of pneumonia and improve the Hyperspectral Image Supervision Classification [J]. Remote Sensing
Technology and Application | Remot Sens Technol Appl, 2019(1).
cure rate of pneumonia. [8] Mcdonnell M D , Vladusich T . Enhanced image classification with a
fast-learning shallow convolutional neural network[C]// 2015
ACKNOWLEDGMENT International Joint Conference on Neural Networks (IJCNN). IEEE,
This research work was supported by Guangxi key 2015.
[9] Lecun Y , Bottou L , Bengio Y , et al. Gradient-based learning applied to
Laboratory Fund of Embedded Technology and Intelligent document recognition[J]. Proceedings of the IEEE, 1998, 86(11):2278-
System ( Guilin University of Technology ) under Grant 2324.
No.2017-2-5. [10] Wang G, Giannakis G B, Chen J. Learning ReLU Networks on Linearly
Separable Data: Algorithm, Optimality, and Generalization[J]. IEEE
REFERENCES Transactions on Signal Processing, 2019, 67(9):2357-2370.
[11] Nayak A, Das S, Nayak D, et al. Nanoquinacrine sensitizes 5-FU-
[1] RINALDI P, MENCHINI L, MARTINELLI M, et al.Computer-aided resistant cervical cancer stem-like cells by down-regulating Nectin-4 via
diagnosis[J].Rays, 2003, 28 (1) :103-108. ADAM-17 mediated NOTCH deregulation[J]. Cellular Oncology,
[2] Zhang Liang, Wang Qikai, Classification of pneumonia based on BP 2019:1-15.
neural network [J]. Journal of South China University of [12] Kermany D S , Goldbaum M , Cai W , et al. Identifying Medical
Technology(Natural Science), 2015,42 (1):72-76 Diagnoses and Treatable Diseases by Image-Based Deep Learning[J].
[3] Zhao G, Ahonen T, Jiří Matas, et al. Rotation-Invariant Image and Video Cell, 2018, 172(5):1122-1131
Description With Local Binary Pattern Features[J]. IEEE Transactions [13] Hu Xuemi, Chen Qin, YangLi, YuJin, TongXiuChi. Abnormal crowd
on Image Processing, 2011, 21(4):1465-1477. behavior detection and localization based on deep spatial-temporal
[4] Xiang wenbo. Classification of Pneumonia Type Image Based on convolution neural networks [J/OL]. Application Research of
Convolution Neural Network [D]. Harbin University of Science and Computers:1-7
Technology,2017. [14] Vianna V P . Study and development of a Computer-Aided Diagnosis
[5] HE Xinyu, Zhang Xiaolong. Pneumonia image recognition model based system for classification of chest x-ray images using convolutional
on deep neural network [J]. Journal of Computer neural networks pre-trained for ImageNet and data augmentation[J].
Applications,2019,39(06):1680-1684. 2018.
[6] Feigelson E D. Random Forests: Finding Quasars - Commentary[J]. [15] Sabour S , Frosst N , Hinton G E . Dynamic Routing Between
Statistical Challenges in Astronomy, 2003. Capsules[J]. 2017.
493
Authorized licensed use limited to: Fondren Library Rice University. Downloaded on May 17,2020 at 05:56:03 UTC from IEEE Xplore. Restrictions apply.