Improved_UNet_Deep_Learning_Model_for_Automatic_De
Improved_UNet_Deep_Learning_Model_for_Automatic_De
Research Article
Improved UNet Deep Learning Model for Automatic Detection of
Lung Cancer Nodules
Received 10 June 2022; Revised 26 July 2022; Accepted 8 August 2022; Published 30 January 2023
Copyright © 2023 Vinay Kumar et al. Tis is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Uncontrolled cell growth in the two spongy lung organs in the chest is the most prevalent kind of cancer. When cells from the
lungs spread to other tissues and organs, this is referred to as metastasis. Tis work uses image processing, deep learning, and
metaheuristics to identify cancer in its early stages. At this point, a new convolutional neural network is constructed. Te predator
technique has the potential to increase network architecture and accuracy. Deep learning identifed lung cancer spinal metastases
in as energy consumption increased CT readings for lung cancer bone metastases decreased. Qualifed physicians, on the other
hand, discovered 71.14 and 74.60 percent of targets with energies of 140 and 60 keV, respectively, whereas the proposed model
gives 76.51 and 81.58 percent, respectively. Expert physicians’ detection rate was 74.60 percent lower than deep learning’s
detection rate of 81.58 percent. Te proposed method has the highest accuracy, sensitivity, and specifcity (93.4, 98.4, and 97.1
percent, respectively), as well as the lowest error rate (1.6 percent). Finally, in lung segmentation, the proposed model outperforms
the CNN model. High-intensity energy-spectral CT images are more difcult to segment than low-intensity energy-spectral
CT images.
vast number of radiographic images, and the complex and sequences. Frame-by-frame, the extraction of RNN and
uneven structure may make an accurate diagnosis difcult. CNN saves time and money [13]. Most image segmentation
CADAS is a computer-based system that assists clinicians in algorithms use two layers: receptive feld constriction and
diagnosing medical problems [1, 2]. Tese tools provide feature map expansion. Computed tomography (CT), par-
images of potentially dangerous situations to help radiolo- ticularly dual-energy spectral computed tomography, has
gists make the best diagnosis. It is best to use a machine to grown in popularity in recent years. It is an excellent re-
increase sensitivity and decrease positive mistakes. Several source for collecting knowledge that is both generic and. A
works of art have been shown here. dual-energy CT scan was frst proposed in 1973, but it did
not become widely used for several decades due to meth-
2. Related Work odological and technological obstacles. Te creation of the
frst dual-source CT system occurred in 2006. Tis device,
A lung cancer diagnostic tool was built using MSE, MFE, which employs two unique X-ray energy spectra, may be
RCMFE, and MPE. As an algorithmic approach, multiscale useful in distinguishing between diferent types of materials
fuzzy entropy was applied. Te standard deviation of the [14–16]. An energy-spectral CT scan may be used to detect
most accurate MFE-based texture properties was 1.95E 50. lung cancer spinal metastases. As a result, SNR and contrast
Te simulation results revealed that RCMFE measures ex- were used to verify its accuracy. Te detection rate was used
celled their rivals when it came to researching lung cancer to compare the results of clinicians with the suggested
dynamics. Tey developed an algorithm for detecting lung model. Te study’s goal was to develop a clinical standard for
cancer in CT scans. Tey analysed lung CT data using LDA lung cancer bone metastases. Figure 1 shows the segmen-
and a deep neural network. An LDR was used to reduce the tation approach used by the Improved UNet model [17].
size of the CT lung imaging features [3]. When the data was Te model may be narrowed or broadened. Both
collected, it was classifed as benign or malignant. It was channels are symmetrical and gather and analyse data. Data
enhanced by using a modifed gravitational search approach characteristics are derived using constraints and expansions.
on CT images to boost accuracy (MGSA). It leverages images In the contraction approach, which comes after 2 × 2
to construct a system for quickly recognising lung cancer pooling, the expansion route is upsampled, while the con-
with the least amount of human touch. Tis technique traction path is mirror mapped. Te model combines
retained discriminative blocks while efectively illuminating upsampled and mirror-mapped image data. Tis enhanced
deep features [4]. A global WSI description was generated visual quality, however, cuts the feature channel in half.
after collecting characteristics and choosing context-aware Feature vector submission to the network output layer (or
blocks. Ten, it was categorised using a random forest output feature map). Te convolution block employs data
classifer. Te outcomes of the investigation proved the characteristics and an activation function with a 256-pixel
method’s efcacy. It was a unique method for detecting lung input resolution. ReLu employs a hyperparameter dilation
cancer. Its purpose was to decrease misclassifcations. After interval as the function of activation to estimate dilation size.
decreasing noise with weighted mean histogram equaliza- Te contraction approach makes use of a maximum pooling
tion, enhanced profuse clustering was used to increase image of two. Te number of feature channels is doubled when the
quality (IPCT). Deep learning predicted lung cancer by image is downscaled. Tree convolutional blocks and a one-
collecting spectral data from the study region [5]. Te to-one convolutional layer were used. A triple convolution
simulation results indicate that the suggested strategy is structure is used to quadruple image resolution while
efective and efcient, but with certain limitations. Tere are halving feature channels. A mirror map joins the high and
several methods for detecting lung cancer in the literature. low information levels. A data layer with several channels
Each has advantages and disadvantages. Tis study dem- that encourages nonlinearity. Here, are some of the study’s
onstrates how deep learning and metaheuristics might key fndings: images from lung CT scans may be used to
improve lung cancer detection systems. diagnose lung cancer. For cancer diagnosis, convolutional
Terefore, since the circulatory system is involved in the neural networks need a certain structure. Te marine
process of bone metastatic spread, lung cancer may spread to predator’s approach is a unique metaheuristic that was used
the bones. Tis is a symptom of advanced cancer [6–8]. Te to improve how well the convolutional neural network
osteolytic disease afects 10–15% of lung cancer patients. worked.
Spinal cancer may spread in a variety of ways. Back dis-
comfort and neurological impairment are caused when CSF 3. Proposed Model
tumour cells enter the thoracic spine from the back or neck-
thoracic junction [9]. After spiral and multislice CT, energy/ CNNs are likely to become one of the most frequently
spectral CT is a multiparameter imaging technology. It has a utilised medical imaging technologies. CNNs do most deep
multiparameter imaging capability. It is used in vascular learning computations in cancer screening. Tese deep
imaging to reduce metal artefacts and expose fne structures learning algorithms take an image as input and assign rel-
[10, 11]. Deep learning employs artifcial neural networks evance (learnable weights and biases) to each object/aspect
(ANNs) to train computers to think and learn in the same inside the image, enabling them to be identifed. CNN
way that people do [12]. Recognition of text automatically processing is, therefore, quicker than other categorised
because of memory, parameter sharing, and Turing com- techniques. With enough practice, the CNN can recognise
pleteness, recurrent neural networks may learn nonlinear and recall these human-created flters and specifcations. Te
Computational Intelligence and Neuroscience 3
arrangement of the brain’s “visual cortex” during network Step 1 Step 2 Step 3
development infuenced human neural network connection
patterns. Te “receptive feld” of the visual feld is the area Convolution
Input Image Pre-processing
where each neuron responds to stimuli. Tese felds are Neural Network
arranged in rows and columns to fll the visual feld. In this
research, convolutional neural networks were used to detect
lung cancer. Te preferred strategy is shown in Figure 2. Healthy
In a nutshell, it safeguards CNN’s brand; the pre- Dilated
Diagnosis
Convolution
processed images are sent into a CNN that has been trained Cancerous
using the image data. Various lights and noises must be Step 5 Step 4
deleted before processing the lung images. Difculties an- Step 6
ticipating the accuracy of the fnal classifer a low-pass flter Figure 2: Te graphical overview of the recommended technique.
reduce the efect of high-frequency pixels. It is difcult to
reduce noise in medical imaging. It is crucial that the image
borders stay intact during noise reduction to obtain optimal
image clarity. A low-pass flter is a median flter. Te average
brightness of the surrounding pixels is used to calculate the
brightness of each output pixel [17–19]. Te value of a pixel
is computed by averaging pixels in the target region. Use the
centre flter, which is less sensitive to toss values, to get rid of
them. Light fuctuation is decreased while edge form and
location are preserved [20–23]. Tis flter swaps the centre
pixels with those surrounding them to arrange values
ascendingly (m n). Before using the median flter, go through
Figure 3: Example of noise reduction on lung images using median
the image pixel by pixel and replace each value with the fltering.
median value of the pixels right adjacent to each other. Tis
must be completed before the flter is applied. Te “window”
of the image is a pattern of close-together pixels that
gradually progresses across the image [19]. A flter was
applied to the images utilised in the research.
CNN, which stands for convolutional neural networks,
processes a large number of similar-sized images of the
research facility [24–27]. Terefore, before being shared with
CNN, all images were reduced to 227 by 227 pixels. Figures 3
and 4 show noise reduction on lung images using median
fltering and a preprocessed CT image.
Figure 4: CT image that has been preprocessed.
Properly trained networks have a lower error function.
Te purpose is to optimise the network’s-free parameters
[28–31]. Te study made use of supervised training. Under various network aspects. Te second way is more reliable,
this design, a manager controls and leads the network. It has but it takes more RAM to maintain the settings. As a result,
a limited number of inputs and outputs [21]. Te magnitude we fnished the remaining jobs in batch mode. To train the
of the mistakes and the network output are compared. Tese database images, we used a 32-batch training approach.
are then picked in order to reduce this value. Tis can be Before exploring for more resources, make the most of the
done sequentially or in batches. Most people train in a row. It ones you already have. Tis does not imply that our pro-
utilises less RAM but is less reliable since it focuses on gramme will use this information while it is running, but
4 Computational Intelligence and Neuroscience
rather that our software will use this knowledge to learn Both methods are used to determine the best option. Te
[32–35] followed by the data collection from the previous RMSprop optimizer is responsible for determining the
phase, with an emphasis on detecting patterns. At this stage, maximum extent to which the oscillations can move in either
a few theories may be tested, so come up with some. Te direction. As a result of this capacity to speed up the learning
basic blocks of AI are three convolutional layers and three process, our algorithm can now make larger horizontal
pooling levels in a deep neural network. jumps and settle on solutions more quickly. Nontraining
Te nucleus of this layer is a 3D mass of neurons in the images are used to evaluate the network’s performance. Te
middle. Convolutional algorithms are used to process neural layer output is used to build the image feature vector. Te
inputs, reduce the depth three convolutional layers are feature vector and matrix are then compared to each data
proposed, with flter widths of 64, 32, and 128. A pooling point. Tat’s it. Probabilities must be assessed prior to
layer was inserted after the convolutional layer to minimise categorization. Softmax, a common function, may be used to
the depth. Tis decreases the number of parameters while normalise probabilities (0 to 1). Te optimizer RMSprop was
improving network performance. Tis reduces the number used to optimise each variable. Deep learning algorithms in
of output layers. It is a two-way flter. Te given image is medical research uncover essential characteristics in a dif-
subsampled to save memory and network trafc, the smaller fcult dataset. Te suggested approach uses 80 percent of the
the input image, the lesser the sensitivity. Te pooling layer, images in the dataset for training and 20 percent for testing,
like the convolutional layer, links the outputs of many with no connection. Using 32-batch data, the deep neural
neurons [35–38], using a pooling layer when sampling may network is trained over 200 times. Te suggested approach
result in a smaller dataset while increasing processing extracts high-level characteristics in addition to employing
performance. Te image is gathered in a 2 × 2 window in this sequential training. Table 1 compares the recommended
experiment. Figure 5 displays the suggested CNN model, technique to the other choices considered. Te diagnosis
which involves shifting one of the window’s four pixels up a accuracy curve for cancer (Figure 9).
layer from its previous placement. Deep learning is rapidly being used for image classif-
Nonlinear operations should be included after each cation, object recognition, and segmentation. Deep neural
convolutional layer. ReLu layers speed up training while networks maintained in databases may also be used to
maintaining accuracy. Figures 6 and 7 show the max pooling recognise images, increasing accuracy. Deep learning and
and ReLu operations, respectively, each patch of each feature machine vision have been widely researched for cancer
map has been assigned the greatest possible value, also diagnosis. Science has made major advances in this area.
known as the maximum value. Tis number was discovered Lung cancer was discovered using a convolutional neural
by using a pooling method called maximal pooling, some- network. Te results of these networks are compared. First,
times known as just max pooling [25, 26]. Feature maps, traditional optimizer RMSprop and metaheuristic-based
which may be constructed with downsampled or pooled techniques were used. Tey worked together to create the
samples, are used to highlight the most distinguishing fnal product. Te suggested MPA method was the most
characteristics of a location. In contrast to the pooling accurate (93.4 percent). It was preprocessed before being
technique, which emphasises the feature’s general occur- reduced to 126 × 126 pixels in size. Te study comprised 36
rence, this strategy emphasises the feature’s uniqueness. Te lung cancer patients who had fve energy-spectral CT scans.
ReLu layer oversees decreasing negative activations. Tis Te 180 images were split into two categories: training and
layer accentuates nonlinear properties while leaving con- validation (45 images). Tere were three types of data used:
volutional layers alone. training, testing, and validation. It was constructed using
During training, the “dropout” layer may cause certain derivatives. Te fnal image was compared to the original.
neurons to be eliminated from the network. Te outputs of Every business requires data collection and image pro-
certain neurons become zero. Tis permits access to a dif- cessing. To determine which focus had the largest layer, the
ferent network and only employs powerful capabilities. biggest-layer entire tumour area approach was used. Fo-
Overftting is avoided using the dropout approach [23]. In cusing on the centre of the lesion this was surrounded by
completely connected deep networks, convergence is more bone fragments, calcifcation, and necrosis, reduced damage.
probable. An unconnected layer was employed to decrease Each ROI’s CT value was utilised to generate the focus’
parameter values. Te dropout layer approach is shown in energy spectral curve. Figure 10 shows how training sessions
Figure 8. have been shortened. Little new knowledge was retained
after just 24 hours. After 20 repetitions, this rate dropped to
4. Results zero.
In both validation and training sets, it outperforms other
Tese layers provide big data sets with small axes. With networks in terms of loss function and dice coefcient,
enough practice, the network will be able to classify all showing that it is more efective. Te invalidation set loss
images. Te system searches for the best unknown pa- function is greater than the training set loss function. Te
rameters as part of the training process. Flatten, convolu- dice coefcients in both groups were comparable (See
tional, and RMSprop layers are used in weight optimization. Table 2).
Te activation function of an optimization function is For our research, we employed Improved UNet
assessed. It is possible to compare the RMSprop optimizer to threshold-based, boundary-based, and theory-based ap-
a technique known as gradient descent with momentum. proaches are often used for lung CT segmentation [24, 25]. A
Computational Intelligence and Neuroscience 5
Up Sampling
Up Sampling
Unet Unet
224 Model Model
54x54x256 27x27x256
224 Layer 4 54x54x512 54x54x1024
112x112x128
Layer 2 27x27x512 27x27x2048 54x54x4096
Image Layer 3 Layer 5 Layer 6 Layer 7 Layer 8
Layer 1
Down Sampling
Sofmax
Fully Connected Layers
54x54x2048 27x27x4096 13x13x8192 7x7x16384 Layer 13
Layer 9 1000
Layer 10 Layer 11 Layer 12
Layer 14
Dilated
Convolution ReLu Max pooling FC Layer Sofmax
Convolution
Maximum Maximum
7 2 6 3 7 2 6 3
9 6
6 9 5 4 6 9 5 4
1 5 4 1 1 5 4 1
7 2 1 2 7 2 1 2 9 6
4×4 2×2
7 4
7 2 6 3 7 2 6 3
6 9 5 4 6 9 5 4
1 5 4 1 1 5 4 1
7 4
7 2 1 2 7 2 1 2
Maximum Maximum
5
Even though the lung was not apparent in the DC-U-Net
4 images, blood vessels impacted the segmentation border.
3
Y-axis
224 × 224 × 32
112 × 112 × 32
224 112
112
224
80
while minimising ray hardening artifacts. K-edge imaging,
70
with its multienergy spectrum properties, minimizes radi-
60
ation and contrast agent use while boosting soft-tissue
50 contrast. Soft and hard tissues with the same light absorption
0 50 100 150 200
coefcient have become more contrasted in low-energy
Iteration Number
areas. Intervals are used by DC to widen the system’s vision.
Figure 9: Te diagnosis accuracy curve for lung cancer. DC-U-Net increases information extraction without
adjusting image parameters [15, 16].
0.05
We wanted to see how quickly deep learning could
detect lung cancer spinal metastases. To generate the fnal
0.04 DC-U-Net models, energy-spectral CT images of lung
0.03 cancer patients were used. Ten, we looked at several CT
Learning rate
Table 3: Comparison of the cancer detection rate achieved by the proposed model and doctors’ report.
Doctors report Te proposed model result
Te energy level
Cases identifed Total Success rate (%) Cases identifed Total Success rate (%)
140 212 298 71.14 228 298 76.51
60 235 315 74.60 257 315 81.58
relationship to potential therapeutic targets,” Cancer Treat- [31] A. Sakalle, P. Tomar, H. Bhardwaj et al., “Genetic pro-
ment Reviews, vol. 40, no. 4, pp. 558–566, 2014. gramming-based feature selection for emotion classifcation
[16] C. Gerecke, S. Fuhrmann, S. Strifer, H. Einsele, and S. Knop, using EEG signal,” Journal of Healthcare Engineering,
“Te diagnosis and treatment of multiple myeloma,” vol. 2022, Article ID 8362091, 6 pages, 2022.
Deutsches Arzteblatt international, vol. 113, no. 27-28, [32] A. Sakalle, P. Tomar, H. Bhardwaj, D. Acharya, and
pp. 470–476, 2016. A. Bhardwaj, “An analysis of machine learning algorithm for
[17] X. Fan, X. Zhang, Z. Zhang, and Y. Jiang, “Deep learning- the classifcation of emotion recognition,” in Soft Computing
based identifcation of spinal metastasis in lung cancer using for Problem Solving, pp. 399–408, Springer, Singapore, 2021.
spectral CT images,” Scientifc Programming, vol. 2021, Article [33] N. R. Navadia, G. Kaur, H. Bhardwaj et al., “A critical survey
ID 2779390, 7 pages, 2021. of autonomous vehicles,” in Cyber-Physical, IoT, and Au-
[18] W. Zuo, F. Zhou, and Y. He, “Automatic classifcation of lung tonomous Systems in Industry 4.0, pp. 235–254, CRC Press,
nodule candidates based on a novel 3D convolution network Florida, FL, USA, 2021.
and knowledge transferred from a 2D network,” Medical [34] A. A. Alnuaim, M. Zakariah, C. Shashidhar et al., “Speaker
Physics, vol. 46, no. 12, pp. 5499–5513, 2019. gender recognition based on deep neural networks and
[19] D. Wang, Y. Luo, D. Shen, H. Y. Liu, and Y. Q. Che, “Clinical ResNet50,” Wireless Communications and Mobile Computing,
features and treatment of patients with lung adenocarcinoma vol. 2022, Article ID 4444388, pp. 1–13, 2022.
with bone marrow metastasis,” Tumori Journal, vol. 105, no. 5, [35] P. K. Pareek, C. Sridhar, R. Kalidoss et al., “IntOPMICM:
intelligent medical image size reduction model,” Journal of
pp. 388–393, 2019.
Healthcare Engineering, Article ID 5171016, 2022.
[20] X. P. Wang, B. Wang, P. Hou, and J. B. Gao, “Screening and
[36] C. Sridhar, P. K. Pareek, R. Kalidoss, S. S. Jamal, P. K. Shukla,
comparison of polychromatic and monochromatic image
and S. J. Nuagah, “Optimal medical image size reduction
reconstruction of abdominal arterial energy spectrum CT,”
model creation using recurrent neural network and Gen-
Journal of Biological Regulators & Homeostatic Agents, vol. 31,
PSOWVQ,” Journal of Healthcare Engineering, Article ID
no. 1, pp. 189–194, 2017.
2354866, 2022.
[21] Z. Wang, Y. Ni, Y. Zhang, Q. Xia, and H. Wang, “Laparo- [37] S. K. Bharti, R. K. Gupta, P. K. Shukla, W. A. Hatamleh,
scopic varicocelectomy: virtual reality training and learning H. Tarazi, and S. J. Nuagah, “Multimodal sarcasm detection: a
curve,” Journal of the Society of Laparoendoscopic Surgeons deep learning approach,” Wireless Communications and
Journal of the Society of Laparoendoscopic Surgeons, vol. 18, Mobile Computing, Article ID 1653696, 2022.
no. 3, Article ID e2014.00258, 2014. [38] P. Rani, P. N. Singh, S. Verma, N. Ali, P. K. Shukla, and
[22] R. Nair, S. Vishwakarma, M. Soni, T. Patel, and S. Joshi, M. Alhassan, “An implementation of modifed blowfsh
“Detection of COVID-19 cases through X-ray images using technique with honey bee behavior optimization for load
hybrid deep neural network,” World Journal of Engineering, balancing in cloud system environment,” Wireless Commu-
vol. 19, no. 1, pp. 33–39, 2021. nications and Mobile Computing, vol. 2022, Article ID
[23] T. Sharma, R. Nair, and S. Gomathi, “Breast cancer image 3365392, 2022.
classifcation using transfer learning and convolutional neural
network,” IJMORE, vol. 2, no. 1, pp. 8–16, 2022.
[24] R. Kashyap, “Evolution of histopathological breast cancer
images classifcation using stochastic dilated residual ghost
model,” Turkish Journal of Electrical Engineering and Com-
puter Sciences, vol. 29, no. SI-1, pp. 2758–2779, 2021.
[25] R. Kashyap, “Breast cancer histopathological image classif-
cation using stochastic dilated residual ghost model,” Inter-
national Journal of Information Retrieval Research, vol. 12,
no. 1, pp. 1–24, 2022.
[26] R. Kashyap, “Object boundary detection through robust active
contour based method with global information,” Interna-
tional Journal of Image Mining, vol. 3, no. 1, Article ID
10014063, 22 pages, 2018.
[27] P. Sharma, Y. P. S. Berwal, and W. Ghai, “Performance
analysis of deep learning CNN models for disease detection in
plants using image segmentation,” Information Processing in
Agriculture, vol. 7, no. 4, pp. 566–574, 2020.
[28] R. Nair and A. Bhagat, “An introduction to clustering algo-
rithms in big data,” Encyclopedia of Information Science and
Technology, pp. 559–576, 2021.
[29] H. Bhardwaj, P. Tomar, A. Sakalle, D. Acharya, T. Badal, and
A. Bhardwaj, “A DeepLSTM model for personality traits
classifcation using EEG signals,” IETE Journal of Research,
pp. 1–9, 2022.
[30] A. Sakalle, P. Tomar, H. Bhardwaj, and M. A. Alim, “A
modifed LSTM framework for analyzing COVID-19 efect on
emotion and mental health during pandemic using the EEG
signals,” Journal of Healthcare Engineering, Article ID
8412430, 8 pages, 2022.