MDPI
MDPI
AI and Big Data Department, Endicott College, Woosong University, Daejeon 34606, Republic of Korea
* Correspondence: [email protected]
Abstract: Background: This research goes into in deep learning technologies within the realm of
medical imaging, with a specific focus on the detection of anomalies in medical pathology, emphasiz-
ing breast cancer. It underscores the critical importance of segmentation techniques in identifying
diseases and addresses the challenges of scarce labelled data in Whole Slide Images. Additionally,
the paper provides a review, cataloguing 61 deep learning architectures identified during the study.
Objectives: The aim of this study is to present and assess a novel quantitative approach utilizing
specific deep learning architectures, namely the Feature Pyramid Net-work and the Linknet model,
both of which integrate a ResNet34 layer encoder to enhance performance. The paper also seeks
to examine the efficiency of a semi-supervised training regimen using a dual model architecture,
consisting of ‘Teacher’ and ‘Student’ models, in addressing the issue of limited labelled datasets.
Methods: Employing a semi-supervised training methodology, this research enables the ‘Student’
model to learn from the ‘Teacher’ model’s outputs. The study methodically evaluates the models’
stability, accuracy, and segmentation capabilities, employing metrics such as the Dice Coefficient
and the Jaccard Index for comprehensive assessment. Results: The investigation reveals that the
Linknet model exhibits good performance, achieving an accuracy rate of 94% in the detection of
breast cancer tissues utilizing a 21-seed parameter for the initialization of model weights. It further
excels in generating annotations for the ‘Student’ model, which then achieves a 91% accuracy with
minimal computational demands. Conversely, the Feature Pyramid Network model demonstrates
a slightly lower accuracy of 93% in the Teacher model but exhibits improved and more consistent
results in the ‘Student’ model, reaching 95% accuracy with a 42-seed parameter. Conclusions: This
study underscores the efficacy and potential of the Feature Pyra-mid Network and Linknet models in
the domain of medical image analysis, particularly in the detection of breast cancer, and suggests
Citation: Schutte, S.; Uddin, J. Deep their broader applicability in various medical segmentation tasks related to other pathology disor-
Segmentation Techniques for Breast
ders. Furthermore, the research enhances the understanding of the pivotal role that deep learning
Cancer Diagnosis. BioMedInformatics
technologies play in advancing diagnostic methods within the field of medical imaging.
2024, 4, 921–945. https://doi.org/
10.3390/biomedinformatics4020052
Keywords: deep learning; segmentation; breast cancer; feature pyramid network; linknet; student
Academic Editors: Hans Binder and and teacher architecture
Alexandre G. De Brevern
Humanity faces many medical problems, one of which is that very prominent ‘breast
cancer’ is detected through various imaging techniques, whether it be from Magnetic
Resonance Imaging (MRI), computed tomography (CT) scans, or others. Humans have
reached a point where they can detect medical disorders more effectively than before.
The main part of this research will introduce breast cancer as one of the diseases that
can be detected via segmentation and deep learning models. However, to do this, an
understanding of cancer as well as training models are needed, to use the intelligence to
further improve disease detection in the medical field.
To distinguish between healthy bodily tissues and those prone to disease or other
abnormalities, segmentation can be used. Segmentation involves taking tiles of images
from high-definition scans, as well as using masks or, rather, labeled data that identify
regions of disease, to train the model to recognize what constitutes a disease or irregular
tissue. This training is conducted via a deep learning model that performs millions of
calculations to classify data points within a complex network of neurons.
2. Literature Review
In this section, the focus is on deep learning models and their understanding, specifi-
cally CNN and its influence on artificial intelligence. The methods these neural networks
use to learn, including backpropagation and gradient descent, are explored. Additionally,
the applications of CNNs in imaging, such as analyzing Whole Slide Images and segment-
ing images, are discussed. Lastly, the complexities involved in interpreting data using these
models are addressed. However, as this paper is on breast cancer, some information should
be more explored on this issue.
Table 1. Breast cancer incidence (new cases) and mortality (deaths) in 2020 by world region and
Human Development Index level [3].
Table 1. Cont.
These statistics underscore the critical importance of breast cancer and the necessity of
adopting any possible measures to combat this disease. Early detection can significantly
improve survival rates, highlighting the potential of deep learning models as a vital solution
in this fight.
improvement of the model’s predictions by minimizing the error between the predicted
and actual outcomes.
continuous improvement of the model’s predictions by minimizing the error between the
predicted and actual
2.4. Algorithm outcomes.
and Equations
(I) Initialization:
2.4. Algorithm Start by selecting values, for the weights typically chosen randomly.
and Equations
(II) Compute Gradient: Employ backpropagation to calculate the gradient of the loss
(I) Initialization: Start by selecting values, for the weights typically chosen randomly.
function concerning each weight. (III) Update Weights: Adjust the weights in the opposite
(II) Compute Gradient: Employ backpropagation to calculate the gradient of the loss func-
direction to the gradient:
tion concerning each weight. (III) Update Weights: Adjust the weights in the opposite di-
rection to the gradient:
Weightnew = Weightold − η × Gradient (1)
Weightnew = Weightold − η × Gradient (1)
where η is referenced to the learning rate, as well as a small positive scalar determining
where η is referenced to the learning rate, as well as a small positive scalar determining
the step size. (IV) Iterate: Repeat steps (II) and (III) until the loss function reaches its point
the step size. (IV) Iterate: Repeat steps (II) and (III) until the loss function reaches its point
indicating convergence [11].
indicating convergence [11].
2.5. Combining Backpropagation and Gradient Descent
2.5. Combining Backpropagation and Gradient Descent
In the context of neural networks, backpropagation and gradient descent (Figure 1C)
In the context of neural networks, backpropagation and gradient descent (Figure 1C)
work together. Backpropagation and gradient descent are two components in the field of
work together. Backpropagation and gradient descent are two components in the field of
neural networks. They collaborate to allow the model to learn from its mistakes effectively.
neural networks. They collaborate to allow the model to learn from its mistakes effec-
Backpropagation calculates the gradients, providing the direction in which the weights
tively. Backpropagation calculates the gradients, providing the direction in which the
need to be modified to minimize the loss. Subsequently, gradient descent utilizes these
weights need to be modified to minimize the loss. Subsequently, gradient descent utilizes
gradients
these to update
gradients the weights
to update throughout
the weights iterations
throughout or epochs
iterations until
or epochs thethe
until desired level of
desired
accuracy is reached [10].
level of accuracy is reached [10].
2.6.
2.6.Whole
WholeSlide
SlideImages (WSIs)
Images (WSIs)
AAlot
lotofofmedical
medicalresearch
researchis conducted
is conducted using Visual
using Computing,
Visual and other
Computing, similar
and other similar
types of research use WSIs. WSIs are high-resolution images that have been extracted
types of research use WSIs. WSIs are high-resolution images that have been extracted from from
glass
glassslides.
slides.WSIs
WSIsbecome
becomeessential in the
essential research
in the usedused
research in pathology. WSIs WSIs
in pathology. are usedare in
used in
medical
medical research, disease diagnosis, and education. One of the reasons for this isWSIs
research, disease diagnosis, and education. One of the reasons for this is that that WSIs
allow for the entire tissue samples, which give pathologists the capability to examine and
allow for the entire tissue samples, which give pathologists the capability to examine and
cross-reference the whole tissue, much like an improved version of microscopy [12,13].
cross-reference the whole tissue, much like an improved version of microscopy [12,13].
WSIs are extensively used in CNN deep learning models. In this context, WSI has
WSIs are extensively used in CNN deep learning models. In this context, WSI has been
been used a lot in segmentation, classification, and detection. The reason for this is due to
used a lot in segmentation, classification, and detection. The reason for this is due to the size
the size to which WSIs can extend, that being over gigabytes per image with a high level
to which WSIs can extend, that being over gigabytes per image with a high level of specific
of specific details found in the image. This can bring challenges in computation and deal-
details found in the image. This can bring challenges in computation and dealing with this
ing with this large-size data effectively. Additionally, annotations of the data are time-
large-size data effectively. Additionally, annotations of the data are time-consuming due to
BioMedInformatics 2024, 4 925
their size and variability due to differences in staining slide preparation and the scanning
across different labs [14–16].
An illustration of the WSI images is formed in a medical context. The image outlines
the steps, from preparing the tissue sample to scanning and the creation of the digital image.
2.7. Segmentation
By definition, segmentation is defined as “the act of dividing something into different
parts” [17]. Image segmentation, by definition, involves dividing images into segments
or ‘tiles’ that are considered to be similarly related. Although the segmentation process
depends on a low-level representation of data, which could relate to lighting, layer weight,
and texture, segmentation cannot explicitly identify regions that may or may not belong to
certain categories [18].
Semantic segmentation is distinct in that it classifies each pixel within an image, which
is relevant because it allows for a more precise understanding of individual objects rather
than just regions or segments. This method enables objects to be distinguished based
on consistent patterns associated with them [18]. Semantic segmentation offers superior
analysis compared to traditional segmentation methods. The patterns discerned in the
images can provide insights into various problems encountered in the physical world.
the model’s capacity to detect breast cancer at various sizes, a feature crucial for identifying
smaller, previously overlooked lesions, thereby advancing early detection efforts [27].
The selection of the LinkNet and Feature Pyramid Network (FPN) models designs
is based on their ability to go beyond the limitations of current approaches. This allows
for efficiency and flexibility. These models excel in handling variations in lesion sizes,
which helps in creating thorough methods for detecting breast cancer. Moreover, this
research aims to utilize a teaching strategy known as teacher–student learning to enhance
the training process of these models.
In this teaching framework, a sophisticated ‘teacher’ model imparts knowledge to
a simpler ‘Student’ model as discussed in this paper. The Student model then learns to
imitate the performance of the Teacher model improving its capabilities with computational
requirements. This method is especially useful when using complex models that are
not feasible due to resource limitations. Through this approach, the Student model can
efficiently and accurately detect breast cancer while maintaining effectiveness.
This study focuses on assessing how well the LinkNet and FPN models perform in
terms of efficiency when using the teacher–student approach compared to established
models. By exploring this teaching technique, we hope to discover strategies for enhancing
detection methods and advancing breast cancer imaging further.
The combination of LinkNet and FPN, along with the teaching method using a teacher–
student, approach shows potential for bringing insights and significant benefits which
could accelerate advancements in identifying and treating breast cancer.
3. Methodology
In this section, the focus is on a training approach for segmentation models that handle
data annotations. A dual-model architecture is employed, consisting of a ‘Teacher’ model
(TM) and a ‘Student’ model (SM). The TM undergoes training on annotated WSIs using
binary cross entropy and dice loss metrics to optimize performance. This model then
generates soft pseudo annotations for WSIs to enhance the dataset. Subsequently, the SM is
trained using both authentic and pseudo annotations, with a process that continues until
maximum stability or convergence is achieved. Furthermore, the importance of evaluating
the accuracy and overlap between the model’s predictions and actual data is highlighted,
using metrics such as the Dice Coefficient and Jaccard Index. These metrics are crucial in
ensuring precision in medical imaging segmentation by aiming for similarity and complete
overlap between predicted results and ground truth data.
In this approach,
Thethebest validation
performing procedure
model inplays a crucial
validation role in
is then ensuring
selected thebest
as the efficiency
Teacher model
and dependability of segmentation
(Figure 2). This modelmodels.will thenAfter the training
generate phase,
‘soft’ pseudo each model—both
annotations TM
for the unlabeled WSI
and SM—undergoes a validation
image data. It shouldstage usingthat
be noted a set of images
although thisnot utilized
method during
is not training. as true
as trustworthy
Throughout this stage, the it
annotations, models’
enriches performance
the dataset by is assessed
a huge means on their
[30].capability to accurately
segment images, whichThe next step in
is vital forthis process istheir
evaluating building on this
ability expanded dataset.
to generalize. SM is then
Performance is trained
measured using adding bothCoefficient
the Dice authentic and andpseudo
Jaccardannotations
Index, which intogauge
its learning paths. between
the overlap The validation set
again serves
the models’ predictions and as the
the actual
benchmark for selecting
annotations. These thevalidation
better Student model,
metrics which, upon sur-
are essential
as they offer an assessment of how the models’ predicted segmentation aligns with teacher
passing its teacher counterpart in performance, assumes the role of the ground in subse-
truth data. A higher score signifies a close match between the models’ output and the finds its
quent iterations [31]. This iterative cycle is then cycled until the model’s accuracy
real data, whichhighest stability,
is critical inalso known
medical as a statewhere
imaging of convergence.
precision This iterative
is key. The refinement
validation takes ad-
stage aids not only in choosing the best-performing models, but also in adjusting overcomes
vantage of the predictive accuracy of the TM to improve the SM, which their the
problem of the lack of labeled data and increasingly
parameters for optimal performance. By tracking these metrics, we can identify when improves the model’s performance
a Student model withhas
each iteration [32]. its teacher in terms of accuracy and generalization,
outperformed
In this approach, the validation procedure plays a crucial role in ensuring the effi-
guiding the training process towards generating more refined and accurate segmentation
ciency and dependability of segmentation models. After the training phase, each model—
models. This methodical validation process guarantees that the ultimate model put into use
both TM and SM—undergoes a validation stage using a set of images not utilized during
is robust, trustworthy, and suitable for real-world use in diagnosis and treatment planning.
training. Throughout this stage, the models’ performance is assessed on their capability
to accurately
3.1.1. Dice Coefficient segment images,
(Sørensen–Dice which is vital for evaluating their ability to generalize. Per-
Index)
formance is measured using the Dice Coefficient and Jaccard Index, which gauge the over-
The Dice lap
coefficient,
betweenwhich is alsopredictions
the models’ known as the andSørensen
the actualDice index or These
annotations. Dice Similarity
validation metrics
Coefficient (DSC), is a tool used in statistics. It is used to measure
are essential as they offer an assessment of how the models’ predicted the commonness
segmentation aligns
between two with sets ground
of data.truth
In this
data.research, this DSC
A higher score is used
signifies a closetomatch
find between
the commonness
the models’ output
of WSI segmentation.
and the realThis is which
data, performed by DSC
is critical comparing
in medical imaging the pixel-wise
where agreement
precision is key. The valida-
between the ground
tion stage aids not only in choosing the best-performing models, but alsolearning
truth segmentation and the segmentation shown by the deep in adjusting their
model [33,34].parameters for optimal performance. By tracking these metrics, we can identify when a
This is the formula
Student usedhas
model to outperformed
prudence theits dice coefficient:
teacher in terms of accuracy and generalization, guid-
ing the training process towards generating more refined and accurate segmentation mod-
els. This methodicalDSC = |X| +process
validation |Y|2 × guarantees
|X∩Y| that the ultimate model put(2) into use is
robust, trustworthy, and suitable for real-world use in diagnosis and treatment planning.
X is the predicted group of pixels that belong to the class of cancer; for example, Y,
being the ground3.1.1.truth
Dice [34]. To iterate,
Coefficient an increased
(Sørensen–Dice number in the DSC indicated an in-
Index)
creased similarity between the prediction and ground truth. In this research, measurement
The Dice coefficient, which is also known as the Sørensen Dice index or Dice Similar-
is in the rangeity
of Coefficient
0 to 1 [34]. (DSC), is a tool used in statistics. It is used to measure the commonness
between two sets of data. In this research, this DSC is used to find the commonness of WSI
3.1.2. Jaccard Index (Intersection over Union)
segmentation. This is performed by DSC comparing the pixel-wise agreement between
The Jaccard
the index,
groundwhich is also knownand
truth segmentation as the
the segmentation
Intersection over
shownUnion (IoU),
by the deepis, like model
learning
the above, another tool used to measure the commonness in evaluating the performance of
[33,34].
This is the formula used to prudence the dice coefficient:
The Jaccard index, which is also known as the Intersection over Union (IoU), is, like
the above, another tool used to measure the commonness in evaluating the performance
of the segmentation models. The formula is fairly simple, and is put together as the size
of the intersection divided by the size of the union of the given sets:
BioMedInformatics 2024, 4 928
JI = ∣X∪Y∣∣X∩Y∣ (3)
In this research, the Jaccard index has been used to quantify the percent overlap be-
the segmentation
tween models.
WSIs given mask and The formula isoutput
the prediction fairlyofsimple, and is put
these models. together
The range thatas
is the size of
given
the intersection
in this research isdivided by the
again from 0 tosize of the
1, with union of the
0 presenting nogiven sets:
overlap and 1 presenting com-
plete overlap. Complete overlap is the goal [35,36].
JI = |X∪Y||X∩Y| (3)
3.2. About the Dataset
In this research, the Jaccard index has been used to quantify the percent overlap
The data used in this study, which is explained by the Kexin et al. [37] publication,
between WSIs given mask and the prediction output of these models. The range that is
consists of a collection of breast ultrasound images, formatted in .png (Figure 3) and the
given
maskedinimages
this research
in .mat is again(Figure
format from 04).toThese
1, with 0 presenting
images no overlap
are focused on breastand 1 presenting
cancer only
complete overlap. Complete overlap is the goal [35,36].
and the images were approved by the institutional review board of Changhua Christian
Hospital, in Taiwan. This retrospective and cross-sectional study includes 20,000 image
3.2. About the Dataset
tiles and 1,448,522 annotated nuclei. The total storage space required for this dataset is 6.8
GB. ItThe data used
is licensed underin this study,
the CC which is explained by the Kexin et al. [37] publication,
BY license.
consists of a collection of breast ultrasound
The imaging process involved uses a GE Voluson images,700
formatted
machinein .png (Figureby
(manufactured 3) GE
and the
masked
Healthcare images
basedinin.mat
Zipf,format (Figure
Austria). This 4). Theseensured
machine images high-resolution
are focused on captures
breast cancer
with only
and the images
dimensions were
of 960 × 720approved
pixels in by
RGBthemode
institutional review
(Figure 3). board of Changhua
Each participant included inChristian
the
Hospital,
study was in Taiwan.
between theThis
ages retrospective
of 35 and 75 andand
hadcross-sectional
been diagnosed study
with aincludes 20,000
malignant tumor image
through
tiles andbiopsy techniques.
1,448,522 annotated Eachnuclei.
participant
The contributed
total storagetwo images
space taken from
required different
for this dataset is
scan
6.8 GB.plane
It isangles [37].under the CC BY license.
licensed
In addition to the images themselves, this dataset also contains information about
treatments including therapy methods using histology reports and radiography results.
Furthermore, corresponding masks (Figure 4) are provided in. mat format.
Figure 4.
Figure 4. A
A view
viewof
ofthe
thedataset
datasetlabels [37].
labels [37].
To facilitate
The imagingease of access,
process for research
involved uses a GEpurposes,
Volusonand700tomachine
enable studying breast tu-
(manufactured by GE
mor characteristics
Healthcare based inthrough imaging This
Zipf, Austria). techniques
machineeffectively,
ensuredall images and masks
high-resolution have with
captures
been organized
dimensions into×a single
of 960 folder.
720 pixels in Furthermore, to enable
RGB mode (Figure 3).this
Eachresearch to be more
participant affec-in the
included
tive with the algorithms and code used in the research, the .mat files have been
study was between the ages of 35 and 75 and had been diagnosed with a malignant tumor converted
into .pngbiopsy
through files and reduced inEach
techniques. size. participant contributed two images taken from different
scan plane angles [37].
3.3. Deep Learning Models Being Compared
In this section, we will compare both deep learning models. To begin with, the layers
of both models have been positioned side by side for an overall view. The model on the
left is LinkNet (Figure 5A), and the model on right is FPN (Figure 5B).
BioMedInformatics 2024, 4 929
In addition to the images themselves, this dataset also contains information about
treatments
Figure 4. A viewincluding therapy
of the dataset labels methods
[37]. using histology reports and radiography results.
Furthermore, corresponding masks (Figure 4) are provided in. mat format.
ToTofacilitate
facilitate ease
ease ofof access,
access, forfor research
research purposes,
purposes, andand to enable
to enable studying
studying breast
breast tu- tumor
mor characteristics through imaging techniques effectively, all images and masks have been
characteristics through imaging techniques effectively, all images and masks have
organized
been organizedinto a single
into a singlefolder.
folder.Furthermore,
Furthermore, to to enable
enablethisthisresearch
researchto to
be be
moremore affective
affec-
with
tive thethe
with algorithms
algorithms and
andcode
codeused
usedin inthe
the research, the .mat
research, the .matfiles
fileshave
havebeen
beenconverted
converted into
.png
into files
.png and
files reduced
and reduced ininsize.
size.
3.3.Deep
3.3. Deep Learning
Learning Models
Models Being
Being Compared
Compared
InIn this
this section,
section, wewewillwill compare
compare bothboth
deepdeep learning
learning models.
models. To begin
To begin with,
with, the the layers
layers
ofofboth
bothmodels
modelshavehave been
been positioned
positioned side
side byby side
side forfor
anan overall
overall view.
view. TheThe model
model on on
thethe left
left is LinkNet
is LinkNet (Figure5A),
(Figure 5A),and
andthe
the model
model on right
rightisisFPN
FPN(Figure
(Figure5B).
5B).
(A) (B)
Figure
Figure5. 5.
(A)(A)
AnAn
illustration of the
illustration of LinkNet architecture
the LinkNet [26]. (B)
architecture An(B)
[26]. illustration of the FPN
An illustration architec-
of the FPN architec-
ture [38].
ture [38].
3.3.1.
3.3.1.LinkNet
LinkNet
The introduction
The introductionof the model
of the known
model as LinkNet
known (Figure
as LinkNet 5A) represents
(Figure a significant
5A) represents a significant
advancement in the field of deep learning, specifically in efficiently segmenting images
advancement in the field of deep learning, specifically in efficiently segmenting images
based
basedonontheir semantic
their semantic content. LinkNet
content. is highly
LinkNet regarded
is highly for itsfor
regarded ability to strike
its ability to astrike a
balance between accuracy and computational requirements, making it an excellent choice
for tasks that demand real-time performance. This efficiency can be attributed to its
architecture, which utilizes trained encoders for extracting features and a decoder module
for classification [26].
A noteworthy aspect of LinkNet is its design, which maintains a connection between
the encoder and decoder modules. This connection plays a crucial role in mitigating the
loss of resolution during down-sampling, enabling LinkNet to preserve information across
the network with fewer parameters, leading to faster training without compromising
performance [39].
LinkNet finds applications in various fields, such as medical imaging, where it aids
in segmenting organs or lesions in radiographic images. It also excels at identifying road
networks in aerial images. However, despite its versatility, LinkNet does have limitations.
For instance, it may not perform as well as more complex models in handling fine details—
especially critical in medical image segmentation where precision is paramount. Moreover,
BioMedInformatics 2024, 4 930
the effectiveness of LinkNet heavily relies on the quality and diversity of the training dataset,
which is a common limitation in machine learning models. Another challenge arises from
the transferability of pre-trained models to medical images from non-medical domains,
due to significant variations in image characteristics. Furthermore, although LinkNet
is designed for real-time segmentation, it might encounter challenges when processing
high-resolution imaging, which is typically found in this field [39].
Figure
Figure AnAn
6. 6. illustration
illustration of the
of the ResNet34
ResNet34 Layer
Layer Encoder
Encoder [43]. [43].
3.4.Summary
3.4. Summary
InInthis
thissection,
section,thethe focus
focus is onis on
the the research
research methodology
methodology that revolves
that revolves around around
the the
training strategy for segmentation models, specifically handling images
training strategy for segmentation models, specifically handling images that have incom- that have incom-
pletedata.
plete data. Our
Our approach
approach involves
involves a semi-supervised
a semi-supervised training
training schemescheme
using using a dual-model
a dual-model
architecture, consisting of a ‘Teacher’ model (TM) and a ‘Student’ model
architecture, consisting of a ‘Teacher’ model (TM) and a ‘Student’ model (SM). Initially, (SM). Initially, the
the TM is trained using WSIs that have annotated data, optimizing binary cross-entropy (BCE)
TM is trained using WSIs that have annotated data, optimizing binary cross-entropy
and Dice
(BCE) loss (DICE)
and Dice as performance
loss (DICE) as performance metrics. The
metrics. high-performing
The high-performing TMTMthen
thengenerates
gener- soft
pseudo
ates annotations
soft pseudo for thefor
annotations unlabeled WSI WSI
the unlabeled data,data,
complementing
complementing thethe
dataset
dataseteven
eventhough
these pseudo
though annotations
these pseudo are not
annotations as reliable
are not as as
as reliable true annotations.
true annotations.Subsequently,
Subsequently, thethe SM is
SM is trained
trained using using
bothboth authentic
authentic andand pseudo
pseudo annotations,with
annotations, withthe
thevalidation
validation set
set determining
deter-
mining the better-performing
the better-performing StudentStudent
model.model.
ThisThis iterative
iterative process
process continuesuntil
continues untilour
ourmodel
model
reachesreaches its highest
its highest stability
stability or state
or state of convergence,
of convergence, effectively
effectively addressingthe
addressing thechallenge
chal- of
lenge of limited
limited labeledlabeled data while
data while continuously
continuously enhancing
enhancing our our model’s
model’s performance.
performance.
Moreover, in the research, two evaluation metrics are employed to assess the perfor-
mance of our segmentation models: the Dice coefficient and the Jaccard index. The Dice
coefficient measures the similarity between the predicted segmentation and the ground
truth on a scale from 0 to 1, with a higher value indicating greater similarity. Similarly,
the Jaccard index quantifies the extent of overlap between the provided mask and the
model’s output prediction. The goal is to achieve complete overlap for optimal perfor-
mance. These statistical techniques are crucial in evaluating the precision and efficiency of
the segmentation models when it comes to processing and analyzing WSIs.
4. Results Analysis
In this section, the focus is on meticulously comparing the performance of deep
learning models such as the LinkNet and Feature Pyramid Network (FPN), all built on
the ResNet34 framework. The analysis thoroughly assesses both Teacher and Student
models under each architecture, evaluating their stability, accuracy, and image segmentation
capabilities. This examination includes an exploration of seed values, training loss levels,
validation Dice coefficients, and Jaccard indices. Through this evaluation, an understanding
BioMedInformatics 2024, 4 932
of the strengths and limitations of these architectures in the field of medical image analysis
is gained.
It should be highlighted that out of the 20,000 images, 5%, which corresponds to 1000
images, has been exclusively allocated for validation. Consequently, these 1000 images are
excluded from the training dataset of the model. Despite the possibility of increasing this
number, it was considered superfluous. This applies to both the LinkNet and FPN deep
learning models. Additionally, both models were trained and validated on the same data,
to keep a fair comparison.
Figure
Figure 7. Results
7. Results of ‘Teacher’
of the the ‘Teacher’ model
model Linknet
Linknet at seedat84.
seed 84.
However,
4.1.2. Student further
Model evaluating the Validation Jaccard index (Intersection over Union
Findings
(IoU)), scores of
The Student 0.8926
model to 0.8972
is also were
based on produced.
LinkNet IoU presents
architecture an insight
with ResNet34, into with
trained the segmen-
tation performance as well as indicating the accuracy of the predicted against
the same seeds as the Teacher model. The Student model showed markedly lower loss the actual
than the Teacher model at 0.0004 to 0.0066 (Table 3, Figure 8); this suggests that the learn-
ing process was much more efficient or that there is potentially overfitting involved.
The Validation Dice coefficient for the Student model shows a significant range from
0.8084 to 0.9529, with the highest score being achieved with seed 84. Similarly, the Valida-
tion Jaccard index varied from 0.6784 to 0.9100 with an average of 0.7988 as well as the
highest score also associated with seed 84. This suggests that the initialization with seed
BioMedInformatics 2024, 4 933
segmentations. Although the scores are high, 89% would not necessarily be firm enough
for real-world scenarios.
Further, the findings showed that the optimum epoch for the Teacher model is as
follows. For a seed of 21, the model optimum epoch on 100 epochs was 92, seed 42 at 98
and finally 85 peaked at epoch 84.
Figure
Figure 8. 8. Results
Results of the
of the ‘Student’
‘Student’ model
model Linknet
Linknet at seed
at seed 84. 84.
ToThe Validation
better Dicethe
understand coefficient
results of for
the the Student
LinkNet model shows
architecture a significant
with ResNet34 range from
encoder,
0.8084
Table 4 istopresented
0.9529, with the highest
comparing the score being
Teacher andachieved
Student with
model seed 84. Similarly,
results the It
side by side. Validation
is
noticeable that the VD of the Teacher model is quite consistent, whereas the Student
Jaccard index varied from 0.6784 to 0.9100 with an average of 0.7988 as well as the highestmodel
shows
scoresome inconsistency,
also associated withwith seed
seed 84.84This
demonstrating
suggests thatthe the
most promising accuracy.
initialization with seed 84 may
have been particularly effective for the Student model. However, the Student model did
Table 4. Linknet
not exhibit theArchitecture
same levelwith ResNet34 as
of stability Base
theComparison.
Teacher model, as evidenced by the fluctuation
in the Validation Jaccard.Teacher
This inconsistency could be attributed to initialization variance,
Student
wherein
Seeds the initial
TL weights may
VD have converged
VJ to a less
TL optimal region.
VD ThisVJphenomenon
is known
21 to occur
0.0102due to the initialization
0.9430 process
0.893 or data
0.0061 shuffling,
0.893which are influenced
0.8081 by
42 0.0098 0.9425 0.8926 0.0066 0.8084 0.6784
84 0.0114 0.9450 0.8972 0.004 0.9529 0.9100
Average 0.0104 0.9435 0.8942 0.0055 0.8847 0.7988
Total Loss (TL), Validation Dice (VD), Validation Jaccard (VJ).
BioMedInformatics 2024, 4 934
the random seed assigned. These factors affect the learning process of the model, including
the random sampling of data points or mini-batches.
Further looking into the optimum epoch for the Student model, the finding showed
that the model reached much earlier than the teacher model, at epoch 9 for all seeds. This
could imply that the Student model learned the necessary patterns more quickly than that
of the Teacher model or that it may have begun to overfit after this point.
In summary, both the Teacher and Student models demonstrated strong performance
on the given tasks (Table 4), with the Teacher model showing consistency across metrics
and the Student model exhibiting rapid learning but with more variability in performance,
displaying a lack of consistency in the Jaccard index. However, the use of different seed
values also provided insights into the stability of the model’s training process, with seed
84 yielding the best results for the Student model.
Teacher Student
Seeds TL VD VJ TL VD VJ
21 0.0102 0.9430 0.893 0.0061 0.893 0.8081
42 0.0098 0.9425 0.8926 0.0066 0.8084 0.6784
84 0.0114 0.9450 0.8972 0.004 0.9529 0.9100
Average 0.0104 0.9435 0.8942 0.0055 0.8847 0.7988
Total Loss (TL), Validation Dice (VD), Validation Jaccard (VJ).
To better understand the results of the LinkNet architecture with ResNet34 encoder,
Table 4 is presented comparing the Teacher and Student model results side by side. It is
noticeable that the VD of the Teacher model is quite consistent, whereas the Student model
shows some inconsistency, with seed 84 demonstrating the most promising accuracy.
The Validation Dice coefficient varied significantly, ranging from 0.8956 to 0.9544,
with the highest accuracy achieved with seed 42. Similarly, the Validation Jaccard index
ranged from 0.8109 to 0.9128, again showing a notable range in segmentation perfor-
mance. The most effective seed for the Student model appeared to be 42, based on these
metrics.
In terms of optimum epoch, the Student model reached its peak performance much
earlier than the Teacher model, at epoch 2 for seeds 21 and 84, and epoch 83 for seed 42.
This suggests a quicker learning curve for the Student model, although it may also indi-
cate a tendency to overfit beyond these points.
In summary, both the Teacher and Student models under the FPN architecture with
a ResNet34 base encoder layer demonstrated robust performance, with the Teacher model
showing more consistency across metrics and the Student model exhibiting rapid learn-
ing, albeit with greater variability in its performance. The different seed values provided
valuable insights into the stability and efficiency of the model’s training process.
To better understand the results of the FPN architecture with ResNet34, Table 7 is
presented, comparing the Teacher and Student model results side by side. It is noticeable
that the VD of the Teacher model is consistent at 93%, whereas the Student model shows
some inconsistency between 84 and 43 to 21, with seed 42 demonstrating the most prom-
ising accuracy.
BioMedInformatics 2024, 4 936
The Validation Dice coefficient varied significantly, ranging from 0.8956 to 0.9544, with
the highest accuracy achieved with seed 42. Similarly, the Validation Jaccard index ranged
from 0.8109 to 0.9128, again showing a notable range in segmentation performance. The
most effective seed for the Student model appeared to be 42, based on these metrics.
In terms of optimum epoch, the Student model reached its peak performance much
earlier than the Teacher model, at epoch 2 for seeds 21 and 84, and epoch 83 for seed 42.
This suggests a quicker learning curve for the Student model, although it may also indicate
a tendency to overfit beyond these points.
In summary, both the Teacher and Student models under the FPN architecture with a
ResNet34 base encoder layer demonstrated robust performance, with the Teacher model
showing more consistency across metrics and the Student model exhibiting rapid learning,
albeit with greater variability in its performance. The different seed values provided
valuable insights into the stability and efficiency of the model’s training process.
To better understand the results of the FPN architecture with ResNet34, Table 7 is
presented, comparing the Teacher and Student model results side by side. It is noticeable
that the VD of the Teacher model is consistent at 93%, whereas the Student model shows
some inconsistency between 84 and 43 to 21, with seed 42 demonstrating the most promising
accuracy.
Teacher Student
Seeds TL VD VJ TL VD VJ
21 0.0173 0.9331 0.0173 0.0183 0.8956 0.8109
42 0.0154 0.9345 0.0154 0.0113 0.9544 0.9128
84 0.0160 0.9326 0.0160 0.0227 0.9378 0.8830
Average 0.0162 0.9334 0.0162 0.0174 0.9292 0.8689
Total Loss (TL), Validation Dice (VD), Validation Jaccard (VJ).
4.3. Discussion
The primary objective of this study was to evaluate and compare the performance
of two distinct deep learning architectures in analyzing Whole Slide Images (WSIs) with
a focus on breast cancer segmentation. The results indicated that all models performed
commendably overall. However, there were noticeable variations in terms of effectiveness
and accuracy. Some models appeared to be overfitting the data, potentially limiting their
generalizability. In contrast, Teacher models were generally more stable and accurate, while
Student models learned quickly but exhibited more variation, resulting in a slightly lower
quality compared to Teacher models.
The findings suggest that the LinkNet model shows strong performance in breast
cancer detection, with its results being particularly notable with the seed 21 parameter.
Although the FPN (Feature Pyramid Network) model did not match the performance of
LinkNet, it still demonstrated sufficient utility in breast cancer detection as well as a more
stable Jaccard index. This study highlights the potential of using different deep learning
architectures for medical imaging tasks, with a focus on the effectiveness and efficiency of
these models in detecting and analyzing breast cancer tissues.
It should be emphasized that these models have been developed using a teacher–
student framework. In scenarios where ample data are available, the Teacher model
BioMedInformatics 2024, 4 937
suffices, eliminating the necessity for a Student model. However, in cases of data scarcity,
employing a Student model is recommended. According to the findings of this research,
the Student model based on FPN performed satisfactorily. However, despite the varying
outcomes where seeds 21 and 84 were tested, seed 42 exhibited inferior results, as previously
discussed. This indicates that a Student model utilizing FPN can be effectively paired with
the LinkNet teacher model. Nonetheless, it is important to consider that testing additional
seed parameters could potentially modify the average performance metrics for a LinkNet-
based Student model.
As seen in the images (Figures 11 and 12), the detection of breast cancer by each of
the models was notably effective, demonstrating their capability to detect breast cancer in
BioMedInformatics 2024, 4, FOR PEERunseen
REVIEW data. This suggests that each model could be effectively utilized in detecting18
breast
BioMedInformatics 2024, 4, FOR PEER REVIEW 18
cancer and potentially in segmenting other pathological disorders in WSIs.
Figure 11.
Figure 11. Image
Imageresults
resultsdetecting
detectingcancer with
cancer thethe
with Student model
Student Linknet
model at seed
Linknet 84. 84.
at seed
Figure 11. Image results detecting cancer with the Student model Linknet at seed 84.
Figure 12. Image results detecting cancer with the Student model FPN at seed 42.
Figure 12.
Figure 12.Image
Imageresults
resultsdetecting
detectingcancer with
cancer thethe
with Student model
Student FPNFPN
model at seed 42. 42.
at seed
4.4. Implications of the Study
4.4. Implications of the Study
In summary, this research emphasizes the significance of selecting and training mod-
In summary, this research emphasizes the significance of selecting and training mod-
els in deep learning tasks, setting important benchmarks for future advancements in im-
els in deep learning tasks, setting important benchmarks for future advancements in im-
age analysis technologies. Both individuals and organizations can greatly benefit from the
age analysis technologies. Both individuals and organizations can greatly benefit from the
increased accuracy and efficiency of these models, particularly in fields like medical im-
increased accuracy and efficiency of these models, particularly in fields like medical im-
aging where precise image analysis is essential. These findings offer valuable guidance for
aging where precise image analysis is essential. These findings offer valuable guidance for
organizations utilizing AI-based image analysis, aiding them in choosing and training
BioMedInformatics 2024, 4 938
Finally, the use of Teacher and Student models demonstrated a strong approach to
further train models on annotated data derived from the teacher model. This effectiveness
was evident through the segmentation tools and the performance of the Jaccard and Dice
evaluations.
5. Conclusions
This research explores the performance of LinkNet and FPN architectures, all based on
ResNet34, in image segmentation tasks for detecting breast cancer in Whole Slide Images
(WSIs). The study provides critical insights into the stability of the training, accuracy levels,
and segmentation capabilities of these models, making a significant contribution to the
field of deep learning in medical image analysis.
The study highlights the Importance of selecting the appropriate model and seed
value to optimize performance while addressing the challenges associated with overfit-
ting, particularly in student models. These aspects are especially crucial in applications
requiring precise image analysis, such as in the diagnosis of diseases like breast cancer or
in environmental monitoring.
From a practical standpoint, these findings are invaluable in guiding model selection
and refining training methodologies. Consequently, they have the potential to enhance the
accuracy and efficiency of AI systems in healthcare, offering more effective tools for breast
cancer detection and other medical imaging tasks.
Furthermore, this research demonstrates that deep learning architectures like LinkNet
and FPN (Feature Pyramid Network) can accurately detect breast cancer in Whole Slide
Imaging (WSI). Among these, LinkNet is highlighted as a recommended model due to its
superior performance in this specific application. This investigation not only improves our
understanding of deep learning models in image segmentation, but also underscores their
potential to enhance diagnostic procedures in medical practice.
In summary, this study marks a significant advancement in the use of deep learning
models for image analysis, particularly within the vital healthcare sector. It paves the way
for more precise and reliable AI-powered diagnostic tools.
Author Contributions: Conceptualization, S.S.; methodology, S.S. and J.U.; software, S.S.; vali-
dation, S.S.; formal analysis, J.U.; investigation, S.S. and J.U.; resources, S.S.; data curation, S.S.;
writing—original draft preparation, S.S.; writing—review and editing, S.S. and J.U.; visualization,
S.S.; supervision, J.U.; project administration, J.U.; funding acquisition, S.S.; All authors have read
and agreed to the published version of the manuscript.
Funding: This research is funded by Woosong University Academic Research 2024.
Institutional Review Board Statement: Not applicable.
BioMedInformatics 2024, 4 939
Appendix A
This table presents a compilation of deep learning models reviewed during the course
of this study. It is included to assist future researchers exploring deep learning models for
image-based pathology detection.
Model
No. Year Description
(Architecture)
Combines a basic Convolutional Neural Network (CNN) with a Conditional
1 2014 CRF Deep Learning Random Field (CRF) for improved image segmentation, enhancing boundary
delineation by refining the CNN’s output with the CRF [44].
Fully Convolutional Pioneering architecture in semantic segmentation that uses convolutional layers to
2 2014
Networks (FCN) process images of any size and outputs segmentation maps [45].
A highly effective network for medical image segmentation, featuring a U-shaped
3 2015 U-Net
architecture that excels in tasks where data are limited [19].
A model based on Recurrent Neural Networks (RNNs) and FCN, designed for
4 2015 ReSeg semantic image segmentation, leveraging the sequential nature of RNNs for
improved segmentation [46].
Uses deconvolutional layers to perform up-sampling of feature maps, enabling
5 2015 Deconvolution Network
precise localization in semantic segmentation tasks [47].
Incorporates dilated convolutions to expand the receptive field without reducing
6 2015 Dilated ConvNet resolution, enhancing performance in dense prediction tasks like semantic
segmentation [48].
Enhances FCNs by adding global context to improve segmentation accuracy,
7 2015 ParseNet
focusing on understanding the whole scene context [49].
SegNet was designed for road scene understanding in the context of autonomous
8 2015 SegNet
driving [50].
DeepLabv1 and its successive versions (v2, v3, v3+, and v4) made significant
9 2016 DeepLab contributions in semantic segmentation, incorporating dilated convolutions,
atrous spatial pyramid pooling, and encoder–decoder structures [51].
10 2016 PSPNet Proposed Pyramid Scene Parsing Network for scene parsing tasks [52].
This approach to segmentation is designed to not only classify pixels but also
Instance-Aware differentiate between separate instances of the same class in the image. It is
11 2016
Segmentaiton commonly used in scenarios where identifying individual objects (instances) is
crucial, such as in instance segmentation tasks [53].
An adaptation of the U-Net model for volumetric (3D) medical image
segmentation. It is particularly effective for tasks like organ segmentation in 3D
12 2016 V-Net
medical scans, using a similar architecture to U-Net but extended to three
dimensions [54].
A lightweight and efficient network designed for real-time semantic segmentation,
particularly in mobile or low-power devices. It achieves a good balance between
13 2016 ENet
accuracy and speed, making it suitable for applications where computational
resources are limited [55].
Utilizes a multi-path refinement network for high-resolution semantic
14 2016 RefineNet
segmentation [56].
BioMedInformatics 2024, 4 940
Model
No. Year Description
(Architecture)
This is also known as The One Hundred Layers Tiramisu; it utilizes DenseNets for
15 2017 Tiramisu
semantic segmentation [57].
An extension of Faster R-CNN, Mask R-CNN is effective for instance
16 2017 Mask R-CNN
segmentation tasks [22].
Combines the principles of DenseNets (densely connected convolutional
17 2017 FC-DenseNet networks) with FCNs, leading to efficient and accurate semantic segmentation,
especially in medical imaging [57].
Designed for semantic segmentation, this network uses large kernels and global
18 2017 Global Convolutional Net convolutional layers to handle both classification and localization tasks
effectively [58].
An advanced version of DeepLab, it employs atrous convolutions and spatial
19 DeepLab V3
pyramid pooling to effectively segment objects at multiple scales [59].
A versatile architecture used in both object detection and segmentation, it builds a
FPN—Feature Pyramid
20 2017 multi-scale feature pyramid from a single input image for efficient and accurate
Network
detection at multiple scales [38].
Utilizes an encoder–decoder architecture for fast and accurate semantic
21 2017 LinkNet
segmentation [26].
22 2018 ICNet Designed for real-time semantic segmentation tasks [60].
23 2018 Attention U-Net Incorporates attention mechanisms into the U-Net architecture [61].
24 2018 Nested U-Net A U-Net architecture with nested and dense skip pathways [60].
25 2018 Panoptic Segmentation A unified model that simultaneously performs semantic segmentation [62].
A deep learning model that combines semantic segmentation, direction prediction,
26 2018 Mask-Lab
and instance center prediction for instance segmentation tasks [63].
Path Aggregation Enhances feature hierarchy and representation capability for object detection by
27 2018
Network enabling efficient multi-scale feature aggregation [64].
A network that integrates dense connections and atrous spatial pyramid pooling
28 2018 Dense-ASSP
for robust semantic image segmentation [65].
A model that fuses semantic and boundary information at multiple levels to
29 2018 Excuse
enhance feature representation and segmentation accuracy [63].
Context Encoding Focuses on capturing global context information for semantic segmentation, often
30 2018
Network using a context encoding module to improve performance [66].
A framework that combines the Feature Pyramid Network (FPN) with panoptic
31 2019 Panoptic FPN segmentation, effectively handling both object detection and segmentation
tasks [41].
A semantic segmentation network with a gated shape stream that focuses on
32 2019 Gated-SCNN
capturing shape information alongside the usual texture features [67].
A unified panoptic segmentation network that effectively combines instance and
33 2019 UPS-Net
semantic segmentation tasks into a single, coherent framework [68].
A dense prediction model for instance segmentation that uses structured 4D
34 2019 TensorMask
tensors to represent masks, enabling precise spatial understanding [69].
Maintains high-resolution representations through the network, enhancing
35 2019 HRNet
performance in tasks like semantic segmentation and object detection [70].
CC-Net: CrissCross Employs criss-cross attention to capture long-range contextual information in a
36 2019
Attention computationally efficient manner for semantic segmentation [71].
Integrates position and channel attention mechanisms to capture rich contextual
37 2017 Dual Attention Network
dependencies for improved scene segmentation [72].
38 2019 Fast-SCNN A fast and efficient network design for semantic segmentation on road scenes [73].
BioMedInformatics 2024, 4 941
Model
No. Year Description
(Architecture)
39 2020 DPT Vision transformer-based architecture for segmentation tasks [74].
Another Vision Transformer-based method for segmentation shows the
40 2020 SETR
effectiveness of transformers in dense prediction tasks [75].
Aims at rendering fine-grained detail in segmentation through iterative
41 2020 PointRend
subdivision [74].
42 2020 EfficientPS Combines semantic segmentation and object detection efficiently [76].
43 2019 FasterSeg An architecture search-based approach for real-time semantic segmentation [77].
44 2018 MAnet Utilizes multi-head attention mechanisms for semantic segmentation [60].
FasterSeg is an AI-designed segmentation network that outperforms traditional
45 2020 FasterSeg models in speed and accuracy by using advanced neural architecture search and
collaborative frameworks [78].
A novel single-shot instance segmentation method that represents object masks in
46 2020 PolarMask,
a polar co-ordinate system; simplifies the instance segmentation process [79].
An efficient anchor-free instance segmentation model that extends the CenterNet
47 2020 CenterMask
object detector by adding a spatial attention-guided mask branch [80].
Stands for “Semantic-Context Neural Architecture Search”. It is a network
48 2020 SC-NAS architecture search method designed to optimize semantic segmentation networks
by considering the semantic context of the task [81].
This combines EfficientNet, a scalable and efficient network architecture, with
NAS-FPN (Neural Architecture Search Feature Pyramid Network), a method for
49 2020 EffientNet + NAS-FPN automatically designing feature pyramid architectures for object detection tasks.
This combination aims to optimize both efficiency and accuracy in detection
models [82].
Multi-scale Adaptive Multi-scale Adaptive Feature Fusion Network for Semantic Segmentation in
50 2020
Feature Fusion Network Remote Sensing Images [83].
51 2021 TUNet TransUNet [84].
52 2021 SUnet Swin-Unet, Swin-Transformer [85].
53 2021 Segm Segmenter [86].
54 2021 MedT Medical Transformer [87].
55 2021 BEiT BERT Image Transformers [88].
56 2023 CrossFormer A Hybrid Transformer Architecture for Semantic Segmentation [89]
57 2022 MLP-Mixer Semantic Segmentation with Transformer and MLP-Mixer [90].
Transformer-Powered Transformer-Powered Semantic Segmentation with Large-Scale Instance
58 2022
Semantic Segmentation Discrimination [91].
59 2023 Adaptive Context Fusion Semantic Segmentation with Adaptive Context Fusion [89].
Multi-Scale Vision
60 2023 Semantic Segmentation with Multi-Scale Vision Transformers [92]
Transformers
Hiformer: Hierarchical
61 2023 Semantic Segmentation with Hierarchical Vision Transformers [93]
multi-scale
References
1. Rosenblatt, F. The Perceptron: A probabilistic model for information storage and organization in the brain. Psychol. Rev. 1958, 65,
386–408. [CrossRef] [PubMed]
2. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017,
60, 84–90. [CrossRef]
3. Arnold, M.; Morgan, E.; Rumgay, H.; Mafra, A.; Singh, D.; Laversanne, M.; Vignat, J.; Gralow, J.R.; Cardoso, F.; Sisesling, S.; et al.
Current and future burden of breast cancer: Global statistics for 2020 and 2040. Breast 2022, 66, 15–23. [CrossRef] [PubMed]
BioMedInformatics 2024, 4 942
4. Pacal, İ. Deep learning approaches for classification of breast cancer in ultrasound (US) images. J. Inst. Sci. Technol. 2022, 12,
1917–1927. [CrossRef]
5. Bengio, Y.; Courville, A.; Vincent, P. Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Mach.
Intell. 2013, 35, 1798–1828. [CrossRef] [PubMed]
6. Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw. 2015, 61, 85–117. [CrossRef]
7. Ananthachari, P.; Schutte, S. Big Data Tools, Deep Learning & 3D Objects in the Metaverse. In Digitalization and Management
Innovation II: Proceedings of DMI 2023; IOS Press: Amsterdam, The Netherlands, 2023; Volume 376, p. 236.
8. Alzubaidi, L.; Zhang, J.; Humaidi, A.J.; Al-Dujaili, A.; Duan, Y.; Al-Shamma, O.; Santamaría, J.; Fadhel, M.A.; Al-Amidie, M.;
Farhan, L. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. J. Big Data 2021, 8,
1–74. [CrossRef]
9. Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by back-propagating errors. Nature 1986, 323, 533–536.
[CrossRef]
10. Cui, N. Applying gradient descent in convolutional neural networks. In Proceedings of the 2nd International Conference on
Machine Vision and Information Technology (CMVIT 2018), Hong Kong, China, 23–25 February 2018; Volume 1004, p. 012027.
11. Ruder, S. An overview of gradient descent optimization algorithms. arXiv 2016, arXiv:1609.04747.
12. Kumar, N.; Gupta, R.; Gupta, S. Whole slide imaging (WSI) in pathology: Current perspectives and future directions. J. Digit.
Imaging 2020, 33, 1034–1040. [CrossRef]
13. Pantanowitz, L.; Farahani, N.; Parwani, A. Whole slide imaging in pathology: Advantages, limitations, and emerging perspectives.
Pathol. Lab. Med. Int. 2015, 7, 22–23. [CrossRef]
14. Campanella, G.; Hanna, M.G.; Geneslaw, L.; Miraflor, A.; Werneck Krauss Silva, V.; Busam, K.J.; Brogi, E.; Reuter, V.E.; Klimstra,
D.S.; Fuchs, T.J. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat. Med.
2019, 25, 1301–1309. [CrossRef] [PubMed]
15. Janowczyk, A.; Madabhushi, A. Deep learning for digital pathology image analysis: A comprehensive tutorial with selected use
cases. J. Pathol. Inform. 2016, 7, 29. [CrossRef] [PubMed]
16. Litjens, G.; Sánchez, C.I.; Timofeeva, N.; Hermsen, M.; Nagtegaal, I.; Kovacs, I.; Hulsbergen-van de kaa, C.; Bult, P.; van Ginneken,
B.; van der Laak, J. Deep learning as a tool for increased accuracy and efficiency of histopathological diagnosis. Sci. Rep. 2016, 6,
26286. [CrossRef] [PubMed]
17. Oxford Learner’s Dictionaries. Segmentation Noun—Definition, Pictures, Pronunciation and Usage Notes. Available online:
https://www.oxfordlearnersdictionaries.com/us/definition/english/segmentation (accessed on 20 December 2023).
18. Chowdhary, C.L.; Acharjya, D.P. Segmentation and feature extraction in medical imaging: A systematic review. Procedia Comput.
Sci. 2020, 167, 26–36. [CrossRef]
19. Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of
the Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich,
Germany, 5–9 October 2015; pp. 234–241.
20. Protonotarios, N.E.; Katsamenis, I.; Sykiotis, S.; Dikaios, N.; Kastis, G.A.; Chatziioannou, S.N.; Metaxas, M.; Doulamis, N.;
Doulamis, A. A few-shot U-Net deep learning model for lung cancer lesion segmentation via PET/CT imaging. Biomed. Phys.
Eng. Express 2022, 8, 025019. [CrossRef] [PubMed]
21. Falk, T.; Mai, D.; Bensch, R.; Çiçek, Ö.; Abdulkadir, A.; Marrakchi, Y.; Böhm, A.; Deubner, J.; Jäckel, Z.; Seiwald, K.; et al. U-Net:
Deep Learning for Cell Counting, Detection, and Morphometry. Nat. Methods 2019, 16, 67–70. [CrossRef] [PubMed]
22. He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision,
Venice, Italy, 22–29 October 2017; pp. 2961–2969.
23. Soleimani, P.; Farezi, N. Utilizing deep learning via the 3D U-net neural network for the delineation of brain stroke lesions in MRI
image. Sci. Rep. 2023, 13, 19808. [CrossRef] [PubMed]
24. Srinivasan, S.; Durairaju, K.; Deeba, K.; Mathivanan, S.K.; Karthikeyan, P.; Shah, M.A. Multimodal Bi-omedical Image Segmenta-
tion using Multi-Dimensional U-Convolutional Neural Network. BMC Med. Imaging 2024, 24, 38. [CrossRef]
25. Yi, X.; Walia, E.; Babyn, P. Generative adversarial network in medical imaging: A review. Med. Image Anal. 2019, 58, 101552.
[CrossRef]
26. Chaurasia, A.; Culurciello, E. Linknet: Exploiting encoder representations for efficient semantic segmentation. In Proceedings of
the 2017 IEEE Visual Communications and Image Processing (VCIP), St. Petersburg, FL, USA, 10–13 December 2017; pp. 1–4.
27. Zhang, H.; Xu, Z.; Yao, D.; Zhang, S.; Chen, J.; Lukasiewicz, T. Multi-Head Feature Pyramid Networks for Breast Mass Detection.
In Proceedings of the ICASSP 2023—2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP),
Rhodes Island, Greece, 4–10 June 2023; pp. 1–5.
28. Wang, Y.; Ahsan, U.; Li, H.; Hagen, M. A comprehensive review of modern object segmentation approaches. Found. Trends
Comput. Graph. Vis. 2022, 13, 111–283. [CrossRef]
29. Fathi, A.; Wojna, Z.; Rathod, V.; Wang, P.; Oh Song, H.; Guadarrama, S.; Murphy, K.P. Semantic instance segmentation via deep
metric learning. arXiv 2017, arXiv:1703.10277.
30. Qiu, Z.; Gan, H.; Shi, M.; Huang, Z.; Yang, Z. Self-training with dual uncertainty for semi-supervised medical image segmentation.
arXiv 2022, arXiv:2304.04441v2.
BioMedInformatics 2024, 4 943
31. Lin, Q.; Ng, H.T. A Semi-Supervised Learning Approach with Two Teachers to Improve Breakdown Identification in Dialogues.
In Proceedings of the AAAI Conference on Artificial Intelligence, Washington, DC, USA, 7–14 February 2022; Volume 36,
pp. 11011–11019. [CrossRef]
32. Sun, Z.; Fan, C.; Sun, X.; Meng, Y.; Wu, F.; Li, J. Neural semi-supervised learning for text classification under large-scale pretraining.
arXiv 2020, arXiv:2011.08626.
33. Bertels, J.; Eelbode, T.; Berman, M.; Vandermeulen, D.; Maes, F.; Bisschops, R.; Blaschko, M.B. Optimizing the DICE score
and Jaccard index for medical image segmentation: Theory and practice. In Proceedings of the Medical Image Computing
and Computer Assisted Intervention—MICCAI 2019, 22nd International Conference, Shenzhen, China, 13–17 October 2019.
[CrossRef]
34. Zou, K.H.; Warfield, S.K.; Bharatha, A.; Tempany, C.M.C.; Kaus, M.R.; Haker, S.J.; Wells, W.M.; Jolesz, F.A.; Kikinis, R. Statistical
Validation of Image Segmentation Quality Based on a Spatial Overlap Index. Acad. Radiol. 2004, 11, 178–189. [CrossRef] [PubMed]
35. Crum, W.R.; Camara, O.; Hill, D.L.G. Generalized overlap measures for evaluation and validation in medical image analysis.
IEEE Trans. Med. Imaging 2006, 25, 1451–1461. [CrossRef] [PubMed]
36. Taha, A.A.; Hanbury, A. Metrics for Evaluating 3D Medical Image Segmentation: Analysis, Selection, and Tool. BMC Med.
Imaging 2015, 15, 29. [CrossRef]
37. Ding, K.; Zhou, M.; Wang, H.; Gevaert, O.; Metaxas, D.; Zhang, S. A Large-Scale Synthetic Pathological Dataset for Deep
Learning-Enabled Segmentation of Breast Cancer. Sci. Data 2023, 10, 231. [CrossRef]
38. Lin, T.-Y.; Dollar, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July
2017; pp. 936–944. [CrossRef]
39. Badrinarayanan, V.; Kendall, A.; Cipolla, R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation.
IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [CrossRef]
40. Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely Connected Convolutional Networks. In Proceedings of the 2017
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 1–9. [CrossRef]
41. Kirillov, A.; Girshick, R.; He, K.; Dollár, P. Panoptic Feature Pyramid Networks. In Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 6399–6408.
42. Guo, C.; Fan, B.; Zhang, Q.; Xiang, S.; Pan, C. AugFPN: Improving Multi-Scale Feature Learning for Object Detection. In
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 14–19 June 2020; pp. 12595–
12604.
43. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016; pp. 1063–6919. [CrossRef]
44. Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. Semantic image segmentation with deep convolutional nets
and fully connected crfs. arXiv 2014, arXiv:1412.7062.
45. Shelhamer, E.; Long, J.; Darrell, T. Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell.
2017, 39, 640–651. [CrossRef] [PubMed]
46. Visin, F.; Romero, A.; Cho, K.; Matteucci, M.; Ciccone, M.; Kastner, K.; Bengio, Y.; Courville, A. ReSeg: A recurrent neural
network-based model for semantic segmentation. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016; pp. 41–48. [CrossRef]
47. Noh, H.; Hong, S.; Han, B. Learning deconvolution network for semantic segmentation. In Proceedings of the 2015 IEEE
International Conference on Computer Vision (ICCV), Santiago, Chile, 13–16 December 2015. [CrossRef]
48. Yu, F.; Koltun, V. Multi-Scale Context Aggregation by Dilated Convolutions. arXiv 2016, arXiv:1511.07122.
49. Wei, L.; Andrew, R.; Alexander, C.B. ParseNet: Looking Wider to See Better. arXiv 2015, arXiv:1506.04579.
50. Qiao, J.-J.; Cheng, Z.-Q.; Wu, X.; Li, W.; Zhang, J. Real-time semantic segmentation with parallel multiple views feature
augmentation. In Proceedings of the 30th ACM International Conference on Multimedia, Lisbon, Portugal, 10–14 October 2022;
pp. 6300–6308. [CrossRef]
51. Chen, L.-C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. DeepLab: Semantic image segmentation with deep convolu-
tional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 834–848. [CrossRef]
[PubMed]
52. Zhao, H.; Shi, J.; Qi, X.; Wang, X.; Jia, J. Pyramid Scene Parsing Network. In Proceedings of the 2017 IEEE Conference on Computer
Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2881–2890. [CrossRef]
53. Li, Y.; Qi, H.; Dai, J.; Ji, X.; Wei, Y. Fully Convolutional Instance-Aware Semantic Segmentation. In Proceedings of the 2017 IEEE
Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 4438–4446. [CrossRef]
54. Milletari, F.; Navab, N.; Ahmadi, S.-A. V-net: Fully convolutional neural networks for volumetric medical image segmentation. In
Proceedings of the 2016 Fourth International Conference on 3D Vision (3DV), Stanford, CA, USA, 25–28 October 2016; pp. 565–571.
[CrossRef]
55. Paszke, A.; Chaurasia, A.; Kim, S.; Culurciello, E. ENet: A Deep neural network architecture for real-time semantic segmentation.
arXiv 2016, arXiv:1606.02147. [CrossRef]
BioMedInformatics 2024, 4 944
56. Lin, G.; Milan, A.; Shen, C.; Reid, I. RefineNet: Multi-path refinement networks for high-resolution semantic segmentation. In
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July
2017; pp. 5168–5177. [CrossRef]
57. Jegou, S.; Drozdzal, M.; Vazquez, D.; Romero, A.; Bengio, Y. The one hundred layers tiramisu: Fully convolutional DenseNets for
Semantic segmentation. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops
(CVPRW), Honolulu, HI, USA, 21–26 July 2017. [CrossRef]
58. Peng, C.; Zhang, X.; Yu, G.; Luo, G.; Sun, J. Large kernel matters—Improve semantic segmentation by global convolutional
network. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA,
21–26 July 2017; pp. 1743–1751. [CrossRef]
59. Chen, L.-C.; Papandreou, G.; Schroff, F.; Adam, H. Rethinking atrous convolution for semantic Image segmentation. arXiv 2017,
arXiv:1706.05587. [CrossRef]
60. Zhao, H.; Qi, X.; Shen, X.; Shi, J.; Jia, J. Icnet for real-time semantic segmentation on high-resolution images. In Proceedings of the
European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 405–420.
61. Han, G.; Zhang, M.; Wu, W.; He, M.; Liu, K.; Qin, L.; Liu, X. Improved U-Net based insulator image segmentation method based
on attention mechanism. Energy Rep. 2021, 7, 210–217. [CrossRef]
62. Xiong, Y.; Liao, R.; Zhao, H.; Hu, R.; Bai, M.; Yumer, E.; Urtasun, R. Upsnet: A unified panoptic segmentation network. In
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019;
pp. 8818–8826.
63. Zhang, G.; Lu, X.; Tan, J.; Li, J.; Zhang, Z.; Li, Q.; Hu, X. Refinemask: Towards high-quality instance segmentation with fine-
grained features. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 19–25 June
2021; pp. 6861–6869.
64. Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path aggregation network for instance segmentation. In Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 8759–8768.
65. Eerapu, K.K.; Ashwath, B.; Lal, S.; Dell’Acqua, F.; Dhan, A.N. Dense refinement residual network for road extraction from aerial
imagery data. IEEE Access 2019, 7, 151764–151782. [CrossRef]
66. Gu, Z.; Cheng, J.; Fu, H.; Zhou, K.; Hao, H.; Zhao, Y.; Zhang, T.; Gao, S.; Liu, J. Ce-net: Context encoder network for 2d medical
image segmentation. IEEE Trans. Med. Imaging 2019, 38, 2281–2292. [CrossRef] [PubMed]
67. Takikawa, T.; Acuna, D.; Jampani, V.; Fidler, S. Gated-scnn: Gated shape cnns for semantic segmentation. In Proceedings of the
IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 5229–5238.
68. Cui, B.; Fei, D.; Shao, G.; Lu, Y.; Chu, J. Extracting raft aquaculture areas from remote sensing images via an improved U-net with
a PSE structure. Remote Sens. 2019, 11, 2053. [CrossRef]
69. Chen, X.; Girshick, R.; He, K.; Dollár, P. Tensormask: A foundation for dense object segmentation. In Proceedings of the IEEE/CVF
International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 2061–2069.
70. Seong, S.; Choi, J. Semantic segmentation of urban buildings using a high-resolution network (HRNet) with channel and spatial
attention gates. Remote Sens. 2021, 13, 3087. [CrossRef]
71. Huang, Z.; Wang, X.; Huang, L.; Huang, C.; Wei, Y.; Liu, W. Ccnet: Criss-cross attention for semantic segmentation. In Proceedings
of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019;
pp. 603–612.
72. Nam, H.; Ha, J.W.; Kim, J. Dual attention networks for multimodal reasoning and matching. In Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 299–307.
73. Poudel, R.P.; Liwicki, S.; Cipolla, R. Fast-scnn: Fast semantic segmentation network. arXiv 2019, arXiv:1902.04502.
74. Chen, Z.; Zhu, Y.; Zhao, C.; Hu, G.; Zeng, W.; Wang, J.; Tang, M. Dpt: Deformable patch-based transformer for visual recognition.
In Proceedings of the 29th ACM International Conference on Multimedia, Virtual, 20–24 October 2021; pp. 2899–2907.
75. Zheng, S.; Lu, J.; Zhao, H.; Zhu, X.; Luo, Z.; Wang, Y.; Fu, Y.; Feng, J.; Xiang, T.; Torr, P.H.S.; et al. Rethinking semantic segmentation
from a sequence-to-sequence perspective with transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision
and Pattern Recognition, Virtual, 19–25 June 2021; pp. 6881–6890.
76. Corradetti, M. A Point-Based Rendering Approach for On-Board Instance Segmentation of Non-Cooperative Resident Space
Objects. Master’s Thesis, Politecnico di Milano, Milan, Italy, 6 October 2022. Available online: https://hdl.handle.net/10589/19
5413 (accessed on 6 October 2022).
77. Mohan, R.; Valada, A. Efficientps: Efficient panoptic segmentation. Int. J. Comput. Vis. 2021, 129, 1551–1579. [CrossRef]
78. Chen, W.; Gong, X.; Liu, X.; Zhang, Q.; Li, Y.; Wang, Z. Fasterseg: Searching for faster real-time semantic segmentation. arXiv
2019, arXiv:1912.10917.
79. Xie, E.; Sun, P.; Song, X.; Wang, W.; Liu, X.; Liang, D.; Shen, C.; Luo, P. Polarmask: Single shot instance segmentation with polar
representation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 14–19 June
2020; pp. 12193–12202.
80. Lee, Y.; Park, J. Centermask: Real-time anchor-free instance segmentation. In Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition, Virtual, 14–19 June 2020; pp. 13906–13915.
81. Song, Y.; Sha, E.H.-M.; Zhuge, Q.; Xu, R.; Xu, X.; Li, B.; Yang, L. Hardware-aware neural architecture search for stochastic
computing-based neural networks on tiny devices. J. Syst. Archit. 2023, 135, 102810. [CrossRef]
BioMedInformatics 2024, 4 945
82. Ghiasi, G.; Cui, Y.; Srinivas, A.; Qian, R.; Lin, T.Y.; Cubuk, E.D.; Le, Q.V.; Zoph, B. Simple copy-paste is a strong data augmentation
method for instance segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,
Virtual, 19–25 June 2021; pp. 2918–2928.
83. Shang, R.; Zhang, J.; Jiao, L.; Li, Y.; Marturi, N.; Stolkin, R. Multi-scale adaptive feature fusion network for semantic segmentation
in remote sensing images. Remote Sens. 2020, 12, 872. [CrossRef]
84. Nguyen, V.A.; Nguyen, A.H.; Khong, A.W. Tunet: A block-online bandwidth extension model based on transformers and
self-supervised pretraining. In Proceedings of the ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and
Signal Processing (ICASSP), Singapore, 23–27 May 2022; pp. 161–165.
85. Fan, C.M.; Liu, T.J.; Liu, K.H. SUNet: Swin transformer UNet for image denoising. In Proceedings of the 2022 IEEE International
Symposium on Circuits and Systems (ISCAS), Austin, TX, USA, 27 May–1 June 2022; pp. 2333–2337.
86. Strudel, R.; Garcia, R.; Laptev, I.; Schmid, C. Segmenter: Transformer for semantic segmentation. In Proceedings of the IEEE/CVF
International Conference on Computer Vision, Virtual, 19–25 June 2021; pp. 7262–7272.
87. Valanarasu, J.M.J.; Oza, P.; Hacihaliloglu, I.; Patel, V.M. Medical transformer: Gated axial-attention for medical image segmenta-
tion. In Proceedings of the Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International
Conference, Strasbourg, France, 27 September–1 October 2021; pp. 36–46.
88. Bao, H.; Dong, L.; Piao, S.; Wei, F. Beit: Bert pre-training of image transformers. arXiv 2021, arXiv:2106.08254.
89. He, A.; Wang, K.; Li, T.; Du, C.; Xia, S.; Fu, H. H2Former: An efficient hierarchical hybrid transformer for medical image
segmentation. IEEE Trans. Med. Imaging 2023, 42, 2763–2775. [CrossRef]
90. Lai, H.P.; Tran, T.T.; Pham, V.T. Axial attention mlp-mixer: A new architecture for image segmentation. In Proceedings of the
2022 IEEE Ninth International Conference on Communications and Electronics (ICCE), Nha Trang, Vietnam, 27–29 July 2022;
pp. 381–386.
91. Razumovskaia, E.; Glavas, G.; Majewska, O.; Ponti, E.M.; Korhonen, A.; Vulic, I. Crossing the conversational chasm: A primer on
natural language processing for multilingual task-oriented dialogue systems. J. Artif. Intell. Res. 2022, 74, 1351–1402. [CrossRef]
92. Mkindu, H.; Wu, L.; Zhao, Y. 3D multi-scale vision transformer for lung nodule detection in chest CT images. Signal Image Video
Process. 2023, 17, 2473–2480. [CrossRef]
93. Heidari, M.; Kazerouni, A.; Soltany, M.; Azad, R.; Aghdam, E.K.; Cohen-Adad, J.; Merhof, D. Hiformer: Hierarchical multi-scale
representations using transformers for medical image segmentation. In Proceedings of the IEEE/CVF Winter Conference on
Applications of Computer Vision, Waikoloa, HI, USA, 3–7 January 2023; pp. 6202–6212.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.