0% found this document useful (0 votes)
5 views

Improvement

Uploaded by

odyseuess
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Improvement

Uploaded by

odyseuess
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

This article has been accepted for publication in Computerized Medical Imag-

ing and Graphics.


DOI: https://doi.org/10.1016/j.compmedimag.2023.102259
arXiv:2307.05804v1 [eess.IV] 11 Jul 2023

1
Graphical Abstract
Improving Segmentation and Detection of
Lesions in CT Scans
Using Intensity Distribution Supervision
Seung Yeon Shin, Thomas C. Shen, Ronald M. Summers
Highlights
Improving Segmentation and Detection of
Lesions in CT Scans
Using Intensity Distribution Supervision
Seung Yeon Shin, Thomas C. Shen, Ronald M. Summers

• Intensity values in CT scans, i.e., Hounsfield unit, convey important


information.

• A method to incorporate the intensity information of a target lesion in


training is proposed.

• An intensity distribution of a target lesion is used to define an auxiliary


task.

• It informs the network about possible lesion locations based on intensity


values.

• A relative improvement of 2.4% (16.9%) was obtained in segmenting


(detecting) kidney tumors.
Improving Segmentation and Detection of
Lesions in CT Scans
Using Intensity Distribution Supervision
Seung Yeon Shina,∗, Thomas C. Shena , Ronald M. Summersa
a
Imaging Biomarkers and Computer-Aided Diagnosis Laboratory, Radiology and Imaging
Sciences, Clinical Center, National Institutes of Health, Bethesda, 20892, MD, USA

Abstract
We propose a method to incorporate the intensity information of a target le-
sion on CT scans in training segmentation and detection networks. We first
build an intensity-based lesion probability (ILP) function from an intensity
histogram of the target lesion. It is used to compute the probability of being
the lesion for each voxel based on its intensity. Finally, the computed ILP
map of each input CT scan is provided as additional supervision for network
training, which aims to inform the network about possible lesion locations
in terms of intensity values at no additional labeling cost. The method was
applied to improve the segmentation of three different lesion types, namely,
small bowel carcinoid tumor, kidney tumor, and lung nodule. The effective-
ness of the proposed method on a detection task was also investigated. We
observed improvements of 41.3% → 47.8%, 74.2% → 76.0%, and 26.4% →
32.7% in segmenting small bowel carcinoid tumor, kidney tumor, and lung
nodule, respectively, in terms of per case Dice scores. An improvement of
64.6% → 75.5% was achieved in detecting kidney tumors in terms of average
precision. The results of different usages of the ILP map and the effect of
varied amount of training data are also presented.
Keywords: Lesion segmentation, lesion detection, supervision, intensity
distribution, Hounsfield unit, computed tomography, carcinoid tumor,
kidney tumor, lung nodule


Corresponding author: [email protected]

Preprint submitted to Computerized Medical Imaging and Graphics July 13, 2023
1. Introduction
Identification and quantification of abnormalities are the first objectives
of medical image acquisition (Eisenhauer et al., 2009; Shin et al., 2019; Ryu
et al., 2021). According to the result, immediate treatments can be done, or
follow-up studies are triggered for surveillance (Cai et al., 2021).
Abnormalities can arise at various locations in the body, such as different
tissues and organs. When finding a particular type of lesion, it is often cou-
pled with segmentation of an organ/tissue that can contain the lesion, e.g.,
liver and liver tumor segmentation (Ayalew et al., 2021). These two related
tasks can be considered either sequentially or jointly. In the first case, or-
gan segmentation can benefit the following lesion identification by restricting
the region of interest and thus by enabling more detailed inspection within
it (Shin et al., 2015; Kamble et al., 2020). Meanwhile, the latter facilitates
joint optimization of the relevant tasks and thus can lead to an enhancement
of them. In the work of Tang et al. (2020), liver and liver tumor segmen-
tations are performed jointly using a single shared network to utilize the
correlation between them. Features learned for organ segmentation would
be relevant to contained lesions since they reside within the organ.
Nevertheless, it is not always possible to train the organ segmentation to-
gether with the target lesion segmentation since it requires additional ground-
truth (GT) segmentations of the organ. Indeed, a clinician aims to find and
mark abnormalities, but not an entire organ, which hinders the mentioned
joint or sequential modeling in most cases.
To boost the lesion segmentation without requiring additional supervi-
sion, there have been many attempts (Weninger et al., 2020; Liu et al., 2021;
Ma et al., 2020). Weninger et al. (2020) utilized a multi-task learning net-
work, which performs an image reconstruction task in parallel with tumor
segmentation, for brain tumors in magnetic resonance imaging (MRI) scans.
By having a shared encoder and separate decoders for each task, training a
part of the network, i.e., auto-encoder part, using scans without annotations
is enabled. In the work of Liu et al. (2021), lesion edge prediction is used as
an auxiliary task to help the segmentation of skin lesions. Two tasks can in-
teract with each other within the network and boost each other’s performance
in turn. As a noticeable trend, incorporating distance transform maps of GT
into segmentation networks has been tried in many different works in past
years, which were summarized in the work of Ma et al. (2020). Interestingly,
most of the benchmark methods showed no or minor improvement against

2
the baseline for tumor segmentation while they performed more preferably
for organ segmentation. Challenges of tumor segmentation compared to or-
gan segmentation, such as various locations, shapes, and sizes, are mentioned
as a potential reason.
In terms of lesion detection, Jaeger et al. (2020) developed the Retina U-
Net architecture based on the Retina Net (Lin et al., 2017b). The decoding
part of the Retina Net is augmented by additional high-resolution feature
levels, and semantic lesion segmentation is performed on top of them to boost
the detection task. Despite its effectiveness, it assumes that segmentation
annotations are available together with detection annotations, which is not
always the case.
In computed tomography (CT) scans, intensity values, i.e., Hounsfield
unit (HU), convey important information on the substance of each region,
e.g., air, fat, and bone (Buzug, 2011). Therefore, they can be used in iden-
tifying a particular organ, tissue, or lesion (Petersenn et al., 2015; Lin et al.,
2016; Phan et al., 2019; Summers et al., 2006). In the work of Petersenn
et al. (2015), a HU threshold of 13 or 21 is suggested to discriminate malig-
nant adrenal tumors from benign ones in unenhanced CT scans. In the work
of Lin et al. (2016), a combined use of lesion morphology and HU values
improved the diagnostic accuracy in differentiating benign and malignant in-
cidental breast lesions on contrast-enhanced chest CT scans. In the work of
Phan et al. (2019), a specific threshold range of [40, 90] is used to determine
areas of hemorrhage on brain CT scans.
In this paper, we propose a method to incorporate the intensity informa-
tion of a target lesion on CT scans in training segmentation and detection
networks. Instead of using hard thresholds as in the previous works (Pe-
tersenn et al., 2015; Lin et al., 2016; Phan et al., 2019), an intensity distribu-
tion of a target lesion is first built and used to effectively locate regions where
the lesions are possibly situated. The intensity distribution can be achieved
by investigating intensity values within available GT lesion segmentations or
can be provided as prior information. More specifically, an intensity-based
lesion probability (ILP) function constructed from an intensity histogram is
used to compute the probability of being lesion for each voxel, and this soft
label map is provided for network training as an auxiliary task. It informs
the network about our region of interest, which could contain target lesions,
based on the intensity. Compared to the organ segmentation trained jointly
with lesion identification tasks, our new task can be understood as a soft
and possibly disconnected surrogate of organ segmentation, and it requires

3
no additional annotation cost.
We demonstrate the effectiveness of the proposed method by conducting
experiments on three different datasets: 1) an in-house small bowel carcinoid
tumor dataset, 2) the KiTS21 dataset (Heller et al., 2021) for kidney tu-
mors, and 3) the LNDb dataset for lung nodules (Pedrosa et al., 2019). The
main contributions of our work are as follows. (1) We extend the idea of our
previous paper (Shin et al., 2023) to the segmentation of different lesions at
different body locations to verify its generability. (2) We further investigate
the effectiveness of the proposed method in several aspects, namely with var-
ied amount of training data, in comparison to the joint organ segmentation,
and even on a detection task.

2. Datasets
2.1. Small Bowel Carcinoid Tumor Dataset
Carcinoid tumor is a rare neoplasm (small bowel neoplasms including
carcinoid tumors account for 0.5% of all cancers in the United States (Jasti
and Carucci, 2020)) and found predominantly within the gastrointestinal
tract (50 − 71.4%) and especially in the small bowel (24 − 44%) (Hughes
et al., 2016). They are often less than a centimeter in size (Hughes et al.,
2016).
Our carcinoid tumor dataset is composed of 24 preoperative abdomi-
nal CT scans collected at the National Institutes of Health Clinical Center.
Each scan is from a unique patient who underwent surgery and had at least
one carcinoid tumor within the small bowel. We note that creating a large
dataset for small bowel carcinoid tumors is more difficult than for other more
prevalent diseases.
All scans are intravenous and oral contrast-enhanced. An oral contrast
agent of Volumen was used. Each patient has both arterial and venous
phase scans, and either of them was selectively used according to the relevant
description in the corresponding radiology report (18 arterial and 6 venous
phase scans). They were acquired using 0.5, 1, or 2 mm slice thickness. All
scans were cropped manually along the z-axis to include from the diaphragm
through the pelvis. We will call this the SBCT dataset.
To achieve GT segmentation of tumors, we used “Segment Editor” mod-
ule in 3DSlicer (Fedorov et al., 2012). The corresponding radiology report
and an available 18 F-DOPA PET scan were referred to for help in locating

4
tumors. 88 tumors were annotated in total. We use five-fold cross-validation
for this dataset.

2.2. The KiTS21 Dataset


Kidney cancer is the sixth and the ninth most common cancer for men and
women, respectively, in the United States (Cancer.Net, 2022). The KiTS21
dataset aims to accelerate the development of automatic segmentation tools
for renal tumors and surrounding anatomy. The KiTS21 cohort includes
patients who underwent nephrectomy for suspected renal malignancy (Heller
et al., 2021). Preoperative CT scans of these patients were collected to
compose the dataset. The official training set comprises 300 CT scans from
300 unique patients who had at least one kidney tumor.
All scans are contrast-enhanced and were acquired in the late arterial
phase. Every scan has corresponding GT segmentations of the kidney, tu-
mor, and cyst. In this work, we focus on segmenting tumors while leaving
cysts unattended since cysts are benign and clinically less relevant than tu-
mors. We refer the authors to the dataset description paper (Heller et al.,
2021) for more information. For experiments, we divide the dataset into
training/validation/test sets at a ratio of 7:1:2.

2.3. The LNDb Dataset


Lung cancer is the leading cause of cancer death, which makes up almost
25% of all cancer deaths (American Cancer Society, 2022). Being a possible
indicator of lung cancer, lung nodules show various shapes and characteris-
tics. Thus, the identification and characterization of them are not trivial and
prone to high inter-observer variability (Pedrosa et al., 2019).
The LNDb dataset includes 294 intravenous contrast-enhanced CT scans
from 294 unique patients. Fifty-eight scans among them were withheld by
the organizers for the test set and the remaining 236 scans are available.
Among all identified lesions (nodule ≥ 3 mm, nodule < 3 mm, non-nodule),
only nodules that are greater than or equal to 3 mm were segmented during
the annotation process, and they will be segmented in this work. Thirty-five
scans have been excluded since they have an empty segmentation map with
the above-mentioned reason, resulting in 201 remained scans (236 − 35 =
201).
In this work, we especially focus on improving the segmentation of non-
solid nodules since they are more likely to be malignant than solid nodules

5
and more difficult to identify due to their fuzzy appearance and lower inci-
dence (Diederich, 2009). We utilized nodule texture ratings (1 − 5) provided
in the dataset, where 1 denotes closer to non-solid nodules and 5 denotes
closer to solid nodules. According to these ratings, we classified each seg-
mented nodule into two groups, namely, non-solid (≤ 2) or solid (> 2).
Nineteen scans were identified to have at least one non-solid nodule after
manual inspection. We note that these 19 scans can also contain solid nod-
ules. Finally, they are used for two-fold cross-validation while the remaining
182 (= 201−19) scans are included as training images for every fold training.

3. Methods
3.1. Intensity Distribution Supervision
Figure 1 presents the intensity histogram of target lesions of each dataset.
They were computed by aggregating intensity values within GT lesion seg-
mentations of each dataset. Images were smoothed using anisotropic dif-
fusion (Black et al., 1998) before the histogram construction. To make a
smooth evaluable function from the discontinuous histogram, we perform
kernel density estimation (Parzen, 1962). It is a method used to estimate
the probability density function based on kernels as basis functions. We used
‘gaussian kde’ function of SciPy Python library, which uses Gaussian kernels
with automatic bandwidth determination. The resulting function is then
rescaled to have the maximum value of 1. While it could be less precise, the
ILP function can be provided also by a user as prior information.
The resulting ILP functions are superimposed with their corresponding
histograms in Figure 1. They enable faster calculation of the ILP for a large
set of voxels (a whole CT scan) than using the histogram. The ILP function,
f ILP , is used to compute the probability of being part of the target lesion
for each voxel according to its intensity value. Given an input image volume
X = {xi }N i=1 , the corresponding ILP volume Y
ILP
is defined as:

Y ILP = {yiILP }N
i=1 = {f
ILP
(xi )}N
i=1 , (1)

where N is the number of voxels. An example of the computed ILP volume


is visualized in Figure 2. It is then provided to a network as the label map of
an auxiliary task. It informs the network about our region of interest, which
could contain target lesions, especially in terms of intensity values. Compared
to the organ segmentation trained jointly with lesion identification tasks, our

6
0.2 intensity histogram ILP function 1 0.15 intensity histogram ILP function 1

0.16 0.8 0.12 0.8

0.6 0.6

probability

probability
0.12
proportion

proportion
0.09

0.08 0.4 0.06 0.4

0.04 0.2 0.03 0.2

0 0
0 0 20 40 60 80 100 120 140 160 180 200 0 -100 -60 -20 20 60 100 140 180 220 260 300
Hounsfield unit Hounsfield unit

(a) (b)
0.12
1 1
intensity histogram intensity histogram
0.06

0.1 ILP function ILP function


0.8 0.8
0.05

0.08

probability

probability
proportion

proportion
0.6 0.04
0.6

0.06
0.03
0.4
0.4
0.04
0.02

0.2
0.2
0.02
0.01

0
0
-1100 -900 -700 -500 -300 -100 100 300 0
-1100 -900 -700 -500 -300 -100 100 300 0

Hounsfieldunit Hounsfieldunit

(c) (d)

Figure 1: Intensity histogram (gray) and an intensity-based lesion probability (ILP) func-
tion (black) of target lesions for each dataset. (a) Small bowel carcinoid tumors of the
SBCT dataset. (b) Kidney tumors of the KiTS21 dataset. (c) Non-solid lung nodules
of the LNDb dataset. (d) Solid lung nodules of the LNDb dataset. This is provided for
comparison to that of non-solid nodules. In each sub-figure, the intensity histogram and
the ILP function have different scales, so the left or right y-axis should be read for each.
Refer to the text for an explanation of the ILP function.

new task can be understood as a soft and possibly disconnected surrogate of


organ segmentation, and it requires no additional labeling effort.

3.2. Network Training


3.2.1. Lesion Segmentation Network
The proposed intensity distribution supervision can be easily used for
a lesion segmentation network. Figure 2 visualizes the data used for the
network training. Given an input image volume X, the corresponding ILP
volume Y ILP is generated using the ILP function f ILP . Then, it is used as
supervision for network training together with the segmentation GT, Y segm .
For simplicity, we use a network with two output channels, which is similar
to the one for joint liver and liver tumor segmentation (Tang et al., 2020).
However, the second output channel of our network predicts the ILP in place

7
Figure 2: Augmenting a lesion segmentation network with the intensity distribution super-
vision. Data involved in training (examples from the small bowel carcinoid tumor dataset)
are visualized. Given a network that is usually trained using pairs of an input image vol-
ume X and corresponding GT segmentation Y segm , it is augmented using an additional
supervision Y ILP , which represents probabilities of being the target lesion for each voxel
according to the intensity value. The GT ILP map Y ILP for training is generated from
the input volume X using the ILP function f ILP of Figure 1, at no additional labeling
cost. The network now predicts two outputs, namely, lesion segmentation and ILP. They
are compared against GT labels Y segm and Y ILP to compute their respective losses Lsegm
and LILP .

of organ segmentation. The generated GT ILP volume Y ILP is used as


supervision for this channel.
A new loss term for the added task, LILP , is incorporated into training
accordingly as shown in Figure 2. Cross-entropy loss is used to measure
the dissimilarity between the GT and the prediction of the ILP. Finally, the
overall loss function for training the lesion segmentation network is defined
as:
L = Lsegm + λLILP (2)
where Lsegm is the segmentation loss and λ is the relative weight for the ILP
loss LILP . We use the generalized Dice loss (Sudre et al., 2017) for Lsegm .

8
Figure 3: The use of intensity distribution supervision for lesion detection networks. The
Retina Net (Lin et al., 2017b) and Retina U-Net (Jaeger et al., 2020) are exemplified here.
The solid line part represents the Retina Net architecture. ‘cls’ and ‘bb’ denote classi-
fication and box regression, respectively. The Retina U-Net is implemented by stacking
additional high-resolution feature levels for the decoding part of the Retina Net and per-
forming lesion segmentation on top of them. For this, GT segmentations are assumed to
be available together with detection GTs. Our ILP map Y ILP generated from each input
volume X using the ILP function f ILP can effectively replace the GT segmentation.

3.2.2. Lesion Detection Network


Our intensity distribution supervision can be also used to enhance detec-
tors that are based on feature pyramid networks (FPNs) (Lin et al., 2017a),
such as the Retina Net (Lin et al., 2017b), with the same philosophy as the
Retina U-Net (Jaeger et al., 2020). Figure 3 explains the concept. In the
Retina U-Net, to exploit available GT segmentation of lesions, the decoding
part of the Retina Net is augmented by additional high-resolution feature lev-
els, and semantic lesion segmentation is performed on top of them. Despite
its effectiveness, it is not feasible if the GT segmentation is unavailable.
Our ILP map Y ILP , which is generated from each input image volume
X using the ILP function f ILP can replace the GT segmentation. The same
ILP loss LILP is applied as in the segmentation network. Finally, the overall
loss function for training the lesion detection network is defined as:

L = Ldet + λLILP (3)

9
where Ldet is the typical detection loss for classification and box regression,
and λ is the relative weight for LILP .

3.3. Evaluation Details


3.3.1. Lesion Segmentation
We first used our own version of the 3D U-Net (Çiçek et al., 2016) to
have more control over the training/test procedures and thus verify the pure
impact of using the proposed intensity distribution supervision. Then, we
further attempted to combine it with the self-configuring nnU-Net (Isensee
et al., 2021) to achieve more optimized performance. Within this framework,
the ‘3D’ full resolution ‘U-Net’ was used again but with higher complexity
in terms of the network size, data augmentation, and test time method. We
note that the proposed method of using intensity information can be used
for any other segmentation networks.
The ILP functions in Figure 1 were used for each dataset. Especially for
the LNDb dataset, we used the function of non-solid nodules (Figure 1(c))
to emphasize them more during training since our goal is to improve the
segmentation of them. Their distribution is wider than that of small bowel
carcinoid tumors or kidney tumors because they are located in the lung
parenchyma. Nevertheless, it is distinguishable from that of solid nodules
(Figure 1(d)).
Hyperparameters related to each method and each dataset are summa-
rized in Table 1. The learning rates and λ were chosen through the grid search
for both methods. While the other values were chosen through the grid search
again by ourselves for the 3D U-Net, they were chosen automatically for the
self-configuring nnU-Net. We used the AdamW optimizer (Loshchilov and
Hutter, 2019) for the 3D U-Net. The SGD with a momentum of 0.99 was
used for the nnU-Net. In all of the implemented networks, 3x3x3 convolution
kernels are used except 1x1x1 kernels for the final inference layer.
For data augmentation, various geometric and photometric augmentation
methods that are available in their implementation (https://github.com/MIC-
DKFZ/nnUNet) were used as is for the nnU-Net. The whole set of photomet-
ric augmentations was turned on or off in its entirety to check their relevance
in each dataset. Meanwhile, selective sets of augmentations were used for
the 3D U-Net after performing an investigation on the effect of each method
for each dataset. While only image rotation was used for the SBCT and the
LNDb datasets, image scaling and elastic deformations were used as well for

10
Method 3D U-Net (Çiçek et al., 2016) nnU-Net (Isensee et al., 2021)
Dataset SBCT KiTS21 LNDb KiTS21 LNDb
Learning rate 3 × 10−4 10−3 3 × 10−5 10−2 10−3
Weight decay 5 × 10−4 5 × 10−4 5 × 10−4 3 × 10−5 3 × 10−5
λ 1 0.1 0.01 0.1 0.003
# channels {8, 16, 32, 64} {32, 64, 128, 256} {8, 16, 32, 64} {32, 64, 128, 256, 320, 320}
Spacing 1 × 1 × 1 mm3 2 × 2 × 2 mm3 1 × 1 × 1 mm3 1 × 1 × 1 mm3 1 × 1 × 1 mm3
Patch size 224 × 224 × 224 112 × 112 × 112 224 × 224 × 224 160 × 112 × 128 128 × 128 × 128
Batch size 1 1 1 2 2

Table 1: Summarization of the hyperparameters related to the lesion segmentation net-


works for each dataset. λ is the relative weight for LILP in Eq. 2. For each dataset,
training is conducted using image patches of the provided size, which are sampled from
resampled image volumes that have the provided resolution (spacing). Refer to the text
for an explanation of how they were chosen.

the KiTS21 dataset. In test time, a test time augmentation method of image
mirroring was used for the nnU-Net.
For evaluation, we use per case and per lesion Dice scores. The per case
Dice score denotes an average Dice score per scan. In calculating the per
lesion Dice scores, tight local image volumes around each tumor were taken
into account. Paired t-tests are conducted to show the statistical significance
of the proposed method. We used an NVIDIA Tesla V100 32GB GPU to
conduct experiments.

3.3.2. Lesion Detection


For all compared methods, the same backbone FPN (Lin et al., 2017a)
based on a ResNet50 (He et al., 2016) was used. We used the Adam opti-
mizer (Kingma and Ba, 2015) with a learning rate of 10−4 . 0.003 was used
for λ of Eq. 3. The training was conducted using image patches of size
96 × 96 × 64, which were sampled from scans that have isotropic voxels of
2 × 2 × 2 mm3 . The batch size of 8 was used. For data augmentation, image
scaling, rotation, mirroring, and elastic deformations were used.
For experiments, we used the KiTS21 dataset, which has the biggest
number of scans. We report average precision (AP) with an intersection over
union threshold of 0.1, following the method of Jaeger et al. (2020).

11
Per case Per lesion Per lesion (≥ 125 mm3 )
Method
Dice (%) p-value Dice (%) p-value Dice (%) p-value
3D U-Net (Çiçek et al., 2016) 41.3 ± 27.2 0.0022 30.0 ± 36.7 0.0398 37.7 ± 36.4 0.1002
3D U-Net + PP 36.2 ± 25.3 0.0001 24.7 ± 32.5 0.0005 30.5 ± 31.3 0.0026
3D U-Net + ILP(in) 41.6 ± 29.2 0.0860 32.8 ± 39.0 0.1885 37.9 ± 38.3 0.2133
3D U-Net + ILP 47.8 ± 29.6 - 35.9 ± 40.0 - 42.6 ± 39.5 -
3D U-Net + ILP(shifted) 41.1 ± 30.4 0.0084 30.1 ± 38.3 0.0296 32.4 ± 37.7 0.0025

Table 2: Results of segmentation methods that use the ILP in different ways on the SBCT
dataset. ‘3D U-Net’ denotes the baseline method that performs only lesion segmentation;
‘3D U-Net + PP’ denotes applying post-processing (PP) that is based on the ILP to
the results of ‘3D U-Net’; ‘3D U-Net + ILP(in)’ denotes using the ILP volume as an
additional input channel instead of as an additional supervision; ‘3D U-Net + ILP’ denotes
the proposed method; ‘3D U-Net + ILP(shifted)’ denotes the proposed method but using
a shifted ILP function, which would be irrelevant with the target tumor. Dice scores
were calculated at two different subject levels, namely, per case and per lesion. Refer to
the text for an explanation of each of the metrics. Mean and standard deviation values
are presented together. P-values are computed by conducting paired t-tests between the
proposed method and the others with the Dice scores.

4. Results
4.1. Lesion Segmentation
4.1.1. Experiments on the SBCT Dataset
Quantitative Results. Table 2 presents quantitative results of segmentation
methods, which differ in the ways of using the intensity distribution infor-
mation. We used the 3D U-Net (Çiçek et al., 2016) to verify the pure effect
of the different usages of the intensity distribution information.
Applying post-processing to the prediction of the segmentation network,
where the ILP volume Y ILP is multiplied with the network predicted proba-
bility map, rather worsened the performance (‘3D U-Net + PP’). This post-
processing could oversimply rule out lesions that have intensity values de-
viating from the built intensity distribution. We also tried using the ILP
volume as an additional input channel instead of as additional supervision
(‘3D U-Net + ILP(in)’). It can be another way to highlight our region of
interest at the input level. However, it performed merely on par with the
baseline that does not use this additional information.
On the other hand, the proposed method, ‘3D U-Net + ILP’, showed
clear improvements for all types of Dice scores when compared to the base-
line. The proposed method of using the intensity distribution supervision

12
Figure 4: Example segmentation results on the SBCT dataset. Each row represents dif-
ferent cases. The columns, from left, represent input CT scan, zoomed view of the red
box in the CT scan (carcinoid tumors are pointed by the red arrows), corresponding GT
segmentation, result of the baseline method (‘3D U-Net’ in Table 2), and result of the
proposed method (‘3D U-Net + ILP’ in Table 2), respectively.

does not entail any additional labeling effort. The ILP function can be con-
structed and included in training by looking up already available CT scans
and corresponding GT tumor segmentation.
We also investigate the effect of having the precise intensity model of a
target. ‘3D U-Net + ILP(shifted)’ is the proposed method but uses another
ILP function that is +100 shifted from the original one. The shifted function
does not reflect the actual intensity distribution of the target anymore. It
performed rather worse than the baseline.
All methods including the proposed method showed higher Dice scores
for relatively larger tumors (≥ 125 mm3 , which is approximately ≥ 6 mm
diameter) than for all tumors.

Qualitative Results. Figure 4 presents example segmentation results of small


bowel carcinoid tumor. Compared to the baseline method that is trained
without the intensity distribution supervision, the proposed method segments
more tumors (first and second rows). The last row shows a failure case, where

13
Per case Per lesion
Method
Dice (%) p-value Dice (%) p-value
3D U-Net (Çiçek et al., 2016) 62.0 ± 34.8 0.0216 66.7 ± 36.0 0.0934
3D U-Net + organ 68.3 ± 30.5 0.3713 71.6 ± 31.4 0.5727
3D U-Net + ILP 69.2 ± 28.5 - 71.2 ± 30.8 -
nnU-Net (Isensee et al., 2021) 74.2 ± 26.8 0.1494 77.6 ± 27.1 0.1742
nnU-Net + photo aug. 74.2 ± 29.8 0.1072 76.9 ± 31.0 0.0887
nnU-Net + organ 80.4 ± 23.7 0.9857 80.3 ± 27.4 0.7983
nnU-Net + ILP 76.0 ± 27.1 - 79.1 ± 27.9 -
Zhao et al. (2021) 74.9 ± 27.0 0.1264 81.3 ± 22.3 0.1716
Zhao et al. (2021) + ILP 77.0 ± 23.3 - 83.0 ± 18.1 -

Table 3: Results of different segmentation methods on the KiTS21 dataset. The first three
and the next four are based on the 3D U-Net and the nnU-Net, respectively. The last two
are based on the winning method of the KiTS21 challenge. For the self-configuring nnU-
Net method, the ‘3D’ full resolution ‘U-Net’ is used again but it has a higher complexity
in terms of the network size, data augmentation, and test time method, etc. ‘+ organ’
denotes performing organ segmentation jointly with tumor segmentation; ‘+ ILP’ denotes
the proposed method; ‘+ photo aug.’ denotes utilizing photometric augmentations as well
as geometric augmentations in training.

the proposed method missed a blurry small tumor.

4.1.2. Experiments on the KiTS21 Dataset


Quantitative Results. Table 3 presents quantitative results of different seg-
mentation methods on the KiTS21 dataset. We first used different versions
of the 3D U-Net that were augmented using different additional supervision,
and again used the nnU-Net for more optimized performance. We also in-
corporated the proposed intensity distribution supervision in training the
winning method of the KiTS21 challenge (Zhao et al., 2021).
In the 3D U-Net based comparison, the proposed method (‘3D U-Net +
ILP’) outperformed the baseline (‘3D U-Net’). We further compare it against
a multi-task learning network that performs organ (kidney) segmentation
together with lesion (kidney tumor) segmentation, which is ‘3D U-Net +
organ’ in Table 3. As our ILP map informs the network about our region of
interest, which could contain target lesions, in terms of intensity values, organ
segmentation supervision could do the same in a stricter way, i.e., kidney

14
tumors can exist within the kidney. The proposed method performed on par
with the organ-segmentation-augmented method, which requires additional
labeling effort while the proposed method does not.
The proposed method (‘nnU-Net + ILP’) still outperformed the base-
line (‘nnU-Net’) when the nnU-Net was used. The test time augmentation
method of the nnU-Net could decrease the performance gap by benefiting an
under-performed method more. We note that the proposed method could be
not well harmonized with photometric augmentations since they randomly
distort original voxel values and thus can change the physical meaning that
each voxel originally has on CT scans. We found in this dataset that photo-
metric augmentations do not really help in improving the performance even
for the baseline method (‘nnU-Net + photo aug.’). Thus, only geometric aug-
mentations were used. ‘nnU-Net + organ’ showed a better performance than
the proposed method, but it used an additional annotation of the kidney.
We also incorporated the proposed intensity distribution supervision in
training the winning method of the KiTS21 challenge (Zhao et al., 2021).
The method is composed of three steps (networks), which are coarse kidney
segmentation, fine kidney segmentation, and tumor segmentation. There-
fore, it uses GT segmentations of the kidney and tumor for training of the
first and second networks, and the last network, respectively. Since there
is no publicly available code, we have used our own implementation for the
experiment. When the intensity distribution supervision was incorporated
in the last tumor segmentation step at no additional labeling cost, a better
performance was again achieved (‘Zhao et al. (2021) + ILP’).
Figure 5 shows the segmentation performances on the KiTS21 dataset,
depending on the number of training images. Given the original training
set of 210 images, 90, 120, 150, or 180 images were randomly sampled to
conduct the experiments. The same validation and test sets of 30 and 60 im-
ages, respectively, were used for all training set sizes. The proposed method
consistently outperformed the baseline for all experiments, with a margin of
around 2%.

Qualitative Results. Figure 6 shows example segmentation results on the


KiTS21 dataset. The proposed method segments tumors more precisely
(first and second rows) by utilizing the intensity distribution supervision
when compared to the baseline. The last row shows a failure case, where the
tumor was missed by both the baseline and proposed methods.

15
76 nnU-Net + ILP
75 nnU-Net
74
73
72
Dice (%)

71
70
69
68
67
66
65
64
90 120 150 180 210
# of training images

Figure 5: Segmentation performances of the baseline nnU-Net and the proposed method
depending on the number of training images on the KiTS21 dataset. Per case Dice scores
are reported.

4.1.3. Experiments on the LNDb Dataset


Quantitative Results. Table 4 presents quantitative segmentation results on
the LNDb dataset. As mentioned in Section 2, we tried to improve the seg-
mentation of non-solid nodules in this work by incorporating their intensity
distribution information into network training. For both the 3D U-Net and
nnU-Net, the inclusion of the intensity distribution supervision (+ ILP in
Table 4) helped in segmenting non-solid nodules better thus resulting in the
performance improvement also for all nodules, except for the per lesion Dice
scores of the 3D U-Net. Per lesion Dice score, by definition, does not take into
account FPs that are apart from GT lesions. On the other hand, it focuses
on segmentation quality around GT lesions. Therefore, FNs are considered
more important than FPs in calculating it. The proposed method with the
3D U-Net reduced FPs but induced FNs, which led to increased per case
Dice scores but decreased per lesion Dice scores. Nevertheless, the added
intensity distribution supervision on non-solid nodules helped in segmenting
them while overcoming their fuzzy appearance and underrepresentation in
the dataset.

16
Figure 6: Example segmentation results on the KiTS21 dataset. Each row represents
different cases. The columns, from left, represent input CT scan (kidney tumors are
pointed by the red arrows), corresponding GT segmentation, result of the baseline method
(‘nnU-Net’ in Table 3), and result of the proposed method (‘nnU-Net + ILP’ in Table 3),
respectively.

Qualitative Results. Figure 7 presents example segmentation results on the


LNDb dataset. Compared to the baseline method, the proposed method
segments more nodules (first and second rows).

4.2. Lesion Detection


4.2.1. Quantitative Results
Table 5 presents quantitative results of detection methods that differ
in augmenting the baseline Retina Net (Lin et al., 2017b) on the KiTS21
dataset. The network architecture of each method is explained in Figure 3
and the corresponding text. The Retina U-Net (Jaeger et al., 2020) ex-
ploiting lesion segmentation supervision, which is assumed to be available
together with detection GTs, outperformed the Retina Net, as suggested in
the work of Jaeger et al. (2020). When the lesion segmentation supervision
was replaced with the proposed ILP supervision, it outperformed the base-
line Retina Net again and further outperformed the Retina U-Net. While

17
Nodule Per case Per lesion
Method
texture
Dice (%) p-value Dice (%) p-value
all 21.6 ± 21.7 0.3022 38.1 ± 36.4 0.9573
3D U-Net (Çiçek et al., 2016)
non-solid 10.9 ± 17.1 0.0782 26.0 ± 33.6 0.7471
all 24.0 ± 26.7 - 32.5 ± 34.4 -
3D U-Net + ILP
non-solid 17.6 ± 24.8 - 22.5 ± 30.4 -
all 26.4 ± 27.4 0.0588 28.0 ± 36.5 0.0283
nnU-Net (Isensee et al., 2021)
non-solid 10.4 ± 21.9 0.1537 13.8 ± 27.4 0.0587
all 32.7 ± 25.2 - 33.8 ± 36.6 -
nnU-Net + ILP
non-solid 14.0 ± 23.6 - 21.0 ± 31.2 -

Table 4: Results of different segmentation methods on the LNDb dataset. The proposed
methods that use the ILP supervision (+ ILP) are compared with the baseline methods
of the 3D U-Net and nnU-Net. For each method, the performances on all nodules and on
only non-solid nodules are presented.

Method Average precision (%)


Retina Net (Lin et al., 2017b) 64.6
Retina U-Net (Jaeger et al., 2020) 72.5
Retina U-Net - Segm + ILP 75.5

Table 5: Results of different detection methods on the KiTS21 dataset. Network archi-
tecture of each method is visually explained in Figure 3. ‘Retina U-Net - Segm + ILP’
denotes the proposed method that uses the ILP supervision in place of lesion segmenta-
tion, which is used for the Retina U-Net.

the ILP function could be less precise, it can be constructed using a small
number of GT lesion segmentations or can be even provided by a user as
prior information. Also, while the Retina Net was used as the baseline here,
the proposed method can be used to enhance any detectors that are based
on FPNs in the same manner.

4.2.2. Qualitative Results


Figure 8 shows example detection results on the KiTS21 dataset. The
incorporated intensity distribution information helped in locating a tumor
(first row) and eliminating a false positive (second row). The bottom two
rows represent failure cases. In the third row, a false positive was detected
by the proposed method on the heterogeneous stomach, which resembles a
kidney with a tumor in appearance. In the last row, the tumor that has

18
Figure 7: Example segmentation results on the LNDb dataset. Each row represents dif-
ferent cases. The columns, from left, represent input CT scan (lung nodules, which are all
non-solid in these examples, are pointed by the red arrows), corresponding GT segmen-
tation, result of the baseline method (‘nnU-Net’ in Table 4), and result of the proposed
method (‘nnU-Net + ILP’ in Table 4), respectively.

similar intensity values as the rest of the kidney was missed by the proposed
method.

5. Discussion
We have presented a method to incorporate the intensity information of
a target lesion on CT scans in training segmentation and detection networks.
An ILP function constructed from an intensity histogram of a target lesion
is used to effectively locate regions where the lesions are possibly situated.
The ILP map of each input CT scan is provided as additional supervision for
network training. It aims to inform the network about our region of interest,
which could contain target lesions, especially in terms of intensity values. It
requires no additional labeling effort.

19
Figure 8: Example detection results on the KiTS21 dataset. Each row represents different
cases. The columns, from left, represent input CT scan (available kidney tumor segmen-
tation is overlaid to locate the tumors), result of the baseline method (‘Retina Net’ in
Table 5), and result of the proposed method (‘Retina U-Net - Segm + ILP’ in Table 5),
respectively.

The method has been applied to improve the segmentation of three dif-
ferent lesion types, namely, small bowel carcinoid tumors, kidney tumors,

20
and lung nodules. The effectiveness of the proposed method on a detection
task has been also investigated for kidney tumors. Our findings from the
experiments are: 1) The proposed method of using the ILP as additional su-
pervision performs better than other usages of it, such as for post-processing
and as an additional input channel (Table 2). 2) Having a precise and gen-
eralizable intensity distribution is important for the success of the method
(Table 2). 3) It can be effectively used with the nnU-Net for more opti-
mized performance (Table 3). 4) It performs favorably against a method
that exploits another supervision such as organ segmentation (Table 3). 5)
Consistent performance gains can be expected over varying training set sizes
(Figure 5). 6) It can be considered to boost the performance of an underrep-
resented lesion type (Table 4). 7) It can be used to enhance a detector such
as the Retina Net (Lin et al., 2017b) (Table 5).
Carcinoid tumors in our SBCT dataset are small (Figure 4). Lung nodules
in the LNDb dataset are also small (mostly less than a centimeter (Pedrosa
et al., 2019)) as exemplified in Figure 7. Even small numbers of false positive
and false negative voxels have a big impact on the Dice score of small lesions.
Nevertheless, the proposed method showed clear improvements compared to
the baseline. We also note that we segmented nodules from an entire CT scan,
whereas the segmentation is conducted when nodule centroids are given for
each scan in the LNDb challenge. Our task is more challenging, which makes
achieving high Dice scores difficult again.
For the KiTS21 dataset, we incorporated the proposed intensity distribu-
tion supervision also in training the challenge winning method (Zhao et al.,
2021). Although the efficacy of the proposed method was verified, our result
is not directly comparable with theirs since they used more training images
(240 vs. 210). While they divided the dataset into only training and valida-
tion sets (a separate test set available for the challenge period), we divided
it into training/validation/test sets to enable a strict evaluation within the
available data. Also, we did not use their postprocessing method of counting
the number of voxels for each connected component and thresholding them
based on their sizes, since that heuristics is not always relevant.
In terms of network training, typical segmentation and detection losses
are used together with the ILP loss for the segmentation and detection tasks,
respectively (Eq. 2 and Eq. 3). The proposed method provides an additional
opportunity to consider the intensity information of the target lesion in an
explicit way while retaining learning about other aspects by the typical loss
terms. A lesion that is not distinct by the ILP model still can be identified

21
by the other aspects. For example, in the third row of Figure 8, the tumor
that is not distinct from the kidney by intensity values was still detected by
the proposed method.
In this work, for each target lesion, the experiments have been conducted
on a single dataset acquired using a particular imaging protocol. The pro-
posed method would be less applicable across datasets that were acquired
using different imaging protocols since the intensity distribution of the tar-
get lesion can be diffused and incoherent. Also, we took a relatively simple
implementation for incorporating the intensity information of the target le-
sion in the network training. For the same objective, a better approach can
be explored.
In future work, we plan to study the effect of incorporating unsupervised
images into training since the proposed intensity distribution supervision
enables training on them. The proposed method can be further applied to
different target lesions.

Data Availability
The code is available at https://github.com/rsummers11/CADLab/tree
/master/intensity distribution supervision

Acknowledgment
This research was supported by the Intramural Research Program of the
National Institutes of Health, Clinical Center. The research used the high-
performance computing facilities of the NIH Biowulf cluster.

Conflicts of Interest
Potential financial interest: Author Ronald M. Summers receives royalties
from iCAD, Philips, Scan Med, PingAn, and Translation Holdings and has
received research support from Ping An (CRADA).

References
American Cancer Society, 2022. Key statistics for lung cancer. URL: https:
//www.cancer.org/content/dam/CRC/PDF/Public/8703.00.pdf.

22
Ayalew, Y.A., Fante, K.A., Mohammed, M.A., 2021. Modified u-net for liver
cancer segmentation from computed tomography images with a new class
balancing method. BMC Biomedical Engineering 3, 1–13.

Black, M., Sapiro, G., Marimont, D., Heeger, D., 1998. Robust anisotropic
diffusion. IEEE Transactions on Image Processing 7, 421–432. doi:10.
1109/83.661192.

Buzug, T.M., 2011. Computed tomography, in: Springer handbook of medi-


cal technology. Springer, pp. 311–342.

Cai, J., Tang, Y., Yan, K., Harrison, A.P., Xiao, J., Lin, G., Lu, L., 2021.
Deep lesion tracker: Monitoring lesions in 4d longitudinal imaging studies,
in: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recog-
nition (CVPR), pp. 15154–15164. doi:10.1109/CVPR46437.2021.01491.

Cancer.Net, 2022. Kidney cancer: Statistics. URL: https://www.cancer.


net/cancer-types/kidney-cancer/statistics.

Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O., 2016.
3d u-net: Learning dense volumetric segmentation from sparse annotation,
in: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (Eds.),
Medical Image Computing and Computer-Assisted Intervention – MICCAI
2016, Springer International Publishing, Cham. pp. 424–432.

Diederich, S., 2009. Pulmonary nodules: do we need a separate algorithm


for non-solid lesions? Cancer Imaging 9, S126.

Eisenhauer, E.A., Therasse, P., Bogaerts, J., Schwartz, L.H., Sargent, D.,
Ford, R., Dancey, J., Arbuck, S., Gwyther, S., Mooney, M., et al., 2009.
New response evaluation criteria in solid tumours: revised recist guideline
(version 1.1). European journal of cancer 45, 228–247.

Fedorov, A., Beichel, R., Kalpathy-Cramer, J., Finet, J., Fillion-Robin, J.C.,
Pujol, S., Bauer, C., Jennings, D., Fennessy, F., Sonka, M., Buatti, J.,
Aylward, S., Miller, J.V., Pieper, S., Kikinis, R., 2012. 3d slicer as an
image computing platform for the quantitative imaging network. Magnetic
Resonance Imaging 30, 1323 – 1341. doi:https://doi.org/10.1016/j.
mri.2012.05.001. quantitative Imaging in Cancer.

23
He, K., Zhang, X., Ren, S., Sun, J., 2016. Deep residual learning for image
recognition, in: 2016 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), pp. 770–778. doi:10.1109/CVPR.2016.90.

Heller, N., Isensee, F., Maier-Hein, K.H., Hou, X., Xie, C., Li, F., Nan,
Y., Mu, G., Lin, Z., Han, M., Yao, G., Gao, Y., Zhang, Y., Wang, Y.,
Hou, F., Yang, J., Xiong, G., Tian, J., Zhong, C., Ma, J., Rickman, J.,
Dean, J., Stai, B., Tejpaul, R., Oestreich, M., Blake, P., Kaluzniak, H.,
Raza, S., Rosenberg, J., Moore, K., Walczak, E., Rengel, Z., Edgerton,
Z., Vasdev, R., Peterson, M., McSweeney, S., Peterson, S., Kalapara, A.,
Sathianathen, N., Papanikolopoulos, N., Weight, C., 2021. The state of
the art in kidney and kidney tumor segmentation in contrast-enhanced
ct imaging: Results of the kits19 challenge. Medical Image Analysis 67,
101821. doi:https://doi.org/10.1016/j.media.2020.101821.

Hughes, M.S., Azoury, S.C., Assadipour, Y., Straughan, D.M., Trivedi, A.N.,
Lim, R.M., Joy, G., Voellinger, M.T., Tang, D.M., Venkatesan, A.M.,
et al., 2016. Prospective evaluation and treatment of familial carcinoid
small intestine neuroendocrine tumors (si-nets). Surgery 159, 350–357.

Isensee, F., Jaeger, P.F., Kohl, S.A., Petersen, J., Maier-Hein, K.H., 2021.
nnu-net: a self-configuring method for deep learning-based biomedical im-
age segmentation. Nature methods 18, 203–211.

Jaeger, P.F., Kohl, S.A.A., Bickelhaupt, S., Isensee, F., Kuder, T.A., Schlem-
mer, H.P., Maier-Hein, K.H., 2020. Retina U-Net: Embarrassingly Sim-
ple Exploitation of Segmentation Supervision for Medical Object Detec-
tion, in: Dalca, A.V., McDermott, M.B., Alsentzer, E., Finlayson, S.G.,
Oberst, M., Falck, F., Beaulieu-Jones, B. (Eds.), Proceedings of the Ma-
chine Learning for Health NeurIPS Workshop, PMLR. pp. 171–183. URL:
https://proceedings.mlr.press/v116/jaeger20a.html.

Jasti, R., Carucci, L.R., 2020. Small bowel neoplasms: A pic-


torial review. RadioGraphics 40, 1020–1038. URL: https:
//doi.org/10.1148/rg.2020200011, doi:10.1148/rg.2020200011,
arXiv:https://doi.org/10.1148/rg.2020200011. pMID: 32559148.

Kamble, B., Sahu, S.P., Doriya, R., 2020. A review on lung and nodule seg-
mentation techniques, in: Kolhe, M.L., Tiwari, S., Trivedi, M.C., Mishra,

24
K.K. (Eds.), Advances in Data and Information Sciences, Springer Singa-
pore, Singapore. pp. 555–565.

Kingma, D.P., Ba, J., 2015. Adam: A method for stochastic optimization, in:
Bengio, Y., LeCun, Y. (Eds.), 3rd International Conference on Learning
Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Con-
ference Track Proceedings. URL: http://arxiv.org/abs/1412.6980.

Lin, T.Y., Dollar, P., Girshick, R., He, K., Hariharan, B., Belongie, S., 2017a.
Feature pyramid networks for object detection, in: Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollar, P., 2017b. Focal loss for
dense object detection, in: Proceedings of the IEEE International Confer-
ence on Computer Vision (ICCV).

Lin, Y.P., Hsu, H.H., Ko, K.H., Chu, C.M., Chou, Y.C., Chang, W.C.,
Chang, T.H., 2016. Differentiation of malignant and benign incidental
breast lesions detected by chest multidetector-row computed tomography:
Added value of quantitative enhancement analysis. PLOS ONE 11, 1–11.
URL: https://doi.org/10.1371/journal.pone.0154569, doi:10.1371/
journal.pone.0154569.

Liu, L., Tsui, Y.Y., Mandal, M., 2021. Skin lesion segmentation using deep
learning with auxiliary task. Journal of Imaging 7. URL: https://www.
mdpi.com/2313-433X/7/4/67, doi:10.3390/jimaging7040067.

Loshchilov, I., Hutter, F., 2019. Decoupled weight decay regularization, in:
International Conference on Learning Representations. URL: https://
openreview.net/forum?id=Bkg6RiCqY7.

Ma, J., Wei, Z., Zhang, Y., Wang, Y., Lv, R., Zhu, C., Gaoxiang, C., Liu,
J., Peng, C., Wang, L., Wang, Y., Chen, J., 2020. How distance trans-
form maps boost segmentation cnns: An empirical study, in: Arbel, T.,
Ben Ayed, I., de Bruijne, M., Descoteaux, M., Lombaert, H., Pal, C. (Eds.),
Proceedings of the Third Conference on Medical Imaging with Deep Learn-
ing, PMLR. pp. 479–492. URL: https://proceedings.mlr.press/v121/
ma20b.html.

25
Parzen, E., 1962. On Estimation of a Probability Density Function and
Mode. The Annals of Mathematical Statistics 33, 1065 – 1076. doi:10.
1214/aoms/1177704472.

Pedrosa, J., Aresta, G., Ferreira, C., Rodrigues, M., Leitão, P., Carvalho,
A.S., Rebelo, J., Negrão, E., Ramos, I., Cunha, A., Campilho, A., 2019.
Lndb: A lung nodule database on computed tomography. URL: https:
//arxiv.org/abs/1911.08434, doi:10.48550/ARXIV.1911.08434.

Petersenn, S., Richter, P.A., Broemel, T., Ritter, C.O., Deutschbein, T.,
Beil, F.U., Allolio, B., Fassnacht, M., Group, G.A.S., 2015. Computed
tomography criteria for discrimination of adrenal adenomas and adreno-
cortical carcinomas: analysis of the german acc registry. European journal
of endocrinology 172, 415–422.

Phan, A.C., Vo, V.Q., Phan, T.C., 2019. A hounsfield value-based approach
for automatic recognition of brain haemorrhage. Journal of Information
and Telecommunication 3, 196–209. URL: https://doi.org/10.1080/
24751839.2018.1547951, doi:10.1080/24751839.2018.1547951.

Ryu, H., Shin, S.Y., Lee, J.Y., Lee, K.M., Kang, H.j., Yi, J., 2021. Joint seg-
mentation and classification of hepatic lesions in ultrasound images using
deep learning. European radiology 31, 8733–8742.

Shin, S.Y., Lee, S., Yun, I.D., Jung, H.Y., Heo, Y.S., Kim, S.M., Lee, K.M.,
2015. A novel cascade classifier for automatic microcalcification detection.
PLOS ONE 10, 1–22. URL: https://doi.org/10.1371/journal.pone.
0143725, doi:10.1371/journal.pone.0143725.

Shin, S.Y., Lee, S., Yun, I.D., Kim, S.M., Lee, K.M., 2019. Joint weakly and
semi-supervised deep learning for localization and classification of masses
in breast ultrasound images. IEEE Transactions on Medical Imaging 38,
762–774. doi:10.1109/TMI.2018.2872031.

Shin, S.Y., Shen, T.C., Wank, S.A., Summers, R.M., 2023. Improving
small lesion segmentation in CT scans using intensity distribution supervi-
sion: application to small bowel carcinoid tumor, in: Iftekharuddin, K.M.,
Chen, W. (Eds.), Medical Imaging 2023: Computer-Aided Diagnosis, In-
ternational Society for Optics and Photonics. SPIE. p. 124651S. URL:
https://doi.org/10.1117/12.2651979, doi:10.1117/12.2651979.

26
Sudre, C.H., Li, W., Vercauteren, T., Ourselin, S., Jorge Cardoso, M., 2017.
Generalised dice overlap as a deep learning loss function for highly unbal-
anced segmentations, in: Cardoso, M.J., Arbel, T., Carneiro, G., Syeda-
Mahmood, T., Tavares, J.M.R., Moradi, M., Bradley, A., Greenspan, H.,
Papa, J.P., Madabhushi, A., Nascimento, J.C., Cardoso, J.S., Belagiannis,
V., Lu, Z. (Eds.), Deep Learning in Medical Image Analysis and Multi-
modal Learning for Clinical Decision Support, Springer International Pub-
lishing, Cham. pp. 240–248.

Summers, R.M., Huang, A., Yao, J., Campbell, S.R., Dempsey, J.E., Dwyer,
A.J., Franaszek, M., Brickman, D.S., Bitter, I., Petrick, N., Hara, A.K.,
2006. Assessment of polyp and mass histopathology by intravenous con-
trast–enhanced ct colonography. Academic Radiology 13, 1490–1495.
doi:https://doi.org/10.1016/j.acra.2006.09.051.

Tang, Y., Tang, Y., Zhu, Y., Xiao, J., Summers, R.M., 2020. E2 Net: An edge
enhanced network for accurate liver and tumor segmentation on ct scans,
in: Martel, A.L., Abolmaesumi, P., Stoyanov, D., Mateus, D., Zuluaga,
M.A., Zhou, S.K., Racoceanu, D., Joskowicz, L. (Eds.), Medical Image
Computing and Computer Assisted Intervention – MICCAI 2020, Springer
International Publishing, Cham. pp. 512–522.

Weninger, L., Liu, Q., Merhof, D., 2020. Multi-task learning for brain tumor
segmentation, in: Crimi, A., Bakas, S. (Eds.), Brainlesion: Glioma, Multi-
ple Sclerosis, Stroke and Traumatic Brain Injuries, Springer International
Publishing, Cham. pp. 327–337.

Zhao, Z., Chen, H., Wang, L., 2021. A coarse-to-fine framework for the
2021 kidney and kidney tumor segmentation challenge. URL: https://
openreview.net/forum?id=6Py5BNBKoJt.

27

You might also like