3-Dimensional Deep Learning With Spatial Erasing for Unsupervised Anomaly Segmentation in Brain MRI
3-Dimensional Deep Learning With Spatial Erasing for Unsupervised Anomaly Segmentation in Brain MRI
1 Introduction
In this paper, we propose to learn from entire 3D MRI volumes instead of sin-
gle 2D MRI slices using 3D instead of 2D unsupervised deep learning, shown
in Figure 1. Also, we extend the concept of spatial input erasing for regular-
ization. To this end, we provide an extensive comparison of variational au-
toencoders (VAE) with 3D and 2D convolutions and propose several different
3D spatial erasing strategies during training. For our experiments, we use a
training data set with brain MRI scans of 2008 healthy patients and evaluate
our methods on two publicly available brain segmentation data sets. We focus
on T1-weighted MRI data, which is widely used in clinics [10, 1], providing a
good starting point for anomaly detection. Moreover, we provide an analysis
of the impact and the importance of the training data set size, especially in
combination with our 3D approach.
For training, we consider a data set with anonymized T1-weighted MRI vol-
umes of 2008 healthy subjects from 22 scanners from different vendors. The
resolutions in axial direction vary from 0.39 mm to 1.25 mm with a majority of
1310 samples with 1 mm. The slice thickness lies between 0.90 mm to 2.40 mm
with a majority of 906 samples with 1 mm. 1506 samples, are acquired with
a field strength of 1.5 T, 446 samples are acquired with 3 T and 56 with 1
T. Data on all scanners was acquired during clinical routine with a standard
3D gradient echo sequence. All scans were sent to jung diagnostics GmbH for
image analysis.
For evaluation, we use two publicly available data sets. First, we consider
the publicly available Multimodal Brain Tumor Segmentation Challenge 2019
(BraTS 2019) data set [15, 2, 3] with T1-weighted image volumes of 335 sub-
4 Bengs et al.
L(xs , x̂s )
x̂s
2D Network
decoder
xs
encoder
3D Network decoder
x
xv
encoder
x̂v
L(xv , x̂v )
Fig. 1: Our approach for unsupervised anomaly segmentation using 3D deep learning com-
bined with spatial input erasing. For the 2D network only a single 2D slice xs is used as
input x and volumetric spatial context remains unexploited. Instead, our novel 3D approach
receives an entire volume xv as input x and learns combined features from all spatial dimen-
sions. Also, we propose 3D spatial input erasing, where parts of the input are missing and
the network is trained to restore missing image parts. Note, xˆs and xˆv refer to the network’s
reconstruction in 2D and 3D, respectively.
jects with the corresponding ground truth segmentation of the tumor. The
slice thickness of the BraTS 2019 data set varies from 1 mm up to 5 mm. Sec-
ond, we use the Anatomical Tracings of Lesions After Stroke (ATLAS) data
set [13], which provides T1-weighted image volumes of 304 subjects with cor-
responding ground truth segmentations of stroke regions. The slice thickness
of the ATLAS data set varies from 1 mm up to 3 mm.
For all image volumes, we apply the following preprocessing. First, we resam-
ple all scans to the same isotropic resolution of 1 mm × 1 mm × 1 mm using
cubic interpolation. Then, we follow the preprocessing of previous studies with
2D deep learning methods for UAD, which include skull stripping, denoising,
and standardization [4]. Next, we crop excessive background by using brain
masks of the MRI scans and zero-pad all MRI scans to the largest volume
resolution in our data set of 191 × 158 × 163. Last, we downsample all vol-
umes to a size of 64 × 64 × 64 for numerical efficiency, as we encounter the
computational complexity of 3D deep learning. Regarding our data split for
training, we consider 1807 healthy images for training and 201 images for val-
idation of our reconstruction performance. We split our data randomly and
stratified by scanners. Considering the images of the BraTS 2019 data set,
we randomly sample 133 images for validation and 202 for testing. Using the
ATLAS data set, we randomly sample 121 and 183 images for validation and
testing, respectively.
3-Dimensional Deep Learning Methods for Unsupervised Anomaly Segmentation 5
Our general backbone network is shown in Figure 2 and for the adaption
to 2D MRI slices or 3D MRI volumes, we employ 2D or 3D operations for
the network, e.g., we use 2D or 3D convolutions. In this way, the architecture
details remain the same for 2D and 3D, e.g., the number of layers and fea-
ture maps remain same, and only the dimension of the networks operation are
changed. Based on our validation set performance, we choose a latent space
size of z ∈ R128 and z ∈ R512 for our 2D and 3D VAE, respectively.
We study and extend the concept of cutout [9] and context-autoencoders [16],
which were proposed for 2D images. The main motivation behind our approach
is to further enhance the usage of global image context, especially in combi-
nation with 3D methods. Therefore, we propose and evaluate the following
different erasing methods for 2D and 3D, which are shown in Figure 3. Note,
we only erase the regions in the input image and not in the ground-truth im-
age that is used for optimization, hence our networks are enforced to solve an
in-painting task for abnormal regions.
Second, we extend this approach and split a single patch or cube into multiple
ones. To this end, we mask-out up to ten randomly located and sized patches
or cubes within an input image, while the overall erasing size remains in the
limit of 1% up to 25% of the input size. We call this method multiple-patch
or multiple-cube for 2D and 3D, respectively.
6 Bengs et al.
µ ∈ Rnz
32 × 32 × 32 32 × 32 × 32 64 × 64 × 64
16 × 16 × 16 16 × 16 × 16
8×8×8 8×8×8
x x̂
128 16 16 128
64 z ∈ Rnz 64
32 32 1
σ ∈ Rnz
k = 5 × 5 × 5, s = 2 reshape
k = 1 × 1 × 1, s = 1 fully connected layer
Fig. 2: Our backbone 3D VAD architecture receives input volume x ∈ R64×64×64 and
encodes it to the lower-dimensional latent variable z ∈ Rnz , afterwards the decoder recon-
structs the output x̂ ∈ R64×64×64 . The number over the boxes refer to the spatial size;
the number below the boxes refer to the number of feature maps. We use convolutions and
transposed convolutions in the encoder and decoder, respectively Note, the first convolution
in the encoder downsamples the input from 64 × 64 × 64 to 32 × 32 × 32.
Third, we erase entire brain sides based on the idea of stimulating the net-
works to exploit the symmetry of a brain. Hence, we randomly erase the right
or left side of the brain in the input slice. Similar for 3D, here we randomly
erase the right or left side of the brain in 1 up to 32 multiple sequential input
slices. We refer to this method as half-slice for 2D and half-volume for 3D.
We follow the idea of VAEs, hence we optimize our networks with respect
to the reconstruction loss between the original input image and the network
output reconstruction combined with the constraint that the latent variables
follow a multivariate normal distribution. Hence, our loss function is based on
the l1 -distance between our input and output combined with the distribution-
matching Kullback–Leibler divergence for regularization. We train our net-
works with a batch size of 32 using Adam for optimization with a learning
rate of 0.001. We individually tune the number of training epochs of the net-
works using the reconstruction performance on our validation set with images
of healthy subjects.
For all evaluations, we employ the following post-processing steps. First, we
multiply each residual image by a slightly eroded brain mask to account for er-
rors occurring at sharp brain-mask boundaries. Next, we remove small outliers
3-Dimensional Deep Learning Methods for Unsupervised Anomaly Segmentation 7
Fig. 3: Our 3D spatial input erasing methods. In each row sectional planes of a volume
with erasing are shown. Top row: We erase a single 3D cube with random location and size
(Cube). Middle row: We erase multiple 3D cubes with random location and size (Multi-
Cube). Bottom row: We erase an entire brain side in a subvolume (Half-Volume).
3 Results
3D-Cube-n. Notably, the ground truth segmentation are highlighted in all dif-
ference images, while also showing errors at further regions.
Moreover, we use our best performing 2D and 3D methods trained on T1-
weighted MRI data and evaluate on T1ce-weighed MRI data from the BraTS
2019 data set to study the effect of using additional image information, see
Table 2. Here, we observe immediate performance improvements compared to
T1-weighting for both 2D and 3D with a relative improvement of 13.61% and
21.82% for 2D and 3D considering the DICED .
Last, we evaluate our baseline and best performing methods with respect to
slice-wise anomaly detection, see Figure 6. Here, our best performing method
achieves an AUPRC of 71.2%. Also for this task using 3D information and
erasing turns out to be beneficial, improving the AUPRC by approximately
4% compared to the 2D VAE.
4 Discussion
We also evaluate 2D and 3D input erasing for regularization and train the
networks to restore missing image parts conditioned on its surroundings. Our
results in Table 1 demonstrate that input erasing allows for further perfor-
mance improvements both for our 2D and 3D VAE. Regarding the method
for masking-out a region, previous works in 2D mostly simply mask our input
regions with zeros [22, 16, 9]. However, our results demonstrate that using noise
for masking-out a region in the input works slightly better, indicating that the
increased variance during training is advantageous for regularization.
We also consider different strategies such as erasing multiple patches or an en-
tire brain side. While all erasing strategies are beneficial, there is no clear win-
ner between the different strategies considering our results on both data sets.
Furthermore, one could argue that our input erasing leads to brain anatomy
that deviates from normal, which is in slight contrast to the idea of only pro-
viding healthy brain anatomy as input. However, our ground-truth image that
is used for optimization remains unmodified, hence our networks are enforced
to solve an in-painting task for abnormal regions. Our results demonstrate
that this leads to an improved segmentation performance.
10 Bengs et al.
Table 1: Results for our 2D and 3D VAE combined with our spatial erasing methods
evaluated on the BraTS 2019 and ATLAS (Stroke) data set. The abbreviations for input
and erasing refer to the input/VAE dimension, erasing strategy and value used for masking-
out a region, e.g., 2D-Patch-0 and 2D-Patch-n stand for a 2D VAE with patch erasing, while
the first refers to masking-out a region with zeros and the second refers to masking-out a
region with noise. DICED represents the metric based on the voxel calculation of an entire
data set. DICES (µ ± σ) refers to the mean and standard deviation of the subject-wise score.
All metrics are in percent.
BraTS 2019
Input & Erasing DICED DICES (µ ± σ) AUPRC
2D-None 26.80 25.30 ± 12.37 21.19
3D-None 28.14 26.93 ± 12.40 24.69
2D-Patch-0 27.96 26.52 ± 13.42 22.53
2D-Patch-n 27.99 26.58 ± 13.27 22.54
3D-Cube-0 29.24 27.90 ± 13.57 26.18
3D-Cube-n 30.10 28.80 ± 13.74 27.85
2D-Multi-Patch-0 28.10 26.44 ± 12.89 22.54
2D-Multi-Patch-n 28.51 27.24 ± 13.14 22.81
3D-Multi-Cube-0 28.88 27.67 ± 13.22 25.82
3D-Multi-Cube-n 29.52 28.33 ± 13.42 26.18
2D-Half-Slice-0 26.86 25.44 ± 12.42 21.77
2D-Half-Slice-n 27.97 26.45 ± 13.22 22.84
3D-Half-Volume-0 28.49 27.51 ± 13.17 25.47
3D-Half-Volume-n 28.99 27.92 ± 13.24 26.07
ATLAS (Stroke)
Input & Erasing DICED DICES (µ ± σ) AUPRC
2D-None 24.72 11.23 ± 13.66 16.86
3D-None 30.68 14.42 ± 16.06 23.74
2D-Patch-0 27.68 12.23 ± 13.67 18.65
2D-Patch-n 27.42 12.36 ± 14.61 18.20
3D-Cube-0 31.50 15.59 ± 17.02 23.47
3D-Cube-n 32.68 15.53 ± 17.30 25.11
2D-Multi-Patch-0 26.99 11.82 ± 14.29 18.72
2D-Multi-Patch-n 28.06 12.88 ± 15.21 19.49
3D-Multi-Cube-0 31.83 15.23 ± 16.64 24.51
3D-Multi-Cube-n 32.37 14.99 ± 17.31 25.13
2D-Half-Slice-0 27.54 11.05 ± 13.70 18.60
2D-Half-Slice-n 28.99 12.13 ± 14.79 20.37
3D-Half-Volume-0 31.00 15.21 ± 17.00 23.14
3D-Half-Volume-n 33.05 15.27 ± 17.21 25.58
Table 2: Results for additional image information considering the BraTS 2019 data set.
DICED represents the metric based on the voxel calculation of an entire data set. DICES
(µ ± σ) refers to the mean and standard deviation of the subject-wise score. All metrics are
in percent.
80 80
2D-None 2D-None
2D-Patch-n 2D-Patch-n
60 60
DICES (%)
DICES (%)
40 40
20 20
0 0
0 100000 200000 300000 0 50000 100000 150000
Lesion size (number of pixels) Lesion size (number of pixels)
80 80
3D-None 3D-None
3D-Cube-n 3D-Cube-n
60 60
DICES (%)
DICES (%)
40 40
20 20
0 0
0 100000 200000 300000 0 50000 100000 150000
Lesion size (number of pixels) Lesion size (number of pixels)
80 80
2D-Patch-n 2D-Patch-n
3D-Cube-n 3D-Cube-n
60 60
DICES (%)
DICES (%)
40 40
20 20
0 0
0 100000 200000 300000 0 50000 100000 150000
Lesion size (number of pixels) Lesion size (number of pixels)
Fig. 4: Subject-wise DICES over lesion size. Lesion size refers to the number of annotated
pixels for the lesion. Results for the BraTS 2019 data set and ATLAS data set are shown
left and right, respectively. (Top) Comparing 2D VAE with and without erasing; (Middle)
Comparing 3D VAE with and without erasing; (Bottom) Comparing 2D and 3D VAE with
erasing. Transparent dots refer to the subject-wise DICES scores. Solid lines are derived by
a polynomial regression of order three.
To gain further insights, we study the performance with respect to the le-
sion size in Figure 4. While providing consistent performance improvements,
erasing turns out to be especially valuable for larger lesions. This might be
attributed to the fact that with erasing, networks are enforced to solve an
additional in-painting task, making them suited to handle inputs with large
anomalies. Also, our results in Figure 4 further emphasize the value of 3D
information, especially for smaller lesions considering the ATLAS data set.
12 Bengs et al.
3D-None
26
3D-Cube-n
2D-None
24 2D-Patch-n
AUPRC (%)
22
20
18
16
10 20 60 100
Data Set Size (%)
Fig. 5: Impact of data set size on the UAD performance. We train our methods with 10%,
20%, 60% and 100% of the training data, shown is the average AUPRC using our two test
data sets (BraTS 2019, ATLAS).
75
70.8 71.2
AUPRC (%)
70 69.3
68.5
65
without erasing
with erasing
60
2D VAE 3D VAE
Fig. 6: Slice-wise anomaly detection for our baseline and best performing methods. Shown
is the AUPRC on the combination of our test sets (BraTS 2019, ATLAS). 2D VAE with
and without erasing refers to 2D-None and 2D-Patch-n, respectively. 3D VAE and without
erasing refers to 3D-None and 3D-cube-n, respectively.
Next, we study the effect of the training data set size. As expected, the data
set size has a notable impact on the performance, see Figure 5. It stands out
that our 3D methods trained with only 20% of the training data even out-
perform the 2D methods trained with 100% of the data. This indicates that
increasing the spatial context during training is even more important than
increasing the data set size. This is an interesting observation, as one could
assume that due to the increased number of parameters, 3D-Models require
more data compared to their 2D-counterparts. We believe that this counter-
intuitive behaviour could explained by the increased complexity of the task
and the bigger input image for the 3D approach. The learning task of the 3D
model can be considered more complex since an entire volume must be pro-
cessed and reconstructed at once, while 2D is only trained to process a single
slice. Also, for 3D the input image is bigger (volume) compared to 2D (single
slice). Note, if the input image is bigger, then a network might need more
3-Dimensional Deep Learning Methods for Unsupervised Anomaly Segmentation 13
Fig. 7: Four example test cases using our best performing method 3D-cube-n. From left to
right: Input image, output image, difference image, heat-map difference image, and ground
truth segmentation. The first two lines contain examples from the BraTS 2019 data set and
the two the two bottom lines contain examples from the ATLAS data set.
expressive power to capture the patterns in the input image, as shown in [20].
Considering our erasing approach and the data set size suggests that solving
the additional in-painting task needs sufficient training data to provide effec-
tive regularization. However, with only 60% of the training data our models
with our regularization approach lead to higher performance than a model
without regularization trained with the full dataset. We argue this demon-
strates the effectiveness of our regularization approach, as less data is required
to achieve similar or better performance compared to a model without reg-
ularization. Still, increasing the data set size is valuable as the performance
for our model with erasing continues to improve with a larger training data set.
Comparing our novel 3D methods with input erasing with the previous 2D
approach demonstrates a relative performance improvement of 12.31% and
14 Bengs et al.
32.20% on the BraTS 2019 and ATLAS data set, respectively. A comparable
work evaluating UAD performance on the same ATLAS data set achieves
a mean subject-wise DICE score of 12 ± 12% with their best performing
method [8]. Notably, this 2D method is restoration-based and involves sig-
nificantly increased computational complexity. Our 3D approach with input
erasing leads to a mean subject-wise DICE score of 15.53 ± 17.30%, improving
the UAD state-of-the-art on this data set. This demonstrates the effective-
ness of our approach. Comparing our results on the BraTS 2019 data set with
other works that utilize additional image information, e.g. T2-weighted data
[8, 22], highlights the advantage of additional image information. Similar, we
observe immediate performance improvement for our methods when evalu-
ated on T1ce-weighted data, despite the domain adaption from T1, see Table
2. Also, other studies that use multiple MRI sequences [4, 5] achieve higher
performance metrics, however, a direct comparison is difficult due to different
data sets and settings. Notably, multiple MRI sequences are beneficial but not
always available [10, 1], imposing an additional challenge on UAD.
Putting UAD into perspective with supervised methods demonstrates that
segmentation performance is in a moderate range. Considering the BRATS
2019 data set, supervised methods achieve a mean subject-wise DICE score of
around 90% [11] utilizing all available MRI sequences (T1, T1ce, T2, FLAIR).
Considering the ATLAS data set, supervised methods achieve mean subject-
wise DICE scores in the range of 32.92% up to 53.49% [10]. While UAD is
notably more challenging than supervised segmentation, the overall UAD per-
formance on these supervised data sets might also be limited, as the annotation
focuses on pre-specified lesions and not all anomalies in the images might be
labeled. This is also demonstrated in Figure 7, where, e.g., the segmentation
focuses only on the tumor and not on all brain regions that deviate from nor-
mal. Also, the domain shifts between different data sets might be challenging,
which is also pointed out in previous works [4, 22].
Considering these challenges we also evaluate our methods with respect to
slice-wise anomaly detection, see Figure 6. Here, we observe significantly in-
creased performance compared to segmentation with an AUPRC of 71.2%
for our best performing method. The slice-wise detection performance moti-
vates that UAD can be helpful in red-flagging suspicious MRI data in clinical
routine, especially with T1-weighted MRI data. Also, we believe that unsuper-
vised segmentation gives additional cues to the reader as to where an anomaly
may be located and thus it is helpful to quickly localize a potential anomaly
or lesion. For this our work consist a valuable contribution by demonstrating
the benefits and emphasizing the use of 3D-models with spatial erasing for
voxel-wise and slice-wise UAD.
For future work, our findings could be extended to more complex deep learn-
ing methods for UAD, such as GANs [18]. In particular, combining our 3D
approach with restoration-based methods [8] might improve the overall per-
formance. However, this approach also leads to significantly increased runtime
and computational efforts, e.g., a restoration accumulates quickly to multi-
3-Dimensional Deep Learning Methods for Unsupervised Anomaly Segmentation 15
ple minutes for a single MRI [4], which is particularly challenging for clinical
routine.
5 Conclusion
References
1. Akkus, Z., Galimzianova, A., Hoogi, A., Rubin, D.L., Erickson, B.J.: Deep learning for
brain mri segmentation: state of the art and future directions. Journal of digital imaging
30(4), 449–459 (2017)
2. Bakas, S., Akbari, H., Sotiras, A., Bilello, M., Rozycki, M., Kirby, J.S., Freymann, J.B.,
Farahani, K., Davatzikos, C.: Advancing the cancer genome atlas glioma mri collections
with expert segmentation labels and radiomic features. Scientific data 4, 170117 (2017)
3. Bakas, S., Reyes, M., et Int, Menze, B.: Identifying the best machine learning algorithms
for brain tumor segmentation, progression assessment, and overall survival prediction
in the brats challenge. arXiv preprint arXiv:1811.02629 (2018)
4. Baur, C., Denner, S., Wiestler, B., Navab, N., Albarqouni, S.: Autoencoders for un-
supervised anomaly segmentation in brain mr images: A comparative study. Medical
Image Analysis p. 101952 (2020)
5. Baur, C., Wiestler, B., Albarqouni, S., Navab, N.: Deep autoencoding models for unsu-
pervised anomaly segmentation in brain mr images. In: International MICCAI Brain-
lesion Workshop, pp. 161–169. Springer (2018)
6. Bruno, M.A., Walker, E.A., Abujudeh, H.H.: Understanding and confronting our mis-
takes: the epidemiology of error in radiology and strategies for error reduction. Radio-
graphics 35(6), 1668–1676 (2015)
7. Chen, X., Konukoglu, E.: Unsupervised detection of lesions in brain mri using con-
strained adversarial auto-encoders. In: International Conference on Medical Imaging
with Deep Learning (2018)
16 Bengs et al.
8. Chen, X., You, S., Tezcan, K.C., Konukoglu, E.: Unsupervised lesion detection via image
restoration with a normative prior. Medical image analysis 64, 101713 (2020)
9. DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks
with cutout. arXiv preprint arXiv:1708.04552 (2017)
10. Ito, K.L., Kim, H., Liew, S.L.: A comparison of automated lesion segmentation ap-
proaches for chronic stroke t1-weighted mri data. Human brain mapping 40(16), 4669–
4685 (2019)
11. Jiang, Z., Ding, C., Liu, M., Tao, D.: Two-stage cascaded u-net: 1st place solution to
brats challenge 2019 segmentation task. In: International MICCAI Brainlesion Work-
shop, pp. 231–241. Springer (2019)
12. Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint
arXiv:1312.6114 (2013)
13. Liew, S.L., Anglin, J.M., Banks, N.W., Sondag, M., Ito, K.L., Kim, H., Chan, J., Ito,
J., Jung, C., Khoshab, N., Lefebvre, S., Nakamura, W., Saldana, D., Schmiesing, A.,
Tran, C., Vo, D., Ard, T., Heydari, P., Kim, B., Aziz-Zadeh, L., Cramer, S., Liu, J.,
Soekadar, S., Nordvik, J.E., Westlye, L., Wang, J., Winstein, C., Yu, C., Ai, L., Koo,
B., Craddock, R., Milham, M., Lakich, M., Pienta, A., Stroud, A.: A large, open source
dataset of stroke anatomical brain images and manual lesion segmentations. Scientific
data 5, 180011 (2018)
14. Lundervold, A.S., Lundervold, A.: An overview of deep learning in medical imaging
focusing on mri. Zeitschrift für Medizinische Physik 29(2), 102–127 (2019)
15. Menze, B., Jakab, A., Bauer, S., Kalpathy-Cramer, J., Farahaniy, K., Kirby, J., Burren,
Y., Porz, N., Slotboomy, J., Wiest, R., Lancziy, L., Gerstnery, E., Webery, M.A., Arbel,
T., Avants, B., Ayache, N., Buendia, P., Collins, L., Cordier, N., Van Leemput, K.: The
multimodal brain tumor image segmentation benchmark (brats). IEEE Transactions
on Medical Imaging 99 (2014)
16. Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A.A.: Context encoders:
Feature learning by inpainting. In: Proceedings of the IEEE conference on computer
vision and pattern recognition, pp. 2536–2544 (2016)
17. Sato, D., Hanaoka, S., Nomura, Y., Takenaga, T., Miki, S., Yoshikawa, T., Hayashi, N.,
Abe, O.: A primitive study on unsupervised anomaly detection with an autoencoder in
emergency head ct volumes. In: Medical Imaging 2018: Computer-Aided Diagnosis, vol.
10575, p. 105751P. International Society for Optics and Photonics (2018)
18. Schlegl, T., Seeböck, P., Waldstein, S.M., Langs, G., Schmidt-Erfurth, U.: f-anogan: Fast
unsupervised anomaly detection with generative adversarial networks. Medical image
analysis 54, 30–44 (2019)
19. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a
simple way to prevent neural networks from overfitting. The journal of machine learning
research 15(1), 1929–1958 (2014)
20. Tan, M., Le, Q.: Efficientnet: Rethinking model scaling for convolutional neural net-
works. In: International Conference on Machine Learning, pp. 6105–6114. PMLR (2019)
21. Vernooij, M.W., Ikram, M.A., Tanghe, H.L., Vincent, A.J., Hofman, A., Krestin, G.P.,
Niessen, W.J., Breteler, M.M., van der Lugt, A.: Incidental findings on brain mri in the
general population. New England Journal of Medicine 357(18), 1821–1828 (2007)
22. Zimmerer, D., Kohl, S.A., Petersen, J., Isensee, F., Maier-Hein, K.H.: Context-encoding
variational autoencoder for unsupervised anomaly detection. In: International Confer-
ence on Medical Imaging with Deep Learning (2019)