Watson_Learning_How_to_MIMIC_Using_Model_Explanations_To_Guide_Deep_WACV_2023_paper

Uploaded by

sirdmdnd

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Watson_Learning_How_to_MIMIC_Using_Model_Explanations_To_Guide_Deep_WACV_2023_paper

Uploaded by

sirdmdnd

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Learning How to MIMIC: Using Model Explanations to Guide Deep Learning

Training

Matthew Watson, Bashar Awwad Shiekh Hasan, Noura Al Moubayed

Durham University
Durham, UK
{matthew.s.watson,bashar.awwad-shiekh-hasan,noura.al-moubayed}@durham.ac.uk

Abstract even exceeding) that of medical professionals [22]. How-

ever, despite these developments we are yet to see a similar
Healthcare is seen as one of the most influential appli- growth in the number of DL models being deployed into
cations of Deep Learning (DL). Increasingly, DL models real-world medical scenarios [2]. This is down to numerous
have been shown to achieve high-levels of performance on limiting factors; most notably, before such techniques can
medical diagnosis tasks, in some cases achieving levels of become established in the medical field, they must be eth-
performance on-par with medical experts. Yet, very few are ical in their decision-making, trustworthy, transparent and
deployed into real-life scenarios. One of the main reasons explainable [5, 12].
for this is the lack of trust in those models by medical profes- It is in these areas that many DL models can perform
sionals driven by the black-box nature of the deployed mod- poorly. In particular, many models fail to accurately cap-
els. Numerous explainability techniques have been devel- ture the causal relationships between input features and the
oped to alleviate this issue by providing a view on how the output classification and rely instead on task irrelevant fea-
model reached a given decision. Recent studies have shown tures. For example, a wide-ranging study on the use of
that those explanations can expose the models’ reliance on Machine Learning (ML) and DL techniques for COVID-19
areas of the feature space that has no justifiable medical in- prediction from chest x-rays (CXRs) [17] has shown that
terpretation, widening the gap with the medical experts. In many models are making spurious correlations, leading to
this paper we evaluate the deviation of saliency maps pro- the models being unable to accurately generalise. Further-
duced by DL classification models from radiologist’s eye- more, recent studies on the robustness of DL models have
gaze while they study the MIMIC-CXR-EGD images, and shown that changes to training hyperparameters can greatly
we propose a novel model architecture that utilises model affect the learned features [26] - this damages the trust be-
explanations during training only (i.e. not during inference) tween clinicians and DL techniques as it highlights just how
to improve the overall plausibility of the model explana- sensitive to small changes the models are, even when those
tions. We substantially improve the similarity between the changes are independent of the medical questions the model
model’s explanations and radiologists’ eye-gaze data, re- is trying to answer.
ducing Kullback-Leibler Divergence by 90% and increasing Thus, the gold-standard for any ML model is to be able
Normalised Scanpath Saliency by 216%. We argue that this to achieve high-levels of performance whilst learning the
significant improvement is an important step towards build- concrete causal relationships present in the data. Unfortu-
ing more robust and interpretable DL solutions in health- nately, the presence of learned causal features is extremely
care. difficult to verify due to a lack of useful data supporting the
task. Following practices in pedagogy, expert’s Eye Gaze
Data (EGD) can be used as a proxy for causal relationships
1. Introduction [23, 19]. The release and initial analysis of the MIMIC-
CXR-EGD dataset [15] showed that even current state-of-
Applications of Deep Learning (DL) to healthcare have the-art CXR classification models fail to learn the same set
been growing rapidly in a wide range of medical scenarios; of features as used by radiologists in their diagnoses.
ranging from critical care [24] and diabetes risk prediction In this paper, we present a novel deep learning architec-
[1] to the diagnosis of chest x-rays (CXRs) [28]. This is ture that learns a more consistent feature set than previous
partly driven by the rising accuracy of such models, with techniques. Using the MIMIC-CXR-EGD dataset, which
some beginning to achieve performance on-par with (or to the best of our knowledge is the only large-scale image

1461
dataset with accompanying expert eye-gaze data, we com- similar techniques being used pedagogically in fields such
pare the similarity between explanations computed from DL as radiology [25]. The MIMIC-CXR-EGD dataset [15] is a
models and the EGD from radiologists. We report that there subset of MIMIC-CXR [14], containing 1,083 CXR images
is a significant increase in overlap (increasing from -0.4634 from three classes (Pneumonia, Congestive Heart Failure
to 0.5410 when measured by Normalised Scanpath Saliency and Normal). Accompanying the images are aligned EGD
and improving from 9.1233 to 0.8398 when measured by from a trained radiologist. Both raw eye gaze information
Kullback-Leibler Divergence) between explanations from and calculated fixation points are available for this EGD -
our proposed technique and the EGD than there is from any we refer readers interested in the EGD collection process to
other model architecture tested; including current state-of- [15]. Alongside the release of the dataset the authors also
the-art methods specifically designed to combat this issue. show that explanations from traditional classification mod-
We also show that our proposed architecture produces more els do not significantly overlap with the radiologist’s EGD.
consistent explanations than previous models, increasing They propose a multi-task UNet model which uses EGD at
explanation consistency [26] from 0.1785 to 0.5333 with train-time to learn to both classify the CXR image and re-
no cost to model performance nor the need for specialist’s produce the ground-truth EGD in order to improve the sim-
EGD at inference time. ilarity between model explanations and EGD. However, the
results are not very convincing and the study lacked a verifi-
2. Related Work able method of comparing their model explanations and the
EGD. Additionally, this technique requires the use of expert
In order to explain the decisions made by DL models, nu- EGD during training which is costly and difficult to collect,
merous explainability techniques have been developed with especially in the medical domain. We compare our method
the aim of “opening up” the black-box architectures. In this against both the baseline models and the improved UNet ar-
paper we focus on two post-hoc techniques [13] that are de- chitecture using static EGD heatmaps proposed in [15] re-
signed to explain deep learning models; our aim is to com- sulting in significantly higher degree of similarity between
pare the explanations from a variety of established architec- model explanations and EGD across all tested metrics.
tures (as well as our novel models) and so the techniques
used must be model-agnostic and easy to apply. SHAP
[16] is a permutation-based approach which has theoretical 3. Method
groundings in game theory. Grad-CAM [18] is a gradient- Our proposed architecture consists of an ensemble archi-
based approach which uses the gradient of any target con- tecture M made up of S sub-models (of any architecture)
cept flowing into the final convolutional layer of a network and a discriminator, D. We begin by describing the archi-
to produce a saliency map. We focus on these two tech- tecture of our model, and then detail its training procedure.
niques in this paper as not only are they the current de-facto We define an explanation ensemble model as M : X →
standards, but they can also both be applied to a wide-range Y, where X is the set of inputs, and Y the outputs. M
of model architectures allowing for the easy comparison of consists of S sub-models m0 , ..., mS , where S ∈ N, each
explanations from varying model types. of which has the same architecture suited to the task. In
Previous work has used these explainability techniques our proposed network, each mi is trained with a different
to investigate the robustness and adaptability of DL models hyperparameter setup - i.e. with different random seeds, or
[26, 8], finding that even small changes to the training pro- training data order. Architecture hyperparameters, such as
cedure can result in significant changes to the learned fea- layer size and learning rate, are kept constant. The final
tures. These results, coupled with many network’s suscepti- output of the explanation ensemble is the average output of
bility to issues such as adversarial attacks [10] and shortcut all sub-models:
learning [9], suggest that many modern DL architectures
are not necessarily learning causal relationships in the data P
mi (x)
i∈[0,S]
to achieve high performance and might be relying on spuri- M (x) = (1)
S
ous correlations. It can be extremely difficult to verify that
the learned features are indeed causal - there are only a lim- The network also adds a discriminator, D : E → S,
ited number of mostly toy datasets that include descriptions where E is the set of model explanations (calculated via any
of their causal relationships [3]. feature importance attribution method) and S = [0, S]. We
In the absence of such data, recent work has used EGD denote the explanations of sub-model mi on the inputs x as
of experts making decisions on a visual task as a proxy for Ei (x). The discriminator is trained on the explanations pro-
concrete causal relationships [15]. Such data can be used duced by each of the S sub-models, with the aim of learn-
to determine whether models are learning features that do- ing to identify which of the sub-models a given explanation
main experts would use in their assessment of the data - this originated from. As the task of the discriminator has been
use case has groundings from real-world applications, with shown to be easily learned [26] the architecture of D should

1462
be chosen carefully, ensuring it is not too complex that M models must agree that any given feature is important for it
is drastically overfitting. to be used, it is more likely that these are causally related
The S sub-models and discriminator D are all trained with the target, and thus is more likely to be included in an
together, optimising the loss function in Equation 2, where expert’s eye-gaze data.
CELoss(·, ·) is cross-entropy loss and β ∈ [0, ∞) is a hy-
perparameter weighting D’s contribution during a training 4. Experimental Setup
epoch. The subtraction of the discriminator loss in Eq. 2
ensures that the sub-model mi “fools” the discriminator by All experiments1 are carried out on the MIMIC-CXR-
learning to produce explanations that are similar to that of EGD dataset [15]. The models are trained on the same 3-
the other sub-models in the ensemble. label classification task: given a CXR image, predict its
diagnosis (Pneumonia, Congestive Heart Failure or Nor-
X mal). We train three architectures to compare our explana-
loss = CELoss(mi (x), y) − β · CELoss(D(Ei (x)), i) tion ensemble to: 1) baseline: a standard UNet architecture
i trained with a learning rate (LR) of 0.003 with Adam opti-
(2) miser, batch size 32, and pre-trained EfficientNet-b0 [21] as
Every α epochs (where α is another tunable hyperparam- the encoder and bottleneck layers; 2) improved UNet: the
eter), the discriminator D is updated with respect to the loss modified UNet architecture [15] using static heatmaps dur-
function CELoss(D(Ei (x)), i), without back-propagating ing training to both classify and reproduce the EGD given a
through the sub-models, allowing D to learn how to ef- CXR using identical hyperparameters; and 3) standard en-
fectively classify the explanations. This only needs to be semble: an ensemble architecture consisting of 10 UNet ar-
done every α epochs due to the ease of the task [26]. This chitectures identical to 2), trained with LR=0.003 using the
equates to the S sub-models and D being updated in a two- Adam optimiser and batch size 4 [15]. A reduced batch was
player minimax game - the goal of D is to learn to separate used due to memory constraints. Each experiment allows us
the sub-models’ explanations, whereas the sub-models are to compare our results against a different standard of model:
aiming to fool the discriminator, all whilst also optimising 1) is a standard classification model and used as a baseline,
m0 , ..., mi on the downstream task. The result is a set of S 2) is the SOTA for similarity between model explanations
sub-models that produce similar explanations. The assump- and EGD, and 3) confirms that our results are not just a re-
tion here is that this learnt explanation is closer to represent- sult of utilising an ensemble architecture (and instead are
ing the causal relationships and less reliant on the spurious inherent to our proposed architecture and training proce-
correlations. dure). UNet was chosen to allow for direct comparison with
Training of this model can be unstable - this is a direct the current state of the art model on the MIMIC-CXR-EGD
consequence of the discriminator and ensemble sub-models dataset in [15]. We also experimented with Vision Trans-
having opposing goals. For example, if each sub-model formers [7], however due to the small size of MIMIC-CXR-
gives each feature of the input equal weight then the loss EGD they are unable to gain levels of performance match-
of the discriminator will be maximised, reducing Eq. 2. ing those of our baseline and so we do not include their
However, this would also result in the sub-model predicting results in this paper. Across all experiments the same 80/20
the same class for every input. Training stability is linked train-test split is used for the MIMIC-CXR-EGD dataset.
to a “good” choice of α. This can be optimised like any hy- We train our proposed explanation ensembles using stan-
perparameter (e.g. through a grid-search or random search), dard UNet with a classification head as our sub-models.
although we have empirically found through experimenta- Batch sizes of 4 and a learning rate of 0.00001 using the
tion that α = 2 provides stable training. Adam optimiser are used. We use a CNN for our discrimi-
To summarise, the intuition behind our architecture is to nator, with two convolution layers. Max pooling (with ker-
train a discriminator D which encourages each of the S sub- nel size and stride of 2) and ReLU activations are used af-
models in an ensemble to learn a similar set of features. ter each convolution layer. We set β = 0.2 to ensure the
As each of the sub-models is trained with a different hy- two parts of the main loss function are of the same order of
perparameter setup, they will each learn a slightly differ- magnitude. We use 10 sub-models per Explanation Ensem-
ent set of features. As training progresses, D will learn to ble (see the Supplementary Material for results on different
use the noisy features of each sub-model to (correctly) clas- numbers of sub-models). We report the accuracy (across all
sify which sub-model explanations originate from - and in three labels) for all models as a performance metric.
turn, the sub-models will learn to use different features for In order to allow for direct comparison with [15], we
its classification, in order to fool D. The final result is an compute the explanations for all models using Grad-CAM
ensemble model that has learned to “ignore” a wide range
of spurious features, with each of the sub-models only us- 1 Code to reproduce these experiments can be found at:
ing features which all mi agree are important. As multiple https://github.com/mattswatson/learning-to-mimic

1463
[18] on the final convolution layer. We sampled exam- explanations and the EGD. Table 1 in the Supplementary
ples from the test set for inspection. We compare the sim- Material reports the results for each training hyperparam-
ilarity of these explanations to EGD heatmaps generated eter setup used. The performance of both the Baseline
from the eye-gaze fixations, which gives us scalar values and Improved UNet models are equal to the results re-
of importance for each pixel based on the radiologist’s eye ported in [15], confirming that these models are behaving
gaze [15]. To measure similarity to the EGD heatmaps we as expected. Furthermore, both ensembling techniques per-
follow standard practice of comparing saliency maps [4]; form better than these two models; this is to be expected
we report both the Kullback–Leibler Divergence (KLD) as given that they are ensemble architectures [6]. Importantly,
a distribution-based metric, and the Normalised Saliency our Explanation Ensemble architecture is shown to improve
Scanpath (NSS) as a location-based metric. KLD is an upon the performance of the baseline models by 3.39% in-
information-theoretic measure of the difference between dicating that the models are not sacrificing model perfor-
one probability distribution and another; importantly, note mance for improved explanations. Given that the expla-
that it is a divergence metric, meaning smaller values indi- nations from Explanation Ensembles are shown to better
cate better similarity. NSS is designed to be used to com- align with radiologist EGD, this also suggests that features
pare saliency maps with a ground-truth, and is the nor- used by radiologists are better for disease classification than
malised saliency at fixed locations. We note that metrics those learned by the baseline model.
such as Intersection over Union (IoU) are not suited to com-
paring EGD and saliency heatmaps [4] as one must con- Both Table 1 and Figure 1 report the Kullback-Leibler
sider how much importance is placed on each pixel (by Divergence and Normalised Scanpath Saliency between
both the model and the expert), rather than treating expla- the Grad-CAM explanations from each model architecture
nations/EGD as binary heatmaps. and the radiologist’s EGD heatmaps (for details on EGD
It is known that NSS is sensitive to false positives, how- heatmap generation, see [15]). From Figure 1 we can see
ever that is desirable here - we hypothesise that the (non- that our Explanation Ensemble model produces explana-
explanation ensemble) models are learning many noisy fea- tions that are more similar to the EGD than all other ar-
tures which are not necessarily causally linked to output - chitectures tested, when measured by both a distribution-
we want to penalise the models if this is indeed the case. based measure (KLD) and a location-based metric (NSS).
Negative NSS values highlight negative correlation, with To confirm that these conclusions are statistically correct,
chance at 0 and positive values indicating positive corre- we perform a Paired t-test at the α = 0.05 significance
lation. level between the similarity metrics from the baseline and
Explanation consistency [26] measures the change in Explanation Ensemble models. Our null and alternative hy-
model explanations under different hyper parameter settings potheses are the same for both KLD and NSS: H0 : µd =
perpendicular to the task. Higher consistency is linked to 0, H1 : µd ̸= 0, where µd is the mean of the differences be-
explanations more robust to spurious correlations [26]. We tween the KLD/NSS values for the two architectures. The
would expect our explanation ensemble model to achieve distributions of the differences were confirmed to be nor-
higher explanation consistency than other models tested. mal before carrying out the t-test. Table 2 reports both
For each architecture, 10 models are trained with different the test statistics and p-values for each of our hypothesis
random seeds. The Grad-CAM explanations are generated tests. Given that all p-values are significantly less than α,
on the test set for these 10 models, with these explanations we can conclude that our explanation ensemble architec-
also being used to calculate the explanation consistency C ture produces explanations that are statistically more simi-
for each architecture. Following the methods of [26], we lar to radiologist EGD than both baseline and current state-
use a binary logistic regression classifier to measure the sep- of-the-art techniques. Significantly, all models except ex-
arability of two sets of explanations. planation ensembles achieve negative NSS scores, showing
Furthermore, we confirm our results on Grad-CAM by anti-correspondence against the EGD [4] and making our
repeating these experiments with SHAP. This confirms that explanation ensemble architecture the only method tested
our results are not limited to one explanation technique; if to use features that are positively correlated with those used
both explainability methods agree on the outcome, then we by experts. This is further highlighted by the large reduction
can conclude with increased certainty that the model is in- in KLD from our methods when compared with the base-
deed learning “better” (i.e. similar, causal) features. line models tested; this underlines how significantly dif-
ferent the features used by current state-of-the-art models
5. Results and Discussion and medical experts are (and follows results suggesting that
many networks suffer from shortcut learning [9] and spuri-
Table 1 reports the best model performance as well as ous correlations [27]), and shows that our proposed method
summary statistics for both the KLD and NSS metrics used is a significant improvement. While we have focused on
to compare the similarity between the model’s Grad-CAM Explanation Ensembles of size 10 in this paper, the effect

1464
Table 1. Table reporting the performance of the best-performing model for each architecture, alongside the similarity between the model
Grad-CAM explanations and the EGD. Note that KLD is a divergence metric, and so smaller is better. Grad-CAM explanation consistency
was calculated across all 10 training hyperparameter setups for each architecture.
KLD NSS
Model Accuracy Mean (± std. dev) Median (± IQR) Mean (± std. dev) Median (± IQR) Consistency
Baseline 75.55% 14.4041 ± 7.6886 13.4535 ± 10.5240 −0.8579 ± 1.2345 −1.0391 ± 1.4737 0.1785
Improved UNet 76.51% 9.9371 ± 6.4179 9.1221 ± 8.4260 −0.3244 ± 1.5237 −0.4634 ± 1.9781 0.1596
Normal Ensemble 79.86% 3.8839 ± 3.2510 2.7740 ± 4.0799 −0.1646 ± 1.5721 −0.1307 ± 2.0840 0.3042
Explanation Ensemble (Ours) 78.94% 0.8196 ± 0.1273 0.8398 ± 0.1658 0.6757 ± 1.1178 0.5410 ± 1.5653 0.5333

Figure 1. Boxplots of mean (a) NSS and (b) KLD between model Grad-CAM explanations and radiologist EGD, across each of the 10
training random seeds tested. Note that KLD is a divergence metric meaning smaller values are better.

of changing the number of sub-models is explored in Fig- the Grad-CAM results, we see that our proposed Explana-
ure 1 of the Supplementary Material. These experiments tion Ensemble architecture improves the similarity upon all
show that as the number of sub-models increase so does the other model architectures tested. Similar patterns can be
agreement between model explanations and the EGD - how- seen between all 4 architectures tested across the KLD and
ever, it is important to note the trade-off between training NSS values on the Grad-CAM and SHAP results, with the
cost and increased performance as the Explanation Ensem- boxplots highlighting that the level of improvement of our
ble size increases. explanation ensemble architecture is at the same scale re-
In addition to improved similarity with expert EGD, ex- gardless of the explainablility technique used. As both the
planation consistency (Table 1) is also significantly im- results of Grad-CAM and SHAP agree, we can conclude
proved in our explanation ensemble model. This can also that our proposed model is learning to use features simi-
be seen by the significantly smaller range of NSS and larly to a radiologist. These results can also be seen from
KLD of the explanations from the explanation ensembles a visual comparison of explanations: Figure 3 shows exam-
(as reported in Figure 1) when compared with other ar- ple CXRs and their corresponding EGD and explanations
chitectures tested. This inherently increases trust in the from all models tested, showing that our explanation en-
model, as it shows that our architecture is more robust than semble places much more importance on regions similar to
the others tested. It also further highlights how our net- the expert radiologist (i.e. around the lungs and heart) than
work learns “better” (i.e. similar to those in EGD) fea- both the baseline and current state of the art models. No-
tures than the baseline models - our model is learning fewer tice how columns 2 (baseline Grad-CAM) and 3 (Improved
noisy/spurious features and instead placing more impor- UNet Grad-CAM) in Figure 3 show how much of the fea-
tance on the features that have a higher probability of being ture attribution is placed in spuriously correlated features
causally related to the task. (such as the top-left corner and the image borders). On the
We also investigate the similarity between SHAP values other hand, our explanation ensemble architecture learns a
and the EGD data; this is shown in Figure 2. Similarly to significantly different set of features (using features around

1465
Figure 2. Boxplots showing the mean (a) NSS and (b) KLD between model SHAP explanations and radiologist EGD, across each of the
10 training random seeds tested. Note that KLD is a divergence metric meaning smaller values are better.

the lungs and heart, with these areas much more closely 6. Conclusion
matching the areas shown in the EGD heatmap in the first
column), further showing that our training technique has a Through the use of two explainability techniques and
notable affect on the representations learned by the model. both distribution- and location-based metrics, we have
This is desirable, as it highlights how our model is learn- shown that our Explanation Ensemble technique improves
ing to use features similar to those used by experts, making upon baseline models in both terms of performance and
it less likely that our model is over-reliant on spurious fea- explanation similarity to EGD on the MIMIC-CXR-EGD
tures. dataset. Furthermore, we have shown that the Explanation
Figure 4 shows how the learned features of our explana- Ensemble architecture also improves upon the current state-
tion ensemble model change as training progresses. Note of-the-art models which share learned features with radiol-
that this figure shows only the most important pixels of ogist’s EGD. In addition to improving agreement between
each model - when showing the importance of all pixels, model explanations and expert EGD, our proposed model
the heatmaps become difficult to analyse by eye. In partic- architecture also improves classification performance and
ular, Figure 4 highlights how our training process (i.e. the explanation consistency when compared with current state
discriminator and our loss function in Equation 2) encour- of the art techniques. Qualitative analysis of our results
ages the sub-models of our ensemble to learn similar fea- shows that our proposed architecture is a highly significant
tures as training progresses, despite the sub-models starting improvement upon current models, and whilst we do not
with vastly different sets of explanations. This verifies that claim that our results are yet perfect they are a huge im-
our intuitive understanding of our explanation ensemble ar- provement in what is a very difficult task. Furthermore,
chitecture, and most importantly our understanding of why unlike the previous state of the art [15] technique, our pro-
it produces explanations closer to expert’s EGD, is correct. posed architecture does not require EGD heatmaps during
training - due to the cost of collecting EGD (especially
Table 2. Test statistics t and p-values for the Paired t-test per- in fields such as medicine, where expert knowledge is re-
formed between the Explanation Ensembles and Baseline (top) quired), we believe this is a significant advantage over pre-
and the Explanation Ensembles and Improved UNet (bottom) viously proposed methods.
models. In future work, it would be interesting to perform an in
Test Statistic p-value
depth causal analysis of the learned features of our model
KLD 18.005 6.8698 × 10−34
and compare this with a causal analysis of the learned fea-
NSS -9.9137 5.7567−17
tures of baseline models. The improved performance, in-
Test Statistic p-value creased explanation consistency and improved better agree-
KLD 14.4617 7.5950 × 10−27 ment with expert EGD suggests that our architecture may
NSS -5.8058 3.5764 × 10−8 be learning more causal features than the baseline models,

1466
Figure 3. 3 samples from the MIMIC-CXR-EGD dataset, overlaid with the radiologist’s EGD and Grad-CAM explanations from the
baseline, improved UNet and Explanation Ensemble models.

with the baseline models possibly relying more on spuri- and other sensitive fields, as well as the release of further
ous features. We hypothesise this as one would only ex- datasets similar to MIMIC-CXR-EGD which can facilitate
pect causal features to be those that are learned consistently this type of research.
across multiple variations of a well-performing model. Fur-
thermore, the increased agreement with expert radiologists Acknowledgements
(whom you would expect to use causal features in their di-
agnoses) further supports this conclusion. However, to fully This work is supported by grant 25R17P01847 from the
verify this hypothesis, an extensive causal analysis of the European Regional Development Fund and Cievert Ltd.
trained models, and their learned features, must be under-
taken (using techniques such as those used in [20] and [11]) References
and so we leave this for future work. [1] Zakhriya Alhassan, Matthew Watson, David Budgen, Riyad
Alshammari, Ali Alessa, and Noura Al Moubayed. Improv-
Due to its increased similarity with a medical profes- ing current glycated hemoglobin prediction in adults: Use of
sional’s decision making process, we believe that more trust machine learning algorithms with electronic health records.
will be placed in our model by clinicians than current state- JMIR Med Inform, 9(5):e25237, May 2021.
of-the-art techniques. We hope that these results encourage [2] Stan Benjamens, Pranavsingh Dhunnoo, and Bertalan
the use of our architecture in other areas of medical practice, Meskó. The state of artificial intelligence-based fda-

1467
Figure 4. Average GradCAM values (across the validation split) of each sub-model of our Explanation Ensemble model, as training
progresses. To aid with visualisation, only the most important 50% of pixels are shown. Sub-models start training with vastly different
learned features, and as training progresses our training procedure encourages the sub-models to learn similar features. A fully animated
version of this figure, and code to reproduce it on other models, will be released upon publication.

1468
approved medical devices and algorithms: an online von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fer-
database. npj Digital Medicine, 3(1):118, Sep 2020. gus, S. V. N. Vishwanathan, and Roman Garnett, editors,
[3] Bradley Butcher, Vincent S. Huang, Christopher Robinson, Advances in Neural Information Processing Systems 30: An-
Jeremy Reffin, Sema K. Sgaier, Grace Charles, and Novi nual Conference on Neural Information Processing Systems
Quadrianto. Causal datasheet for datasets: An evaluation 2017, December 4-9, 2017, Long Beach, CA, USA, pages
guide for real-world data analysis and data collection design 4765–4774, 2017.
using bayesian networks. Frontiers in Artificial Intelligence, [17] Michael Roberts, Derek Driggs, Matthew Thorpe, Julian
4, 2021. Gilbey, Michael Yeung, Stephan Ursprung, Angelica I.
[4] Zoya Bylinskii, Tilke Judd, Aude Oliva, et al. What do differ- Aviles-Rivero, Christian Etmann, Cathal McCague, Lucian
ent evaluation metrics tell us about saliency models? IEEE Beer, Jonathan R. Weir-McCall, Zhongzhao Teng, Effrossyni
transactions on pattern analysis and machine intelligence, Gkrania-Klotsas, Alessandro Ruggiero, Anna Korhonen,
41(3):740–757, 2018. Emily Jefferson, Emmanuel Ako, Georg Langs, Ghassem
[5] D. S. Char, M. D. Abràmoff, and C. Feudtner. Identify- Gozaliasl, Guang Yang, Helmut Prosch, Jacobus Preller, Jan
ing Ethical Considerations for Machine Learning Healthcare Stanczuk, Jing Tang, Johannes Hofmanninger, Judith Babar,
Applications. Am J Bioeth, 20(11):7–17, 11 2020. Lorena Escudero Sánchez, Muhunthan Thillai, Paula Mar-
[6] Xibin Dong, Zhiwen Yu, Wenming Cao, Yifan Shi, and tin Gonzalez, Philip Teare, Xiaoxiang Zhu, Mishal Patel,
Qianli Ma. A survey on ensemble learning. Frontiers of Conor Cafolla, Hojjat Azadbakht, Joseph Jacob, Josh Lowe,
Computer Science, 14(2):241–258, 2020. Kang Zhang, Kyle Bradley, Marcel Wassin, Markus Holzer,
Kangyu Ji, Maria Delgado Ortet, Tao Ai, Nicholas Walton,
[7] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov,
Pietro Lio, Samuel Stranks, Tolou Shadbahr, Weizhe Lin,
et al. An image is worth 16x16 words: Transformers for im-
Yunfei Zha, Zhangming Niu, James H. F. Rudd, Evis Sala,
age recognition at scale. arXiv preprint arXiv:2010.11929,
Carola-Bibiane Schönlieb, and AIX-COVNET. Common
2020.
pitfalls and recommendations for using machine learning
[8] Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pas-
to detect and prognosticate for covid-19 using chest radio-
cal Germain, Hugo Larochelle, François Laviolette, Mario
graphs and ct scans. Nature Machine Intelligence, 3(3):199–
Marchand, and Victor Lempitsky. Domain-adversarial train-
217, Mar 2021.
ing of neural networks. The journal of machine learning
[18] Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek
research, 17(1):2096–2030, 2016.
Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Ba-
[9] Robert Geirhos, Jörn-Henrik Jacobsen, Claudio Michaelis, tra. Grad-cam: Visual explanations from deep networks
Richard S. Zemel, Wieland Brendel, Matthias Bethge, and via gradient-based localization. In 2017 IEEE International
Felix A. Wichmann. Shortcut learning in deep neural net- Conference on Computer Vision (ICCV), pages 618–626,
works. CoRR, abs/2004.07780, 2020. 2017.
[10] Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. [19] Shyamli Sindhwani, Gregory Minissale, Gerald Weber,
Explaining and harnessing adversarial examples. In Yoshua Christof Lutteroth, Anthony Lambert, Neal Curtis, and Eliz-
Bengio and Yann LeCun, editors, 3rd International Confer- abeth Broadbent. A multidisciplinary study of eye track-
ence on Learning Representations, ICLR 2015, San Diego, ing technology for visual intelligence. Education Sciences,
CA, USA, May 7-9, 2015, Conference Track Proceedings, 10(8), 2020.
2015. [20] Sumedha Singla, Stephen Wallace, Sofia Triantafillou, and
[11] Yash Goyal, Amir Feder, Uri Shalit, and Been Kim. Ex- Kayhan Batmanghelich. Using causal analysis for concep-
plaining classifiers with causal concept effect (cace). arXiv tual deep learning explanation. In International Conference
preprint arXiv:1907.07165, 2019. on Medical Image Computing and Computer-Assisted Inter-
[12] Joshua James Hatherley. Limits of trust in medical ai. Jour- vention, pages 519–528. Springer, 2021.
nal of Medical Ethics, 46(7):478–481, 2020. [21] Mingxing Tan and Quoc Le. Efficientnet: Rethinking model
[13] Andreas Holzinger, Chris Biemann, Constantinos S. Pat- scaling for convolutional neural networks. In International
tichis, and Douglas B. Kell. What do we need to build Conference on Machine Learning, pages 6105–6114. PMLR,
explainable AI systems for the medical domain? CoRR, 2019.
abs/1712.09923, 2017. [22] Eric J. Topol. High-performance medicine: the conver-
[14] Alistair E. W. Johnson, Tom J. Pollard, Seth J. Berkowitz, gence of human and artificial intelligence. Nature Medicine,
Nathaniel R. Greenbaum, Matthew P. Lungren, Chih-ying 25(1):44–56, Jan 2019.
Deng, Roger G. Mark, and Steven Horng. Mimic-cxr, a [23] A van der Gijp, C J Ravesloot, H Jarodzka, M F van der
de-identified publicly available database of chest radiographs Schaaf, I C van der Schaaf, J P J van Schaik, and Th J
with free-text reports. Scientific Data, 6(1):317, Dec 2019. Ten Cate. How visual search relates to visual diagnostic
[15] Alexandros Karargyris, Satyananda Kashyap, Ismini performance: a narrative systematic review of eye-tracking
Lourentzou, et al. Creation and validation of a chest research in radiology. Adv. Health Sci. Educ. Theory Pract.,
x-ray dataset with eye-tracking and report dictation for ai 22(3):765–787, Aug. 2017.
development. Scientific Data, 8(1):92, Mar 2021. [24] Alfredo Vellido, Vicent Ribas, Carles Morales, Adolfo
[16] Scott M. Lundberg and Su-In Lee. A unified approach to Ruiz Sanmartı́n, and Juan Carlos Ruiz Rodrı́guez. Machine
interpreting model predictions. In Isabelle Guyon, Ulrike learning in critical care: state-of-the-art and a sepsis case

1469
study. BioMedical Engineering OnLine, 17(1):135, Nov
2018.
[25] Stephen Waite, Arkadij Grigorian, Robert G. Alexander,
Stephen L. Macknik, Marisa Carrasco, David J. Heeger, and
Susana Martinez-Conde. Analysis of perceptual expertise in
radiology – current knowledge and a new perspective. Fron-
tiers in Human Neuroscience, 13, 2019.
[26] Matthew Watson, Bashar Awwad Shiekh Hasan, and
Noura Al Moubayed. Agree to disagree: When deep learning
models with identical architectures produce distinct explana-
tions. CoRR, abs/2105.06791, 2021.
[27] Yao-Yuan Yang and Kamalika Chaudhuri. Understanding
rare spurious correlations in neural networks, 2022.
[28] Erdi Çallı, Ecem Sogancioglu, Bram van Ginneken,
Kicky G. van Leeuwen, and Keelin Murphy. Deep learning
for chest x-ray analysis: A survey. Medical Image Analysis,
72:102125, 2021.

1470

CCR PDF
0% (1)
CCR PDF
5 pages
10 1109@CVPRW50498 2020 00379
No ratings yet
10 1109@CVPRW50498 2020 00379
9 pages
Basak Pseudo-Label Guided Contrastive Learning For Semi-Supervised Medical Image Segmentation CVPR 2023 Paper
No ratings yet
Basak Pseudo-Label Guided Contrastive Learning For Semi-Supervised Medical Image Segmentation CVPR 2023 Paper
12 pages
Eye-gaze Guided Multi-modal Alignment for Medical Representation Learning
No ratings yet
Eye-gaze Guided Multi-modal Alignment for Medical Representation Learning
28 pages
Uncertainty-weighted and relation-driven consistency training for semi-supervised head-and-neck tumor segmentation
No ratings yet
Uncertainty-weighted and relation-driven consistency training for semi-supervised head-and-neck tumor segmentation
12 pages
Multi-task-deep-learning-for-medical-image-comput_2023_Computers-in-Biology-
No ratings yet
Multi-task-deep-learning-for-medical-image-comput_2023_Computers-in-Biology-
15 pages
A Review of Explainable Deep Learning Cancer Detection Models in Medical Imaging
No ratings yet
A Review of Explainable Deep Learning Cancer Detection Models in Medical Imaging
21 pages
Unbalanced and Small Sample Deep Learning For COVID X-Ray Classification
No ratings yet
Unbalanced and Small Sample Deep Learning For COVID X-Ray Classification
13 pages
3) Applications and challenges of artificial intelligence in diagnostic Applications and
No ratings yet
3) Applications and challenges of artificial intelligence in diagnostic Applications and
5 pages
9. CauSSL Causality-Inspired Semi-supervised Learning for Medical Image Segmentation ICCV 2023 Paper
No ratings yet
9. CauSSL Causality-Inspired Semi-supervised Learning for Medical Image Segmentation ICCV 2023 Paper
12 pages
Endoscopic Image Classification Based On Explainable Deep Learning
No ratings yet
Endoscopic Image Classification Based On Explainable Deep Learning
14 pages
A Real-World Dataset and Benchmark For Foundation Model Adaptation in Medical Image Classification
No ratings yet
A Real-World Dataset and Benchmark For Foundation Model Adaptation in Medical Image Classification
9 pages
A New Deep Learning Method For Automatic Ovarian Cancer Prediction & Subtype Classification
No ratings yet
A New Deep Learning Method For Automatic Ovarian Cancer Prediction & Subtype Classification
10 pages
Predictive_overfitting_in_immunological_applicatio
No ratings yet
Predictive_overfitting_in_immunological_applicatio
11 pages
Sanity Checks For Saliency Maps
No ratings yet
Sanity Checks For Saliency Maps
30 pages
Automatic lung segmentation in chest Xray images using SAM with prompts from YOLO
No ratings yet
Automatic lung segmentation in chest Xray images using SAM with prompts from YOLO
12 pages
karam last (2) (2)
No ratings yet
karam last (2) (2)
5 pages
Can We Trust Deep Learning Model For Diagnosis
No ratings yet
Can We Trust Deep Learning Model For Diagnosis
10 pages
Research Proposal
No ratings yet
Research Proposal
16 pages
DmWorddoc
No ratings yet
DmWorddoc
6 pages
256_Camera Ready - Copy
No ratings yet
256_Camera Ready - Copy
9 pages
5.Sanity Checks for Saliency Maps
No ratings yet
5.Sanity Checks for Saliency Maps
30 pages
ECCE_Presentation
No ratings yet
ECCE_Presentation
4 pages
bioengineering-09-00097-v2 (1)
No ratings yet
bioengineering-09-00097-v2 (1)
18 pages
Utilizing Deep improved ResNet50 for Brain
No ratings yet
Utilizing Deep improved ResNet50 for Brain
11 pages
Enabling Semantic Interoperability in Multi Centric Clinical Trials On Breast Cancer 2015 Computer Methods and Programs in Biomedicine
No ratings yet
Enabling Semantic Interoperability in Multi Centric Clinical Trials On Breast Cancer 2015 Computer Methods and Programs in Biomedicine
8 pages
Enhancing Missing Values Imputation through Transformer-Based Predictive Modeling
No ratings yet
Enhancing Missing Values Imputation through Transformer-Based Predictive Modeling
8 pages
A Review of Transfer Learning For Medical Image CL
No ratings yet
A Review of Transfer Learning For Medical Image CL
27 pages
Matsoukas What Makes Transfer Learning Work For Medical Images Feature Reuse CVPR 2022 Paper
No ratings yet
Matsoukas What Makes Transfer Learning Work For Medical Images Feature Reuse CVPR 2022 Paper
10 pages
Model Predictive Control PHD Thesis
100% (3)
Model Predictive Control PHD Thesis
7 pages
Robust and Data-Efficient Generalization of Self-Supervised Machine Learning For Diagnostic Imaging
No ratings yet
Robust and Data-Efficient Generalization of Self-Supervised Machine Learning For Diagnostic Imaging
30 pages
Enhanced Literature Review MedVQA
No ratings yet
Enhanced Literature Review MedVQA
3 pages
Exer4 Cabugnason
No ratings yet
Exer4 Cabugnason
5 pages
Paper
No ratings yet
Paper
6 pages
18Vol102No16
No ratings yet
18Vol102No16
22 pages
大模型-疾病诊断-LoRA Guided Multi-Modal Disease
No ratings yet
大模型-疾病诊断-LoRA Guided Multi-Modal Disease
10 pages
Jimaging 09 00046 v2
No ratings yet
Jimaging 09 00046 v2
26 pages
Suresh Et Al 2022 The Role of Augmented Reality in 231126 121918
No ratings yet
Suresh Et Al 2022 The Role of Augmented Reality in 231126 121918
17 pages
Cross Co-Teaching For Semi - Supervised Medical Image Segmentation
No ratings yet
Cross Co-Teaching For Semi - Supervised Medical Image Segmentation
15 pages
Huang GLoRIA A Multimodal Global-Local Representation Learning Framework For Label-Efficient Medical ICCV 2021 Paper
No ratings yet
Huang GLoRIA A Multimodal Global-Local Representation Learning Framework For Label-Efficient Medical ICCV 2021 Paper
10 pages
Comprehensive Survey of Model Compression and Speed up for Vision Transformers_Chen et al_
No ratings yet
Comprehensive Survey of Model Compression and Speed up for Vision Transformers_Chen et al_
12 pages
BEST CODE UNETR Delving Into Efficient and Accurate 3D Medical Image Segmentation
No ratings yet
BEST CODE UNETR Delving Into Efficient and Accurate 3D Medical Image Segmentation
14 pages
2020-Maleki-(NeuroimageClin)-Machine Learning Algorithm Validation ...
No ratings yet
2020-Maleki-(NeuroimageClin)-Machine Learning Algorithm Validation ...
13 pages
DL Review Ansi
No ratings yet
DL Review Ansi
9 pages
Agarwal_Does_Data_Repair_Lead_to_Fair_Models_Curating_Contextually_Fair_WACV_2022_paper
No ratings yet
Agarwal_Does_Data_Repair_Lead_to_Fair_Models_Curating_Contextually_Fair_WACV_2022_paper
10 pages
Robust and Explainable Framework To Address Data Scarcity in Diagnostic Imaging
No ratings yet
Robust and Explainable Framework To Address Data Scarcity in Diagnostic Imaging
64 pages
Partial Dependence Plot
No ratings yet
Partial Dependence Plot
16 pages
Development of Revised Resnet 50 For Diabetic Retinopathy Detection
No ratings yet
Development of Revised Resnet 50 For Diabetic Retinopathy Detection
18 pages
Is Grad-CAM Explainable in Medical Images?
No ratings yet
Is Grad-CAM Explainable in Medical Images?
13 pages
Deep Lesion Tracker: Monitoring Lesions in 4D Longitudinal Imaging Studies
No ratings yet
Deep Lesion Tracker: Monitoring Lesions in 4D Longitudinal Imaging Studies
11 pages
Kang_Benchmarking_Self-Supervised_Learning_on_Diverse_Pathology_Datasets_CVPR_2023_paper
No ratings yet
Kang_Benchmarking_Self-Supervised_Learning_on_Diverse_Pathology_Datasets_CVPR_2023_paper
11 pages
s41467-021-24464-3
No ratings yet
s41467-021-24464-3
11 pages
Introduction To Machine and Deep Learning For Medical Physicists
No ratings yet
Introduction To Machine and Deep Learning For Medical Physicists
21 pages
Medical Transformer: Gated Axial-Attention For Medical Image Segmentation
No ratings yet
Medical Transformer: Gated Axial-Attention For Medical Image Segmentation
18 pages
One Class Classification-based Quality Assurance of Organs-at-risk Delineation in Radiotherapy
No ratings yet
One Class Classification-based Quality Assurance of Organs-at-risk Delineation in Radiotherapy
9 pages
Multi-Learner Based Deep Meta-Learning For Few-Shot Medical Image Classification
No ratings yet
Multi-Learner Based Deep Meta-Learning For Few-Shot Medical Image Classification
12 pages
s41467-021-24025-8
No ratings yet
s41467-021-24025-8
13 pages
37_Multi-domain improves classification in out-of-distribution and data-limited scenarios for medical image analysis
No ratings yet
37_Multi-domain improves classification in out-of-distribution and data-limited scenarios for medical image analysis
15 pages
Int422 Project
No ratings yet
Int422 Project
8 pages
NTTTTT
No ratings yet
NTTTTT
5 pages
Mastering Partial Least Squares Structural Equation Modeling (Pls-Sem) with Smartpls in 38 Hours
From Everand
Mastering Partial Least Squares Structural Equation Modeling (Pls-Sem) with Smartpls in 38 Hours
Ken Kwong-Kay Wong
3/5 (1)
Zorzi_PolyWorld_Polygonal_Building_Extraction_With_Graph_Neural_Networks_in_Satellite_CVPR_2022_paper
No ratings yet
Zorzi_PolyWorld_Polygonal_Building_Extraction_With_Graph_Neural_Networks_in_Satellite_CVPR_2022_paper
10 pages
Yu_SoftCollage_A_Differentiable_Probabilistic_Tree_Generator_for_Image_Collage_CVPR_2022_paper
No ratings yet
Yu_SoftCollage_A_Differentiable_Probabilistic_Tree_Generator_for_Image_Collage_CVPR_2022_paper
10 pages
Wang_Learning_by_Hallucinating_Vision-Language_Pre-Training_With_Weak_Supervision_WACV_2023_paper
No ratings yet
Wang_Learning_by_Hallucinating_Vision-Language_Pre-Training_With_Weak_Supervision_WACV_2023_paper
11 pages
Tang_Few_Could_Be_Better_Than_All_Feature_Sampling_and_Grouping_CVPR_2022_paper
No ratings yet
Tang_Few_Could_Be_Better_Than_All_Feature_Sampling_and_Grouping_CVPR_2022_paper
10 pages
Wang_Semi-Supervised_Semantic_Segmentation_Using_Unreliable_Pseudo-Labels_CVPR_2022_paper
No ratings yet
Wang_Semi-Supervised_Semantic_Segmentation_Using_Unreliable_Pseudo-Labels_CVPR_2022_paper
10 pages
Zhang_Semantic_Segmentation_by_Early_Region_Proxy_CVPR_2022_paper
No ratings yet
Zhang_Semantic_Segmentation_by_Early_Region_Proxy_CVPR_2022_paper
11 pages
Skorokhodov_StyleGAN-V_A_Continuous_Video_Generator_With_the_Price_Image_Quality_CVPR_2022_paper
No ratings yet
Skorokhodov_StyleGAN-V_A_Continuous_Video_Generator_With_the_Price_Image_Quality_CVPR_2022_paper
11 pages
Qiu_3D_Change_Localization_and_Captioning_From_Dynamic_Scans_of_Indoor_WACV_2023_paper
No ratings yet
Qiu_3D_Change_Localization_and_Captioning_From_Dynamic_Scans_of_Indoor_WACV_2023_paper
10 pages
Wang_Cloning_Outfits_From_Real-World_Images_to_3D_Characters_for_Generalizable_CVPR_2022_paper
No ratings yet
Wang_Cloning_Outfits_From_Real-World_Images_to_3D_Characters_for_Generalizable_CVPR_2022_paper
10 pages
Wen_3D_Shape_Reconstruction_From_2D_Images_With_Disentangled_Attribute_Flow_CVPR_2022_paper
No ratings yet
Wen_3D_Shape_Reconstruction_From_2D_Images_With_Disentangled_Attribute_Flow_CVPR_2022_paper
11 pages
Sun_Human_Instance_Matting_via_Mutual_Guidance_and_Multi-Instance_Refinement_CVPR_2022_paper
No ratings yet
Sun_Human_Instance_Matting_via_Mutual_Guidance_and_Multi-Instance_Refinement_CVPR_2022_paper
10 pages
Wang_Rethinking_Bayesian_Deep_Learning_Methods_for_Semi-Supervised_Volumetric_Medical_Image_CVPR_2022_paper
No ratings yet
Wang_Rethinking_Bayesian_Deep_Learning_Methods_for_Semi-Supervised_Volumetric_Medical_Image_CVPR_2022_paper
9 pages
Wang_CLIP-NeRF_Text-and-Image_Driven_Manipulation_of_Neural_Radiance_Fields_CVPR_2022_paper
No ratings yet
Wang_CLIP-NeRF_Text-and-Image_Driven_Manipulation_of_Neural_Radiance_Fields_CVPR_2022_paper
10 pages
Wang_An_Efficient_Training_Approach_for_Very_Large_Scale_Face_Recognition_CVPR_2022_paper
No ratings yet
Wang_An_Efficient_Training_Approach_for_Very_Large_Scale_Face_Recognition_CVPR_2022_paper
10 pages
Peng_Semantic-Aware_Domain_Generalized_Segmentation_CVPR_2022_paper
No ratings yet
Peng_Semantic-Aware_Domain_Generalized_Segmentation_CVPR_2022_paper
12 pages
Wang_Continual_Learning_With_Lifelong_Vision_Transformer_CVPR_2022_paper
No ratings yet
Wang_Continual_Learning_With_Lifelong_Vision_Transformer_CVPR_2022_paper
11 pages
Qin_MonoGround_Detecting_Monocular_3D_Objects_From_the_Ground_CVPR_2022_paper
No ratings yet
Qin_MonoGround_Detecting_Monocular_3D_Objects_From_the_Ground_CVPR_2022_paper
10 pages
Kittenplon_Towards_Weakly-Supervised_Text_Spotting_Using_a_Multi-Task_Transformer_CVPR_2022_paper
No ratings yet
Kittenplon_Towards_Weakly-Supervised_Text_Spotting_Using_a_Multi-Task_Transformer_CVPR_2022_paper
10 pages
Lin_OcclusionFusion_Occlusion-Aware_Motion_Estimation_for_Real-Time_Dynamic_3D_Reconstruction_CVPR_2022_paper
No ratings yet
Lin_OcclusionFusion_Occlusion-Aware_Motion_Estimation_for_Real-Time_Dynamic_3D_Reconstruction_CVPR_2022_paper
10 pages
Lu_Prompt_Distribution_Learning_CVPR_2022_paper
No ratings yet
Lu_Prompt_Distribution_Learning_CVPR_2022_paper
10 pages
Liu_Open-Set_Text_Recognition_via_Character-Context_Decoupling_CVPR_2022_paper
No ratings yet
Liu_Open-Set_Text_Recognition_via_Character-Context_Decoupling_CVPR_2022_paper
10 pages
Ke_Unsupervised_Hierarchical_Semantic_Segmentation_With_Multiview_Cosegmentation_and_Clustering_Transformers_CVPR_2022_paper
No ratings yet
Ke_Unsupervised_Hierarchical_Semantic_Segmentation_With_Multiview_Cosegmentation_and_Clustering_Transformers_CVPR_2022_paper
11 pages
Pang_Zoom_in_and_Out_A_Mixed-Scale_Triplet_Network_for_Camouflaged_CVPR_2022_paper
No ratings yet
Pang_Zoom_in_and_Out_A_Mixed-Scale_Triplet_Network_for_Camouflaged_CVPR_2022_paper
11 pages
Muller_Self-Supervised_Relative_Pose_With_Homography_Model-Fitting_in_the_Loop_WACV_2023_paper
No ratings yet
Muller_Self-Supervised_Relative_Pose_With_Homography_Model-Fitting_in_the_Loop_WACV_2023_paper
10 pages
Lee_Sound-Guided_Semantic_Image_Manipulation_CVPR_2022_paper
No ratings yet
Lee_Sound-Guided_Semantic_Image_Manipulation_CVPR_2022_paper
10 pages
Jeong_3D_Scene_Painting_via_Semantic_Image_Synthesis_CVPR_2022_paper
No ratings yet
Jeong_3D_Scene_Painting_via_Semantic_Image_Synthesis_CVPR_2022_paper
11 pages
Li_Brain-Inspired_Multilayer_Perceptron_With_Spiking_Neurons_CVPR_2022_paper
No ratings yet
Li_Brain-Inspired_Multilayer_Perceptron_With_Spiking_Neurons_CVPR_2022_paper
11 pages
Guo_Generating_Diverse_and_Natural_3D_Human_Motions_From_Text_CVPR_2022_paper
No ratings yet
Guo_Generating_Diverse_and_Natural_3D_Human_Motions_From_Text_CVPR_2022_paper
10 pages
Jain_Zero-Shot_Text-Guided_Object_Generation_With_Dream_Fields_CVPR_2022_paper
No ratings yet
Jain_Zero-Shot_Text-Guided_Object_Generation_With_Dream_Fields_CVPR_2022_paper
10 pages
He_GANSeg_Learning_To_Segment_by_Unsupervised_Hierarchical_Image_Generation_CVPR_2022_paper
No ratings yet
He_GANSeg_Learning_To_Segment_by_Unsupervised_Hierarchical_Image_Generation_CVPR_2022_paper
11 pages
Small Engines and Power Tools
No ratings yet
Small Engines and Power Tools
11 pages
Mq6-Series Um En
No ratings yet
Mq6-Series Um En
48 pages
Department of Education: Republic of The Philippines
No ratings yet
Department of Education: Republic of The Philippines
2 pages
Josephine A Quarshie CV 2
No ratings yet
Josephine A Quarshie CV 2
4 pages
Installation Instruction Einstellung - RST - EN
No ratings yet
Installation Instruction Einstellung - RST - EN
7 pages
Vulcan: 1200mm Raven Octocopter
No ratings yet
Vulcan: 1200mm Raven Octocopter
3 pages
Shop Drawing Lift - 10.07.24
No ratings yet
Shop Drawing Lift - 10.07.24
4 pages
Strategic TP Pre Final
No ratings yet
Strategic TP Pre Final
8 pages
Stock LesPdf Examens BAC Comores Sujet 2012 Comores Sujet C Anglais Bac 2012
No ratings yet
Stock LesPdf Examens BAC Comores Sujet 2012 Comores Sujet C Anglais Bac 2012
2 pages
Ecografo Mindray DC-N Ceries PDF
No ratings yet
Ecografo Mindray DC-N Ceries PDF
233 pages
Radial Piston Motor (Multi-Stroke) MCR10: RE 15207/07.10 1/16 Replaces: 02.98
No ratings yet
Radial Piston Motor (Multi-Stroke) MCR10: RE 15207/07.10 1/16 Replaces: 02.98
16 pages
Sep2022
No ratings yet
Sep2022
11 pages
SSD-4000-28 Measure To Video Output Failure
No ratings yet
SSD-4000-28 Measure To Video Output Failure
5 pages
TCN-500 Data Sheet
No ratings yet
TCN-500 Data Sheet
2 pages
Chapter 4. Sputtering Target Manufacturing
No ratings yet
Chapter 4. Sputtering Target Manufacturing
1 page
Imaginando - FRMS
100% (1)
Imaginando - FRMS
12 pages
974236-001 Installation Q230
100% (1)
974236-001 Installation Q230
70 pages
Streamlining Supply Chain Operations Through Automation - PPT
No ratings yet
Streamlining Supply Chain Operations Through Automation - PPT
15 pages
Sections in Red Color Are Required To Be Filled To Submit/Forward The Application For Approval
No ratings yet
Sections in Red Color Are Required To Be Filled To Submit/Forward The Application For Approval
2 pages
RZ dh608 Engl Web PDF
No ratings yet
RZ dh608 Engl Web PDF
2 pages
PVT & Eos Modelling: Using Pvtsim Software
No ratings yet
PVT & Eos Modelling: Using Pvtsim Software
109 pages
Digital Fluency Last Min Reference
No ratings yet
Digital Fluency Last Min Reference
21 pages
Energy Conversion and Management
No ratings yet
Energy Conversion and Management
18 pages
Side Channel Pumps: Self-Priming, Multi-Stage Type
No ratings yet
Side Channel Pumps: Self-Priming, Multi-Stage Type
5 pages
Exterior Inspection: The Boeing 737
100% (3)
Exterior Inspection: The Boeing 737
40 pages
Mumbai To London: We're Processing Your Booking
No ratings yet
Mumbai To London: We're Processing Your Booking
3 pages
06 Pistons PDF
No ratings yet
06 Pistons PDF
6 pages
IJCRT_263105
No ratings yet
IJCRT_263105
6 pages
ICT Paper 2 Question of Mock Test. 2021
No ratings yet
ICT Paper 2 Question of Mock Test. 2021
11 pages

Watson_Learning_How_to_MIMIC_Using_Model_Explanations_To_Guide_Deep_WACV_2023_paper

Uploaded by

Watson_Learning_How_to_MIMIC_Using_Model_Explanations_To_Guide_Deep_WACV_2023_paper

Uploaded by

Learning How to MIMIC: Using Model Explanations to Guide Deep Learning

Matthew Watson, Bashar Awwad Shiekh Hasan, Noura Al Moubayed

Abstract even exceeding) that of medical professionals [22]. How-

You might also like