Explaining Predictions by Characteristic Rules
Explaining Predictions by Characteristic Rules
{amak2,bostromh,mvaz}@kth.se
1 Introduction
Machine learning algorithms that reach state-of-the-art performance, in domains
such as medicine, biology, and chemistry [6], often produce non-transparent
(black-box) models. However, understanding the rationale behind predictions is,
in many domains, a prerequisite for the users placing trust in the models. This
can be achieved by employing algorithms that produce interpretable (white-box)
models, such as decision trees and generalized linear models, but in many cases,
with a substantial loss of predictive performance [30]. As a consequence, ex-
plainable machine learning has gained significant attention as a means to obtain
transparency without sacrificing performance [7].
2 A. Alkhatib et al.
Explanation techniques are either model-agnostic, i.e., they allow for explain-
ing any underlying black-box model [16], or model-specific, i.e., they exploit
properties of the underlying black-box model to generate the explanations, tar-
geting e.g., random forests [13,14] or deep neural networks [19,20]. Along another
dimension, the explanation techniques can be divided into local and global ap-
proaches [7]. Local approaches, such as LIME [1] and SHAP [2] aim to explain
a single prediction of a black-box model [1,2], while global approaches, such as
SP-LIME [1] and MAME [9], aim to provide an understanding of how the model
behaves in general [7]. Many explanation techniques produce explanations in the
form of (additive) feature importance scores. Such explanations do however not
directly lend themselves to verification, due to lack of an established, general
and objective way of concluding whether the scores (or rankings imposed by
them) are correct or not [29]. In contrast, some techniques, such as Anchors [3],
produce explanations in the form of rules. Since each such rule can be used to
make predictions, the agreement (fidelity) of the rule to the underlying black-box
model can be measured, e.g., using independent test instances. However, in some
cases, the produced rules may be very specific [8], which in practice precludes
verification due to the limited coverage of the rules.
Setzu et al. proposed GLocalX [10] as a solution to the above problem, by
which multiple local explanations (rules) are merged into fewer, more general
(global) rules. Similar to all local explanation techniques that output rules, GLo-
calX produces discriminative rules, which, according to Fürnkranz [17], provide
a quick and easy way to distinguish one class from the others using a small num-
ber of features. Characteristic rules, on the other hand, capture properties that
are common for objects belonging to a specific class, rather than highlighting
the differences (only) between objects belonging to different classes. See Figure 1
for an illustration of discriminative and characteristic rules. Although most rule
learning approaches have targeted the former type of rule, also a few approaches
for characteristic rule learning have been developed [17,21]. As stated in [17,
p.871]:
Characteristic rules could hence be a potentially useful format also for ex-
planations. Until now, however, the use of characteristic rules for explaining
predictions have, to the best of our knowledge, not been considered.
The main contributions of this study are:
Fig. 1: Discriminative rules distinguish one class from the others using a few
features, while characteristic rules learn all the features that characterize the
class.
– an ablation study of CEGA in which the format of the rules are changed from
characteristic to standard discriminative rules and in which three different
options for the local explanation technique are considered; LIME, SHAP and
Anchors.
2 Related Work
In this section, we start out by discussing model-agnostic explanation techniques
that work with any algorithm for generating the black-box model. We then
continue with model-specific explanation techniques that are explicitly designed
for certain underlying models. We also cover some related work on rule learning.
Explainable machine learning is a research area that has gained quite some
attention recently, in particular following the introduction of the popular lo-
cal explanation technique LIME (Local Interpretable Model-agnostic Explana-
tions) [1]. In addition to the original algorithm, a variant called SP (Submodu-
lar Pick) LIME was proposed, which produces more general explanations. LIME
trains a white-box model using perturbed instances, which are weighted by prox-
imity to the instance of interest. The trained white-box model acts as a faithful
explainer locally but not globally. On the other hand, SP-LIME uses a set of
representative instances to generate explanations with high coverage, allowing
4 A. Alkhatib et al.
Algorithm 1 CEGA
1: Input: A set of objects X, a local explainer L, a black-box model B; confidence
threshold c; minimum rule support s; class labels {C1 , . . . , Ck }
2:
3: E ← Generate-Explanation-Itemsets(X, L, B)
4:
5: R ← Find-Association-Rules(E, s, c)
6:
7: R ← Filter-Rules(R, {C1 , . . . , Ck })
8:
9: Output: General characteristic rules (R)
predicted class) together with the predicted class label. We binarized the cat-
egorical features using one-hot encoding, and the binarized feature names that
appear in an explanation are added to the itemset (e.g., feature_A_V, where
feature_A is the name and V is the value), and the same preprocessing step
is applied to the data at prediction time. Continuous features were discretized
using equal-width binning, using five bins. As a side note, we extract more than
one explanatory itemset per example, one per class, where each feature is added
to an itemset with the class label it supports. Therefore, we obtain the same
number of itemsets per class, which is particularly useful for highly imbalanced
datasets to avoid extracting explanations solely for the dominant class. One final
preprocessing step, for binary classification tasks only, is to add the binarized
categorical feature to the itemset of the opposite class if the feature value is zero,
which is motivated by the explainer associating the absence of one feature with
the predicted class. Consequently, the presence of the same feature will favor
the other class. This preprocessing step may hence result in that multiple values
of a categorical feature may appear in characteristic explanations for the same
class. In this study, we will consider binary classification problems only; there-
fore, the ranking of the features is with respect to the predicted class label. To
avoid including features with negligible effect on a prediction, a threshold will
be employed to filter out low-ranking features, which also reduces the computa-
tional cost when later performing association rule mining on the corresponding
itemsets. Now the explanation itemsets can be used as input to the association
rule mining algorithm.
An association rule mining algorithm is applied to the explanation itemsets
within the Find-Association-Rules function, using the specified confidence and
support thresholds. It should be noted that CEGA is again agnostic to which
algorithm is used for conducting this step.
Using the Filter-Rules function, CEGA aims to find a set of rules that char-
acterize each class. The characteristic rules can be obtained from the set of
discovered association rules by keeping only the rules for which a single class
label appear in the antecedent (condition part) and some set of conditions in the
consequent. Moreover, to simplify the resulting set of rules, they are processed
in the following way. If there are two rules that have the same class label, while
the conditions of one of them is a subset of the conditions of the other rule, and
if the former rule has higher confidence, then the latter rule is removed. This last
step is motivated by that it is likely to reduce the complexity and the number
of conditions of the resulting rules while increasing the coverage of the resulting
rule set. As discussed in [17], this may however not necessarily lead to that the
resulting rules are indeed more interpretable.
satisfies the conditions given that it belongs to the class (equation 2). It should
be noted that by just a slight modification of the function Filter-Rules, con-
cerning whether the conditions and the label should appear in the antecedent
and consequent (or vice versa), CEGA will produce discriminative instead of
characteristic rules.
P (Conditions, Class)
conf idence(Conditions → Class) = P (Class|Conditions) =
P (Conditions)
(1)
P (Conditions, Class)
conf idence(Class → Conditions) = P (Conditions|Class) =
P (Class)
(2)
The confidence in (equation 2) for a characteristic rule measures how characteristic
is a set of features given a class. For example, a rule with 100% confidence means that
the items (features) are shared between all class objects.
4 Empirical Evaluation
In this section, we compare CEGA to Anchors and GLocalX, two state-of-the-art ap-
proaches for explaining predictions with discriminative rules. The methods will be
compared with respect to fidelity (agreement with the explained black-box model) and
complexity (number of rules). We then conduct an ablation study, in which we change
the rule format of CEGA from characteristic to discriminative rules (as explained in
the previous section) and also the technique used to generate the local explanations.
third set. All datasets concern binary classification tasks except for Compas2 , which
originally contains three classes (Low, Medium, and High), which were reduced into
two by merging Low and Medium into one class. The black-box models are generated
by XGBoost [11]. Some of its hyperparameters (learning rate, number of estimators
and the regularization parameter lambda) were tuned by grid search using 5-fold cross-
validation on the training set. CEGA requires two additional hyperparameters; support
and confidence. The former was set to 10 in the case of LIME and SHAP and set to 4
in the case of Anchors, while the confidence of the characteristic rules was tuned based
on the fidelity as measured on the development set. In the second experiment, where
CEGA was used to produce also discriminative rules, the confidence was set to 100%
for this rule type to keep the number of generated rules within a reasonable limit.
In the first experiment, Anchors, GLocalX and CEGA are compared, where the
two latter use Anchors at the local explanation technique. Anchors is used with the
default hyperparameters, and the confidence threshold has been set to 0.9. GLocalX is
tested as well using the default values except for the alpha hyperparameter, in which
values between 50 and 95 have been tested with step 5, and the best result is reported.
In a second experiment, we consider using also SHAP and LIME as local explanation
techniques for CEGA. As described in Section 3.1, the output of these techniques
require some preprocessing to turn them into itemsets; the threshold to exclude low-
ranking features has here been set to 0.01. In all experiments, CEGA will employ the
Apriori algorithm [4] for association rule mining3 .
2
https://github.com/propublica/compas-analysis
3
CEGA is available at: https://github.com/amrmalkhatib/CEGA
Explaining Predictions by Characteristic Rules 9
Table 1: Fidelity, number of rules and coverage for Anchors, GLocalX and CEGA.
Fig. 2: Comparing average ranks with respect to fidelity measured by AUC (lower
rank is better) of Anchors, GLocalX, and CEGA, where the critical difference
(CD) represents the largest difference that is not statistically significant.
Fig. 3: Comparing average ranks of Anchors, GLocalX, and CEGA with respect
to the number of rules
Table 2: The top 11 characteristic rules output by CEGA for the German Credit
dataset when using SHAP as the local explainer.
Table 3: The top 11 discriminative rules output by CEGA for the German Credit
dataset when using SHAP as the local explainer.
5 Concluding remarks
We have proposed CEGA, a method to aggregate local explanations into general char-
acteristic explanatory rules. CEGA is agnostic to the local explanation technique and
can work with local explanations in the form of rules or feature scores, given that they
are properly converted to itemsets. We have presented results from a large-scale em-
pirical evaluation, comparing CEGA to Anchors and GLocalX, with respect to three
fidelity metrics (accuracy, AUC and F1 score), number of rules and coverage. CEGA
was observed to significantly outperform GLocalX with respect to fidelity and Anchors
with respect to the number of generated rules. We also investigated changing the rule
format of CEGA to discriminative rules and using SHAP, LIME, or Anchors as the
local explanation technique. The main conclusion of the former investigation is that
indeed the rule format has a significant effect; the characteristic rules result in higher
fidelity and fewer rules compared to when using discriminative rules. The results from
the second follow-up investigation showed that CEGA combined with either SHAP or
Anchors generally result in rules with higher fidelity compared to when using LIME as
the local explanation technique.
One direction for future work would be to complement the functionally-grounded
(objective) evaluation of the quality of the explanations with user-grounded evalua-
tions, e.g., asking users to indicate whether they actually can follow the logic behind
the predictions or solve some tasks using the output rules.
Another direction for future work concerns investigating additional ways of form-
ing itemsets from which general (characteristic or discriminative) rules are formed.
This could for example include combining the output of multiple local explanation
techniques. Another important direction concerns quantifying the uncertainty of the
generated rules, capturing to what extent one can expect a rule to agree with the
output of a black-box model. The investigated applications may also include datasets
beyond regular tabular data, e.g., text documents and images.
Acknowledgement This work was partially supported by the Wallenberg AI, Au-
tonomous Systems and Software Program (WASP) funded by the Knut and Alice Wal-
lenberg Foundation. HB was partly funded by the Swedish Foundation for Strategic
Research (CDA, grant no. BD15-0006).
References
1. Ribeiro, M., Singh, S., Guestrin, C.: "Why Should I Trust You?": Explaining the
Predictions of Any Classifier. Proceedings Of The 22nd ACM SIGKDD Interna-
tional Conference On Knowledge Discovery And Data Mining, San Francisco, CA,
USA, August 13-17, 2016. pp. 1135-1144 (2016)
2. Lundberg, S., Lee, S.: A Unified Approach to Interpreting Model Predictions. Ad-
vances In Neural Information Processing Systems 30. pp. 4765-4774 (2017)
3. Ribeiro, M., Singh, S., Guestrin, C.: Anchors: High-Precision Model-Agnostic Ex-
planations . AAAI Conference On Artificial Intelligence (AAAI). (2018)
4. Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules in Large
Databases. Proceedings Of The 20th International Conference On Very Large
Data Bases. pp. 487-499 (1994)
5. Kohavi, R., Becker, B., Sommerfield, D.: Improving Simple Bayes. European Con-
ference On Machine Learning. (1997)
Explaining Predictions by Characteristic Rules 15
25. Deng, H.: Interpreting tree ensembles with inTrees. International Journal Of Data
Science And Analytics. 7 pp. 277-287 (2018)
26. Friedman, M.: A Correction: The use of ranks to avoid the assumption of nor-
mality implicit in the analysis of variance. Journal Of The American Statistical
Association. 34, 109-109 (1939)
27. Nemenyi, P.: Distribution-free multiple comparisons. (Princeton University,1963)
28. Wilcoxon, F.: Individual comparisons by ranking methods. biometrics bulletin 1,
6 (1945), 80–83. URL Http://www. Jstor. Org/stable/3001968. (1945)
29. Slack, D., Hilgard, S., Jia, E., Singh, S., Lakkaraju, H.: Fooling LIME and SHAP:
Adversarial Attacks on Post hoc Explanation Methods. AAAI/ACM Conference
On AI, Ethics, And Society (AIES). (2020)
30. Loyola-González, O.: Black-Box vs. White-Box: Understanding Their Advantages
and Weaknesses From a Practical Point of View. IEEE Access. 7 pp. 154096-
154113 (2019)
31. Fürnkranz, J., Gamberger, D., Lavrac, N.: Foundations of Rule Learning.
(Springer,2012), http://dx.doi.org/10.1007/978-3-540-75197-7
32. Michalski, R.: A theory and methodology of induc-
tive learning. Artificial Intelligence. 20, 111-161 (1983),
https://www.sciencedirect.com/science/article/pii/0004370283900164