An Introduction To Machine Learning Interpretability 2e
An Introduction To Machine Learning Interpretability 2e
An Introduction to Machine
Learning Interpretability
An Applied Perspective on Fairness,
Accountability, Transparency,
and Explainable AI
978-1-098-11547-0
[LSI]
Table of Contents
iii
An Introduction to Machine
Learning Interpretability
1 Cynthia Rudin, “Please Stop Explaining Black Box Models for High-Stakes Decisions,”
arXiv:1811.10154, 2018, https://arxiv.org/pdf/1811.10154.pdf.
1
ques, and discusses predictive modeling and machine learning from
an applied perspective, focusing on the common challenges of busi‐
ness adoption, internal model documentation, governance, valida‐
tion requirements, and external regulatory mandates. We’ll also
discuss an applied taxonomy for debugging, explainability, fairness,
and interpretability techniques and outline the broad set of available
software tools for using these methods. Some general limitations
and testing approaches for the outlined techniques are addressed,
and finally, a set of open source code examples is presented.
4 Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin, “‘Why Should I Trust You?’:
Explaining the Predictions of Any Classifier,” in Proceedings of the 22nd ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining, ACM (2016): 1135–
1144. https://oreil.ly/2OQyGXx.
5 Scott M. Lundberg and Su-In Lee, “A Unified Approach to Interpreting Model Predic‐
tions,” in I. Guyon et al., eds., Advances in Neural Information Processing Systems 30
(Red Hook, NY: Curran Associates, Inc., 2017): 4765–4774. https://oreil.ly/2OWsZYf.
6 Daniel W. Apley, “Visualizing the Effects of Predictor Variables in Black Box Supervised
Learning Models,” arXiv:1612.08468, 2016, https://arxiv.org/pdf/1612.08468.pdf.
7 Trevor Hastie, Robert Tibshirani, and Jerome Friedman, The Elements of Statistical
Learning, Second Edition (New York: Springer, 2009). https://oreil.ly/31FBpoe.
8 Alex Goldstein et al., “Peeking Inside the Black Box: Visualizing Statistical Learning
with Plots of Individual Conditional Expectation,” Journal of Computational and Graph‐
ical Statistics 24, no. 1 (2015), https://arxiv.org/pdf/1309.6392.pdf.
9 Osbert Bastani, Carolyn Kim, and Hamsa Bastani, “Interpreting Blackbox Models via
Model Extraction,” arXiv:1705.08504, 2017, https://arxiv.org/pdf/1705.08504.pdf.
10 Joel Vaughan et al., “Explainable Neural Networks Based on Additive Index Models,”
arXiv:1806.01933, 2018, https://arxiv.org/pdf/1806.01933.pdf.
11 Hongyu Yang, Cynthia Rudin, and Margo Seltzer, “Scalable Bayesian Rule Lists,” in Pro‐
ceedings of the 34th International Conference on Machine Learning (ICML), 2017, https://
arxiv.org/pdf/1602.08610.pdf.
12 Berk Ustun and Cynthia Rudin, “Supersparse Linear Integer Models for Optimized
Medical Scoring Systems,” Machine Learning 102, no. 3 (2016): 349–391, https://oreil.ly/
31CyzjV.
13 Microsoft Interpret GitHub Repository: https://oreil.ly/2z275YJ.
14 Christoph Molnar, Giuseppe Casalicchio, and Bernd Bischl, “Quantifying Interpretabil‐
ity of Arbitrary Machine Learning Models Through Functional Decomposition,” arXiv:
1904.03867, 2019, https://arxiv.org/pdf/1904.03867.pdf.
15 Debugging Machine Learning Models: https://debug-ml-iclr2019.github.io.
19 Kevin Eykholt et al., “Robust Physical-World Attacks on Deep Learning Visual Classifi‐
cation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recogni‐
tion (2018): 1625-1634. https://oreil.ly/2yX8W11.
20 Patrick Hall, “Proposals for Model Vulnerability and Security,” O’Reilly.com (Ideas),
March 20, 2019. https://oreil.ly/308qKm0.
21 Rebecca Wexler, “When a Computer Program Keeps You in Jail,” New York Times, June
13, 2017, https://oreil.ly/2TyHIr5.
Commercial Motivations
Companies and organizations use machine learning and predictive
models for a very wide variety of revenue- or value-generating
applications. Just a few examples include facial recognition, lending
decisions, hospital release decisions, parole release decisions, or gen‐
erating customized recommendations for new products or services.
Many principles of applied machine learning are shared across
industries, but the practice of machine learning at banks, insurance
companies, healthcare providers, and in other regulated industries is
often quite different from machine learning as conceptualized in
popular blogs, the news and technology media, and academia. It’s
also somewhat different from the practice of machine learning in
the technologically advanced and less regulated digital, ecommerce,
FinTech, and internet verticals.
In commercial practice, concerns regarding machine learning algo‐
rithms are often overshadowed by talent acquisition, data engineer‐
ing, data security, hardened deployment of machine learning apps
and systems, managing and monitoring an ever-increasing number
of predictive models, modeling process documentation, and regula‐
tory compliance. Successful entities in both traditional enterprise
and in modern digital, ecommerce, FinTech, and internet verticals
have learned to balance these competing business interests. Many
digital, ecommerce, FinTech, and internet companies, operating out‐
side of most regulatory oversight, and often with direct access to
web-scale, and sometimes unethically sourced, data stores, have
often made web data and machine learning products central to their
business. Larger, more established companies tend to practice statis‐
tics, analytics, and data mining at the margins of their business to
optimize revenue or allocation of other valuable assets. For all these
reasons, commercial motivations for interpretability vary across
industry verticals, but center around improved margins for previ‐
22 Patrick Hall, Wen Phan, and Katie Whitson, The Evolution of Analytics (Sebastopol, CA:
O’Reilly Media, 2016). https://oreil.ly/2Z3eBxk.
Regulatory compliance
Interpretable, fair, and transparent models are simply a legal man‐
date in certain parts of the banking, insurance, and healthcare
industries. Because of increased regulatory scrutiny, these more tra‐
ditional companies typically must use techniques, algorithms, and
models that are simple and transparent enough to allow for detailed
documentation of internal system mechanisms and in-depth analy‐
sis by government regulators. Some major regulatory statutes cur‐
rently governing these industries include the Civil Rights Acts of
1964 and 1991, the Americans with Disabilities Act, the Genetic
Information Nondiscrimination Act, the Health Insurance Portabil‐
ity and Accountability Act, the Equal Credit Opportunity Act
(ECOA), the Fair Credit Reporting Act (FCRA), the Fair Housing
Act, Federal Reserve SR 11-7, and European Union (EU) Greater
24 Andrew Burt, “How Will the GDPR Impact Machine Learning?” O’Reilly.com (Ideas),
May 16, 2018, https://oreil.ly/304nxDI.
Reducing risk
No matter what space you are operating in as a business, hacking of
prediction APIs or other model endpoints and discriminatory
model decisions can be costly, both to your reputation and to your
bottom line. Interpretable models, model debugging, explanation,
and fairness tools can mitigate both of these risks. While direct
hacks of machine learning models still appear rare, there are numer‐
ous documented hacking methods in the machine learning security
literature, and several simpler insider attacks that can change your
model outcomes to benefit a malicious actor or deny service to legit‐
imate customers.,, You can use explanation and debugging tools in
white-hat hacking exercises to assess your vulnerability to adversa‐
rial example, membership inference, and model stealing attacks. You
can use fair (e.g., learning fair representations, LFR) or private (e.g.,
private aggregation of teaching ensembles, PATE) models as an
active measure to prevent many attacks., Also, real-time disparate
impact monitoring can alert you to data poisoning attempts to
27 Marco Barreno et al., “The Security of Machine Learning,” Machine Learning 81, no. 2
(2010): 121–148. https://oreil.ly/31JwoLL.
28 Reza Shokri et al., “Membership Inference Attacks Against Machine Learning Models,”
IEEE Symposium on Security and Privacy (SP), 2017, https://oreil.ly/2Z22LHI.
29 Nicholas Papernot, “A Marauder’s Map of Security and Privacy in Machine Learning:
An Overview of Current and Future Research Directions for Making Machine Learn‐
ing Secure and Private,” in Proceedings of the 11th ACM Workshop on Artificial Intelli‐
gence and Security, ACM, 2018, https://arxiv.org/pdf/1811.01134.pdf.
32 Patrick Hall, Wen Phan, and SriSatish Ambati, “Ideas on Interpreting Machine Learn‐
ing,” O’Reilly.com (Ideas), March 15, 2017, https://oreil.ly/2H4aIC8.
33 Riccardo Guidotti et al., “A Survey of Methods for Explaining Black Box Models,” ACM
Computing Surveys (CSUR) 51, no. 5 (2018): 93. https://arxiv.org/pdf/1802.01933.pdf.
34 Zachary C. Lipton, “The Mythos of Model Interpretability,” arXiv:1606.03490, 2016,
https://arxiv.org/pdf/1606.03490.pdf.
Fairness
As we discussed earlier, fairness is yet another important facet of
interpretability, and a necessity for any machine learning project
whose outcome will affect humans. Traditional checks for fairness,
often called disparate impact analysis, typically include assessing
model predictions and errors across sensitive demographic seg‐
39 Reza Shokri et al., “Membership Inference Attacks Against Machine Learning Models,”
IEEE Symposium on Security and Privacy (SP), 2017, https://oreil.ly/2Z22LHI.
40 Florian Tramèr et al., “Stealing Machine Learning Models via Prediction APIs,” in 25th
{USENIX} Security Symposium ({USENIX} Security 16) (2016): 601–618. https://
oreil.ly/2z1TDnC.
41 Patrick Hall, “Guidelines for the Responsible and Human-Centered Use of Explainable
Machine Learning,” arXiv:1906.03533, 2019, https://arxiv.org/pdf/1906.03533.pdf.
44 Leo Breiman, “Statistical Modeling: The Two Cultures (with Comments and a Rejoin‐
der by the Author),” Statistical Science 16, no. 3 (2001): 199-231. https://oreil.ly/303vbOJ.
Because of the convex nature of the error surface for linear models,
there is basically only one best model, given some relatively stable
set of inputs and a prediction target. The model associated with the
error surface displayed in Figure 1-7 would be said to have strong
model locality. Even if we retrained the model on new or updated
data, the weight of income versus interest rate is likely mostly stable
in the pictured error function and its associated linear model.
Explanations about how the function made decisions about loan
defaults based on those two inputs would also probably be stable
and so would results for disparate impact testing.
Figure 1-8 depicts a nonconvex error surface that is representative of
the error function for a machine learning function with two inputs
—for example, a customer’s income and a customer’s interest rate—
and an output, such as the same customer’s probability of defaulting
on a loan. This nonconvex error surface with no obvious global
minimum implies there are many different ways a complex machine
learning algorithm could learn to weigh a customer’s income and a
customer’s interest rate to make an accurate decision about when
they might default. Each of these different weightings would create a
different function for making loan default decisions, and each of
these different functions would have different explanations and fair‐
ness characteristics! This would likely be especially obvious upon
updating training data and trying to refit a similar machine learning
model.
©2019 O’Reilly Media, Inc. O’Reilly is a registered trademark of O’Reilly Media, Inc. | 175