0% found this document useful (0 votes)
1 views

Evaluating the use of artificial intelligence and big data in policy making_25_05_05_13_06_04

The document discusses the evaluation of artificial intelligence and big data in policy-making, emphasizing the need to unpack the 'black boxes' of AI/BD to understand their mechanisms and impacts. It outlines a six-step approach for evaluators to assess the effectiveness of AI/BD interventions, focusing on identifying goals, assumptions, and testing their validity. The essay highlights the importance of transparency, explainability, and the societal implications of AI/BD applications across various domains.

Uploaded by

rafa
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Evaluating the use of artificial intelligence and big data in policy making_25_05_05_13_06_04

The document discusses the evaluation of artificial intelligence and big data in policy-making, emphasizing the need to unpack the 'black boxes' of AI/BD to understand their mechanisms and impacts. It outlines a six-step approach for evaluators to assess the effectiveness of AI/BD interventions, focusing on identifying goals, assumptions, and testing their validity. The essay highlights the importance of transparency, explainability, and the societal implications of AI/BD applications across various domains.

Uploaded by

rafa
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

9 Evaluating the use of artificial

intelligence and big data in policy


making
Unpacking black boxes and testing white
boxes
Frans L. Leeuw

Background
Given the widespread diffusion of artificial intelligence/big data (AI/BD),1 —
recently conceptualized as ‘the newest system technology’ and compared in
magnitude with the introduction of the steam engine, electricity, and the com-
puter (WRR, 2021)—it is important that evaluators address the question of how
to evaluate the expected, unexpected, and adverse effects of the use of AI/BD
when designing, developing, and implementing interventions (of any kind): pol-
icies, programmes, regulation, therapies, drugs, and others.
Indicators of the societal role AI and BD play are manifold. They include
mobile health and the Quantified Self movement, legal scholars and practitioners
using machine learning in analyzing and drafting legal texts, law enforcement
agencies predicting crime rates and patterns, medical professionals diagnosing
and developing therapies and drugs, and policymakers designing and implement-
ing programmes and policies (Dwivedi et al., 2021; Leeuw, 2021; Rajkomar et
al., 2019; York & Bamberger, 2020; Zuiderwijk et al., 2021). Added to this are
several artificial intelligence chatbots like ChatGPT by big tech companies that
have been recently introduced.
At the same time, it is known that there are various problematic, complicated,
and probably adverse effects of living in a ‘society of algorithms’ (Burrel &
Fourcade, 2021). Examples are the bias problem (data collection only or largely
includes subjects using apps, smartphones, or desktops), the legacy problem
(working with ‘old’, sometimes biased data), the big data hubris problem (the
belief that BD is a substitute for everything else, making theories and causal
analysis obsolete), validity issues (Lazer et al., 2021), and the lack of transpar-
ency and explainability of what is happening when AI/BD are applied in devel-
oping and implementing interventions. Often this lack of transparency is related
to the black-box problem of AI/BD—the focus of this essay.
Evaluating black boxes is familiar territory for evaluators; they attempt to
uncover and test assumptions about contexts, mechanisms, and outcomes of
activities (Astbury & Leeuw, 2010; Leeuw, 2020; Lemire et al., 2020; Nielsen

DOI: 10.4324/9781032669618-12
This chapter has been made available under a CC-BY-NC-ND license
84 Leeuw

et al., 2021; Pawson, 2013). However, rarely have realist, theory-driven evalu-
ators applied this approach to black boxes when BD/AI are involved in domains
including health, law enforcement, labor, and education, among others.2 This
essay attempts to outline an approach to do that.

Unpacking black boxes of AI/BD


The AI black box3 refers to the phenomenon that with most AI and AI-based
tools one does not know how they do what they do. This problem has several
characteristics:

• Although AI-based tools are often clear about the information on the input
(the question or data the AI tool starts with), and that also usually applies to
the output (the answer), it often remains unclear how the input is turned into
the output and what the respective roles of algorithms and (big) data are. As
Price and Rai (2021) mention, ‘even when human field experts are given full
access to the learning algorithm, training data, training process, and resulting
model, the models can be difficult to parse because they are often complex
and nonintuitive’ (p. 779). They distinguish between two related layers of
what is called opacity: ‘the opacity of the system being studied, and the opac-
ity of the research tool (machine learning) being deployed to study it’ (p.
778).
• Another characteristic next to opacity and the related lack of explainability
of AI/BD is plasticity, which means that the algorithms change in response
to new data. Price (2018) mentions, for the field of medicine, that ‘this form
of frequent updating is relatively common in software but relatively rare in
the context of other medical interventions, such as drugs, that are identified,
verified, and then used for years, decades, or even centuries’ (p. 1).
• The third characteristic is that the different actors engaged in using AI/
BD in organizations like hospitals, governments, and companies have often
different levels of practical expertise and experience. Some clinicians, poli-
cymakers, administrators, or managers are more ‘into’ AI and BD than oth-
ers. Probably this also applies to auditors, inspectors, and other oversight
officials. Ranerup and Henriksen (2020, p. 1) studied the introduction of
robotic process automation (RPA) into the world of (governmental) social
services and what it did to the civil servants. Apart from positive effects,
they find ‘that a human–technology hybrid actor [RPA] redefines social
assistance practices. Simplifications are needed to unpack the automated
decision-making process because of the technological and theoretical
complexities’.

Black boxes are not a ‘given’; they can be unpacked and made into white boxes,
that next need (external) validation (or testing). The question is: Does working
Artificial intelligence and big data in policy making 85

with AI/BD contribute to the effectiveness of interventions? The next section


offers a six-step approach to help find that out.

A six-step approach to unpack AI/BD black boxes into white boxes


and test them
This section discusses several steps evaluators could follow when they start to
think about unpacking black boxes in the world of artificial intelligence.

Step 1

The first step is to specify the goals or the contributions to be achieved when
applying AI/BD in designing, developing, and implementing interventions (see
Table 9.1).

Step 2

This step concerns the identification of assumptions that underlie the processes
when interventions are designed, developed, and implemented using AI/BD.
Assumptions are sometimes ‘hidden’ (Pawson, 2008), which means that they
have to be articulated. Bennett Moses and Chan (2018) did that for predictive
policing, Mitchell et al. (2021) for ‘algorithmic fairness’, Kempeneer (2021) for
the ‘big data state of mind’, and Domingos (2015) for theory-families (‘tribes’)
existing in AI.4 This step tells us at a minimum that—contrary to the big data
hubris-claim—theories are important, and they can and will differ (as do the
criteria that can be used to ‘judge’ or ‘test’ them). Put differently: AI/BD black

Table 9.1 Examples of goals and contributions to be achieved

Example Goal or contribution of using AI/BD


Digital alternative dispute An ‘intelligent agent’ is the AI application. These
resolution agents can be a tool for the adjudicator (reviewing all
documents; researching similar cases) or they can be
the arbitrator. The goal is to increase efficiency, trust,
and effectiveness in and of the dispute solution.
High-performance Algorithms are used with the goal to detect (often together
medicine with human intelligence) pneumonia, do medical
scans, carry out diagnoses of skin cancer, and check
eye conditions (Topol, 2019).
Education Computerized adaptive tests are a form of machine-
learned assessment, used with the goal to optimize
summative assessments in high-stakes selection
processes.
Insurance AI is used for more effective and efficient illness and
disability claim prediction and for fraud detection.
86 Leeuw

boxes have to become white boxes. In order to do that, a framework is needed


that specifies dimensions of AI/BD and their use. Based on earlier studies, Figure
9.1 is the visualization of such a framework. It combines the context-mecha-
nism-outcome model from realist evaluations with components and characteris-
tics of the data and algorithms and the types of machine learning applied.
The central piece in the framework is the use of AI/BD in decision making
and its (assumed) contribution to the effectiveness of the interventions. A core
assumption is that the more AI/BD plays a role in these processes, the larger the
likelihood that the interventions will be (more) effective than when AI/BD does
not play that role. This is believed to happen because the design, production, and
implementation processes are:

• free from human failures, including fatigue, cognitive and computational


restrictions, and personal biases
• always up to date (using diverse types of data, including real-time data)
• precise in their focus
• free from implementation failures

2
ASSUMPTIONS regarding
CONTEXTS and their
characteristics:
accountability,
explainability,
transparancy, privacy,
security, fairness and trust of
BD/AI, risks of biases

[EVALUATING THE]
USE OF BD/AI IN
POLICY/PROGRAMS
1
/INTERVENTIONS 3
ASSUMPTONS regarding ASSUMPTIONS regarding
TECHNICAL the working/impact of
CHARACTERISTICS/ BD/AI on behavior/decision
COMPONENTS OF BIG making including its
DATA/ANALYTICS/AI/ MECHANISMS AND
TYPES OF MACHINE OUTCOMES/
LEARNING CONSEQUENCES

Figure 9.1 Framework for specifying dimensions of AI/BD and their use
Artificial intelligence and big data in policy making 87

Circle 1 includes assumptions on the type and quality and relevance of data
and its data-ecosystem, including types of machine learning (like reinforcement
learning, deep learning, and so on). Examples of these types of assumptions
include:

1. If data are used, they will accurately reflect reality.


2. The future is like the past (or: the best predictor of the future is the past).
3. Data analytics does not unjustly discriminate in terms of gender (for example).

Circle 2 regards assumptions on the (societal) contexts in which AI/BD is used


when interventions are designed, developed, and implemented. Pawson et al.
(2005) suggest that contextual factors must be considered at four different lev-
els: individual capacities of the key actors and stakeholders, such as interests
and attitudes; interpersonal relationships required to support the intervention,
such as lines of communication; institutional settings in which the intervention
is implemented, such as the culture of organizations5; and the wider (infra-)
structural systems of a society. An example is given by Price (2018), who
distinguishes between working with AI in medicine ‘in high-resource contexts,
such as academic medical centers versus in low-resource settings such as com-
munity health centers or rural providers in less-developed countries’.
Examples of these types of assumptions include:

4. If the data point to implementing certain (policy) actions, then it is assumed


that ‘perfect implementation’ will take place.
5. If the goal is to realize social acceptability (and impact), attention must be
paid to transparency and accountability of the working processes, fairness,
explainability, nonmaleficence, responsibility, security, privacy, reliability,
and trust. This also requires putting human values at the core of AI-systems.
6. If this goal is to be achieved, organizations should implement general (infor-
mation technology) controls to ensure that their IT systems are reliable and
ethically sound.

Circle 3 looks ‘inside’ the operations when algorithms and big data are used
and addresses assumptions underlying these operations. They include assump-
tions on the actors and their behavior that work with AI/BD and how they ‘deal’
with the challenges attached to issues like opacity and plasticity: Who are the
actors? What are their perspectives on, knowledge of, and attitudes toward AI/
BD? Which stakes are involved? The evaluator also has to look for indicators of
impact/behavioral and social consequences, respective of the costs and benefits.6
Pedersen and Johansen (2020, p. 520 ff) introduced the concept of behavioral
AI (BAI). Studying BAI is believed to open up the ‘link between AI-behavior
and AI-inference by describing how to study AI behavior’. Of particular impor-
tance in BAI are:
88 Leeuw

• The relation—similarities and differences—between human cognition and


algorithmic processing
• The relation between human learning and algorithmic (machine) learning
• The process of inferring knowledge from data, thus arriving at valid and
reliable judgments, made by an AI system compared to how humans make
judgments.7

Examples of these types of assumptions include:

7. If one works with AI/algorithms, then (sometimes) it is assumed that the


decision making will be far more efficient and fair than humans are ever
capable of.
8. If there is eradication of the human factor in decision making, the one-sided
focus on efficiency—and the use of computational analyses for control,
surveillance, and prevention—could lead to a more critical attitude toward
assumption number 7.
9. Algorithms are (always) neutral.
10. Algorithms (often) have a serious degree of plasticity, changing in response
to new data (that is, frequent updating).

Step 3

The assumptions that have been surfaced are now a step closer to a white box,
but that does not guarantee validity or truth; they need—as is always the case
with (small-t and capital-T) theories—to be tested. Questions from a realist eval-
uator’s perspective include:

• How relevant, valid, and reliable are these assumptions?


• Are they valid and reliable in a general way (that is, for ‘all’ BD/AI working
processes) or only given certain contexts?
• Do they specify and articulate which mechanisms are involved in BD/AI
operations in practice?
• Are the outcomes readily available or directly operational, measurable, and
explainable?

Step 4

Assessing the validity of the (articulated) assumptions can be done first by using
existing empirical evidence from (interdisciplinary) research in which similar or
look-alike AI/BD tools and cases are investigated. These studies can be found
in the social and behavioral sciences (like Bennett Moses & Chan, 2018), in
behavioral computer sciences, and in computational social sciences and stud-
ies dealing with machine–human interactions (Lazer et al., 2021; Bowser et al.,
Artificial intelligence and big data in policy making 89

2021). They can help by transferring that knowledge to one or more look-alike
AI/BD cases to help make predictions about the probable validity of the AI/BD
used. Sometimes this approach is called ‘subsuming interventions or cases under
general theories’ (Leeuw, 2012; Pawson 2002a, 2002b) or framed by Foy et al.
(2011, p. 454) as ‘generalization through theory’.

Step 5

Predicting the probable validity of a white box based on existing research on


look-alikes is oftentimes not enough. New, primary research is needed. In
the literature, a distinction is made between in silico studies—evaluating the
algorithms and data operations as such (that is, operate within and between
computers) and behavioral evaluations—evaluating the implementation and
contributions of the algorithms and big data in practical, real-life situations. We
do not discuss in silico evaluations here and only focus on the second ‘type’ of
evaluations.
Focusing on the world of medicine, Price (2018) described three activities
that need to be done in such an evaluation.

• Step 5.1. The first activity is ‘ensuring that algorithms are developed accord-
ing to well-vetted techniques and trained on high-quality data’ (p. 2).
• Step 5.2. The second concerns reliability: ‘demonstrating that an algorithm
reliably finds patterns in data. This type of validation depends on what the
algorithm is trying to do. Some algorithms are trained to measure what we
already know about the world, just more quickly, cheaply, or accurately than
current methods… Showing that this type of algorithm performs at the desired
level is relatively straightforward… Other algorithms optimize based purely
on patient data and self-feedback without developers providing a “correct”
answer, such as an insulin pump programme that measures patient response
to insulin and self-adjusts over time. This type of algorithm cannot be vali-
dated with test datasets’ (p. 2).
• Step 5.3. The third activity ‘applies to all sorts of black-box algorithms:
they should be continuously validated by tracking successes and failures as
they are actually implemented in health-care settings’ (Price, 2018, p. 2).
For performance one can also read: impact or effects of the AI-based inter-
vention when dealing with patients/clients in real life. Park and Han (2018,
pp. 806–807) add this: ‘With a computerized decision-support system such
as artificial intelligence, not only its technical analytic capability but also
the way in which the computerized results are presented to, interpreted by,
and acted on by human practitioners in the clinical workflow could affect
the ultimate usefulness of the computerized algorithm’. They suggest to use
randomized controlled trials to sort this out, but they are also open to other
designs. Vijayakumar and Cheung (2021) add that checking replicability of
90 Leeuw

AI/machine learning-based results is strongly recommended. An application


to the world outside medicine is presented by Choenni et al. (2021) for the
field of using AI/BD in smart cities.8

Step 6

This step concerns the transfer of the findings to experts, other professionals, and
society at large. The goal is to inform parties and society about the validity of the
approach, which is intended to help explain how BD/AI has been applied in the
process, show the transparency of that process, and increase its social acceptance.

Conclusions
This essay outlined the relevance of thinking in line with realist (theory-driven)
evaluations to unpack and test AI/BD black boxes. It included a six-step
approach. Because human–machine interaction is involved—together with a
continuous flow of data, plasticity of algorithms, and different types of machine
learning—this is not an easy task.
If the statement ‘practice makes perfect’ is correct, then that is the way to go.
This should include learning from what is already happening in other worlds,
like in medicine.
All this may and probably will help increase the relevance of evaluating AI/
BD-driven interventions and policies and contribute to an effective, ethical, and
socially acceptable ‘Algorithmic Society’.

Notes
1 For readers not familiar with the concepts of big data, AI, and machine learning, see
Janev (2020), www​.linkedin​.com​/pulse​/intelligent​-things​-its​-all​-machine​-learning​
-roger​-attick/ and www​.zendesk​.com​/blog​/machine​-learning​-and​-deep​-learning/.
2 See Bamberger (2016); York & Bamberger (2020); https://datapopalliance​.org​/lwl​-27​
-the​-role​-of​-big​-data​-and​-ai​-in​-monitoring​-and​-evaluation​-me/; and Rathinam, F. et
al. (2020).
3 Sometimes one refers to ‘alchemy’ or ‘black art’ when characterizing AI black boxes
(Campolo & Crawford, 2020, 7 ff).
4 Examples are the symbolists, the evolutionaries and the Bayesians; he described their
characteristics, including assumptions they work with.
5 Sometimes reference is made to ‘cultural metacognitions’ that exist in organizations.
They regard the knowledge of and control over thinking and learning activities in
organizations, like the awareness of different contexts, analyzing them, and develop-
ing plans of actions for different cultural contexts.
6 One example is Ranerup and Henriksen (2020, p. 5) investigating ‘Trelleborg, the
first municipality in Sweden to use automated decision making for social assistance.
The Trelleborg Model is a management model now used in many other municipalities
in Sweden’. A second example is a study of AI adoption in public sector organiza-
tions, comparing three cases in three countries (van Noordt & Misuraca, 2022).
Artificial intelligence and big data in policy making 91

7 They add this: ‘Behavioral Artificial Intelligence (BAI) would study the artificial
inferences inherent in, and the manifested behavior of, artificial intelligent systems
in the same way as the social sciences have studied human cognition, inference and
behavior’.
8 Contribution analysis may also be an interesting approach to apply. The main rea-
son is that AI/BD are not alone in making and implementing policy programmes/
interventions; they always act in combination with human intelligence, experiences,
prior individual knowledge, and so on. So, the focus of an empirical investigation
would probably be most relevant if it tries to sort out what the contribution of AI (in
interaction with humans) has been in developing and implementing programmes and
interventions.

References
Astbury, B., & Leeuw, F. (2010). Unpacking black boxes: Mechanisms and theory
building in evaluation. American Journal of Evaluation, 31(3), 363–381. https://doi​
.org​/10​.1177​/1098214010371972
Bamberger, M. (2016). Integrating big data into the monitoring and evaluation of
development programmes. Global Pulse.
Bennett Moses, L., & Chan, J. (2018). Algorithmic prediction in policing: Assumptions,
evaluation, and accountability. Policing and Society, 28(7), 806–822.
Bowser, A., Carmona, A., & Fordyce, A. (2021). Unpacking transparency to support
ethical AI. Science and Technology Innovation Program, Wilson Center.
Burrel, J., & Fourcade, M. (2021). The society of algorithms. Annual Review of Sociology,
47, 23.1–23.25.
Campolo, A., & Crawford, K. (2020). Enchanted determinism: Power without
responsibility in artificial intelligence. Engaging Science, Technology, and Society, 6,
1–19. https://doi​.org​/10​.17351​/ests2020​.277
Choenni, R., et al. (2021). Exploiting big data for smart government: Facing the
challenges. In J. C. Augusto (Ed.), Handbook of smart cities (pp. 1–23). Springer.
https://doi​.org​/10​.1007​/978​-3​-030​-15145​-4​_82-1
Domingos, P. (2015). The master algorithm. Basic Books.
Dwivedi, Y., et al. (2021). Artificial intelligence: Multidisciplinary perspectives on
emerging challenges, opportunities, and agenda for research, practice, and policy.
International Journal of Information Management, 57(April), 1–47. https://doi​.org​
/10​.1016​/j​.ijinfomgt​.2019​.08​.002.
Foy, R., et al. (2011). The role of theory in research to develop and evaluate the
implementation of patient safety practices. BMJ Quality & Safety, 20(5), 453–459.
Janev, V. (2020). Ecosystem of big data. In V. Janev, D. Graux, H. Jabeen, & E. Sallinger
(Eds.), Knowledge graphs and big data processing (pp. 3–19). Springer. doi​.org​/10​
.1007​/978​-3​-030​-5​3199​-7_1
Kempeneer, S. (2021). A big data state of mind: Epistemological challenges to
accountability and transparency in data-driven regulation. Government Information
Quarterly, 38(3), 1–8. https://doi​.org​/10​.1016​/j​.giq​.2021​.101578.
Lazer, D., et al. (2021). Meaningful measures of human society in the twenty-first
century. Nature, 595, 189–196.
Leeuw, F. (2012). Linking theory-based evaluation and contribution analysis: Three
problems and a few solutions. Evaluation, 18(3), 348–363.
92 Leeuw

Leeuw, F. (2020). Program evaluation B: Evaluation, big data, and artificial intelligence:
Two sides of one coin. In E. Vigoda-Gadot & D. R. Vashdi (Eds.), Handbook of
research methods in public administration, management and policy (pp. 277–297).
EE Publishers. https://doi​.org​/10​.4337​/9781789903485
Leeuw, F. (2021). Big data, artificial intelligence, and the future of evaluation.
Background report to a presentation given at the Seminar of the Evaluation Network
of DG Regional and Urban Policy, July 1, 2021.
Lemire, S., Kwako, A., Nielsen, S. B., Christie, C. A., Donaldson, S. I., & Leeuw, F.
(2020). What is this thing called a mechanism? Findings from a review of realist
evaluations. Causal Mechanisms in Program Evaluation, 2020(167), 73–86.
Mitchell, S., Potash, E., Barocas, S., D’Amour, A., & Lum, K. (2021). Algorithmic
fairness: Choices, assumptions, and definitions. Annual Review of Statistics and Its
Application, 8, 141–163.
Nielsen, S., Lemire, S., & Tangsig, S. (2021). Unpacking context in realist evaluations:
Findings from a comprehensive review. Evaluation, 28(1), 91–112.
Park, S. H., & Han, K. (2018). Methodologic guide for evaluating clinical performance
and effect of AI technology for medical diagnoses and prediction. Radiology, 286(3),
800–809.
Pawson, R. (2002a). Evidence-based policy: The promise of ‘realist synthesis’.
Evaluation, 8(3), 340–358.
Pawson, R. (2002b). Evidence and policy and naming and shaming. Policy Studies, 23(3),
211–230. https://doi​.org​/10​.1080​/0144287022000045993
Pawson, R. (2008). Invisible mechanisms. Evaluation Journal of Australasia, 8(2), 3–13.
https://doi​.org​/10​.1177​/1035719X0800800202
Pawson, R. (2013). The science of evaluation: A realist manifesto. Sage.
Pawson, R., Greenhalgh, T., Harvey, G., & Walshe, K. (2005). Realist review—A new
method of systematic review designed for complex policy interventions. Journal of
Health Services Research & Policy, 10(1), Suppl 1, 21–34. https://doi​.org​/10​.1258​
/1355819054308530.
Pedersen, T., & Johansen, C. (2020). Behavioural artificial intelligence: An agenda for
systematic empirical studies of artificial inference. AI & Society, 35(3), 519–532.
https://doi​.org​/10​.1007​/s00146​-019​-00928-5
Price, W. (2018). Big data and black-box medical algorithms. Science Translational
Medicine, 10(47). doi:10.1126/scitranslmed.aao5333.
Price, W. N., & Rai, A. K. (2021). Clearing opacity through machine learning. Iowa Law
Review, 106, 775–812.
Rajkomar, A., Dean, J., & Kohane, I. (2019). Machine learning in medicine. New England
Journal of Medicine, 380(14), 1347–1358.
Ranerup, A., & Henriksen, H. (2020). Digital discretion: Unpacking human and
technological agency in automated decision making in Sweden’s social services.
Social Science Computer Review, 40(2), 445–461. https://doi​.org​/10​.1177​
/0894439320980434.
Rathinam, F., Khatua, S., Siddiqui, Z., Malik, M., Duggal, P., Watson, S., & Vollenweider,
X. (2020). Using big data for evaluating development outcomes: A systematic map.
CEDIL Methods Working Paper 2. Centre of Excellence for Development Impact and
Learning.
Topol, E. (2019). High-performance medicine: The convergence of human and artificial
intelligence. Nature Medicine, 25(January), 44–56.
Artificial intelligence and big data in policy making 93

Van Noordt, C., & Misuraca, G. (2022). Exploratory insights on artificial intelligence for
government in Europe. Social Science Computer Review, 40(2), 426–444. https://doi​
.org​/10​.1177​/0894439320980449
Vijayakumar, R., & Cheung, M. (2021). Assessing replicability of machine learning results:
An introduction to methods on predictive accuracy in social sciences. Social Science
Computer Review, 39(5), 768–801. https://doi​.org​/10​.1177​/0894439319888445
WRR. (2021). Mission AI: The new system technology. Netherlands Scientific Council
for Government Policy.
York, P., & Bamberger, M. (2020). Measuring results and impacts in an age of big data:
The nexus of evaluation, analytics, and digital technology. Rockefeller Foundation.
Zuiderwijk, A., Chen, Y., & Salem, F. (2021). Implications of the use of artificial
intelligence in public governance: A systematic literature review and research agenda.
Government Information Quarterly, 38(3), 1–19.

You might also like