2024_OA Osoba
2024_OA Osoba
Osonde A. Osoba
ABSTRACT KEYWORDS
Military decision-making institutions are sociotechnical systems. Artificial intelligence (AI);
They feature interactions among people applying technologies to machine learning; complex-
enact roles within mission-oriented collectives. As sociotechnical adaptive-systems;
complexity; decision-making
systems, military institutions can be examined through the lens of
complex adaptive systems (CAS) theory. This discussion applies
the CAS perspective to reveal implications of integrating newer
artificial intelligence (AI) technologies into military decision-
making institutions. I begin by arguing that military adoption of
AI is well-incentivised by the current defence landscape. Given
these incentives, it would be useful to try to understand the likely
effects of AI integration into military decision-making. Direct
examinations of the new affordances and risks of new AI models
is a natural mode of analysis for this. I discuss some low-hanging
fruit in this tradition. However, I also maintain that such
examinations can miss systemic impacts of AI reliance in decision-
making workflows. By taking a complex systems view of AI
integration, it is possible to glean non-intuitive insights,
including, for example, that common policy concerns like
preventing human deskilling or requiring algorithmic
transparency may be overblown or counterproductive.
Introduction
Large Language Models (LLMs), a recent iteration in advanced artificial intelligence (AI),
have gained prominence for impressive feats of general intelligence. It raises a natural
question for the security-minded: does the availability of such broadly capable, advanced
AI systems impact the effectiveness of security actors?
A recent short report (Mouton, Lucas, and Guest 2023) illustrates this concern. The
authors discuss results from ‘red-teaming’ exercises that aim to identify novel security
risks arising from the open deployment of advanced AI tools like OpenAI’s ChatGPT.
Red-teaming (Rehberger 2020) is a standard exercise developed in cybersecurity practices
in which experts attempt to abuse or defeat a new system to gather evidence about the
system’s weaknesses. In this particular exercise, red-teamers adopt the perspective of
an adversary, perhaps a non-state actor looking to attack a better-resourced opponent.
They aim to plan and execute asymmetric warfare operations using bioweapons in an
urban environment. The study convened panels of biosecurity experts to assess
whether these new LLMs can improve an adversary’s effectiveness at planning such bio-
weapon attacks. Shockingly yet unsurprisingly, red-teamers were able to get LLMs to
discuss practical aspects and make useful recommendations for increasing the lethality
of potential bioweapon attacks … How do we anticipate such military uses of artificial
intelligence (AI)? And what are the implications for deterrence?
AI and machine learning (ML) technologies have become very useful and effective. These
tools are likely to be adopted widely in military decision making. Military adoption of useful
technology is normal. For example, the adaptation of automobile and aviation technologies
for military purposes. The flow of innovation also goes in reverse. A lot of technological
innovation is the direct product of wartime demands. Privacy-enhancing technologies like
digital cryptography have roots in war-time code-cracking efforts and military espionage.
Despite this natural connection, the integration of AI/ML technologies into military
decision making raises special concerns for attempts at meaningful arms control.
Levers for AI ‘arms’ verification and control are still underspecified. For verification:
the verification of an adversary’s use of AI/ML in war or military decision making is a
difficult problem and will likely feature significant deceptive actions by adversaries.
Parties would be well motivated to try to hide the full capabilities of their AI in software
(think, for instance, about Volkswagen using software switches to evade proper emis-
sions testing (Schiermeier 2015)). We would also need answers to questions like: what
kinds of tests are needed to verify the level of intelligence embedded in a military
system? What is a natural metric of artificial intelligence capabilities?
For AI ‘arms’ control, effective control requires that (international) regulating parties
can credibly intervene on the key enablers of AI technologies. But access to key enablers
of these technologies (data, algorithms, compute, tech talent) is ‘democratised,’ making it
difficult for regulating bodies to intervene to regulate their use by adversaries. Further-
more, the primary momentum around AI/ML technologies lies in the hands of private
and often multinational commercial interests who are not bound by the same norms
and responsibilities that traditional nation states have. Finally, public discussions of AI
already express fears about uncontrollability and economic disruption. Military use
heightens these fears.
To add more colour and detail to concerns about the risks of employing AI in military
decision making, I am going to focus on providing a considered response to the following
question: what systemic risks and opportunities accompany the integration of AI into
military decision-making ecosystems?
A natural response for a technologist tackling this kind of question is to dissect the
nature of the technological artefacts1 (Winner 1980) to generate inferences about likely
risks and benefits in use. For example, one could focus on compute or training data
needs (as in Matheny’s (2023) recommendations for AI regulation, for example) or on
differences in AI perceptual abilities (as I do below). These kinds of artefact-level analyses
are important but can be incomplete. They do not clearly connect to insights about how
the institutional culture of military decision-making may adapt to the shock of AI inte-
gration. We need to develop a more systemic frame to better contextualise the effects of
broad AI adoption in military decision-making culture. Vold’s (2024) conception of ‘AI-
enabled cognitive enhancements’ is one such fruitful systemic frame. My later discussion
here of AI as artefacts in Complex Adaptive Systems (CAS) is another exploration of the
systemic frame. Drawing insights from other complex sociotechnical systems, that later
AUSTRALIAN JOURNAL OF INTERNATIONAL AFFAIRS 239
In sum, the threat landscape and operating environments are increasingly complex.
Warfighting institutions need to adapt to this complexity increase. AI-based tools offer
an accessible, plausible, and enticing mode of adaptation. I contend that the operating
incentives described make widespread AI integration into military decision making insti-
tutions practically inevitable. Given this probable trajectory, what new capabilities and
shortcomings do they introduce? The next section discusses some responses to this ques-
tion by examining technical aspects of AI systems.
AI-augmented perception?
In resort-to-force decision making, the actor typically starts with (or is cued in because
of) information flows from tactical intelligence, surveillance, and reconnaissance (Tacti-
cal ISR or Tac-ISR) operations applied to observations or perceptions. Open source
AUSTRALIAN JOURNAL OF INTERNATIONAL AFFAIRS 241
intelligence (OSINT) data flows are one input source of Tac-ISR workflows.2 With the
rise of cheaper and more commercial space infrastructure, space assets are another
input source for enabling robust and timely military Tac-ISR. AI systems show great
promise as facilitators of perception tasks (Yang et al. 2020) and they demonstrate
near-human performance on some visual perception tasks. Osoba et al. (2023) discuss
how AI tools can be applied to data verification tasks in Tac-ISR data supply chains.
The performance of AI tools at these tasks is somewhat of a distraction though.
Achieving human-level perceptual processing is noteworthy, but that is not the most
compelling reason for broad adoption. The main benefit of AI in this part of the decision
cycle is scale. AI-augmented perception would enable a significant increase in the amount
of information that can be processed, for example, the ability to process vast amounts of
audio, visual, and textual inputs. and then cue up the pieces that are more relevant to the
decision maker. This kind of cuing workflow is an operational tactic for leveraging the
cognitive diversity (Hernández-Orallo and Vold 2019) in human-machine teams to
great effect. The workflow preserves the limited attention of human operators and
targets human attention more efficiently. This workflow structure is valuable even
when the algorithm is not a perfect perceiver. This scaling up in perception is going to
be pivotal for parsing multi-sourced data flows to enable responsive and robust Tac-
ISR for resort-to-force decision making.
The promise of AI-augmented perception comes with a new kind of risk: adversarial
examples or adversarial manipulations of machine perceptions.3 AI perception systems
can be deceived to give mistaken inferences via both digital (Kurakin, Goodfellow, and
Bengio 2016) or physical (Song et al. 2018) manipulations of the subject under obser-
vation. Deception is the norm in warfare. Deception incentivizes the routine deployment
of adversarial examples in countersurveillance and warfare. Furthermore, automation
bias can atrophy review processes for lower-level machine outputs, rendering such AI
misperceptions more systemically dangerous.
In summary, an artefact-level examination suggests the following:
. AI technologies can already increase the capabilities of military organisations that are
properly resourced to take advantage of them.
. The use of these technologies in human military institutions represents a concrete
form of cognitive diversity. These artefacts literally perceive the world differently
from humans. But their use requires paying closer attention to the trustworthiness
of AI-sourced and filtered perceptions that feed into resort-to-force decision
making pipelines. How should a decision maker adapt when they cannot fully trust
the testimony of their ‘eyes’?
Briefly, a complex system is an ensemble of agents and structures that interact in non-
linear ways leading to the emergence of hard-to-predict macroscopic behaviours.
Complex systems which are also adaptive, typically feature additional signature charac-
teristics like nested subsystems, self-organisation, nonlinear effects, adaptation, memory,
and emergence. Most organisations and policy systems qualify as CASs involving adap-
tive heterogeneous agents (Davis et al. 2021). ‘Adaptive’ refers to the agents’ capacity for
learning and behavioural change or evolution. ‘Heterogeneous’ refers to the diversity in
the kinds of agents interacting in the system. For example, public health systems feature
adaptive individuals, firms (hospitals and insurers), infectious pathogens, transport infra-
structure, etc. States and, more specifically, their national security institutions responsible
for resort-to-force decision-making processes would also qualify as CASs.
The integration of AI technologies adds another dimension of complexity to insti-
tutions responsible for resort-to-force decision processes. Exploring other complex adap-
tive systems can offer valuable insights into how to adapt to AI-imposed complexity, for
example, by highlighting properties common to well-functioning CASs.
most a small fraction of other subsystems. Modular or nearly decomposable systems are
easier to manage (e.g. swap out failing subsystems with limited side-effects) and under-
stand (e.g. easier to trace observed outcomes back to responsible sub-systems).
responsibility’ to describe the concept of reaching back through the technological arte-
facts to root responsibility squarely in the (legally & socially) responsive humans and
organisations deploying these artefacts. This concept is similar to Floridi’s concepts of
distributed morality and distributed faultless responsibility (Floridi 2016; 2020) (if we
drop the ‘faultless’ aspect).
For example, we have some trust in the veracity of an esoteric academic paper pub-
lished by an unknown author in a reputable journal. This trust is not because we
always have enough deep expertise to critically evaluate the paper. Our knowledge of
peer-review processes and the embedded incentives anchors our calibration of the trust-
worthiness of the paper. This suggests that there might be value in exploring ways of
designing processes around the generation and consumption of AI-augmented resort-
to-force decisions that may be more effective at providing compelling justifications.
Conclusion
My aim in this piece has been to highlight the utility of two different frames for under-
standing the implications of AI integration into military decision-making institutions: a
frame focusing on the technical capabilities of the AI artefacts and a frame focused on the
systems into which these AI artefacts will be integrated. From those explorations, the fol-
lowing insights are worth reiterating.
The artefact-level analyses reveal that AI systems perceive the world differently than
humans. That difference in perception can be exploited by adversaries but also provides
useful cognitive diversity. Deceptive threat actors seeking to hide their activities from
monitoring satellites may be able to craft AI-targeted camouflage to confuse large-
scale surveillance workflows. Threat actors can also take advantage of current AI tools
to generate novel tactical plans and courses of action. Increased variety in adversary
tactics can slow or muddy resort-to-force decision making.
The systems-level analysis points to two additional conclusions. First, the concern
over deskilling human actors may not be a central one. Specialisation is likely essential
for efficient and scalable resort-to-force decision making processes, especially if that
specialisation makes wise use of the cognitive diversity in human-machine teams.
Second, achieving a decision system architecture that is responsive to societal norms
requires the ability to link decisions to accountable non-machine agents. Algorithmic
introspection (transparency and explanation) centres algorithms in our search for
accountability in decision making. But algorithms are not capable of bearing responsibil-
ity in any useful way. Perfecting our algorithmic introspection capabilities would not
bridge accountability gaps. The focus should be on accountability structures rather
than algorithmic introspection.
Notes
1. ‘Artefacts’ are simply man-made objects. I use this term typically to refer to man-made
elements that play roles in a sociotechnical system. I intend my use of this to be similar
to Langdon Winner’s use of the term in his 1980 essay ‘Do artifacts have politics?’
(Winner 1980) in which he scopes artefacts to include ‘machines, structures, and systems
of modern material culture.’
AUSTRALIAN JOURNAL OF INTERNATIONAL AFFAIRS 245
2. These were identified by Mick Ryan (2023) in his contribution to the ‘Anticipating the
Future of War: AI, Automated Systems, and Resort-to-Force Decision Making’ workshop.
3. The adversarial example concern can be parsed as a mirror image of the problem of ‘deep-
fakes.’ The deepfake concern involves the production of AI-generated artifacts designed to
fool humans into believing falsehoods.
4. There is a secondary question about the degree to which the skill of human actors is necess-
ary for maintaining compliance with international rules and norms around the use of force.
If the primary requirement for compliance is meaningful human oversight, then there is
research and design to do on how to organise coordination between human actors and
AI-systems to achieve compliance.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Notes on contributor
Osonde Osoba (Ph.D.) is a researcher and practitioner in the field of Responsible AI (RAI). Over
the past decade, he has worked on RAI by applying AI to policy problems (at RAND) and by exam-
ining questions of fairness and equity in the use of AI for decision-making (at RAND & now at
LinkedIn).
References
Bryson, Joanna J, Mihailis E. Diamantis, and Thomas D Grant. 2017. “Of, for, and by the People:
The Legal Lacuna of Synthetic Persons.” Artificial Intelligence and Law 25: 273–291. https://doi.
org/10.1007/s10506-017-9214-9.
Davis, Paul K., Tim McDonald, Ann Pendleton-Jullian, Angela O’Mahony, and Osonde Osoba.
2021. “A Complex-Systems Agenda for Influencing Policy Studies.” In Proceedings of the
2019 International Conference of The Computational Social Science Society of the Americas,
edited by Zining Yang and Elizabeth Von Briesen, 277–296. Springer.
Deeks, Ashley. 2024. “Delegating War Initiation to Machines.” Anticipating the Future of War: AI,
Automated Systems, and Resort-to-Force Decision Making, Special Issue Australian Journal of
International Affairs 78 (2), (this issue).
Delmas, Magali, and Ivan Montiel. 2009. “Greening the Supply Chain: When is Customer Pressure
Effective?” Journal of Economics & Management Strategy 18 (1): 171–201.
Erskine, Toni. 2024. “Before Algorithmic Armageddon: Anticipating Immediate Risks to Restraint
When AI Infiltrates Decisions to Wage War.” Anticipating the Future of War: Ai, Automated
Systems, and Resort-to-Force Decision Making, Special Issue of Australian Journal of
International Affairs 78 (2), (this issue).
Floridi, Luciano. 2016. “Faultless Responsibility: On the Nature and Allocation of Moral
Responsibility for Distributed Moral Actions.” Philosophical Transactions of the Royal Society
A: Mathematical, Physical and Engineering Sciences 374: 20160112. https://doi.org/10.1098/
rsta.2016.0112.
Floridi, Luciano. 2020. “Distributed Morality in an Information Society.” In The Ethics of
Information Technologies, edited by Keith Miller and Mariarosaria Taddeo, 63–79. Milton
Park, UK: Routledge.
Hernández-Orallo, José, and Karina Vold. 2019. “AI Extenders: The Ethical and Societal
Implications of Humans Cognitively Extended by AI.” Proceedings of the 2019 AAAI/ACM
Conference on AI, Ethics, and Society.
Hoehn, A., and T Shanker. 2023. Age of Danger: Keeping America Safe in an Era of New
Superpowers, New Weapons, and New Threats. Hachette Books.
246 O. A. OSOBA
Kurakin, Alexey, Ian Goodfellow, and Samy Bengio. 2016. “Adversarial Machine Learning at
Scale.” arXiv preprint arXiv:1611.01236.
Lakkaraju, Himabindu, and Osbert Bastani. 2020. “‘How do I Fool you?’ Manipulating User Trust
Via Misleading Black Box Explanations.” In Proceedings of the AAAI/ACM Conference on AI,
Ethics, and Society, edited by Annette Markham, Julia Powles, Toby Walsh, and Anne L.
Washington, 79–85. New York, NY: Association for Computing Machinery.
Matheny, Jason. 2023. “Advancing Trustworthy Artificial Intelligence.” Testimony presented before
the U.S. House Committee on Science, Space, and Technology, 22 June 2023. Santa Monica, CA:
RAND Corporation.
Mouton, Christopher A., Caleb Lucas, and Ella Guest. 2023. The Operational Risks of AI in Large-
Scale Biological Attacks: A Red-Team Approach. Santa Monica, CA: RAND Corporation.
Navabi, Shiva, and Osonde A. Osoba. 2021. “A Generative Machine Learning Approach to Policy
Optimization in Pursuit-Evasion Games.” Paper presented at the 60th IEEE Conference on
Decision and Control (CDC), Austin, Texas, United States. December 2021.
Osoba, Osonde, George Nacouzi, Jeff Hagen, Jonathan Tran, Li Ang Zhang, Marissa Herron,
Christopher M Lynch, Mel Eisman, and Charlie Barton. 2023. The Resilience Assessment
Framework: Assessing Commercial Contributions to US Space Force Mission Resilience. Santa
Monica, CA: RAND Corporation.
Rehberger, Johann. 2020. Cybersecurity Attacks–Red Team Strategies: A Practical Guide to Building
a Penetration Testing Program Having Homefield Advantage. Birmingham, UK: Packt
Publishing Ltd.
Ryan, Mick. 2023. “Meshed Civil-Military Sensor Systems: Opportunities and Challenges of AI-
Enabled Battlespace Transparency.” Paper presented at the Anticipating the Future of War:
AI, Automated Systems and Resort-to-Force Decision Making Workshop. Canberra,
Australia, 28–29 June.
Schiermeier, Quirin. 2015. “The Science Behind the Volkswagen Emissions Scandal.” Nature 9: 24.
Sienknecht, Mitja. 2024. “Proxy Responsibility: Addressing Responsibility Gaps in Human-
Machine Decision Making on the Resort to Force.” Anticipating the Future of War: Ai,
Automated Systems, and Resort-to-Force Decision Making, Special Issue of Australian Journal
of International Affairs 78 (2), (this issue).
Simon, Herbert A. 1996. The Sciences of the Artificial. Cambridge, MA, United States: MIT press.
Song, Dawn, Kevin Eykholt, Ivan Evtimov, Earlence Fernandes, Bo Li, Amir Rahmati, Florian
Tramer, Atul Prakash, and Tadayoshi Kohno. 2018. “Physical Adversarial Examples for
Object Detectors.” Paper Presented at the 12th USENIX Workshop on Offensive
Technologies (WOOT 18), Baltimore, MD, United States, August 2018.
Sutton, Richard S., and Andrew G. Barto. 2018. Reinforcement Learning: An Introduction.
Cambridge, MA: MIT Press.
Vold, Karina. 2024. “Human-AI Cognitive Teaming: Using AI to Support State-Level Decision
Making on the Use of Force.” Anticipating the Future of War: Ai, Automated Systems, and
Resort-to-Force Decision Making, Special Issue of Australian Journal of International Affairs
78 (2), (this issue).
Wei, Jason, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, and
Denny Zhou. 2022. “Chain-of-thought Prompting Elicits Reasoning in Large Language
Models.” Advances in Neural Information Processing Systems 35: 24824–24837.
Winner, Langdon. 1980. “Do Artifacts Have Politics?” Daedalus 109 (1): 121–136.
Yang, Jiachen, Chenguang Wang, Bin Jiang, Houbing Song, and Qinggang Meng. 2020. “Visual
Perception Enabled Industry Intelligence: State of the Art, Challenges and Prospects.” IEEE
Transactions on Industrial Informatics 17 (3): 2204–2219.