About The Author: Open Letter
About The Author: Open Letter
potential dangers of AI have garnered widespread public attention. In this essay, the
author reviews the threats to democracy posed by the possibility of “rogue AIs,”
dangerous and powerful AIs that would execute harmful goals, irrespective of whether
the outcomes are intended by humans. To mitigate against the risk that rogue AIs present
to democracy and geopolitical stability, the author argues that research into safe and
defensive AIs should be conducted by a multilateral, international network of research
laboratories.
cosigned along with numerous experts in the field of AI, called for a temporary halt in
the development of even more potent AI systems to allow more time for scrutinizing
the risks that they could pose to democracy and humanity, and to establish regulatory
measures for ensuring the safe development and deployment of such systems. Two
months later, Geoffrey Hinton and I, who together with Yann Le Cun won the 2018
Turing Award for our seminal contributions to deep learning, joined CEOs of AI labs,
top scientists, and many others to endorse a succinct declaration: “Mitigating the risk
of extinction from AI should be a global priority alongside other societal-scale risks
such as pandemics and nuclear war.” Le Cun, who works for Meta, publicly disagreed
2
Yoshua Bengio is professor of computer science at the Université de Montréal, founder and scientific director
of Mila–Quebec Artificial Intelligence Institute, and senior fellow and codirector of the Learning in Machines
and Brains program at the Canadian Institute for Advanced Research. He won the 2018 A.M. Turing Award
(with Geoffrey Hinton and Yann LeCun).
View all work by Yoshua Bengio
There are serious risks that could come from the construction of increasingly powerful
AI systems. And progress is speeding up, in part because developing these
transformative technologies is lucrative—a veritable gold rush that could amount to
quadrillions of dollars (see the calculation in Stuart Russell’s book on pp. 98–
99). Since deep learning transitioned from a purely academic endeavor to one that
4
also has strong commercial interests about a decade ago, questions have arisen about
the ethics and societal implications of AI—in particular, who develops it, for whom
and for what purposes, and with what potential consequences? These concerns led to
the development in 2017 of the Montreal Declaration for the Responsible
Development of AI and the drafting of the Asilomar AI Principles, both of which I
was involved with, followed by many more, including the OECD AI Principles (2019)
and the UNESCO Recommendation on the Ethics of Artificial Intelligence (2021). 5
Modern AI systems are trained to perform tasks in a way that is consistent with
observed data. Because those data will often reflect social biases, these systems
themselves may discriminate against already marginalized or disempowered
groups. The awareness of such issues has created subfields of research (for example,
6
controlling superhuman AIs would accrue a level of power never before seen in
human history, a blatant contradiction with the very principle of democracy and a
major threat to it.
The development of and broad access to very large language models such as ChatGPT
have raised serious concerns among researchers and society as a whole about the
possible social impacts of AI. The question of whether and how AI systems can be
controlled, that is, guaranteed to act as intended, has been asked for many years, yet
there is still no satisfactory answer—although the AI safety-research community is
currently studying proposals. 8
To understand the dangers associated with AI systems, especially the more powerful
ones, it is useful to first explain the concept of misalignment. If Party A relies on Party
B to achieve an objective, can Party A concisely express its expectations of Party B?
Unfortunately, the answer is generally “no,” given the manifold circumstances in
which Party A might want to dictate Party B’s behavior. This situation has been well
studied in both economics, in contract theory, where Parties A and B could be
corporations, and in the field of AI, where Party A represents a human and Party B, an
AI system. If Party B is highly competent, it might fulfill Party A’s instructions,
9
adhering strictly to “the letter of the law,” contract, or instructions, but still leave Party
A unsatisfied by violating “the spirit of the law” or finding a loophole in the contract.
This disparity between Party A’s intention and what Party B is optimizing is referred
to as a misalignment.