0% found this document useful (0 votes)
22 views

Lecture 1

The document discusses the history and current state of artificial intelligence including key developments, challenges, and applications. It explores the modeling-inference-learning paradigm for solving AI problems and highlights machine learning as a crucial ingredient powering recent successes.

Uploaded by

Rana Naeem Khan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Lecture 1

The document discusses the history and current state of artificial intelligence including key developments, challenges, and applications. It explores the modeling-inference-learning paradigm for solving AI problems and highlights machine learning as a crucial ingredient powering recent successes.

Uploaded by

Rana Naeem Khan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 75

Artificial Intelligence

AI Transformation
It is hard these days to escape hearing about AI — in the news, on social
media, in cafe conversations. We see both reports triumphs of superhuman
performance in games such as Jeopardy! (IBM Watson, 2011) and Go
(DeepMind’s AlphaGo, 2016), as well as on benchmark tasks such as reading
comprehension, speech recognition, face recognition, and medical imaging
(though it is important to realize that these are about performance on one
benchmark, which is a far cry from the general problem).
AI Speculation
We also see speculation about the future: that it will bring about
sweeping societal change due to automation, resulting in
massive job loss, not unlike the industrial revolution, or that AI
could even surpass human-level intelligence and seek to take
control.
While media hype is real, it is true that both
companies and governments are heavily
investing in AI. Both see AI as an integral part of
their competitive strategy.
The AI Index is an effort to track the progress of AI over time. In 2017, the AI Index published a report, showing
essentially that all curves go up and to the right. Here are a few representative samples.
• Why might one even think that it is even possible to capture this rich behavior?
• While AI is a relatively young field, one can trace back some of its roots back to
Aristotle, who formulated a system of syllogisms that capture the reasoning
process: how one can mechanically apply syllogisms to derive new conclusions.

• Alan Turing, who laid the conceptual foundations of computer science,


developed the Turing machine, an abstract model of computation, which, based
on the Church-Turing thesis, can implement any computable function.

• In the 1940s, actual devices that could actually carry out these computations
started emerging. So perhaps one might be able to capture intelligent behavior via
a computer. But how do we define success?
• Can machines think? This is a question that has occupied philosophers since
Descartes. But even the definitions of ”thinking” and ”machine” are not clear. Alan
Turing, the renowned mathematician and code breaker who laid the foundations of
computing, posed a simple test to sidestep these philosophical concerns.

• In the test, an interrogator converses with a man and a machine via a text-based
channel. If the interrogator fails to guess which one is the machine, then the
machine is said to have passed the Turing test. (This is a simplification but it suffices
for our present purposes.)

• Although the Turing test is not without flaws (e.g., failure to capture visual and
physical abilities, emphasis on deception), the beauty of the Turing test is its
simplicity and objectivity. It is only a test of behavior, not of the internals of the
machine. It doesn’t care whether the machine is using logical methods or neural
networks. This decoupling of what to solve from how to solve is an important
theme in this class.
• AI started out with a bang. People were ambitious and tried to develop things like
General Problem Solver that could solve anything. Despite some successes, certain tasks
such as machine translation were complete failures, which lead to the cutting of funding
and the first AI winter. It happened again in the 1980s, this time with expert systems,
though the aims were scoped more towards industrial impact. But again, expectations
exceeded reality, leading to another AI winter. During these AI winters, people eschewed
the phrase ”artificial intelligence” as not to be labeled as a hype-driven lunatic.

• In the latest rebirth, we have new machine learning techniques, tons of data, and tons
of computation. So each cycle, we are actually making progress. Will this time be
different?

• We should be optimistic and inspired about the potential impact that advances in AI can
bring. But at the same time, we need to be grounded and not be blown away by hype.
This class is about providing that grounding, showing how AI problems can be treated
rigorously and mathematically. After all, this class is called ”Artificial Intelligence:
Principles and Techniques”.
• There are two ways to look at AI philosophicaly.

• The first is what one would normally associate with the AI: the science and
engineering of building ”intelligent” agents. The inspiration of what constitutes
intelligence comes from the types of capabilities that humans possess: the ability
to perceive a very complex world and make enough sense of it to be able to
manipulate it.

• The second views AI as a set of tools. We are simply trying to solve problems in
the world, and AI techniques happen to be quite useful for that.

• However, both views boil down to many of the same day-to-day activities (e.g.,
collecting data and optimizing a training objective), the philosophical differences
do change the way AI researchers approach and talk about their work. And
moreover, the conflation of these two can generate a lot of confusion.
The same computer vision
techniques used to recognize
objects can be used to tackle social
problems. Poverty is a huge
problem, and even identifying the
areas of need is difficult due to the
difficulty in getting reliable survey
data. Recent work has shown that
one can take satellite images (which
are readily available) and predict
various poverty indicators.
• Machine learning can also be used to optimize the energy efficiency of datacenters, which given the hunger for
compute these days makes a big difference. Some recent work from DeepMind show how to significantly reduce
Google’s energy footprint by using machine learning to predict the power usage effectiveness from sensor
measurements such as pump speeds, and using that to drive recommendations.
• Other applications such as self-driving
cars and authentication have high-stakes,
where errors could be much more
damaging than getting the wrong movie
recommendation. These applications
present a set of security concerns.

• One can generate so-called adversarial


examples, where by putting stickers on a
stop sign, one can trick a computer vision
system to mis-classify it as a speed limit
sign. You can also purchase special glasses
that fool a system to thinking that you’re a
celebrity.

• Even more fundamentally, these


examples shows that current methods
clearly are not learning ”the right thing”
as defined by the human visual system.
• A more subtle case is the issue of bias. One might naively think that since machine
learning algorithms are based on mathematical principles, that they are somehow
objective. However, machine learning predictions come from the training data, and
the training data comes from society, so any biases in society are reflected in the
data and propagated to predictions. The issue of bias is a real concern when
machine learning is used to decide whether an individual should receive a loan or
get a job.

• Unfortunately, the problem of fairness and bias is as much of a philosophical one


as it is a technical one. There is no obvious ”right thing to do”, and it has even been
shown mathematically it is impossible for a classifier to satisfy three reasonable
fairness criteria (Kleinberg et al., 2016).
• How should we actually solve these AI tasks? The real world is complicated. At the end of the day, we need to write
some code (and possibly build some hardware too). But there is a huge chasm.
• In this class, we will adopt the modeling-inference-learning paradigm to help us navigate the solution space. In reality, the
lines are blurry, but this paradigm serves as an ideal and a useful guiding principle.
• The first pillar is modeling. Modeling takes
messy real world problems and packages them
into neat formal mathematical objects called
models, which can be subject to rigorous
analysis but is more amenable to what
computers can operate on. However, modeling
is lossy: not all of the richness of the real world
can be captured, and therefore there is an art
of modeling: what does one keep versus
ignore? (An exception to this is games such as
Chess or Go or Sodoku, where the real world is
identical to the model.)

• As an example, suppose we’re trying to have


an AI that can navigate through a busy city. We
might formulate this as a graph where nodes
represent points in the city.
• The second pillar is inference. Given
a model, the task of inference is to
answer questions with respect to the
model. For example, given the model
of the city, one could ask questions
such as: what is the shortest path?
what is the cheapest path?

• For some models, computational


complexity can be a concern (games
such as Go), and usually approxi-
mations are needed.
• But where does the model come
from? Remember that the real
world is rich, so if the model is to
be faithful, the model has to be
rich as well. But we can’t possibly
write down such a rich model
manually.
•The idea behind (machine)
learning is to instead get it from
data. Instead of constructing a
model, one constructs a skeleton
of a model (more precisely, a
model family), which is a model
without parameters. And then if
we have the right type of data, we
can run a machine learning
algorithm to tune the parameters
of the model.
• Supporting all of these models is machine learning, which has been arguably the most
crucial ingredient powering recent successes in AI. Conceptually, machine learning allows
us to shift the information com- plexity of the model from code to data, which is much
easier to obtain (either naturally occurring or via crowdsourcing).

• The main conceptually magical part of learning is that if done properly, the trained
model will be able to produce good predictions beyond the set of training examples. This
leap of faith is called generalization, and is, explicitly or implicitly, at the heart of any
machine learning algorithm. This can even be formalized using tools from probability and
statistical learning theory.
• The idea of a reflex-
based model simply
performs a fixed sequence
of computations on a given
input. Examples include
most models found in
machine learning from
simple linear classifiers to
deep neural networks. The
main characteristic of
reflex-based models is that
their computations are
feed-forward; one doesn’t
backtrack and consider
alternative computations.
Inference is trivial in these
models because it is just
running the fixed
computations, which
makes these models
appealing.
• Reflex-based models are too simple for tasks that require more forethought
(e.g., in playing chess or planning a big trip). State-based models overcome this
limitation.

• The key idea is, at a high-level, to model the state of a world and transitions
between states which are triggered by actions. Concretely, one can think of
states as nodes in a graph and transitions as edges. This reduction is useful
because we understand graphs well and have a lot of efficient algorithms for
operating on graphs.
• Search problems are adequate models when you are operating in
environment that has no uncertainty. However, in many realistic settings,
there are other forces at play.

• Markov decision processes handle tasks with an element of chance (e.g.,


Blackjack), where the distribution of randomness is known (reinforcement
learning can be employed if it is not).

• Adversarial games, as the name suggests, handle tasks where there is an


opponent who is working against you (e.g., chess).
• In state-based models, solutions are procedural: they specify step by step instructions on
how to go from A to B. In many applications, the order in which things are done isn’t
important.
• Constraint satisfaction problems are variable-based models where we only have hard
constraints. For example, in scheduling, we can’t have two people in the same place at
the same time.

• Bayesian networks are variable-based models where variables are random variables
which are dependent on each other. For example, the true location of an airplane Ht and
its radar reading Et are related, as are the location Ht and the location at the last time
step Ht−1. The exact dependency structure is given by the graph structure and formally
defines a joint probability distribution over all the variables. This topic is studied
thoroughly in probabilistic graphical models.
• Our last stop on the tour is logic. Even more so than variable-based models, logic
provides a compact language for modeling, which gives us more expressivity.

• It is interesting that historically, logic was one of the first things that AI
researchers started with in the 1950s. While logical approaches were in a way
quite sophisticated, they did not work well on complex real-world tasks with noise
and uncertainty. On the other hand, methods based on probability and machine
learning naturally handle noise and uncertainty, which is why they presently
dominate the AI landscape. However, they have yet to be applied successfully to
tasks that require really sophisticated reasoning.

• In this course, we will appreciate the two as not contradictory, but simply
tackling different aspects of AI — in fact, in our schema, logic is a class of models
which can be supported by machine learning. An active area of research is to
combine the richness of logic with the robustness and agility of machine learning.

You might also like