chapter 1
chapter 1
Artificial intelligence :
• AI can be described as the effort to automate intellectual tasks normally
per formed by humans.
• As such, AI is a general field that encompasses machine learning and deep
learning, but that also includes many more approaches that may not
involve any learning.
• Symbolic AI :
▪ Artificial intelligence achieved by having programmers handcraft a
sufficiently large set of explicit rules for manipulating knowledge
stored in explicit databases
◦ ex: early chess programs
▪ Although symbolic AI proved suitable to solve well-defined, logical
problems, such as playing chess, it turned out to be intractable to
figure out explicit rules for solving more complex, fuzzy problems,
such as image classification, speech recognition, or natural language
translation
Machine Learning :
• A machine learning system is trained rather than explicitly programmed.
It’s presented with many examples relevant to a task, and it finds statistical
structure in these examples that eventually allows the system to come up
with rules for automating the task.
• Machine learning is related to mathematical statistics, but it differs from
statistics in several important ways
• Unlike statistics, machine learning tends to deal with large, complex
datasets (such as a data set of millions of images, each consisting of tens
of thousands of pixels) for which classical statistical analysis such as
Bayesian analysis would be impractical.
• To do machine learning we need three things :
a. Input data points—For instance, if the task is speech recognition,
these data points could be sound files of people speaking. If the
task is image tagging, they could be pictures.
b. Examples of the expected output—In a speech-recognition task,
these could be human-generated transcripts of sound files. In an
image task, expected outputs could be tags such as “dog,” “cat,”
and so on.
c. A way to measure whether the algorithm is doing a good job—This
is necessary in order to determine the distance between the
algorithm’s current output and its expected output. The
measurement is used as a feedback signal to adjust the way the
algorithm works. This adjustment step is what we call learning.
2
• A machine learning model transforms its input data into meaningful
outputs, a process that is “learned” from exposure to known examples of
inputs and outputs. Therefore, the central problem in machine learning
and deep learning is to meaningfully transform data: in other words, to
learn useful representations of the input data at hand—representations
that get us closer to the expected output
• what’s a representation? At its core, it’s a different way to look at data—to
represent or encode data
• Machine learning models are all about finding appropriate representations
for their input data—transformations of the data that make it more
amenable to the task at hand.
• Learning, in the context of machine learning, describes an automatic
search process for data transformations that produce useful
representations of some data, guided by some feedback signal—
representations that are amenable to simpler rules solving the task at hand
• Machine learning algorithms aren’t usually creative in finding these
transformations; they’re merely searching through a predefined set of
operations, called a hypothesis space.
Deep Learning :
• Deep learning is a specific subfield of machine learning: a new take on
learning representations from data that puts an emphasis on learning
successive layers of increasingly meaningful representations.
• The “deep” in “deep learning” isn’t a reference to any kind of deeper
understanding achieved by the approach; rather, it stands for this idea of
successive layers of representations.
• How many layers contribute to a model of the data is called the depth
of the model
• In deep learning, these layered representations are learned via models
called neural networks, structured in literal layers stacked on top of each
other
3
Understanding how deep learning works, in three figures
:
• The specification of what a layer does to its input data is stored in the
layer’s weights, which in essence are a bunch of numbers
• Weights are also sometimes called the parameters of a layer
• In this context, learning means finding a set of values for the weights of all
layers in a network, such that the network will correctly map example
inputs to their associated targets.
4
• Loss function ( or objective function or cost function ) : measure how far
this output is from what you expected.
• The fundamental trick in deep learning is to use this score (from Loss
function ) as a feedback signal to adjust the value of the weights a little, in
a direction that will lower the loss score for the current example
• This adjustment is the job of the optimizer, which implements what’s called
the Backpropagation algorithm
5
• Initially, the weights of the network are assigned random values, so
the network merely implements a series of random transformations