0% found this document useful (0 votes)
8 views3 pages

Chapter 1

The document introduces machine learning as a method for analyzing vast amounts of data by detecting patterns to make predictions or decisions. It categorizes machine learning into supervised, unsupervised, and reinforcement learning, with supervised learning further divided into classification and regression tasks. Real-world applications of these methods include document classification, image recognition, and stock market predictions.

Uploaded by

Saman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views3 pages

Chapter 1

The document introduces machine learning as a method for analyzing vast amounts of data by detecting patterns to make predictions or decisions. It categorizes machine learning into supervised, unsupervised, and reinforcement learning, with supervised learning further divided into classification and regression tasks. Real-world applications of these methods include document classification, image recognition, and stock market predictions.

Uploaded by

Saman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Chapter 1:- Introduction

Machine Learning: What and Why?


We are drowning in information and starving for knowledge. — John Naisbitt.
We are entering the era of big data. For example, there are about 1 trillion web pages; one
hour of video is uploaded to YouTube every second, amounting to 10 years of content every
day; the genomes of 1000s of people, each of which has a length of 3.8 × 109 base pairs,
have been sequenced by various labs; Walmart handles more than 1M transactions per hour
and has databases containing more than 2.5 petabytes of information; and so on.
This deluge of data calls for automated methods of data analysis, which is what machine
learning provides. In particular, we define machine learning as a set of methods that can
automatically detect patterns in data, and then use the uncovered patterns to predict future
data, or to perform other kinds of decision making under uncertainty (such as planning how
to collect more data!).
Types of Machine Learning
Types of machine learning Machine learning is usually divided into two main types. In the
predictive or supervised learning approach, the goal is to learn a mapping from inputs x to
N
outputs y, given a labeled set of input-output pairs D = { ( xi , yi ) }i=1. Here D is called the
training set, and N is the number of training examples. In the simplest setting, each training
input xi is a D-dimensional vector of numbers, representing, say, the height and weight of a
person. These are called features, attributes or covariates. In general, however, xi could be
a complex structured object, such as an image, a sentence, an email message, a time series,
a molecular shape, a graph, etc.

methods assume that yi is a categorical or nominal variable from some finite set, yi ∈
Similarly the form of the output or response variable can in principle be anything, but most

{1,...,C} (such as male or female), or that yi is a real-valued scalar (such as income level).
When yi is categorical, the problem is known as classification or pattern recognition, and
when yi is real-valued, the problem is known as regression. Another variant, known as
ordinal regression, occurs where label space Y has some natural ordering, such as grades A–
F.
The second main type of machine learning is the descriptive or unsupervised learning
approach. Here we are only given inputs, D = {xi }Ni=1 i=1, and the goal is to find “interesting
patterns” in the data. This is sometimes called knowledge discovery. This is a much less
well-defined problem, since we are not told what kinds of patterns to look for, and there is
no obvious error metric to use (unlike supervised learning, where we can compare our
prediction of y for a given x to the observed value).
There is a third type of machine learning, known as reinforcement learning, which is
somewhat less commonly used. This is useful for learning how to act or behave when given
occasional reward or punishment signals. (For example, consider how a baby learns to
walk.) Semi-Supervised learning is a type of Machine Learning algorithm that lies between
Supervised and Unsupervised machine learning. It represents the intermediate ground between
Supervised (With Labelled training data) and Unsupervised learning (with no labelled training data)
algorithms and uses the combination of labelled and unlabeled datasets during the training period.

Figure1.1: Left: Some labeled training examples of colored shapes, along with 3 unlabeled

the feature vector xi. The last column is the label, yi ∈ {0, 1}.
test cases. Right: Representing the training data as an N × D design matrix. Row i represents

Classification

outputs y, where y ∈ {1,...,C}, with C being the number of classes. If C = 2, this is called
In this section, we discuss classification. Here the goal is to learn a mapping from inputs x to

binary classification (in which case we often assume y ∈ {0, 1}); if C > 2, this is called
multiclass classification. If the class labels are not mutually exclusive (e.g., somebody may
be classified as tall and strong), we call it multi-label classification, but this is best viewed as
predicting multiple related binary class labels (a so-called multiple output model). When we
use the term “classification”, we will mean multiclass classification with a single output,
unless we state otherwise.
One way to formalize the problem is as function approximation. We assume y = f(x) for
some unknown function f, and the goal of learning is to estimate the function f given a
labeled training set, and then to make predictions using Y^ =f(x). (We use the hat symbol to
denote an estimate.) Our main goal is to make predictions on novel inputs, meaning ones
that we have not seen before (this is called generalization), since predicting the response on
the training set is easy (we can just look up the answer).
Example:- As a simple toy example of classification, consider the problem illustrated in
Figure 1.1(a). We have two classes of object which correspond to labels 0 and 1. The inputs
are colored shapes. These have been described by a set of D features or attributes, which
are stored in an N × D design matrix X, shown in Figure 1.1(b). The input features x can be
discrete, continuous or a combination of the two. In addition to the inputs, we have a vector
of training labels y. In Figure 1.1, the test cases are a blue crescent, a yellow circle and a blue
arrow. None of these have been seen before. Thus we are required to generalize beyond
the training set. A reasonable guess is that blue crescent should be y = 1, since all blue
shapes are labeled 1 in the training set. The yellow circle is harder to classify, since some
yellow things are labeled y = 1 and some are labeled y = 0, and some circles are labeled y = 1
and some y = 0. Consequently it is not clear what the right label should be in the case of the
yellow circle. Similarly, the correct label for the blue arrow is unclear.

Here are some examples of real-world classification applications.

 Document classification and email spam filtering.


 Classifying Flowers’
 Image Classification and handwriting recognition
 Face detection and recognition

Regression

example: we have a single real-valued input x i ∈ R, and a single real-valued response yi ∈ R. We


Regression is just like classification except the response variable is continuous. Figure shows a simple

consider fitting two models to the data: a straight line and a quadratic function.

(a) Linear regression on some 1d data. (b) Same data with polynomial regression (degree 2).

Here are some examples of real-world regression problems.

 Predict tomorrow’s stock market price given current market conditions and other possible
side information.
 Predict the age of a viewer watching a given video on YouTube.
 Predict the location in 3d space of a robot arm end effector, given control signals (torques)
sent to its various motors.
 Predict the temperature at any location inside a building using weather data, time, door
sensors, etc.

You might also like