MAchineLearningNotes
MAchineLearningNotes
o Image Classification: You train with images/labels. Then in the future, you give
a new image expecting that the computer will recognize the new object.
o Market Prediction/Regression: You train the computer with historical market
data and ask the computer to predict the new price in the future.
Unsupervised learning: No labels are given to the learning algorithm, leaving it on its
own to find structure in its input. It is used for clustering populations in different groups.
Unsupervised learning can be a goal in itself (discovering hidden patterns in data).
o Clustering: You ask the computer to separate similar data into clusters, this is
essential in research and science.
o High-Dimension Visualization: Use the computer to help us visualize high-
dimension data.
o Generative Models: After a model captures the probability distribution of your
input data, it will be able to generate more data. This can be very useful to make
your classifier more robust.
A simple diagram that clears the concept of supervised and unsupervised learning is shown
below:
As you can see clearly, the data in supervised learning is labeled, whereas data in unsupervised
learning is unlabelled.
Semi-supervised learning: Problems where you have a large amount of input data and
only some of the data is labeled, are called semi-supervised learning problems. These
problems sit in between both supervised and unsupervised learning. For example, a photo
archive where only some of the images are labeled, (e.g. dog, cat, person) and the
majority are unlabeled.
Reinforcement learning: A computer program interacts with a dynamic environment in
which it must perform a certain goal (such as driving a vehicle or playing a game against
an opponent). The program is provided feedback in terms of rewards and punishments as
it navigates its problem space.
2. Two most common use cases of Supervised learning are:
Classification: Inputs are divided into two or more classes, and the learner must produce
a model that assigns unseen inputs to one or more (multi-label classification) of these
classes and predicts whether or not something belongs to a particular class. This is
typically tackled in a supervised way. Classification models can be categorized in two
groups: Binary classification and Multiclass Classification. Spam filtering is an example
of binary classification, where the inputs are email (or other) messages and the classes are
“spam” and “not spam”.
Regression: It is also a supervised learning problem, that predicts a numeric value and
outputs are continuous rather than discrete. For example, predicting stock prices using
historical data.
An example of classification and regression on two different datasets is shown below: