0% found this document useful (0 votes)
16 views

Ch4

Uploaded by

Siddhesh Yevale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Ch4

Uploaded by

Siddhesh Yevale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Unit – 4

Machine Learning Essentials


4.1. Introduction
A rapidly developing field of technology, machine learning allows computers to automatically learn
from previous data. For building mathematical models and making predictions based on historical data
or information, machine learning employs a variety of algorithms. It is currently being used for a variety
of tasks, including speech recognition, email filtering, auto-tagging on Facebook, a recommender
system, and image recognition.
4.2. Definition of Machine Learning
Machine learning (ML) is a type of Artificial Intelligence (AI) that allows computers to learn without
being explicitly programmed. It involves feeding data into algorithms that can then identify patterns
and make predictions on new data.
A computer program is said to learn from experience E concerning some class of tasks T and
performance measure P, if its performance at tasks T, as measured by P, improves with experience E.
Examples:
I. Handwriting recognition learning problem
✓ Task T: Recognizing and classifying handwritten words within images
✓ Performance P: Percent of words correctly classified
✓ Training experience E: A dataset of handwritten words with given classifications

II. A robot driving learning problem


✓ Task T: Driving on highways using vision sensors
✓ Performance P: Average distance traveled before an error
✓ Training experience E: A sequence of images and steering commands recorded
while observing a human driver

4.3. How does Machine Learning work?


✓ A machine learning system builds prediction models, learns from previous data, and predicts
the output of new data whenever it receives it. The amount of data helps to build a better model
that accurately predicts the output, which in turn affects the accuracy of the predicted output.
✓ Let's say we have a complex problem in which we need to make predictions. Instead of writing
code, we just need to feed the data to generic algorithms, which build the logic based on the
data and predict the output. Our perspective on the issue has changed as a result of machine
learning. The Machine Learning algorithm's operation is depicted in the following block
diagram:
✓ Let's say we have a complex problem in which we need to make predictions. Instead of writing
code, we just need to feed the data to generic algorithms, which build the logic based on the
data and predict the output. Our perspective on the issue has changed as a result of machine
learning.

4.4. Features of Machine Learning:


1. Machine learning uses data to detect various patterns in a given dataset.
2. It can learn from past data and improve automatically.
3. It is a data-driven technology.
4. Machine learning is much similar to data mining as it also deals with the huge amount of the data.

4.5. Importance of Machine Learning:


1. Rapid increment in the production of data
2. Solving complex problems, which are difficult for a human
3. Decision making in various sector including finance
4. Finding hidden patterns and extracting useful information from data.

4.6. Classification of Machine Learning


At a broad level, machine learning can be classified into three types:
1. Supervised learning
2. Unsupervised learning
3. Reinforcement learning
4.6.1. Supervised Learning
✓ In supervised learning, sample labeled data are provided to the machine learning system for
training, and the system then predicts the output based on the training data.
✓ The system uses labeled data to build a model that understands the datasets and learns about
each one. After the training and processing are done, we test the model with sample data to see
if it can accurately predict the output.
✓ The mapping of the input data to the output data is the objective of supervised learning. The
managed learning depends on oversight, and it is equivalent to when an understudy learns things
in the management of the educator. Spam filtering is an example of supervised learning.

Supervised learning can be grouped further in two categories of algorithms:


a. Classification
b. Regression

Classification:
✓ The Classification algorithm is a Supervised Learning technique that is used to identify the
category of new observations on the basis of training data. In Classification, a program learns
from the given dataset or observations and then classifies new observation into a number of
classes or groups. Such as, Yes or No, 0 or 1, Spam or Not Spam, cat or dog, etc. Classes can
be called as targets/labels or categories.
✓ Unlike regression, the output variable of Classification is a category, not a value, such as "Green
or Blue", "fruit or animal", etc. Since the Classification algorithm is a Supervised learning
technique, hence it takes labeled input data, which means it contains input with the
corresponding output.
✓ The best example of an ML classification algorithm is Email Spam Detector.
✓ The main goal of the Classification algorithm is to identify the category of a given dataset, and
these algorithms are mainly used to predict the output for the categorical data.
✓ Classification algorithms can be better understood using the below diagram. In the below
diagram, there are two classes, class A and Class B. These classes have features that are similar
to each other and dissimilar to other classes.

The algorithm which implements the classification on a dataset is known as a classifier. There are two
types of Classifications:
1. Binary Classifier: If the classification problem has only two possible outcomes, then it is called
as Binary Classifier.

Examples: YES or NO, MALE or FEMALE, SPAM or NOT SPAM, CAT or DOG, etc.
2. Multi-class Classifier: If a classification problem has more than two outcomes, then it is called
as Multi-class Classifier.
Example: Classifications of types of crops, Classification of types of music.

Regression:
✓ Regression analysis is a statistical method to model the relationship between a dependent
(target) and independent (predictor) variables with one or more independent variables.
✓ More specifically, Regression analysis helps us to understand how the value of the dependent
variable is changing corresponding to an independent variable when other independent
variables are held fixed. It predicts continuous/real values such as temperature, age, salary,
price, etc.
Example: Suppose there is a marketing company A, who does various advertisement every year and
get sales on that. The below list shows the advertisement made by the company in the last 5 years and
the corresponding sales:

✓ Now, the company wants to do the advertisement of $200 in the year 2019 and wants to know
the prediction about the sales for this year. So to solve such type of prediction problems in
machine learning, we need regression analysis.

✓ Regression is a supervised learning technique which helps in finding the correlation between
variables and enables us to predict the continuous output variable based on the one or more
predictor variables. It is mainly used for prediction, forecasting, time series modeling, and
determining the causal-effect relationship between variables.

✓ In Regression, we plot a graph between the variables which best fits the given datapoints, using
this plot, the machine learning model can make predictions about the data. In simple
words, "Regression shows a line or curve that passes through all the datapoints on target-
predictor graph in such a way that the vertical distance between the datapoints and the
regression line is minimum." The distance between datapoints and line tells whether a model
has captured a strong relationship or not.
Some examples of regression can be as:

o Prediction of rain using temperature and other factors


o Determining Market trends
o Prediction of road accidents due to rash driving.

4.6.2. Unsupervised Learning


Unsupervised learning is a learning method in which a machine learns without any supervision.

The training is provided to the machine with the set of data that has not been labeled, classified, or
categorized, and the algorithm needs to act on that data without any supervision. The goal of
unsupervised learning is to restructure the input data into new features or a group of objects with similar
patterns.

In unsupervised learning, we don't have a predetermined result. The machine tries to find useful insights
from the huge amount of data. It can be further classifieds into two categories of algorithms:

o Clustering
o Association

Clustering:

✓ Clustering or cluster analysis is a machine learning technique, which groups the unlabeled
dataset. It can be defined as "A way of grouping the data points into different clusters,
consisting of similar data points. The objects with the possible similarities remain in a group
that has less or no similarities with another group."
✓ It does it by finding some similar patterns in the unlabeled dataset such as shape, size, color,
behavior, etc., and divides them as per the presence and absence of those similar patterns.
✓ It is an unsupervised learning method; hence no supervision is provided to the algorithm, and it
deals with the unlabeled dataset.
✓ After applying this clustering technique, each cluster or group is provided with a cluster-ID.
ML system can use this id to simplify the processing of large and complex datasets.
✓ The clustering technique is commonly used for statistical data analysis.
Association:

✓ Association rule learning is a type of unsupervised learning technique that checks for the
dependency of one data item on another data item and maps accordingly so that it can be more
profitable.
✓ It tries to find some interesting relations or associations among the variables of dataset. It is
based on different rules to discover the interesting relations between variables in the database.
✓ The association rule learning is one of the very important concepts of machine learning, and it
is employed in Market Basket analysis, Web usage mining, continuous production, etc.
✓ Here market basket analysis is a technique used by the various big retailer to discover the
associations between items. We can understand it by taking an example of a supermarket, as in
a supermarket, all products that are purchased together are put together.

4.6.3. Reinforcement Learning


Reinforcement learning is a feedback-based learning method, in which a learning agent gets a reward
for each right action and gets a penalty for each wrong action. The agent learns automatically with these
feedbacks and improves its performance. In reinforcement learning, the agent interacts with the
environment and explores it. The goal of an agent is to get the most reward points, and hence, it
improves its performance. The robotic dog, which automatically learns the movement of his arms, is
an example of Reinforcement learning.

4.7. Linear Regression:


✓ Linear regression is one of the easiest and most popular Machine Learning algorithms. It is a
statistical method that is used for predictive analysis. Linear regression makes predictions for
continuous/real or numeric variables such as sales, salary, age, product price, etc.
✓ Linear regression algorithm shows a linear relationship between a dependent (y) and one or more
independent (y) variables, hence called as linear regression. Since linear regression shows the linear
relationship, which means it finds how the value of the dependent variable is changing according to
the value of the independent variable.
✓ The linear regression model provides a sloped straight line representing the relationship between the
variables. Consider the below image:

Mathematically, we can represent a linear regression as:

y= a0+a1x+ ε

Here,

Y= Dependent Variable (Target Variable)


X= Independent Variable (predictor Variable)
a0= intercept of the line (Gives an additional degree of freedom)
a1 = Linear regression coefficient (scale factor to each input value).
ε = random error

The values for x and y variables are training datasets for Linear Regression model representation.

Types of Linear Regression


Linear regression can be further divided into two types of the algorithm:

o Simple Linear Regression:


If a single independent variable is used to predict the value of a numerical dependent variable,
then such a Linear Regression algorithm is called Simple Linear Regression.
o Multiple Linear regression:
If more than one independent variable is used to predict the value of a numerical dependent
variable, then such a Linear Regression algorithm is called Multiple Linear Regression.

4.8. Logistic Regression:

✓ Logistic regression is one of the most popular Machine Learning algorithms, which comes
under the Supervised Learning technique. It is used for predicting the categorical dependent
variable using a given set of independent variables.
✓ Logistic regression predicts the output of a categorical dependent variable. Therefore the
outcome must be a categorical or discrete value. It can be either Yes or No, 0 or 1, true or
False, etc. but instead of giving the exact value as 0 and 1, it gives the probabilistic values
which lie between 0 and 1.
✓ Logistic Regression is much similar to the Linear Regression except that how they are used.
Linear Regression is used for solving Regression problems, whereas Logistic regression is
used for solving the classification problems.
✓ In Logistic regression, instead of fitting a regression line, we fit an "S" shaped logistic
function, which predicts two maximum values (0 or 1).
✓ The curve from the logistic function indicates the likelihood of something such as whether
the cells are cancerous or not, a mouse is obese or not based on its weight, etc.
✓ Logistic Regression is a significant machine learning algorithm because it has the ability to
provide probabilities and classify new data using continuous and discrete datasets.
✓ Logistic Regression can be used to classify the observations using different types of data
and can easily determine the most effective variables used for the classification. The below
image is showing the logistic function:

You might also like