0% found this document useful (0 votes)

98 views51 pages

Unit 4 Machine Learning

The document discusses various applications of machine learning, including social media features, product recommendations, image recognition, sentiment analysis, virtual personal assistants, self-driving cars, and entertainment. It also defines machine learning, describes the types of machine learning including supervised and unsupervised learning, and covers terminology related to machine learning models, algorithms, datasets, and more.

Uploaded by

shahidshaikh9936

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

98 views51 pages

Unit 4 Machine Learning

Uploaded by

shahidshaikh9936

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 51

Machine Learning

Unit 4

Overview of Machine learning concepts – Over fitting and train/test splits,

Types of Machine learning – Supervised, Unsupervised, Reinforcement learning

Vinay S. Prabhavalkar
Application of Machine Learning

1. Social Media Features

• Social media platforms use machine

learning algorithms and approaches to
create some attractive and excellent
features. For instance, Facebook notices
and records your activities, chats, likes,
and comments, and the time you spend
on specific kinds of posts. Machine
learning learns from your own
experience and makes friends and page
suggestions for your profile.

Vinay S. Prabhavalkar
Application of Machine Learning

2. Product Recommendations

• Product recommendation is one of the

most popular and known applications of
machine learning. Product
recommendation is one of the stark
features of almost every e-commerce
website today, which is an advanced
application of machine learning
techniques. Using machine learning and
AI, websites track your behavior based
on your previous purchases, searching
patterns, and cart history, and then
make product recommendations.

Vinay S. Prabhavalkar
Application of Machine Learning

3. Image Recognition

• Image recognition, which is an approach for cataloging and detecting a

feature or an object in the digital image, is one of the most significant and
notable machine learning and AI techniques. This technique is being
adopted for further analysis, such as pattern recognition, face detection,
and face recognition.

Vinay S. Prabhavalkar
Application of Machine Learning

4. Sentiment Analysis
• Sentiment analysis is one of the most necessary applications of machine
learning. Sentiment analysis is a real-time machine learning application
that determines the emotion or opinion of the speaker or the writer. For
instance, if someone has written a review or email (or any form of a
document), a sentiment analyzer will instantly find out the actual thought
and tone of the text. This sentiment analysis application can be used to
analyze a review based website, decision-making applications, etc

Vinay S. Prabhavalkar
Application of Machine Learning
5. Virtual Personal Assistants
As the name suggests, Virtual Personal Assistants assist in finding useful
information, when asked via text or voice. Few of the major applications of
Machine Learning here are:
Speech Recognition
Speech to Text Conversion
Natural Language Processing
Text to Speech Conversion

Vinay S. Prabhavalkar
Application of Machine Learning
6. Self Driving Cars
• Well, here is one of the coolest application of Machine Learning. It’s here
and people are already using it. Machine Learning plays a very important
role in Self Driving Cars and I’m sure you guys might have heard
about Tesla. The leader in this business and their current Artificial
Intelligence is driven by hardware manufacturer NVIDIA, which is based
on Unsupervised Learning Algorithm.
• NVIDIA stated that they didn’t train their model to detect people or any
object as such. The model works on Deep Learning and it crowdsources
data from all of its vehicles and its drivers. It uses internal and external
sensors which are a part of IOT. According to the data gathered by
McKinsey, the automotive data will hold a tremendous value of $750
Billion.

Vinay S. Prabhavalkar
Application of Machine Learning
7. Entertainment
Companies such as Netflix, Amazon, YouTube, and Spotify give
relevant movies, songs, and video recommendations to enhance their
customer experience.
This is all thanks to Deep Learning. Based on a person’s browsing
history, interest, and behavior, online streaming companies give
suggestions to help them make product and service choices.
Deep learning techniques are also used to add sound to silent movies
and generate subtitles automatically.

Vinay S. Prabhavalkar
What is Machine Learning
• Definition of Machine Learning

• Machine learning is an application of AI that enables systems to learn and

improve from experience without being explicitly programmed. Machine
learning focuses on developing computer programs that can access data
and use it to learn for themselves.

• Machine learning is a branch of artificial intelligence (AI) and computer

science which focuses on the use of data and algorithms to imitate the
way that humans learn, gradually improving its accuracy.

• Machine learning (ML) is defined as a discipline of artificial intelligence

(AI) that provides machines the ability to automatically learn from data
and past experiences to identify patterns and make predictions with
minimal human intervention.

Vinay S. Prabhavalkar
What is Machine Learning
• Definition of Machine Learning

• A computer program is said to learn from experience E with respect to

some class of tasks T and performance measure P, if its performance at
tasks in T, as measured by P, improves with experience E.

• Task: A task is defined as the main problem in which we are interested.

This task/problem can be related to the predictions and recommendations
and estimations, etc.
• Experience: It is defined as learning from historical or past data and used
to estimate and resolve future tasks.
• Performance: It is defined as the capacity of any machine to resolve any
machine learning task or problem and provide the best outcome for the
same. However, performance is dependent on the type of machine
learning problems.

Vinay S. Prabhavalkar
Terminology in Machine Learning
• An Algorithm is a set of rules that a machine follows to achieve a particular goal.
• Machine Learning is a set of methods that allow computers to learn from data to
make and improve predictions (for example cancer, weekly sales, credit default).
• A Learner or Machine Learning Algorithm is the program used to learn a
machine learning model from data. Another name is "inducer" (e.g. "tree
inducer").
• A Machine Learning Model is the learned program that maps inputs to
predictions. A trained machine is also called as a Model.
• A Dataset is a table with the data from which the machine learns. An Instance is
a row in the dataset. A feature is a column in the dataset.
• The Target is the information the machine learns to predict. In mathematical
formulas, the target is usually called y.
• The Prediction is what the machine learning model "guesses" what the target
value should be based on the given features.
• A Training set is used to train a machine.
• A Test Set is used to test a already trained machine.
• A validation set is used to verify a trained machine.
Vinay S. Prabhavalkar
https://christophm.github.io/interpretable-ml-book/terminology.html
Terminology in Machine Learning

Vinay S. Prabhavalkar
Terminology in Machine Learning

Vinay S. Prabhavalkar
Terminology in Machine Learning
• Bias is the gap between predicted value by the model and the actual or target
value.
• Variance tells how scattered the predicted values are.

Vinay S. Prabhavalkar
https://christophm.github.io/interpretable-ml-book/terminology.html
Types of Machine Learning

Vinay S. Prabhavalkar
Types of Machine Learning

1. Supervised learning:

It is applicable when a machine has sample data, i.e., input as well as output
data with correct labels. Correct labels are used to check the correctness of
the model using some labels and tags.
Supervised learning technique helps us to predict future events with the help
of past experience and labeled examples. Initially, it analyses the known
training dataset, and later it introduces an inferred function that makes
predictions about output values. Further, it also predicts errors during this
entire learning process and also corrects those errors through algorithms.

Example: Let's assume we have a set of images tagged as ''dog''. A machine

learning algorithm is trained with these dog images so it can easily distinguish
whether an image is a dog or not.

Vinay S. Prabhavalkar
Types of Machine Learning
1. Supervised learning:

Vinay S. Prabhavalkar
Types of Machine Learning

2. Unsupervised Learning:

As the name suggests, unsupervised learning is a machine learning technique

in which models are not supervised using training dataset. Instead, models
itself find the hidden patterns and insights from the given data. It can be
compared to learning which takes place in the human brain while learning
new things.
Unsupervised learning cannot be directly applied to a regression or
classification problem because unlike supervised learning, we have the input
data but no corresponding output data.

Example: Suppose the unsupervised learning algorithm is given an input

dataset containing images of different types of cats and dogs. The algorithm
is never trained upon the given dataset, which means it does not have any
idea about the features of the dataset. The task of the unsupervised learning
algorithm is to identify the image features on their own. Unsupervised
learning algorithm will perform this task by clustering the image dataset into
the groups according to similarities between images.
Vinay S. Prabhavalkar
Types of Machine Learning

2. Unsupervised Learning:
Here, we have taken an unlabeled input data, which means it is not
categorized and corresponding outputs are also not given. Now, this
unlabeled input data is fed to the machine learning model in order to
train it. Firstly, it will interpret the raw data to find the hidden patterns
from the data and then will apply suitable algorithms such as k-means
clustering, Decision tree, etc.

Vinay S. Prabhavalkar
Types of Machine Learning
3. Reinforcement Learning:

Reinforcement Learning is a feedback-based machine learning technique.

In such type of learning, agents (computer programs) need to explore the

environment, perform actions, and on the basis of their actions, they get
rewards as feedback.

For each good action, they get a positive reward, and for each bad action,
they get a negative reward.

The goal of a Reinforcement learning agent is to maximize the positive

rewards.

Since there is no labeled data, the agent is bound to learn by its experience
only.
Vinay S. Prabhavalkar
Types of Machine Learning
3. Reinforcement Learning:

An RL problem can be best explained through games. Let’s take the game
of PacMan where the goal of the agent(PacMan) is to eat the food in the grid
while avoiding the ghosts on its way. In this case, the grid world is the
interactive environment for the agent where it acts. Agent receives a reward
for eating food and punishment if it gets killed by the ghost (loses the game).
The states are the location of the agent in the grid world and the total
cumulative reward is the agent winning the game.

Vinay S. Prabhavalkar
Types of Machine Learning

4. Semi-supervised Learning:

Semi-supervised Learning is an intermediate technique of both supervised

and unsupervised learning. It performs actions on datasets having few labels
as well as unlabeled data. However, it generally contains unlabeled data.
Hence, it also reduces the cost of the machine learning model as labels are
costly, but for corporate purposes, it may have few labels. Further, it also
increases the accuracy and performance of the machine learning model.

Semi-supervised Learning is an intermediate technique of both supervised

Vinay S. Prabhavalkar
Machine Learning Problem Categories

1. Classification:

Classification is a task that requires the use of machine learning algorithms

that learn how to assign a class label to examples from the problem domain.
An easy to understand example is classifying emails as “spam” or “not spam.”

Following are the types of classifications:-

1. Classification Predictive Modelling
2. Binary Classification
3. Multi-Class Classification
4. Multi-Label Classification
5. Imbalanced Classification

Vinay S. Prabhavalkar
Machine Learning Problem Categories

1. Classification:

1. Classification Predictive Modelling: In machine learning, classification

refers to a predictive modeling problem where a class label is predicted
for a given example of input data.

Examples of classification problems include:

i. Given an example, classify if it is spam or not.

ii. Given a handwritten character, classify it as one of the known
characters.
iii. Given recent user behavior, classify as churn or not.

Vinay S. Prabhavalkar
Machine Learning Problem Categories
1. Classification:
2. Binary Classification: It refers to those classification tasks that have two
class labels.
Examples include:
i. Email spam detection (spam or not).
ii. Churn prediction (churn or not).
iii. Conversion prediction (buy or not).

• Typically, binary classification tasks involve one class that is the normal
state and another class that is the abnormal state.
• Bernoulli probability distribution is used to classify such a problem.
• Popular algorithms that can be used for binary classification include:

▪ Logistic Regression
▪ k-Nearest Neighbors
▪ Decision Trees
▪ Support Vector Machine
▪ Naive Bayes Vinay S. Prabhavalkar
Machine Learning Problem Categories
1. Classification:
3. Multi-Class Classification: It refers to those classification tasks that have
more than two class labels.
Examples include:
i. Face classification.
ii. Plant species classification.
iii. Optical character recognition.

• Unlike binary classification, multi-class classification does not have the

notion of normal and abnormal outcomes. Instead, examples are classified
as belonging to one among a range of known classes.

• The number of class labels may be very large on some problems. For
example, a model may predict a photo as belonging to one among
thousands or tens of thousands of faces in a face recognition system.

Vinay S. Prabhavalkar
Machine Learning Problem Categories
• Multinoulli probability distribution is used to classify such a problem.
• Popular algorithms that can be used for multi-class classification include:
i. k-Nearest Neighbors.
ii. Decision Trees.
iii. Naive Bayes.
iv. Random Forest.
v. Gradient Boosting.

Vinay S. Prabhavalkar
Machine Learning Problem Categories
1. Classification:
4. Multi-Label Classification: It refers to those classification tasks that have
two or more class labels, where one or more class labels may be
predicted for each example.

• Consider the example of photo classification, where a given photo may

have multiple objects in the scene and a model may predict the presence
of multiple known objects in the photo, such as “bicycle,” “apple,”
“person,” etc.

• Special multi-label versions of the algorithms given below are used for
classification:-
i. Multi-label Decision Trees
ii. Multi-label Random Forests
iii. Multi-label Gradient Boosting

Vinay S. Prabhavalkar
Machine Learning Problem Categories
1. Classification:
5. Imbalanced Classification: It refers to classification tasks where the
number of examples in each class is unequally distributed.
• Typically, imbalanced classification tasks are binary classification tasks
where the majority of examples in the training dataset belong to the
normal class and a minority of examples belong to the abnormal class.
Examples include:
i. Fraud detection.
ii. Outlier detection.
iii. Medical diagnostic tests
• Specialized techniques may be used to change the composition of samples in
the training dataset by under-sampling the majority class or oversampling
the minority class.
Examples include:
i. Random Under-sampling
ii. SMOTE Oversampling
Vinay S. Prabhavalkar
Machine Learning Problem Categories
• Specialized modeling algorithms may be used that pay more attention to
the minority class when fitting the model on the training dataset, such as
cost-sensitive machine learning algorithms.
i. Examples include:
ii. Cost-sensitive Logistic Regression
iii. Cost-sensitive Decision Trees.

Vinay S. Prabhavalkar
Machine Learning Problem Categories
2. Clustering:

• Clustering or cluster analysis is a machine learning technique, which

groups the unlabelled dataset. It can be defined as "A way of grouping
the data points into different clusters, consisting of similar data points.
The objects with the possible similarities remain in a group that has less
or no similarities with another group."
• It does it by finding some similar patterns in the unlabelled dataset such
as shape, size, color, behavior, etc., and divides them as per the presence
and absence of those similar patterns.
• It is an unsupervised learning method, hence no supervision is provided
to the algorithm, and it deals with the unlabeled dataset.
• After applying this clustering technique, each cluster or group is provided
with a cluster-ID. ML system can use this id to simplify the processing of
large and complex datasets.
• The clustering technique is commonly used for statistical data analysis.
Vinay S. Prabhavalkar
Machine Learning Problem Categories
2. Clustering:
Example: Let's understand the clustering technique with the real-world example of
Mall:
• When we visit any shopping mall, we can observe that the things with similar
usage are grouped together. Such as the t-shirts are grouped in one section, and
trousers are at other sections, similarly, at vegetable sections, apples, bananas,
Mangoes, etc., are grouped in separate sections, so that we can easily find out
the things. The clustering technique also works in the same way. Other examples
of clustering are grouping documents according to the topic.
• The clustering technique can be widely used in various tasks. Some most
common uses of this technique are:
– Market Segmentation
– Statistical data analysis
– Social network analysis
– Image segmentation
– Anomaly detection, etc.
• Apart from these general usages, it is used by the Amazon in its recommendation
system to provide the recommendations as per the past search of
products. Netflix also uses this technique to recommend the movies and web-
Vinay S. Prabhavalkar
series to its users as per the watch history.
Machine Learning Problem Categories
2. Clustering:

The below diagram explains the working of the clustering algorithm. We can see
the different fruits are divided into several groups with similar properties.

Vinay S. Prabhavalkar
Machine Learning Problem Categories
2. Clustering:

• Types of Clustering Methods: The clustering methods are broadly divided

into Hard clustering (data points belongs to only one group) and Soft
Clustering (data points can belong to another group also). But there are also
other various approaches of Clustering exist.

• Below are the main clustering methods used in Machine learning:

a. Partitioning Clustering
b. Density-Based Clustering
c. Distribution Model-Based Clustering
d. Hierarchical Clustering
e. Fuzzy Clustering

Vinay S. Prabhavalkar
Machine Learning Problem Categories
2. Clustering:
a. Partitioning Clustering:-

• It is a type of clustering that divides the

data into non-hierarchical groups. It is
also known as the centroid-based
method. The most common example of
partitioning clustering is the K-Means
Clustering algorithm.
• In this type, the dataset is divided into
a set of k groups, where K is used to
define the number of pre-defined
groups. The cluster center is created in
such a way that the distance between
the data points of one cluster is
minimum as compared to another
cluster centroid. Vinay S. Prabhavalkar
Machine Learning Problem Categories
2. Clustering:
b. Density-Based Clustering:-

• The density-based clustering method

connects the highly-dense areas into
clusters, and the arbitrarily shaped
distributions are formed as long as the
dense region can be connected. This
algorithm does it by identifying
different clusters in the dataset and
connects the areas of high densities
into clusters. The dense areas in data
space are divided from each other by
sparser areas.
• These algorithms can face difficulty in
clustering the data points if the dataset
has varying densities and high
dimensions.
Vinay S. Prabhavalkar
Machine Learning Problem Categories
2. Clustering:

c. Distribution Model-Based Clustering:-

• In the distribution model-based

clustering method, the data is divided
based on the probability of how a
dataset belongs to a particular
distribution. The grouping is done by
assuming some distributions
commonly Gaussian Distribution.
• The example of this type is
the Expectation-Maximization
Clustering algorithm that uses
Gaussian Mixture Models (GMM).

Vinay S. Prabhavalkar
Machine Learning Problem Categories
2. Clustering:
d. Hierarchical Clustering:-

• Hierarchical clustering can be used as

an alternative for the partitioned
clustering as there is no requirement of
pre-specifying the number of clusters
to be created.
• In this technique, the dataset is divided
into clusters to create a tree-like
structure, which is also called
a dendrogram.
• The observations or any number of
clusters can be selected by cutting the
tree at the correct level. The most
common example of this method is
the Agglomerative Hierarchical
algorithm.
Vinay S. Prabhavalkar
Machine Learning Problem Categories
2. Clustering:

e. Fuzzy Clustering:-
• Clustering is a type of soft method in which a data object may belong to
more than one group or cluster.
• Each dataset has a set of membership coefficients, which depend on the
degree of membership to be in a cluster.
• Fuzzy C-means algorithm is the example of this type of clustering; it is
sometimes also known as the Fuzzy k-means algorithm.

Vinay S. Prabhavalkar
Machine Learning Problem Categories
2. Clustering:

Clustering Algorithms:-

• K-Means algorithm: The k-means algorithm is one of the most popular clustering algorithms. It
classifies the dataset by dividing the samples into different clusters of equal variances. The
number of clusters must be specified in this algorithm. It is fast with fewer computations
required, with the linear complexity of O(n).
• Mean-shift algorithm: Mean-shift algorithm tries to find the dense areas in the smooth density
of data points. It is an example of a centroid-based model, that works on updating the
candidates for centroid to be the center of the points within a given region.
• DBSCAN Algorithm: It stands for Density-Based Spatial Clustering of Applications with Noise. It
is an example of a density-based model similar to the mean-shift, but with some remarkable
advantages. In this algorithm, the areas of high density are separated by the areas of low
density. Because of this, the clusters can be found in any arbitrary shape.
• Expectation-Maximization Clustering using GMM: This algorithm can be used as an alternative
for the k-means algorithm or for those cases where K-means can be failed. In GMM, it is
assumed that the data points are Gaussian distributed.
• Agglomerative Hierarchical algorithm: The Agglomerative hierarchical algorithm performs the
bottom-up hierarchical clustering. In this, each data point is treated as a single cluster at the
outset and then successively merged. The cluster hierarchy can be represented as a tree-
structure.
• Affinity Propagation: It is different from other clustering algorithms as it does not require to
specify the number of clusters. In this, each data point sends a message between the pair of data
points until convergence. It has O(N2T) time complexity, which is the main drawback of this
algorithm. Vinay S. Prabhavalkar
Machine Learning Problem Categories
3. Regression:
• Regression analysis is a statistical method to model the relationship between a
dependent (target) and independent (predictor) variables with one or more
independent variables. More specifically, Regression analysis helps us to understand
how the value of the dependent variable is changing corresponding to an
independent variable when other independent variables are held fixed. It predicts
continuous/real values such as temperature, age, salary, price, etc.
• We can understand the concept of regression analysis using the below example:

Example: Suppose there is a marketing company A, who

does various advertisement every year and get sales on
that. The below list shows the advertisement made by
the company in the last 5 years and the corresponding
sales:

Now, the company wants to do the advertisement of

$200 in the year 2019 and wants to know the prediction
about the sales for this year. So to solve such type of
prediction problems in machine learning, we need
regression analysis.
Vinay S. Prabhavalkar
Machine Learning Problem Categories
3. Regression:
• Regression is a supervised learning technique which helps in finding the correlation
between variables and enables us to predict the continuous output variable based on
the one or more predictor variables. It is mainly used for prediction, forecasting, time
series modeling, and determining the causal-effect relationship between variables.

• In Regression, we plot a graph between the variables which best fits the given
datapoints, using this plot, the machine learning model can make predictions about
the data. In simple words, "Regression shows a line or curve that passes through all
the datapoints on target-predictor graph in such a way that the vertical distance
between the datapoints and the regression line is minimum." The distance between
datapoints and line tells whether a model has captured a strong relationship or not.

• Some examples of regression can be as:

– Prediction of rain using temperature and other factors
– Determining Market trends
– Prediction of road accidents due to rash driving.

Vinay S. Prabhavalkar
Machine Learning Problem Categories
3. Regression:
Terminologies Related to the Regression Analysis:
• Dependent Variable: The main factor in Regression analysis which we want to predict
or understand is called the dependent variable. It is also called target variable.
• Independent Variable: The factors which affect the dependent variables or which are
used to predict the values of the dependent variables are called independent
variable, also called as a predictor.
• Outliers: Outlier is an observation which contains either very low value or very high
value in comparison to other observed values. An outlier may hamper the result, so it
should be avoided.
• Multicollinearity: If the independent variables are highly correlated with each other
than other variables, then such condition is called Multicollinearity. It should not be
present in the dataset, because it creates problem while ranking the most affecting
variable.
• Underfitting and Overfitting: If our algorithm works well with the training dataset but
not well with test dataset, then such problem is called Overfitting. And if our
algorithm does not perform well even with training dataset, then such problem is
called underfitting.

Vinay S. Prabhavalkar
Machine Learning Problem Categories
4. Optimization:
• Optimization is the problem of finding a set of inputs to an objective
function that results in a maximum or minimum function evaluation.
• Optimization, in simple terms, is a mechanism to make something better or
define a context for a solution that makes it the best.
• Consider a production scenario:- Let's assume there are two machines that
produce the desired product-
– one machine requires more energy for high speed in production and
lower raw materials
– other requires higher raw materials and less energy to produce the same
output in the same time.
• It is important to understand the patterns in the output based on the
variation in inputs; a combination that gives the highest profits would
probably be the one the production manager would want to know.
• As an analyst, one needs to identify the best possible way to distribute the
production between the machines that gives them the highest profit.

Vinay S. Prabhavalkar
Machine Learning Problem Categories
4. Optimization:
• The following image shows the point of highest profit when a graph was
plotted for various distribution options between the two machines.
Identifying this point is the goal of this technique.

Vinay S. Prabhavalkar
Data and inconsistencies in Machine learning

• Before understanding the overfitting and underfitting, let's understand

some basic term that will help to understand this topic well:

• Signal: It refers to the true underlying pattern of the data that helps the
machine learning model to learn from the data.

• Noise: Noise is unnecessary and irrelevant data that reduces the

performance of the model.

• Bias: Bias is a prediction error that is introduced in the model due to

oversimplifying the machine learning algorithms. Or it is the difference
between the predicted values and the actual values.

• Variance: If the machine learning model performs well with the training
dataset, but does not perform well with the test dataset, then variance
occurs.
Vinay S. Prabhavalkar
Data and inconsistencies in Machine learning
• Under-fitting: Underfitting occurs when our machine learning model is
not able to capture the underlying trend of the data.
• Example: We can understand the underfitting using below output of the
linear regression model:
• As we can see from the above diagram, the model is unable to capture the
data points present in the plot.

How to avoid underfitting:

• By increasing the training time of

the model.

• By increasing the number of

features.

Vinay S. Prabhavalkar
Data and inconsistencies in Machine learning
• Overfitting: Overfitting occurs when our machine learning model tries to
cover all the data points or more than the required data points present in
the given dataset.
• Because of this, the model starts caching noise and inaccurate values
present in the dataset, and all these factors reduce the efficiency and
accuracy of the model.
• The overfitted model has low bias and high variance.

• The chances of occurrence of

overfitting increase as much we
provide training to our model. It
means the more we train our
model, the more chances of
occurring the overfitted model.
• Overfitting is the main problem
that occurs in supervised
learning.
Vinay S. Prabhavalkar
Data and inconsistencies in Machine learning
• As we can see from the above graph, the model tries to cover all the data
points present in the scatter plot. It may look efficient, but in reality, it is
not so. Because the goal of the regression model to find the best fit line,
but here we have not got any best fit, so, it will generate the prediction
errors.
How to avoid the Overfitting in Model
• Both overfitting and underfitting cause the degraded performance of the
machine learning model. But the main cause is overfitting, so there are
some ways by which we can reduce the occurrence of overfitting in our
model.
– Cross-Validation
– Training with more data
– Removing features
– Early stopping the training
– Regularization
– Ensembling

Vinay S. Prabhavalkar
Train Test Split

Vinay S. Prabhavalkar

Machine Learning PPT For Students
70% (10)
Machine Learning PPT For Students
18 pages
OM Test Bank - Chapte3
100% (2)
OM Test Bank - Chapte3
10 pages
ML Notes
No ratings yet
ML Notes
202 pages
AI Presentation Machine Learning
100% (2)
AI Presentation Machine Learning
42 pages
DL UNIT 1
No ratings yet
DL UNIT 1
21 pages
ml report
No ratings yet
ml report
19 pages
What is Machine Learning
No ratings yet
What is Machine Learning
5 pages
UNIT III_AIML
No ratings yet
UNIT III_AIML
47 pages
UNIT III;dkd
No ratings yet
UNIT III;dkd
48 pages
Session One Machine Learning
No ratings yet
Session One Machine Learning
18 pages
Lec 15 (1)
No ratings yet
Lec 15 (1)
37 pages
Machine Learning Tutorial For Beginners
No ratings yet
Machine Learning Tutorial For Beginners
15 pages
ML_Lec_1
No ratings yet
ML_Lec_1
49 pages
Unit-1
No ratings yet
Unit-1
112 pages
Unit-I
No ratings yet
Unit-I
8 pages
DS Artificial Intelligence 3
No ratings yet
DS Artificial Intelligence 3
14 pages
Machine Learning
No ratings yet
Machine Learning
25 pages
ARTIFICIAL INTELLIGENCE LEC 1 PDF
No ratings yet
ARTIFICIAL INTELLIGENCE LEC 1 PDF
15 pages
Machine Learning.
No ratings yet
Machine Learning.
50 pages
DA-chap2
No ratings yet
DA-chap2
14 pages
ML NOTES
No ratings yet
ML NOTES
101 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
27 pages
Piyush
No ratings yet
Piyush
9 pages
Unit 1
No ratings yet
Unit 1
4 pages
CBSYLLABUS BDA 1
No ratings yet
CBSYLLABUS BDA 1
4 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
19 pages
Concept of Machine Learning
No ratings yet
Concept of Machine Learning
13 pages
Chapter 01 machine learning
No ratings yet
Chapter 01 machine learning
22 pages
Unit 1
No ratings yet
Unit 1
24 pages
O220880ppt_1
No ratings yet
O220880ppt_1
19 pages
Introduction To Machine Learning: Dr.S.Sankar Ganesh Vellore Institute of Technology
No ratings yet
Introduction To Machine Learning: Dr.S.Sankar Ganesh Vellore Institute of Technology
132 pages
G5Tzhv_1. Overview of Artificial Intelligence - Study Material_NASSCOM Skill Enhancement Course_class Note PDF-3
No ratings yet
G5Tzhv_1. Overview of Artificial Intelligence - Study Material_NASSCOM Skill Enhancement Course_class Note PDF-3
25 pages
nasscom 1
No ratings yet
nasscom 1
211 pages
Nasscom Notes
No ratings yet
Nasscom Notes
498 pages
Machine Learning
No ratings yet
Machine Learning
17 pages
ML UNIT 1
No ratings yet
ML UNIT 1
34 pages
ML ch1
No ratings yet
ML ch1
20 pages
1.unit 1 ML Q&A
No ratings yet
1.unit 1 ML Q&A
47 pages
A Comprehensive Guide to Machine Learning
No ratings yet
A Comprehensive Guide to Machine Learning
8 pages
Karthik
No ratings yet
Karthik
10 pages
1 Introduction
No ratings yet
1 Introduction
12 pages
Machine Learning1
100% (1)
Machine Learning1
11 pages
CP Presentation Affan, Hammad, Arman, Shayan
No ratings yet
CP Presentation Affan, Hammad, Arman, Shayan
18 pages
Machine Learning
No ratings yet
Machine Learning
3 pages
UNIT I-Machine Learning
No ratings yet
UNIT I-Machine Learning
68 pages
Machine Learning, History and Types of ML
No ratings yet
Machine Learning, History and Types of ML
18 pages
Unit-1 Part-1 Material
No ratings yet
Unit-1 Part-1 Material
45 pages
And Where The Machine Learning Models Are Being Used?
100% (1)
And Where The Machine Learning Models Are Being Used?
4 pages
UNit 1 Introduction To ML
No ratings yet
UNit 1 Introduction To ML
225 pages
01. ML,Types,Application,Life Cycle,Issues
No ratings yet
01. ML,Types,Application,Life Cycle,Issues
29 pages
ML_Module_4
No ratings yet
ML_Module_4
25 pages
DL VS ML VS Ai
No ratings yet
DL VS ML VS Ai
14 pages
01 - Introduction to Machine Learning
No ratings yet
01 - Introduction to Machine Learning
29 pages
Unit3 - Updated
No ratings yet
Unit3 - Updated
116 pages
Machine Learning Techniques-bcds062!01!01[1]
No ratings yet
Machine Learning Techniques-bcds062!01!01[1]
66 pages
ML@Chapter 1
No ratings yet
ML@Chapter 1
29 pages
Chapter 1
No ratings yet
Chapter 1
27 pages
ML_Unit-1
No ratings yet
ML_Unit-1
15 pages
Machine Learning Foundations - Overview (1)
No ratings yet
Machine Learning Foundations - Overview (1)
10 pages
Unit 1
No ratings yet
Unit 1
72 pages
Machine Learning: A Comprehensive, Step-by-Step Guide to Learning and Understanding Machine Learning Concepts, Technology and Principles for Beginners: 1
From Everand
Machine Learning: A Comprehensive, Step-by-Step Guide to Learning and Understanding Machine Learning Concepts, Technology and Principles for Beginners: 1
Peter Bradley
No ratings yet
Mathematics Sample Stage 3 Scope and Sequence - Year 6 (Illustrating The Completion of Stage 3 by The End of Year 6)
No ratings yet
Mathematics Sample Stage 3 Scope and Sequence - Year 6 (Illustrating The Completion of Stage 3 by The End of Year 6)
1 page
Jurnal FCC Pasien Anak
No ratings yet
Jurnal FCC Pasien Anak
7 pages
First Course in Statistics 11th Edition McClave Solutions Manual download
100% (3)
First Course in Statistics 11th Edition McClave Solutions Manual download
52 pages
Sampling 114718
No ratings yet
Sampling 114718
3 pages
MCQ Calculator Oct 2014
No ratings yet
MCQ Calculator Oct 2014
6 pages
Credit Card Fraud Detection
100% (1)
Credit Card Fraud Detection
10 pages
Assessment of The Leadership Skills of Psba QC Graduating Students
No ratings yet
Assessment of The Leadership Skills of Psba QC Graduating Students
111 pages
New Mock Exam
No ratings yet
New Mock Exam
11 pages
Immaculate Conception School of Baliuag: Statistics and Probability Performance Task
No ratings yet
Immaculate Conception School of Baliuag: Statistics and Probability Performance Task
15 pages
Graduate School Application Essay Handout
No ratings yet
Graduate School Application Essay Handout
5 pages
The Art of Statistics How to Learn from Data David Spiegelhalter instant download
100% (7)
The Art of Statistics How to Learn from Data David Spiegelhalter instant download
81 pages
Mathematical Methods: Dr. Asim Khwaja
No ratings yet
Mathematical Methods: Dr. Asim Khwaja
37 pages
Health Information On The Internet: The Case of Greece
No ratings yet
Health Information On The Internet: The Case of Greece
12 pages
Wehel Hadi (8686807) Thesis Applied Data Science
No ratings yet
Wehel Hadi (8686807) Thesis Applied Data Science
46 pages
1 The Problem and Its Setting
No ratings yet
1 The Problem and Its Setting
47 pages
f13hw2 Solutions
No ratings yet
f13hw2 Solutions
13 pages
(PDF Download) Test Bank For Business Statistics in Practice Third Canadian Edition Fulll Chapter
100% (3)
(PDF Download) Test Bank For Business Statistics in Practice Third Canadian Edition Fulll Chapter
188 pages
Tracy Report
No ratings yet
Tracy Report
8 pages
Module08 PolynomialRegressionSplineGAMs
No ratings yet
Module08 PolynomialRegressionSplineGAMs
56 pages
Parenting and Family Adjustment Among Parents 14-20 PDF
No ratings yet
Parenting and Family Adjustment Among Parents 14-20 PDF
7 pages
Analisis Evie
No ratings yet
Analisis Evie
3 pages
STATISTICS and PROBABILITY
No ratings yet
STATISTICS and PROBABILITY
16 pages
Rosenberg Self-Esteem Scale (Rosenberg, 1979) : Strongly Disagree Disagree Agree Strongly Agree
No ratings yet
Rosenberg Self-Esteem Scale (Rosenberg, 1979) : Strongly Disagree Disagree Agree Strongly Agree
4 pages
Lecture 3-Linear-Regression-Part2
No ratings yet
Lecture 3-Linear-Regression-Part2
45 pages
7._75-80
No ratings yet
7._75-80
7 pages
Asbio Manual
No ratings yet
Asbio Manual
145 pages
DL Full Merged
No ratings yet
DL Full Merged
454 pages
Chapter 3
No ratings yet
Chapter 3
8 pages
Lesson No.5. Continuous Random Variable
No ratings yet
Lesson No.5. Continuous Random Variable
3 pages

Unit 4 Machine Learning

Uploaded by

Unit 4 Machine Learning

Uploaded by

Machine Learning

Overview of Machine learning concepts – Over fitting and train/test splits,

1. Social Media Features

• Social media platforms use machine

• Product recommendation is one of the

• Image recognition, which is an approach for cataloging and detecting a

• Machine learning is an application of AI that enables systems to learn and

• Machine learning is a branch of artificial intelligence (AI) and computer

• Machine learning (ML) is defined as a discipline of artificial intelligence

• A computer program is said to learn from experience E with respect to

• Task: A task is defined as the main problem in which we are interested.

Example: Let's assume we have a set of images tagged as ''dog''. A machine

As the name suggests, unsupervised learning is a machine learning technique

Example: Suppose the unsupervised learning algorithm is given an input

Reinforcement Learning is a feedback-based machine learning technique.

In such type of learning, agents (computer programs) need to explore the

The goal of a Reinforcement learning agent is to maximize the positive

Semi-supervised Learning is an intermediate technique of both supervised

Semi-supervised Learning is an intermediate technique of both supervised

Classification is a task that requires the use of machine learning algorithms

Following are the types of classifications:-

1. Classification Predictive Modelling: In machine learning, classification

Examples of classification problems include:

i. Given an example, classify if it is spam or not.

• Unlike binary classification, multi-class classification does not have the

• Consider the example of photo classification, where a given photo may

• Clustering or cluster analysis is a machine learning technique, which

• Types of Clustering Methods: The clustering methods are broadly divided

• Below are the main clustering methods used in Machine learning:

• It is a type of clustering that divides the

• The density-based clustering method

c. Distribution Model-Based Clustering:-

• In the distribution model-based

• Hierarchical clustering can be used as

Example: Suppose there is a marketing company A, who

Now, the company wants to do the advertisement of

• Some examples of regression can be as:

• Before understanding the overfitting and underfitting, let's understand

• Noise: Noise is unnecessary and irrelevant data that reduces the

• Bias: Bias is a prediction error that is introduced in the model due to

How to avoid underfitting:

• By increasing the training time of

• By increasing the number of

• The chances of occurrence of

You might also like