Day 4 Content
Day 4 Content
Practical Applications
Contents
problems.
4. Feedback session.
2
What is Machine Learning?
Machine Learning is the science of getting computer to learn and act like
humans do, and improve their learning over time in autonomous fashion, by
feeding them data and information in the form of observations and real-world
interactions.
3
Applying machine learning to
real-world problems
4
Recommendation Engines
Eg: Netflix Viewing Suggestions
Application Area: Media + Entertainment + Shopping
5
Self- Driving Cars
Eg: Tesla Cars use ML to understand surrounding
Application Area: Automotive + Transportation
6
Gamified Learning and Education
Eg: Duolingo’s Mobile Application
Application Area: Learning Language Application
7
E-Commerce Websites
Eg: Ajio
Application Area: Fashion E- Commercce
8
Medical Diagnosis
Eg: Orderly Health
Application Area: HealthCare
9
Getting Your Right Answers
Eg: Quora’s Super-Specific Answer Ranking
Application Area: Search
10
Supervised algorithm : Support Vector Machine(SVM)
Algorithm
● Used for Classification as well as Regression problems. However, primarily,
it is used for Classification problems in Machine Learning.
● Goal is to create the best line or decision boundary that can segregate n-
dimensional space into classes so that we can easily put the new data point
in the correct category in the future. This best decision boundary is called a
hyperplane.
11
Supervised algorithm : Support Vector Machine
Algorithm
Two different categories that are classified using a decision boundary or
hyperplane:
Eg:
12
Supervised algorithm : Support Vector Machine
Algorithm
Types of SVM:
● Linear SVM: Linear SVM is used for linearly separable data, which means if
a dataset can be classified into two classes by using a single straight line,
then such data is termed as linearly separable data, and classifier is used
called as Linear SVM classifier.
● Non-linear SVM: Non-Linear SVM is used for non-linearly separated data,
which means if a dataset cannot be classified by using a straight line, then
such data is termed as non-linear data and classifier used is called as Non-
linear SVM classifier.
13
Supervised algorithm : Support Vector Machine
Algorithm
Hyperplane:
● There can be multiple lines/decision boundaries to segregate the classes in
n-dimensional space, but we need to find out the best decision boundary
that helps to classify the data points. This best boundary is known as the
hyperplane of SVM.
● The dimensions of the hyperplane depend on the features present in the
dataset, which means if there are 2 features (as shown in image), then
hyperplane will be a straight line. And if there are 3 features, then
hyperplane will be a 2-dimension plane.
● We always create a hyperplane that has a maximum margin, which means
the maximum distance between the data points.
Support Vectors:
● The data points or vectors that are the closest to the hyperplane and which
affect the position of the hyperplane are termed as Support Vector. Since
these vectors support the hyperplane, hence called a Support vector.
14
Supervised algorithm : Support Vector Machine
Algorithm
Linear SVM:
Non-linear SVM:
15
Supervised algorithm : SVM basics
SVM Basics:
https://colab.research.google.com/drive/1rhvbJxSaOCRsAzwJBqtW7vor
5lxo_BPi?usp=sharing
16
Supervised algorithm : Naive Bayes
Dec, 2023 18
Supervised algorithm : Naive Bayes
● Multinomial: The Multinomial Naïve Bayes classifier is used when the data
is multinomial distributed. It is primarily used for document classification
problems, it means a particular document belongs to which category such
as Sports, Politics, education, etc.
The classifier uses the frequency of words for the predictors.
https://colab.research.google.com/drive/1FBph3Bg-
He2hIz3p1jE2_LMqRrnIBbMB?usp=sharing
Dec, 2023 20
Supervised algorithm : KNN
Dec, 2023 21
Supervised algorithm : KNN
● KNN algorithm at the training phase just stores the dataset and when it
gets new data, then it classifies that data into a category that is much
similar to the new data.
● Example: Suppose, we have an image of a creature that looks similar to cat
and dog, but we want to know either it is a cat or dog. So for this
identification, we can use the KNN algorithm, as it works on a similarity
measure. Our KNN model will find the similar features of the new data set
to the cats and dogs images and based on the most similar features it will
put it in either cat or dog category.
Dec, 2023 22
Supervised algorithm : KNN
Dec, 2023 23
Supervised algorithm : KNN
Dec, 2023 24
Supervised algorithm : KNN
● There is no particular way to determine the best value for "K", so we need
to try some values to find the best out of them. The most preferred value
for K is 5.
● A very low value for K such as K=1 or K=2, can be noisy and lead to the
effects of outliers in the model.
● Large values for K are good, but it may find some difficulties.
Dec, 2023 25
Supervised Learning
Projects:email_spam_detection
Mailing companies like Gmail, Outlook, and Yahoo are heavily investing
in their technology to provide security to their users. One possible
method is segregating spam emails automatically to avoid phishing
attacks. This project demonstrates the capability of Machine learning in
the cyber-security domain, where the ML model classifies emails into
spam and non-spam categories based on internal textual content. It uses
the KNN classifier for this task.
Code:
https://colab.research.google.com/drive/1BYZU0V
94QYa3LRQpYGGWmGWLJwYXvEST?usp=sharing
Dataset:
https://drive.google.com/drive/folders/15_CgVHH
6bP_zDbh28lnjQ0yqG4Zb9TXc?usp=sharing
Dec, 2023 26
Unsupervised Learning Projects:personality prediction
Dataset:
https://drive.google.com/file/d/1B5plNmEFu81Wv
f7GlZkdxynbDbGZdG_k/view?usp=sharing
Dec, 2023 27
Unsupervised Learning : PCA(Principal Component
Analysis)
● It is an algebraic technique for converting a set of observations of possibly
correlated variables into the set of values of liner uncorrelated variables.
Eigen Vector: It is a nonzero vector that remains parallel after multiplying the
matrix. Suppose 'V' is an eigen vector of dimension R of matrix K with
dimension R * R. If KV and V are parallel. Then the user has to solve KV = PV
where both V and P are unknown for solving eigen vector and eigen value.
Eigen Value: It is also known as "characteristic roots" in PCA. This is used for
measuring the variance in all the variables of the set, which is reported for by
that factor. The proportion of eigen value is the ratio of descriptive importance
of the factors concerning the variables. If the factor is low, then it subsidises
less to the description of variables.
Dec, 2023 29
PCA(Principal Component Analysis)
https://colab.research.google.com/drive/1VN6bAgGRQ8j5JSdDHHhtf5e1_XJZ
E61E?usp=sharing
Dec, 2023 30
Unsupervised Learning Projects: image compression
Dataset:
https://drive.google.com/drive/folders/1dMSHY3U
8ltK-2eCjeuhJi64f6nD6Arir?usp=sharing
Dec, 2023 31
Quiz Time
https://forms.gle/kemAxpxHxjwF4dnD9
Dec, 2023 32
Group work on conceptualising a machine
learning project.
1. Linear Regression:
https://colab.research.google.com/drive/11Y0CjJR4bmTBxtnVvSnEJ6HNxtijB9uY
?usp=sharing
2. Logistic Regression: https://colab.research.google.com/drive/1WJ3kuM2D-d-
Qpob9Mb5KMWgxhwdgebES?usp=sharing
Dataset:
https://drive.google.com/file/d/1EcH07uEBs9oad2xWhB6bmYHiuUfCdGgz/
view?usp=sharing
1. KNN:
https://colab.research.google.com/drive/1NTcj_jozaNvMwZrFtaUaFQ_57fzNVub
H?usp=sharing
2. Naive Bayes:
https://colab.research.google.com/drive/168zYbuHiyd2YvX5kaEEjCO1iYrMQPSU
C?usp=sharing
Dec, 2023 33
References
I. https://www.youtube.com/watch?v=4Rl8S7stN5A
II. https://colab.research.google.com/drive/1izGP15oreJ9zFZ4qi8jz9jaK3nuhb8
a2?usp=sharing
III. https://www.youtube.com/watch?v=SrY0sTJchHE&t=402s
IV. https://www.youtube.com/watch?v=4jv1pUrG0Zk&t=1878s
V. https://github.com/enjoyalgorithms/Machine-learning-project-
code/tree/main
VI. https://medium.com/enjoy-algorithm/top-machine-learning-projects-with-
python-code-c83d937050c9
VII.https://www.javatpoint.com/
Dec, 2023 34
THANKS
Dec, 2023