0% found this document useful (0 votes)

7 views21 pages

Unit 3 Unsupervised Learning & Neural Network

Machine learning notes for acedemcs.

Uploaded by

mohit gola

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views21 pages

Unit 3 Unsupervised Learning & Neural Network

Machine learning notes for acedemcs.

Uploaded by

mohit gola

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

Unsupervised Machine Learning

In the previous topic, we learned supervised machine learning in which models are trained using
labeled data under the supervision of training data. But there may be many cases in which we do
not have labeled data and need to find the hidden patterns from the given dataset. So, to solve
such types of cases in machine learning, we need unsupervised learning techniques.

What is Unsupervised Learning?

As the name suggests, unsupervised learning is a machine learning technique in which models
are not supervised using training dataset. Instead, models itself find the hidden patterns and
insights from the given data. It can be compared to learning which takes place in the human brain
while learning new things. It can be defined as:

Unsupervised learning is a type of machine learning in which models are trained using
unlabeled dataset and are allowed to act on that data without any supervision.

Unsupervised learning cannot be directly applied to a regression or classification problem

because unlike supervised learning, we have the input data but no corresponding output data. The
goal of unsupervised learning is to find the underlying structure of dataset, group that data
according to similarities, and represent that dataset in a compressed format.

Example: Suppose the unsupervised learning algorithm is given an input dataset containing
images of different types of cats and dogs. The algorithm is never trained upon the given dataset,
which means it does not have any idea about the features of the dataset. The task of the
unsupervised learning algorithm is to identify the image features on their own. Unsupervised
learning algorithm will perform this task by clustering the image dataset into the groups
according to similarities between images.

Why use Unsupervised Learning?

Below are some main reasons which describe the importance of Unsupervised Learning:

o Unsupervised learning is helpful for finding useful insights from the data.
o Unsupervised learning is much similar as a human learns to think by their own
experiences, which makes it closer to the real AI.
o Unsupervised learning works on unlabeled and uncategorized data which make
unsupervised learning more important.
o In real-world, we do not always have input data with the corresponding output so to solve
such cases, we need unsupervised learning.

Working of Unsupervised Learning

Working of unsupervised learning can be understood by the below diagram:

Here, we have taken an unlabeled input data, which means it is not categorized and
corresponding outputs are also not given. Now, this unlabeled input data is fed to the machine
learning model in order to train it. Firstly, it will interpret the raw data to find the hidden patterns
from the data and then will apply suitable algorithms such as k-means clustering, Decision tree,
etc.

Once it applies the suitable algorithm, the algorithm divides the data objects into groups
according to the similarities and difference between the objects.

Types of Unsupervised Learning Algorithm:

The unsupervised learning algorithm can be further categorized into two types of problems:
o Clustering: Clustering is a method of grouping the objects into clusters such that objects
with most similarities remains into a group and has less or no similarities with the objects
of another group. Cluster analysis finds the commonalities between the data objects and
categorizes them as per the presence and absence of those commonalities.
o Association: An association rule is an unsupervised learning method which is used for
finding the relationships between variables in the large database. It determines the set of
items that occurs together in the dataset. Association rule makes marketing strategy more
effective. Such as people who buy X item (suppose a bread) are also tend to purchase Y
(Butter/Jam) item. A typical example of Association rule is Market Basket Analysis.

Unsupervised Learning algorithms:

Below is the list of some popular unsupervised learning algorithms:

o K-means clustering
o KNN (k-nearest neighbors)
o Hierarchal clustering
o Anomaly detection
o Neural Networks
o Principle Component Analysis
o Independent Component Analysis
o Apriori algorithm
o Singular value decomposition

Advantages of Unsupervised Learning

o Unsupervised learning is used for more complex tasks as compared to supervised

learning because, in unsupervised learning, we don't have labeled input data.
o Unsupervised learning is preferable as it is easy to get unlabeled data in comparison to
labeled data.

Disadvantages of Unsupervised Learning

o Unsupervised learning is intrinsically more difficult than supervised learning as it does

not have corresponding output.
o The result of the unsupervised learning algorithm might be less accurate as input data is
not labeled, and algorithms do not know the exact output in advance.

K-Means Clustering-

K-Means clustering is an unsupervised iterative clustering technique.

It partitions the given data set into k predefined distinct clusters.
A cluster is defined as a collection of data points exhibiting certain similarities.

It partitions the data set such that-

Each data point belongs to a cluster with the nearest mean.
Data points belonging to one cluster have high degree of similarity.
Data points belonging to different clusters have high degree of dissimilarity.

K-Means Clustering Algorithm-

K-Means Clustering Algorithm involves the following steps-

Step-01:

Choose the number of clusters K.

Step-02:

Randomly select any K data points as cluster centers.

Select cluster centers in such a way that they are as farther as possible from each other.

Step-03:

Calculate the distance between each data point and each cluster center.
The distance may be calculated either by using given distance function or by using euclidean
distance formula.

Step-04:

Assign each data point to some cluster.

A data point is assigned to that cluster whose center is nearest to that data point.

Step-05:

Re-compute the center of newly formed clusters.

The center of a cluster is computed by taking mean of all the data points contained in that cluster.

Step-06:
Keep repeating the procedure from Step-03 to Step-05 until any of the following stopping criteria
is met-
Center of newly formed clusters do not change
Data points remain present in the same cluster
Maximum number of iterations are reached

Advantages-

K-Means Clustering Algorithm offers the following advantages-

Point-01:

It is relatively efficient with time complexity O(nkt) where-

n = number of instances
k = number of clusters
t = number of iterations

Point-02:

It often terminates at local optimum.

Techniques such as Simulated Annealing or Genetic Algorithms may be used to find the global
optimum.

Disadvantages-

K-Means Clustering Algorithm has the following disadvantages-

It requires to specify the number of clusters (k) in advance.
It can not handle noisy data and outliers.
It is not suitable to identify clusters with non-convex shapes.

PRACTICE PROBLEMS BASED ON K-MEANS CLUSTERING ALGORITHM-

Problem-01:

Cluster the following eight points (with (x, y) representing locations) into three clusters:
A1(2, 10), A2(2, 5), A3(8, 4), A4(5, 8), A5(7, 5), A6(6, 4), A7(1, 2), A8(4, 9)

Initial cluster centers are: A1(2, 10), A4(5, 8) and A7(1, 2).
The distance function between two points a = (x1, y1) and b = (x2, y2) is defined as-
Ρ(a, b) = |x2 – x1| + |y2 – y1|

Use K-Means Algorithm to find the three cluster centers after the second iteration.

Solution-

We follow the above discussed K-Means Clustering Algorithm-

Iteration-01:

We calculate the distance of each point from each of the center of the three clusters.
The distance is calculated by using the given distance function.

The following illustration shows the calculation of distance between point A1(2, 10) and each of
the center of the three clusters-

Calculating Distance Between A1(2, 10) and C1(2, 10)-

Ρ(A1, C1)
= |x2 – x1| + |y2 – y1|
= |2 – 2| + |10 – 10|
=0

Calculating Distance Between A1(2, 10) and C2(5, 8)-

Ρ(A1, C2)
= |x2 – x1| + |y2 – y1|
= |5 – 2| + |8 – 10|
=3+2
=5

Calculating Distance Between A1(2, 10) and C3(1, 2)-

Ρ(A1, C3)
= |x2 – x1| + |y2 – y1|
= |1 – 2| + |2 – 10|
=1+8
=9

In the similar manner, we calculate the distance of other points from each of the center of the
three clusters.

Next,
We draw a table showing all the results.
Using the table, we decide which point belongs to which cluster.
The given point belongs to that cluster whose center is nearest to it.

Distance from Distance from Distance from

Point belongs
Given Points center (2, 10) of center (5, 8) of center (1, 2) of
to Cluster
Cluster-01 Cluster-02 Cluster-03
A1(2, 10) 0 5 9 C1

A2(2, 5) 5 6 4 C3

A3(8, 4) 12 7 9 C2

A4(5, 8) 5 0 10 C2

A5(7, 5) 10 5 9 C2

A6(6, 4) 10 5 7 C2

A7(1, 2) 9 10 0 C3

A8(4, 9) 3 2 10 C2

From here, New clusters are-

Cluster-01:

First cluster contains points-

A1(2, 10)

Cluster-02:

Second cluster contains points-

A3(8, 4)
A4(5, 8)
A5(7, 5)
A6(6, 4)
A8(4, 9)

Cluster-03:

Third cluster contains points-

A2(2, 5)
A7(1, 2)

Now,
We re-compute the new cluster clusters.
The new cluster center is computed by taking mean of all the points contained in that cluster.

For Cluster-01:
We have only one point A1(2, 10) in Cluster-01.
So, cluster center remains the same.

For Cluster-02:

Center of Cluster-02
= ((8 + 5 + 7 + 6 + 4)/5, (4 + 8 + 5 + 4 + 9)/5)
= (6, 6)

For Cluster-03:

Center of Cluster-03
= ((2 + 1)/2, (5 + 2)/2)
= (1.5, 3.5)

This is completion of Iteration-01.

Iteration-02:

We calculate the distance of each point from each of the center of the three clusters.
The distance is calculated by using the given distance function.

The following illustration shows the calculation of distance between point A1(2, 10) and each of
the center of the three clusters-

Calculating Distance Between A1(2, 10) and C1(2, 10)-

Ρ(A1, C1)
= |x2 – x1| + |y2 – y1|
= |2 – 2| + |10 – 10|
=0

Calculating Distance Between A1(2, 10) and C2(6, 6)-

Ρ(A1, C2)
= |x2 – x1| + |y2 – y1|
= |6 – 2| + |6 – 10|
=4+4
=8

Calculating Distance Between A1(2, 10) and C3(1.5, 3.5)-

Ρ(A1, C3)
= |x2 – x1| + |y2 – y1|
= |1.5 – 2| + |3.5 – 10|
= 0.5 + 6.5
=7

In the similar manner, we calculate the distance of other points from each of the center of the
three clusters.

Next,
We draw a table showing all the results.
Using the table, we decide which point belongs to which cluster.
The given point belongs to that cluster whose center is nearest to it.

Distance from Distance from Distance from

Point belongs to
Given Points center (2, 10) of center (6, 6) of center (1.5, 3.5) of
Cluster
Cluster-01 Cluster-02 Cluster-03

A1(2, 10) 0 8 7 C1

A2(2, 5) 5 5 2 C3

A3(8, 4) 12 4 7 C2

A4(5, 8) 5 3 8 C2

A5(7, 5) 10 2 7 C2

A6(6, 4) 10 2 5 C2

A7(1, 2) 9 9 2 C3

A8(4, 9) 3 5 8 C1

From here, New clusters are-

Cluster-01:

First cluster contains points-

A1(2, 10)
A8(4, 9)

Cluster-02:

Second cluster contains points-

A3(8, 4)
A4(5, 8)
A5(7, 5)
A6(6, 4)

Cluster-03:

Third cluster contains points-

A2(2, 5)
A7(1, 2)

Now,
We re-compute the new cluster clusters.
The new cluster center is computed by taking mean of all the points contained in that cluster.

For Cluster-01:

Center of Cluster-01
= ((2 + 4)/2, (10 + 9)/2)
= (3, 9.5)

For Cluster-02:

Center of Cluster-02
= ((8 + 5 + 7 + 6)/4, (4 + 8 + 5 + 4)/4)
= (6.5, 5.25)

For Cluster-03:

Center of Cluster-03
= ((2 + 1)/2, (5 + 2)/2)
= (1.5, 3.5)

This is completion of Iteration-02.

After second iteration, the center of the three clusters are-

C1(3, 9.5)
C2(6.5, 5.25)
C3(1.5, 3.5)

Problem-02:

Use K-Means Algorithm to create two clusters-

Solution-

We follow the above discussed K-Means Clustering Algorithm.

Assume A(2, 2) and C(1, 1) are centers of the two clusters.
Iteration-01:

We calculate the distance of each point from each of the center of the two clusters.
The distance is calculated by using the euclidean distance formula.

The following illustration shows the calculation of distance between point A(2, 2) and each of
the center of the two clusters-

Calculating Distance Between A(2, 2) and C1(2, 2)-

Ρ(A, C1)
= sqrt [ (x2 – x1)2 + (y2 – y1)2 ]
= sqrt [ (2 – 2)2 + (2 – 2)2 ]
= sqrt [ 0 + 0 ]
=0

Calculating Distance Between A(2, 2) and C2(1, 1)-

Ρ(A, C2)
= sqrt [ (x2 – x1)2 + (y2 – y1)2 ]
= sqrt [ (1 – 2)2 + (1 – 2)2 ]
= sqrt [ 1 + 1 ]
= sqrt [ 2 ]
= 1.41

In the similar manner, we calculate the distance of other points from each of the center of the two
clusters.

Next,
We draw a table showing all the results.
Using the table, we decide which point belongs to which cluster.
The given point belongs to that cluster whose center is nearest to it.

Distance from Distance from

Point belongs to
Given Points center (2, 2) of center (1, 1) of
Cluster
Cluster-01 Cluster-02

A(2, 2) 0 1.41 C1

B(3, 2) 1 2.24 C1

C(1, 1) 1.41 0 C2

D(3, 1) 1.41 2 C1
E(1.5, 0.5) 1.58 0.71 C2

From here, New clusters are-

Cluster-01:

First cluster contains points-

A(2, 2)
B(3, 2)
E(1.5, 0.5)
D(3, 1)

Cluster-02:

Second cluster contains points-

C(1, 1)
E(1.5, 0.5)

Now,
We re-compute the new cluster clusters.
The new cluster center is computed by taking mean of all the points contained in that cluster.

For Cluster-01:

Center of Cluster-01
= ((2 + 3 + 3)/3, (2 + 2 + 1)/3)
= (2.67, 1.67)

For Cluster-02:

Center of Cluster-02
= ((1 + 1.5)/2, (1 + 0.5)/2)
= (1.25, 0.75)

This is completion of Iteration-01.

Next, we go to iteration-02, iteration-03 and so on until the centers do not change anymore
What is a neural network?

How do neural networks work?

Neural networks are composed of a collection of nodes. The nodes are spread out across at least
three layers. The three layers are:

• An input layer
• A "hidden" layer
• An output layer

These three layers are the minimum. Neural networks can have more than one hidden layer, in
addition to the input layer and output layer.

No matter which layer it is part of, each node performs some sort of processing task or function
on whatever input it receives from the previous node (or from the input layer). Essentially, each
node contains a mathematical formula, with each variable within the formula weighted
differently. If the output of applying that mathematical formula to the input exceeds a certain
threshold, the node passes data to the next layer in the neural network. If the output is below the
threshold, no data is passed to the next layer.

Imagine that the Acme Corporation has an accounting department with a strict hierarchy. Acme
accounting department employees at the manager level approve expenses below $1,000,
directors approve expenses below $10,000, and the CFO approves any expenses that exceed
$10,000. When employees from other departments of Acme Corp. submit their expenses, they
first go to the accounting managers. Any expense over $1,000 gets passed to a director, while
expenses below $1,000 stay at the managerial level — and so on.

The accounting department of the Acme Corp. functions somewhat like a neural network. When
employees submit their expense reports, this is like a neural network's input layer. Each manager
and director is like a node within the neural network.

And, just as one accounting manager may ask another manager for assistance in interpreting an
expense report before passing it along to an accounting director, neural networks can be
architected in a variety of ways. Nodes can communicate in multiple directions.

What are the types of neural networks?

There is no limit on how many nodes and layers a neural network can have, and these nodes can
interact in almost any way. Because of this, the list of types of neural networks is ever-
expanding. But, they can roughly be sorted into these categories:
• Shallow neural networks usually have only one hidden layer
• Deep neural networks have multiple hidden layers

Shallow neural networks are fast and require less processing power than deep neural networks,
but they cannot perform as many complex tasks as deep neural networks.

Below is an incomplete list of the types of neural networks that may be used today:

Perceptron neural networks are simple, shallow networks with an input layer and an output
layer.

Multilayer perceptron neural networks add complexity to perceptron networks, and include a
hidden layer.

Feed-forward neural networks only allow their nodes to pass information to a forward node.
Recurrent neural networks can go backwards, allowing the output from some nodes to impact
the input of preceding nodes.

Modular neural networks combine two or more neural networks in order to arrive at the output.

Radial basis function neural network nodes use a specific kind of mathematical function called
a radial basis function.
Liquid state machine neural networks feature nodes that are randomly connected to each other.

Residual neural networks allow data to skip ahead via a process called identity mapping,
combining the output from early layers with the output of later layers.

Generalization

Generalization in machine learning refers to the ability of a trained model to accurately make
predictions on new, unseen data. The purpose of generalization is to equip the model to
understand the patterns and relationships within its training data and apply them to previously
unseen examples from within the same distribution as the training set. Generalization is
foundational to the practical usefulness of machine learning and deep learning algorithms
because it allows them to produce models that can make reliable predictions in real-world
scenarios.

Generalization is important because the true test of a model's effectiveness is not how well it
performs on the training data, but rather how well it generalizes to new and unseen data. If a
model fails to generalize, it may exhibit high accuracy on the training set but will likely perform
poorly on real-world examples. This limitation renders the model impractical and unreliable in
practical applications.

A spam email classifier is a great example of generalization in machine learning. Suppose you
have a training dataset containing emails labeled as either spam or not spam and your goal is to
build a model that can accurately classify incoming emails as spam or legitimate based on their
content.

During the training phase, the machine learning algorithm learns from the set of labeled emails,
extracting relevant features and patterns to make predictions. The model optimizes its parameters
to minimize the training error and achieve high accuracy on the training data.

Now, the true test of the model's effectiveness lies in its ability to generalize to new, unseen
emails. When new emails arrive, the model needs to accurately classify them as spam or
legitimate without prior exposure to their content. This is where generalization comes in.

In this case, generalization enables the model to identify the underlying patterns and
characteristics that distinguish spam from legitimate emails. It allows the model to generalize its
learned knowledge beyond the specific examples in the training set and apply it to unseen data.

Without generalization, the model may become too specific to the training set, memorizing
specific words or phrases that were common in the training data and failing to understand new
examples. As a result, the model could incorrectly classify legitimate emails as spam or fail to
detect new spam patterns.

Have you ever noticed that your model false predictions over your testing data? Even though you
have trained your model with enough data still you get false negatives or false positives for your
test data. Why is that?

Either your model is underfitting or overfitting to your training data. Generalization is a measure
of how your model performs on predicting unseen data. So, it is important to come up with the
best-generalized model to give better performance against future data. Let us first understand
what is underfitting and overfitting, and then see what are the best practices to train a generalized
model.
A: Underfitting, B: Generalized, C: Overfitting

What is Underfitting?

Underfitting is a state where the model cannot model itself on the training data. And also not able
to generalize new data. You can notice it with the help of loss function during your training. A
simple rule of thumb is if both training loss and cross-validation loss are high, then your model
is underfitting.

Lack of data, not enough features, lack of variance in training data or high regularization rate can
cause underfitting. A simple solution is to add more shuffled data to your training. Depending on
what causes underfitting to your model, you can try introducing more meaningful features, feature
crossing and introducing higher order polynomials as features or reducing regularization rate if
you are using regularization. In some cases trying out with different training algorithm will work
fine.

What is Overfitting?

Overfitting is a situation where your model force learns the whole variance. Experts say it as
model starts to memorize all the noise instead of learning. A simple rule of thumb to identify the
overfitting is if your training loss is low and cross-validation loss is high then your model is
overfitting.

Uncleaned data, fewer steps in training, higher complexity of the model (due to higher weights in
data) can cause overfitting. It is always recommended to preprocess data and create a good data
pipeline. Select only necessary and meaningful features with good variance. Reduce the
complexity of the model using good regularization algorithm (L1 norm or L2 norm).
Comparison

Competitive Learning

Competitive Learning: A technique for training machine learning models to improve

performance in competitive environments.

Competitive learning is a concept in machine learning where models are trained to improve their
performance in competitive environments, such as online coding competitions, gaming, and
multi-agent systems. This approach enables models to adapt and learn from interactions with
other agents, users, or systems, balancing exploration for learning and competition for resources
or users.

One of the key challenges in competitive learning is finding the right balance between
exploration and exploitation. Exploration involves making suboptimal choices to acquire new
information, while exploitation focuses on making the best choices based on the current
knowledge. In competitive environments, learning algorithms must consider not only their own
performance but also the performance of other competing agents.

Recent research in competitive learning has explored various aspects of the field, such as
accelerating graph quantization, learning from source code competitions, and understanding the
impact of various parameters on learning processes in online coding competitions. These studies
have provided valuable insights into the nuances and complexities of competitive learning, as
well as the current challenges faced by researchers and practitioners.

For instance, a study on emergent communication under competition demonstrated that

communication can indeed emerge in competitive settings, provided that both agents benefit
from it. Another research paper on deep latent competition showed how reinforcement learning
algorithms can learn competitive behaviors through self-play in imagination, using a compact
latent space representation.

Practical applications of competitive learning can be found in various domains, such as:
1. Online coding competitions: Competitive learning can help improve the performance of
participants by analyzing their behavior, approach, emotions, and problem difficulty levels.

2. Multi-agent systems: In settings where multiple agents interact and compete, competitive
learning can enable agents to adapt and cooperate more effectively.

3. Gaming: Competitive learning can be used to train game-playing agents to improve their
performance against human or AI opponents.

A company case study in competitive learning is the CodRep Machine Learning on Source Code
Competition, which aimed to create a common playground for machine learning and software
engineering research communities. The competition facilitated interaction between researchers
and practitioners, leading to advancements in the field.

In conclusion, competitive learning is a promising area of research in machine learning, with

potential applications in various domains. By understanding the nuances and complexities of
competitive environments, researchers can develop more effective learning algorithms that can
adapt and thrive in such settings.
What is competitive learning in machine learning?
Competitive learning in machine learning is a technique where models are trained to improve
their performance in competitive environments, such as online coding competitions, gaming, and
multi-agent systems. This approach enables models to adapt and learn from interactions with
other agents, users, or systems, balancing exploration for learning and competition for resources
or users.

Nptel Swayam DWDM Slides
No ratings yet
Nptel Swayam DWDM Slides
406 pages
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet
Problem Statement
100% (1)
Problem Statement
17 pages
DSUP_Exp5[1]
No ratings yet
DSUP_Exp5[1]
7 pages
UNIT-5 Material
No ratings yet
UNIT-5 Material
42 pages
Aiml Unit 4
No ratings yet
Aiml Unit 4
20 pages
Ml Unit5 Notes
No ratings yet
Ml Unit5 Notes
18 pages
chapter 3 p4
No ratings yet
chapter 3 p4
18 pages
DSA Presentation Group 6
No ratings yet
DSA Presentation Group 6
34 pages
Foml - U3
No ratings yet
Foml - U3
32 pages
Lab 10 Unsupervised
No ratings yet
Lab 10 Unsupervised
12 pages
Unsupervised Learning Final
No ratings yet
Unsupervised Learning Final
17 pages
Unsupervised Lec
No ratings yet
Unsupervised Lec
12 pages
ML_unit_4
No ratings yet
ML_unit_4
17 pages
Unit 2 Unsupervised Learning
No ratings yet
Unit 2 Unsupervised Learning
86 pages
som-new
No ratings yet
som-new
21 pages
ML Unit 2 Notes
No ratings yet
ML Unit 2 Notes
14 pages
Unsupervised Machine Learning
No ratings yet
Unsupervised Machine Learning
10 pages
Week 9. Unsupervised Learning
No ratings yet
Week 9. Unsupervised Learning
32 pages
K-MEANS-FINAL
No ratings yet
K-MEANS-FINAL
10 pages
Assignment 2
No ratings yet
Assignment 2
8 pages
1
No ratings yet
1
59 pages
Module 6 - Un-Supervised Learning Algorithms
No ratings yet
Module 6 - Un-Supervised Learning Algorithms
31 pages
Group I Discrete Mathematics
No ratings yet
Group I Discrete Mathematics
4 pages
Machine Learning Unsupervised
No ratings yet
Machine Learning Unsupervised
20 pages
Unit-5 Clustering (March 16, 24)
No ratings yet
Unit-5 Clustering (March 16, 24)
25 pages
ML UNIT 4 Sir
No ratings yet
ML UNIT 4 Sir
42 pages
Lecture Unsupervised (17!04!2024).Pptx
No ratings yet
Lecture Unsupervised (17!04!2024).Pptx
61 pages
Lecture - 10 Unsupervised Learning & K-Means Clustering
No ratings yet
Lecture - 10 Unsupervised Learning & K-Means Clustering
31 pages
Unit-4
No ratings yet
Unit-4
53 pages
(KtabPDF Com) xrwA7TEBGp
No ratings yet
(KtabPDF Com) xrwA7TEBGp
32 pages
Unsupervised - Learning Final
No ratings yet
Unsupervised - Learning Final
20 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
20 pages
02- KNN & Regression
No ratings yet
02- KNN & Regression
40 pages
unsupervised learning
No ratings yet
unsupervised learning
23 pages
ML Unit-2 - RTU
No ratings yet
ML Unit-2 - RTU
33 pages
K means Clustering
No ratings yet
K means Clustering
11 pages
2nd Unit NN Final Class Notes (1)
No ratings yet
2nd Unit NN Final Class Notes (1)
50 pages
Aiml 8
No ratings yet
Aiml 8
7 pages
CS8091 - Big Data Analytics - Unit 2
No ratings yet
CS8091 - Big Data Analytics - Unit 2
44 pages
Unit IV
No ratings yet
Unit IV
96 pages
ML Mod 4 Part 1
No ratings yet
ML Mod 4 Part 1
99 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
9 pages
Week 11
No ratings yet
Week 11
49 pages
Unit-4 Unsupervised Algorithm
No ratings yet
Unit-4 Unsupervised Algorithm
18 pages
Week 9
No ratings yet
Week 9
66 pages
04-FSSR_DS610_2024=2025T1_Kmeans
No ratings yet
04-FSSR_DS610_2024=2025T1_Kmeans
57 pages
ML UNIT-5
No ratings yet
ML UNIT-5
21 pages
Module 6.1
No ratings yet
Module 6.1
42 pages
Agglomerative Is A Bottom-Up Technique, But Divisive Is A Top-Down Technique
No ratings yet
Agglomerative Is A Bottom-Up Technique, But Divisive Is A Top-Down Technique
8 pages
ML - Unit - 2
No ratings yet
ML - Unit - 2
13 pages
UNIT 4
No ratings yet
UNIT 4
125 pages
Unit 3 & 4 (p18)
No ratings yet
Unit 3 & 4 (p18)
18 pages
Untitled document
No ratings yet
Untitled document
32 pages
ML UNIT-III
No ratings yet
ML UNIT-III
18 pages
Unsupervised Learning Modi
No ratings yet
Unsupervised Learning Modi
16 pages
R20 machine learning unit 4
No ratings yet
R20 machine learning unit 4
49 pages
1.supervised and Unsupervised
No ratings yet
1.supervised and Unsupervised
42 pages
ML Unit-4 Final 2024-25
No ratings yet
ML Unit-4 Final 2024-25
28 pages
U1 - KMeans - 5th Sem - DS
No ratings yet
U1 - KMeans - 5th Sem - DS
14 pages
Unit 4
No ratings yet
Unit 4
40 pages
Unit-4 (2)
No ratings yet
Unit-4 (2)
29 pages
MCSL-223 Section 2 Data Mining Lab
No ratings yet
MCSL-223 Section 2 Data Mining Lab
55 pages
Question Bank-ODD Semester
No ratings yet
Question Bank-ODD Semester
9 pages
6.DMBI Question Bank PDF
No ratings yet
6.DMBI Question Bank PDF
12 pages
Application and Comparison of Classification Techniques in Controlling Credit Risk
0% (1)
Application and Comparison of Classification Techniques in Controlling Credit Risk
16 pages
DWDM File
No ratings yet
DWDM File
26 pages
Fuzzy Association Rule Mining and Classification For The Prediction of Malaria in South Korea (PDFDrive)
No ratings yet
Fuzzy Association Rule Mining and Classification For The Prediction of Malaria in South Korea (PDFDrive)
17 pages
AI Practical TYCS
No ratings yet
AI Practical TYCS
31 pages
VO_MCA_S4_Data Mining Unit 1
No ratings yet
VO_MCA_S4_Data Mining Unit 1
18 pages
Certified AI & ML BlackBelt Plus Program- Projects
No ratings yet
Certified AI & ML BlackBelt Plus Program- Projects
66 pages
Assgg
No ratings yet
Assgg
12 pages
Mining Closed Regular Patterns in Data Streams
No ratings yet
Mining Closed Regular Patterns in Data Streams
9 pages
DAV
No ratings yet
DAV
1 page
Module 1 PPT
No ratings yet
Module 1 PPT
122 pages
V14 Cse Aiml Iii Year
No ratings yet
V14 Cse Aiml Iii Year
41 pages
Data Mining: Characterization: Jimma University, Faculty of Computing Arranged By: Dessalegn Y
No ratings yet
Data Mining: Characterization: Jimma University, Faculty of Computing Arranged By: Dessalegn Y
79 pages
Data Mining Lab Manual
No ratings yet
Data Mining Lab Manual
40 pages
Unit 3.1
No ratings yet
Unit 3.1
23 pages
Module 4 BDA NOTES
No ratings yet
Module 4 BDA NOTES
75 pages
Harshal ET 3 Lab Manual New
No ratings yet
Harshal ET 3 Lab Manual New
57 pages
Lab Manual LPII 2
No ratings yet
Lab Manual LPII 2
43 pages
219 - Exp 9 - DWM
No ratings yet
219 - Exp 9 - DWM
10 pages
Dbms Question Bank2 Marks 16 Marks
No ratings yet
Dbms Question Bank2 Marks 16 Marks
31 pages
Customer Behavior Model Using Data Mining: Milan Patel, Srushti Karvekar, Zeal Mehta
No ratings yet
Customer Behavior Model Using Data Mining: Milan Patel, Srushti Karvekar, Zeal Mehta
8 pages
Predicting Missing Items in Shopping Carts
50% (2)
Predicting Missing Items in Shopping Carts
11 pages
Unit 1
No ratings yet
Unit 1
27 pages
Marketing and Retail Analysis Grocery Store - Pranjal Jain
No ratings yet
Marketing and Retail Analysis Grocery Store - Pranjal Jain
26 pages
Data Analytics Kit 601
No ratings yet
Data Analytics Kit 601
2 pages
Sample Acknowledgement
No ratings yet
Sample Acknowledgement
6 pages

Unit 3 Unsupervised Learning & Neural Network

Uploaded by

Unit 3 Unsupervised Learning & Neural Network

Uploaded by

Unsupervised Machine Learning

What is Unsupervised Learning?

Unsupervised learning cannot be directly applied to a regression or classification problem

Why use Unsupervised Learning?

Working of Unsupervised Learning

Working of unsupervised learning can be understood by the below diagram:

Types of Unsupervised Learning Algorithm:

Unsupervised Learning algorithms:

Below is the list of some popular unsupervised learning algorithms:

Advantages of Unsupervised Learning

o Unsupervised learning is used for more complex tasks as compared to supervised

Disadvantages of Unsupervised Learning

o Unsupervised learning is intrinsically more difficult than supervised learning as it does

K-Means clustering is an unsupervised iterative clustering technique.

It partitions the data set such that-

K-Means Clustering Algorithm-

Choose the number of clusters K.

Randomly select any K data points as cluster centers.

Assign each data point to some cluster.

Re-compute the center of newly formed clusters.

K-Means Clustering Algorithm offers the following advantages-

It is relatively efficient with time complexity O(nkt) where-

It often terminates at local optimum.

K-Means Clustering Algorithm has the following disadvantages-

PRACTICE PROBLEMS BASED ON K-MEANS CLUSTERING ALGORITHM-

We follow the above discussed K-Means Clustering Algorithm-

Calculating Distance Between A1(2, 10) and C1(2, 10)-

Calculating Distance Between A1(2, 10) and C2(5, 8)-

Calculating Distance Between A1(2, 10) and C3(1, 2)-

Distance from Distance from Distance from

From here, New clusters are-

First cluster contains points-

Second cluster contains points-

Third cluster contains points-

This is completion of Iteration-01.

Calculating Distance Between A1(2, 10) and C1(2, 10)-

Calculating Distance Between A1(2, 10) and C2(6, 6)-

Calculating Distance Between A1(2, 10) and C3(1.5, 3.5)-

Distance from Distance from Distance from

From here, New clusters are-

First cluster contains points-

Second cluster contains points-

Third cluster contains points-

This is completion of Iteration-02.

After second iteration, the center of the three clusters are-

Use K-Means Algorithm to create two clusters-

We follow the above discussed K-Means Clustering Algorithm.

Calculating Distance Between A(2, 2) and C1(2, 2)-

Calculating Distance Between A(2, 2) and C2(1, 1)-

Distance from Distance from

From here, New clusters are-

First cluster contains points-

Second cluster contains points-

This is completion of Iteration-01.

How do neural networks work?

What are the types of neural networks?

Competitive Learning: A technique for training machine learning models to improve

For instance, a study on emergent communication under competition demonstrated that

In conclusion, competitive learning is a promising area of research in machine learning, with

You might also like