ML End-Sem

Unsupervised learning techniques aim to uncover hidden patterns in unlabeled data. Common approaches include clustering, dimensionality reduction, and association rule learning. Frequent itemset mining discovers item co-occurrences in transactional data using the Apriori algorithm. PCA performs dimensionality reduction by projecting data onto orthogonal principal components that maximize variance.

Uploaded by

padmhastaa

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views

ML End-Sem

Uploaded by

padmhastaa

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

ML End-Sem

Unsupervised learning
It is a type of machine learning where algorithms are used to uncover patterns or hidden
structures in unlabeled data. Unlike supervised learning, where the algorithm learns from labeled
data (input-output pairs), unsupervised learning deals with input data that doesn't have
corresponding output labels.

There are several approaches to unsupervised learning, each serving different purposes:
1. Clustering: Clustering algorithms aim to partition data points into groups or clusters based on
similarities in their features. Some popular clustering algorithms include:
• K-means: Divides data into K clusters, where each data point belongs to the cluster with the
nearest mean.
• Hierarchical clustering: Builds a hierarchy of clusters by either merging or splitting them
based on distance metrics.
• DBSCAN: Density-Based Spatial Clustering of Applications with Noise identifies clusters in
high-density areas separated by low-density regions.

2. Dimensionality Reduction: These techniques aim to reduce the number of features

(dimensions) in a dataset while preserving important information. They help in visualizing and
compressing data, as well as reducing computational complexity. Common methods include:
• Principal Component Analysis (PCA): Finds linear combinations of features that maximize
variance.

3. Association Rule Learning: This technique discovers interesting relationships or associations

between variables in large datasets. A famous algorithm is the Apriori algorithm, used for
mining frequent itemsets in transactional databases, often applied in market basket analysis.

Frequent itemset mining

It is a fundamental concept in machine learning and data mining used to discover interesting
associations or relationships between items in a dataset. It's commonly applied in market basket
analysis, recommendation systems, and other areas where understanding co-occurrences or
patterns among items is crucial.
Here's an explanation of frequent itemset mining:
1. Support and Itemset
• Itemset: A collection of one or more items grouped together. For instance, in a market
basket dataset, an itemset could be {bread, milk, eggs}.
• Support: It is a measure indicating how frequently an itemset appears in a dataset.
Mathematically, support is defined as the proportion of transactions in the dataset that
contain the itemset.

2. Frequent Itemset Mining

• Objective: Discover itemsets that have support greater than or equal to a predefined
minimum support threshold.
• Apriori Principle: This principle suggests that if an itemset is frequent, then all of its subsets
must also be frequent. This property helps in reducing the search space while mining
frequent itemsets.
• Apriori Algorithm: A widely used algorithm for frequent itemset mining. It operates in
iterations, gradually finding itemsets with higher support.
o Initially, it finds frequent individual items (singletons) by scanning the dataset to
calculate their support.
o Then, it uses these singletons to generate candidate itemsets of length 2 (pairs) and
checks their support in the dataset.
o The algorithm continues this process, creating larger candidate itemsets by joining
frequent itemsets of length k to create candidates of length k+1, and then checking their
support.
o It stops when no new frequent itemsets can be found or when no candidate itemsets
meet the minimum support threshold.

3. Association Rule Generation

• Once frequent itemsets are discovered, association rules can be generated from these
itemsets. An association rule is an implication of the form X ➞ Y, where X and Y are
itemsets.
• Two common metrics used for association rules are:
o Confidence: Measures the likelihood of item Y being purchased when itemset X is
purchased. It's calculated as support(X ∪ Y) / support(X).
o Lift: Measures the strength of a rule by comparing the observed support of X and Y
appearing together to what would be expected if they were independent. Lift =
support(X ∪ Y) / (support(X) * support(Y)).

Applications:
• Market Basket Analysis: Understanding which items are frequently bought together to
drive product placement, marketing strategies, or bundle offerings.
• Recommendation Systems: Generating recommendations by analyzing user-item
interactions and suggesting items based on co-occurrence patterns.

Principal Component Analysis (PCA)

1. Dimensionality Reduction Technique: PCA is a technique used for reducing the
dimensionality of data by transforming it into a new coordinate system.
2. Maximizing Variance: It identifies the directions (principal components) in which the data
varies the most.
3. Orthogonal Components: Each principal component is orthogonal (uncorrelated) to each
other, capturing different aspects of the variation present in the data.
4. Preserving Information: PCA reorients data to preserve as much variance as possible in a
lower-dimensional space, often by selecting the top principal components that retain most
of the variance.
5. Mathematical Process: Involves eigenvalue decomposition or Singular Value Decomposition
(SVD) to compute the principal components.
6. Applications:
• Reducing dimensionality for visualization and computational efficiency.
• Feature extraction by transforming high-dimensional data into a lower-dimensional
space while retaining important information.
7. Assumptions:
• Linearity: PCA assumes a linear relationship between variables.
• Gaussian Distribution: Assumes the data follows a Gaussian distribution.
8. Limitations:
• Assumes linear relationships which might not hold in all datasets.
• Might not perform well if the variance does not represent important information.
9. Use Cases:
• Image and signal processing.
• Preprocessing step in machine learning pipelines to reduce computational complexity.
• Exploratory data analysis to visualize high-dimensional data.

Tabular difference
Ensemble methods
Ensemble methods in machine learning refer to techniques that combine predictions from
multiple individual models to produce a stronger, more accurate predictive model. These methods
aim to improve the overall performance and robustness compared to using a single model.
Reinforcement Learning (RL)
It is a type of machine learning paradigm where an agent learns to make sequential decisions by
interacting with an environment to achieve a specific goal. In RL, the agent learns through a trial-
and-error process by receiving feedback in the form of rewards or penalties based on its actions.
Key components of reinforcement learning:
1. Agent: The learner or decision-maker that interacts with the environment. It observes the
environment, takes actions, and receives feedback.
2. Environment: The external system with which the agent interacts. It responds to the actions
taken by the agent and provides feedback in the form of rewards or penalties.
3. Actions: Choices made by the agent that influence the state of the environment.
4. State: Represents the current situation or configuration of the environment, which the agent
perceives before taking actions.
5. Rewards: Feedback signals provided by the environment to the agent after each action.
Rewards guide the agent toward maximizing cumulative reward over time, aligning with its
goal.
6. Policy: The strategy or set of rules that the agent uses to decide actions in different states.
7. Value Function: Estimates the expected cumulative reward an agent can obtain from a
particular state or action, helping the agent make better decisions.
8. Learning Process: The agent learns by interacting with the environment, using experiences
(state, action, reward) to update its policy or value function to make better decisions over
time.
9. Exploration vs. Exploitation: Balancing between exploring new actions and exploiting known
actions to maximize rewards while learning.

Applications of reinforcement learning:

• Game playing (e.g., AlphaGo, Atari games).
• Robotics (e.g., controlling robotic arms).
• Autonomous vehicles.
• Recommendation systems.
• Finance (e.g., portfolio management).

Temporal Difference (TD) learning

TD is a type of reinforcement learning technique used for estimating value functions or learning
from experiences in an environment without requiring a model of the environment's dynamics
(model-free learning). TD learning combines elements of Monte Carlo methods and dynamic
programming.
1. Advantages:
• More sample-efficient compared to Monte Carlo methods as it updates value estimates at
each time step rather than waiting until the end of an episode.
• Suitable for online and incremental learning scenarios.

2. Types of TD Learning:
• SARSA (State-Action-Reward-State-Action): TD learning algorithm that updates value
estimates based on the current state-action pair and the action taken next (on-policy
method).
• Q-learning: TD learning algorithm that updates value estimates based on the current state
and the action that maximizes the value of the next state (off-policy method).

TD learning algorithms, such as SARSA and Q-learning, are fundamental in

reinforcement learning. They enable agents to learn from experiences by iteratively
updating value functions based on observed rewards and transitions between states,
facilitating efficient learning and decision-making in various environments.

Machine Learning Lab Viva
100% (1)
Machine Learning Lab Viva
9 pages
ML
No ratings yet
ML
5 pages
AIML ASSIGNMENT 1
No ratings yet
AIML ASSIGNMENT 1
11 pages
ml_group_5
No ratings yet
ml_group_5
21 pages
Long ML
No ratings yet
Long ML
25 pages
Machine Unit4
No ratings yet
Machine Unit4
55 pages
ML
No ratings yet
ML
8 pages
Unit 5
No ratings yet
Unit 5
9 pages
ashageri assignment
No ratings yet
ashageri assignment
13 pages
Entropy (S) Log (P) : I 1c I I
No ratings yet
Entropy (S) Log (P) : I 1c I I
5 pages
AI unit 3
No ratings yet
AI unit 3
12 pages
23ECE205 FoDS 13 Introduction To ML
No ratings yet
23ECE205 FoDS 13 Introduction To ML
41 pages
BML answer key
No ratings yet
BML answer key
21 pages
AIML MODEL
No ratings yet
AIML MODEL
13 pages
Intro To Ai ML
No ratings yet
Intro To Ai ML
21 pages
Data Mining University Answer
No ratings yet
Data Mining University Answer
10 pages
Lecture 06 - Machine Learning Types Unsupervised
No ratings yet
Lecture 06 - Machine Learning Types Unsupervised
25 pages
Unit 3 - MLnotes-WPS Office
No ratings yet
Unit 3 - MLnotes-WPS Office
18 pages
Ch 1 (2)
No ratings yet
Ch 1 (2)
13 pages
Ai Notes
No ratings yet
Ai Notes
8 pages
Describe About Reinforcement Learning, Passive Reinforcement Learning and Active Reinforcement
No ratings yet
Describe About Reinforcement Learning, Passive Reinforcement Learning and Active Reinforcement
1 page
Supervised Machine Learning
No ratings yet
Supervised Machine Learning
3 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
4 pages
Predictive Analysis 5
No ratings yet
Predictive Analysis 5
8 pages
Unit-3 New
No ratings yet
Unit-3 New
75 pages
Machine Learning Summarized Notes 1660762916
No ratings yet
Machine Learning Summarized Notes 1660762916
111 pages
1
No ratings yet
1
42 pages
Lecture 29 RL
No ratings yet
Lecture 29 RL
38 pages
Ai Notes V
No ratings yet
Ai Notes V
7 pages
DWDM Unit 4 (R22)
No ratings yet
DWDM Unit 4 (R22)
25 pages
Q) Frequent Itemset Generation: States That If An Itemset Is Frequent, Then All of Its Subsets Must Also Be Frequent. This
No ratings yet
Q) Frequent Itemset Generation: States That If An Itemset Is Frequent, Then All of Its Subsets Must Also Be Frequent. This
9 pages
Unit 5
No ratings yet
Unit 5
77 pages
AI Paper Set
No ratings yet
AI Paper Set
33 pages
ML 4,5
No ratings yet
ML 4,5
8 pages
ADVANCE AIML CIE3 ANS
No ratings yet
ADVANCE AIML CIE3 ANS
5 pages
Slides Courtesy: Ling Chen [email protected]
No ratings yet
Slides Courtesy: Ling Chen [email protected]
42 pages
abc
No ratings yet
abc
10 pages
Unit 4 - Machine Learning - WWW - Rgpvnotes.in PDF
No ratings yet
Unit 4 - Machine Learning - WWW - Rgpvnotes.in PDF
27 pages
MOST ASKED QUESTIONS Pattern Recognition GTU
No ratings yet
MOST ASKED QUESTIONS Pattern Recognition GTU
23 pages
Association Rule Mining
No ratings yet
Association Rule Mining
191 pages
Lecture Text 2 - Machine Learning Algorithms and Techniques
No ratings yet
Lecture Text 2 - Machine Learning Algorithms and Techniques
2 pages
2 marks
No ratings yet
2 marks
5 pages
Unit-1 ML notes
No ratings yet
Unit-1 ML notes
20 pages
ML - Machine Learning PDF
No ratings yet
ML - Machine Learning PDF
13 pages
Lecture 3 Ver2
No ratings yet
Lecture 3 Ver2
42 pages
AIYA SESSION 4
No ratings yet
AIYA SESSION 4
42 pages
PAIML-UNIT 5 (1) (1)
No ratings yet
PAIML-UNIT 5 (1) (1)
38 pages
Association Rules,Recommendation Engine n Network Analytics
No ratings yet
Association Rules,Recommendation Engine n Network Analytics
22 pages
ML Answer Key (M.tech)
No ratings yet
ML Answer Key (M.tech)
31 pages
ML Questions Answer Q1
No ratings yet
ML Questions Answer Q1
79 pages
ML Q
No ratings yet
ML Q
40 pages
Aml Unit 3
No ratings yet
Aml Unit 3
17 pages
EDAB Module 5 Singular Value Decomposition (SVD)
No ratings yet
EDAB Module 5 Singular Value Decomposition (SVD)
58 pages
Unsupervised Learning Algorithm 1
No ratings yet
Unsupervised Learning Algorithm 1
3 pages
Unit 4 (PCA)
No ratings yet
Unit 4 (PCA)
12 pages
ml_cheatsheet
No ratings yet
ml_cheatsheet
4 pages
Unit 3 Data Mining
No ratings yet
Unit 3 Data Mining
15 pages
MACHINE LEARNING TECHNIQUES
No ratings yet
MACHINE LEARNING TECHNIQUES
4 pages
UNIT 5 Artificial Intelligence
No ratings yet
UNIT 5 Artificial Intelligence
7 pages

ML End-Sem

Uploaded by

ML End-Sem

Uploaded by

ML End-Sem

2. Dimensionality Reduction: These techniques aim to reduce the number of features

3. Association Rule Learning: This technique discovers interesting relationships or associations

Frequent itemset mining

2. Frequent Itemset Mining

3. Association Rule Generation

Principal Component Analysis (PCA)

Applications of reinforcement learning:

Temporal Difference (TD) learning

TD learning algorithms, such as SARSA and Q-learning, are fundamental in

You might also like