unit-3

The document discusses dimensionality reduction techniques, including feature selection and feature extraction, to address the challenges posed by high-dimensional data, known as the curse of dimensionality. It highlights methods such as Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) for reducing complexity while retaining essential information. Additionally, it outlines the benefits and limitations of feature selection methods and compares PCA and LDA in terms of their applications and characteristics.

Uploaded by

jayanths878

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views23 pages

unit-3

Uploaded by

jayanths878

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 23


Dimensionality reduction: The curse of dimensionality

Principal component analysis

Feature selection

Discriminant analysis: Fisher linear discriminant, multiple linear discriminant
Dimensionality Reduction
Dimensionality reduction is a technique used to reduce the number of features in a
dataset while retaining as much of the important information as possible.
Dimensionality reduction: The curse of dimensionality
• It is a process of transforming high-dimensional data into a lower-dimensional
space that still preserves the essence of the original data.
• In machine learning, high-dimensional data refers to data with a large number of
features or variables.
• Dimensionality reduction can help to mitigate these problems by reducing the
complexity of the model and improving its generalization performance.
• There are two main approaches to dimensionality reduction: feature selection
and feature extraction.
1. Feature Selection:
• Feature selection involves selecting a subset of the original features that are most
relevant to the problem at hand.
• The goal is to reduce the dimensionality of the dataset while retaining the most
important features.
2. Feature Extraction:
• Feature extraction involves creating new features by combining or transforming
the original features.
• The goal is to create a set of features that captures the essence of the original
data in a lower-dimensional space.
Differences between Feature selection and Feature extraction

Feature Selection Feature Extraction

Selects a subset of relevant features from Extracts a new set of features that are
the original set of features. more informative and compact.
Reduces the dimensionality of the Captures the essential information from
feature space and simplifies the model. the original features and represents it in a
lower-dimensional feature space.
Can be categorized into filter, wrapper, Can be categorized into linear and
and embedded methods. nonlinear methods.
May lose some information and May introduce some noise and
introduce bias if the wrong features are redundancy if the extracted features are
selected. not informative.
Curse of Dimensionality
• Curse of Dimensionality refers to a set of problems that arise when working with
high-dimensional data.
• A dataset with a large number of attributes, generally of the order of a hundred
or more, is referred to as high dimensional data
• The Curse of Dimensionality refers to the various challenges and complications
that arise when analyzing and organizing data in high-dimensional spaces (often
hundreds or thousands of dimensions).
• Dimensions refer to the features or attributes of data. For instance consider a
dataset of houses, the dimensions could include the house's price, size, number
of bedrooms, location
• The curse of dimensionality occurs mainly because as more features or
dimensions are added then there is high chance of increasing the complexity of
the data without necessarily increasing the amount of useful information.
• In high-dimensional spaces, most data points are at the "edges" or "corners,"
making the data sparse.
• The primary solution to the curse of dimensionality is "dimensionality reduction."
It's a process that reduces the number of random variables under consideration
by obtaining a set of principal variables.
• By reducing the dimensionality, we can retain the most important information in
the data while discarding the redundant or less important features.
• The difficulties related to training machine learning models due to high
dimensional data is referred to as ‘Curse of Dimensionality
Domains of curse of dimensionality
1) Anomaly Detection
• Anomaly detection is used for finding unforeseen items or events in the dataset.
• In high-dimensional data anomalies often show a remarkable number of
attributes which are irrelevant in nature; certain objects occur more frequently in
neighbor lists than others.
2) Combinations
• Whenever, there is an increase in the number of possible input combinations it
fuels the complexity to increase rapidly, and the curse of dimensionality occurs.
How to Mitigate Curse of Dimensionality
• To mitigate the problems associated with high dimensional data a suite of
techniques generally referred to as ‘Dimensionality reduction techniques’
• Dimensionality reduction techniques fall into one of the two categories- ‘Feature
selection’ or ‘Feature extraction.
Feature selection techniques
• The attributes are tested for their worthiness and then selected or eliminated
• In Low Variance filter technique, the variance in the distribution of all the
attributes in a dataset is compared and attributes with very low variance are
eliminated.
• In High Correlation filter technique, the pair wise correlation between attributes
is determined. One of the attributes in the pairs that show very high correlation is
eliminated and the other retained
Feature extraction techniques
• The high dimensional attributes are combined in low dimensional components
(PCA or ICA) or factored into low dimensional factors (FA).
Principal Component Analysis (PCA)
• PCA, is a dimensionality-reduction technique in which high dimensional
correlated data is transformed to a lower dimensional set of uncorrelated
components, referred to as principal components.
Factor Analysis (FA)
• Factor analysis is based on the assumption that all the observed attributes in a
dataset can be represented as a weighted linear combination of latent factors.
Feature Selection Techniques
• Feature selection is a way of selecting the subset of the most relevant features
from the original features set by removing the redundant, irrelevant or noisy
features.
• Feature selection is one of the important concepts of machine learning, which
highly impacts the performance of the model.
• It is the process of selecting some attributes from a given collection of
prospective features and then discarding the rest of the attributes that were
considered
• A feature is an attribute that has an impact on a problem or is useful for the
problem, and choosing the important features for the model is known as feature
selection.
• Selecting the best features helps the model to perform well.
• Suppose to create a model that automatically decides which car should be
crushed for a spare part
• The dataset contains a Model of the car, Year, Owner's name, Mileage.
• The name of the owner does not contribute to the model performance as it does
not decide if the car should be crushed or not
• Remove owner column and select the rest of the features(column) for the model
building.
Benefits of Feature selection method
1) Reduce over fitting as less redundant data means less chance to make decisions
based on noise.
2) Improve accuracy by removing misleading and unimportant data
3) Reduce training time since data with fewer columns mean faster training
4) It helps in avoiding the curse of dimensionality.
5) It helps in the simplification of the model so that it can be easily interpreted by
the researchers.
Shortcomings of Feature Selection Method
1) Feature selection methods are hard to apply to high dimensional data
2) The presence of more features then longer time for selection to complete
3) There is risk of over fitting when there aren’t enough observation
The two types of Feature Selection techniques
• Supervised Feature Selection technique
Supervised Feature selection techniques consider the target variable and can be
used for the labeled dataset.
• Unsupervised Feature Selection technique
Unsupervised Feature selection techniques ignore the target variable and can be
used for the unlabeled dataset.
Three techniques under supervised feature Selection
1) Wrapper Methods
• In wrapper methodology, selection of features is done by considering it as a
search problem, in which different combinations are made, evaluated, and
compared with other combinations.
• It trains the algorithm by using the subset of features iteratively.
2) Filter Methods
• In Filter Method, features are selected on the basis of statistics measures.
• This method does not depend on the learning algorithm and chooses the features
as a pre-processing step.
• The filter method filters out the irrelevant feature and redundant columns from
the model by using different metrics through ranking.
• Filter methods remove features with low correlation with target variable before
training the final ML model.
3) Embedded methods
• Embedded feature selection approaches incorporate the feature selection
machine learning algorithm as an integral component of the learning process.
• This allows for simultaneous classification and feature selection to take place
within the method.
• A few examples of common embedded approaches are the LASSO feature
selection algorithm, the random forest feature selection algorithm, and the
decision tree feature selection algorithm.
PCA
• Karl Pearson was the first person to come up with this plan
• Based on the idea that when data from a higher-dimensional space is put into a
lower-dimensional space
• Principal component analysis (PCA) is a way to get important variables from a
large set of variables in a data set.
• PCA is more useful when you have data with three or more dimensions.
Merits of Dimensionality Reduction
• It helps to compress data, which reduces the amount of space needed to store it
and the amount of time it takes to process it.
• If there are any redundant features, it also helps to get rid of them.
Limitations of Dimensionality Reduction
• lose some data.
• PCA fails when the mean and covariance are not enough to describe a dataset.
Steps Involved in the PCA
Step 1: Standardize the dataset.
Step 2: Calculate the covariance matrix for the features in the dataset.
Step 3: Calculate the eigenvalues and eigenvectors for the covariance matrix.
Step 4: Sort eigenvalues and their corresponding eigenvectors.
Step 5: Pick k eigenvalues and form a matrix of eigenvectors.
Step 6: Transform the original matrix.
Linear Discriminant Analysis (LDA)
• The Linear Discriminant Analysis is the linear classification method that is
recommended to use when there are more than two classes.
• Linear Discriminant analysis is one of the most popular dimensionality reduction
techniques used for supervised classification problems in machine learning.
• To separate two or more classes having multiple features efficiently, the Linear
Discriminant Analysis model is considered the most common technique to solve
such classification problems.
• LDA is used to solve classification problems where the output variable is a
categorical one
• LDA can also be used in data pre-processing to reduce the number of features,
just as PCA, which reduces the computing cost significantly.
• LDA is also used in face detection algorithms
• LDA is used to extract useful data from different faces.
• LDA is used to minimize the number of features to a manageable number before
going through the classification process.
• Face recognition is the popular application of computer vision, where each face is
represented as the combination of a number of pixel values.
• LDA has a great application in classifying the patient disease on the basis of
various parameters of patient health and the medical treatment which is going on
and classifies disease as mild, moderate, or severe. This classification helps the
doctors in either increasing or decreasing the pace of the treatment.
• LDA can also be used for making predictions and so in decision making. For
example, "will you buy this product” will give a predicted result of either one or
two possible classes as a buying or not.
Differences between PCA and LDA
PCA LDA
1) unsupervised method 1) supervised method

2) PCA is commonly used when there are 2) finding the linear combinations of
no class labels features that best separate two or more
classes

3) Data compression, noise reduction, 3) classification tasks and preprocessing

and visualization for algorithms like Logistic Regression,
SVM, and Neural Networks

4) PCA is sensitive to outliers 4) slightly less sensitive to outliers than

PCA.

5) well-suited for general dimensionality 5) ideal for classification-related

reduction, data exploration, and noise preprocessing
reduction.

Dimensionality Reduction
No ratings yet
Dimensionality Reduction
47 pages
PCA - Feb 8
No ratings yet
PCA - Feb 8
28 pages
Introduction To Dimensionality Reduction
No ratings yet
Introduction To Dimensionality Reduction
5 pages
7. Feature Engineering and Dimensionality Reduction
No ratings yet
7. Feature Engineering and Dimensionality Reduction
146 pages
Unit No.02 - Feature Extraction and Selection
No ratings yet
Unit No.02 - Feature Extraction and Selection
17 pages
Principal Component Analysis (PCA)
No ratings yet
Principal Component Analysis (PCA)
56 pages
s40747-021-00637-x
No ratings yet
s40747-021-00637-x
31 pages
Dimension Reduction
No ratings yet
Dimension Reduction
38 pages
ML Mod 4 Part 2
No ratings yet
ML Mod 4 Part 2
32 pages
Module6
No ratings yet
Module6
51 pages
Day School 03
No ratings yet
Day School 03
32 pages
R21 Unit 2
No ratings yet
R21 Unit 2
101 pages
ML-unit-4
No ratings yet
ML-unit-4
20 pages
Machine Learning Unit-5
No ratings yet
Machine Learning Unit-5
49 pages
Unit-13 Feature Selection and Extraction
No ratings yet
Unit-13 Feature Selection and Extraction
24 pages
Research Citation Notes
No ratings yet
Research Citation Notes
35 pages
Data Reduction Techniques
No ratings yet
Data Reduction Techniques
41 pages
Curse of Dimensionality
No ratings yet
Curse of Dimensionality
33 pages
Chapter6 - Unit IV2024
No ratings yet
Chapter6 - Unit IV2024
84 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
27 pages
ML Unit 4 @ VS
No ratings yet
ML Unit 4 @ VS
33 pages
ICT202B AI ML and Emerging Technologies UNIT 2 (Advanced Phython Packages)
No ratings yet
ICT202B AI ML and Emerging Technologies UNIT 2 (Advanced Phython Packages)
20 pages
Lecture4-Dimensionality Reduction Methods
No ratings yet
Lecture4-Dimensionality Reduction Methods
40 pages
5 Data Pre Processing III
No ratings yet
5 Data Pre Processing III
30 pages
Unit-4 Dimensionality Reduction
No ratings yet
Unit-4 Dimensionality Reduction
17 pages
Feature Pruning and Normalization
No ratings yet
Feature Pruning and Normalization
8 pages
Introduction To Dimensionality Reduction-1
No ratings yet
Introduction To Dimensionality Reduction-1
16 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
30 pages
Feature Selection & Feature Extraction
No ratings yet
Feature Selection & Feature Extraction
19 pages
ASM-BDM - Module 3 - Notes
No ratings yet
ASM-BDM - Module 3 - Notes
12 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
Dimenn Red PDF
No ratings yet
Dimenn Red PDF
135 pages
Feature Selection and Extraction
No ratings yet
Feature Selection and Extraction
26 pages
3.1 Dimensionality Reduction
No ratings yet
3.1 Dimensionality Reduction
24 pages
ML Unit 2 Part -2
No ratings yet
ML Unit 2 Part -2
6 pages
L-10 - Presentation1-09052024-072206pm
No ratings yet
L-10 - Presentation1-09052024-072206pm
27 pages
ICACCI.2015.7275954-pca
No ratings yet
ICACCI.2015.7275954-pca
4 pages
E-Note 14653 Content Document 20231228101402AM
No ratings yet
E-Note 14653 Content Document 20231228101402AM
10 pages
6 - Data Pre-Processing-III
No ratings yet
6 - Data Pre-Processing-III
30 pages
Comparartive
No ratings yet
Comparartive
7 pages
Conference 101719
No ratings yet
Conference 101719
7 pages
Conference 101719
No ratings yet
Conference 101719
7 pages
Chapter 1.2. Overview of ML
No ratings yet
Chapter 1.2. Overview of ML
17 pages
IEEE Dimensionality Reduction
No ratings yet
IEEE Dimensionality Reduction
6 pages
ML UNIT IV PART I
No ratings yet
ML UNIT IV PART I
11 pages
Dimensionality Reduction-PCA FA LDA
No ratings yet
Dimensionality Reduction-PCA FA LDA
12 pages
1.variable Reduction 2.principal Component Analysis: Topic UNIT-4
No ratings yet
1.variable Reduction 2.principal Component Analysis: Topic UNIT-4
19 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
7 pages
Dimensionality
No ratings yet
Dimensionality
9 pages
Business Data Mining Week 4
No ratings yet
Business Data Mining Week 4
12 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
5 pages
Digital Communication (1)
No ratings yet
Digital Communication (1)
57 pages
Unit 5 Notes New
No ratings yet
Unit 5 Notes New
6 pages
Unit 5 - Machine Learning - WWW - Rgpvnotes.in PDF
No ratings yet
Unit 5 - Machine Learning - WWW - Rgpvnotes.in PDF
14 pages
Dimensionality Reduction Techniques You Should Know in 2021
No ratings yet
Dimensionality Reduction Techniques You Should Know in 2021
12 pages
OPTIMIZATION-MODULE IV
No ratings yet
OPTIMIZATION-MODULE IV
7 pages
Dimension Reduction: P Adraig Cunningham University College Dublin
No ratings yet
Dimension Reduction: P Adraig Cunningham University College Dublin
24 pages
lecture06
No ratings yet
lecture06
52 pages
A Survey of Feature Selection and Feature Extraction Techniques in Machine Learning
No ratings yet
A Survey of Feature Selection and Feature Extraction Techniques in Machine Learning
7 pages
Feature Selection Extraction
No ratings yet
Feature Selection Extraction
24 pages
Linear Programming Problems The Simplex Method: Dr. Siddhalingeshwar I.G
100% (1)
Linear Programming Problems The Simplex Method: Dr. Siddhalingeshwar I.G
27 pages
Unit 5 - Machine Learning - Www.a2softech - Xyz - A2kash
No ratings yet
Unit 5 - Machine Learning - Www.a2softech - Xyz - A2kash
12 pages
Antenna Lesson Plan New
No ratings yet
Antenna Lesson Plan New
2 pages
Newton Raphson Method good
No ratings yet
Newton Raphson Method good
14 pages
Applications of Linear System
No ratings yet
Applications of Linear System
38 pages
Zero Lag Indicators
No ratings yet
Zero Lag Indicators
4 pages
Unit 3-Fuzzy Clustering
No ratings yet
Unit 3-Fuzzy Clustering
34 pages
Unit 1
No ratings yet
Unit 1
100 pages
Constraint Satisfaction Problems: Artificial Intelligence COSC-3112 Ms. Humaira Anwer
No ratings yet
Constraint Satisfaction Problems: Artificial Intelligence COSC-3112 Ms. Humaira Anwer
15 pages
Aproiri Qand A
No ratings yet
Aproiri Qand A
9 pages
Pipelining Size and Depth
No ratings yet
Pipelining Size and Depth
19 pages
Daa Practical File Prabhjot
No ratings yet
Daa Practical File Prabhjot
38 pages
Maths Project 2 Final
No ratings yet
Maths Project 2 Final
12 pages
Week 2 Sol
No ratings yet
Week 2 Sol
3 pages
Tutorial 3
No ratings yet
Tutorial 3
3 pages
BCH Encoder Decoder
No ratings yet
BCH Encoder Decoder
17 pages
AIR - MCQ's (Unit 1-6)
No ratings yet
AIR - MCQ's (Unit 1-6)
20 pages
Digital Control System 1
No ratings yet
Digital Control System 1
1 page
Data Structures
No ratings yet
Data Structures
2 pages
Introduction To Data Analytics MCA-3282 Open Elective - 6 Sem B.Tech Topic - Grouping
No ratings yet
Introduction To Data Analytics MCA-3282 Open Elective - 6 Sem B.Tech Topic - Grouping
44 pages
Mathematics in Chemical Engineering PDF
No ratings yet
Mathematics in Chemical Engineering PDF
148 pages
Introduction To Algorithm
No ratings yet
Introduction To Algorithm
3 pages
CSE 471 Spring 2022 Homework 3 Due February 20, Sunday Online Only Typed Answers Will Be Accepted
No ratings yet
CSE 471 Spring 2022 Homework 3 Due February 20, Sunday Online Only Typed Answers Will Be Accepted
3 pages
108 Emergency Ambulance Services
100% (1)
108 Emergency Ambulance Services
5 pages
Samsung New
No ratings yet
Samsung New
6 pages
How To Generate AHow To Generate AWGN Noise in Matlab/Octave (Without Using In-Built Awgn Function) WGN Noise
No ratings yet
How To Generate AHow To Generate AWGN Noise in Matlab/Octave (Without Using In-Built Awgn Function) WGN Noise
5 pages
Pattern Matching
No ratings yet
Pattern Matching
3 pages
A Manual For System Identification
No ratings yet
A Manual For System Identification
8 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet

unit-3

Uploaded by

unit-3

Uploaded by

Contents

Feature Selection Feature Extraction

3) Data compression, noise reduction, 3) classification tasks and preprocessing

4) PCA is sensitive to outliers 4) slightly less sensitive to outliers than

5) well-suited for general dimensionality 5) ideal for classification-related

You might also like