unit-3
unit-3
Dimensionality reduction: The curse of dimensionality
Principal component analysis
Feature selection
Discriminant analysis: Fisher linear discriminant, multiple linear discriminant
Dimensionality Reduction
Dimensionality reduction is a technique used to reduce the number of features in a
dataset while retaining as much of the important information as possible.
Dimensionality reduction: The curse of dimensionality
• It is a process of transforming high-dimensional data into a lower-dimensional
space that still preserves the essence of the original data.
• In machine learning, high-dimensional data refers to data with a large number of
features or variables.
• Dimensionality reduction can help to mitigate these problems by reducing the
complexity of the model and improving its generalization performance.
• There are two main approaches to dimensionality reduction: feature selection
and feature extraction.
1. Feature Selection:
• Feature selection involves selecting a subset of the original features that are most
relevant to the problem at hand.
• The goal is to reduce the dimensionality of the dataset while retaining the most
important features.
2. Feature Extraction:
• Feature extraction involves creating new features by combining or transforming
the original features.
• The goal is to create a set of features that captures the essence of the original
data in a lower-dimensional space.
Differences between Feature selection and Feature extraction
2) PCA is commonly used when there are 2) finding the linear combinations of
no class labels features that best separate two or more
classes