0% found this document useful (0 votes)
131 views

Machine Learning Lab Manual 8

The document discusses linear discriminant analysis (LDA), including: 1. LDA is a dimensionality reduction technique used as a pre-processing step in machine learning applications. 2. It projects features into a lower-dimensional space to avoid the "curse of dimensionality" while retaining class separability. 3. LDA finds the vectors that best separate two or more classes of objects or events and represents the data as a set of scores on these discriminant vectors.

Uploaded by

Raheel Aslam
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
131 views

Machine Learning Lab Manual 8

The document discusses linear discriminant analysis (LDA), including: 1. LDA is a dimensionality reduction technique used as a pre-processing step in machine learning applications. 2. It projects features into a lower-dimensional space to avoid the "curse of dimensionality" while retaining class separability. 3. LDA finds the vectors that best separate two or more classes of objects or events and represents the data as a set of scores on these discriminant vectors.

Uploaded by

Raheel Aslam
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

International Islamic University, Islamabad

Faculty of Engineering & Technology

Department of Electrical Engineering

Machine Learning LAB

Experiment No. 8: Linear Discriminant Analysis in Python

Name of Student: ……………………………………

Registration No.: …………………………………….

Date of Experiment: …………………………………

Submitted To: ………………………………………


Linear Discriminant Analysis or LDA is a dimensionality reduction technique. It is used
as a pre-processing step in Machine Learning and applications of pattern classification.
The goal of LDA is to project the features in higher dimensional space onto a lower-
dimensional space in order to avoid the curse of dimensionality and also reduce
resources and dimensional costs.

The original technique was developed in the year 1936 by Ronald A. Fisher and was
named Linear Discriminant or Fisher's Discriminant Analysis. The original Linear
Discriminant was described as a two-class technique. The multi-class version was later
generalized by C.R Rao as Multiple Discriminant Analysis. They are all simply referred
to as the Linear Discriminant Analysis.

LDA is a supervised classification technique that is considered a part of crafting


competitive machine learning models. This category of dimensionality reduction is used
in areas like image recognition and predictive analysis in marketing.

Code
Let’s see how we could go about implementing Linear Discriminant Analysis from scratch using
Python. To start, import the following libraries.
from sklearn.datasets import load_wine
import pandas as pd
import numpy as np
np.set_printoptions(precision=4)
from matplotlib import pyplot as plt
import seaborn as sns
sns.set()
from sklearn.preprocessing import LabelEncoder
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
In the proceeding tutorial, we’ll be working with the wine dataset which can be obtained from the
UCI machine learning repository. Fortunately, the scitkit-learn library provides a wrapper
function for downloading
wine = load_wine()X = pd.DataFrame(wine.data,
columns=wine.feature_names)
y = pd.Categorical.from_codes(wine.target, wine.target_names)

The dataset contains 178 rows of 13 columns each.


X.shape

The features are composed of various characteristics such as the magnesium and alcohol content
of the wine.
X.head()

There are 3 different kinds of wine.


wine.target_names

We create a DataFrame containing both the features and classes.


df = X.join(pd.Series(y, name='class'))

Linear Discriminant Analysis can be broken up into the following steps:

1. Compute the within class and between class scatter matrices

2. Compute the eigenvectors and corresponding eigenvalues for the scatter matrices

3. Sort the eigenvalues and select the top k


4. Create a new matrix containing eigenvectors that map to the k eigenvalues

5. Obtain the new features (i.e. LDA components) by taking the dot product of the data and the
matrix from step 4

Within Class Scatter Matrix

We calculate the within class scatter matrix using the following formula.

where c is the total number of distinct classes and

where x is a sample (i.e. row) and n is the total number of samples with a given class.

For every class, we create a vector with the means of each feature.
class_feature_means = pd.DataFrame(columns=wine.target_names)for c, rows in df.groupby('class'):
class_feature_means[c] = rows.mean()class_feature_means
Then, we plug the mean vectors (mi) into the equation from before in order to obtain the within
class scatter matrix.
within_class_scatter_matrix = np.zeros((13,13))for c, rows in
df.groupby('class'):rows = rows.drop(['class'], axis=1)

s = np.zeros((13,13))for index, row in rows.iterrows():


x, mc = row.values.reshape(13,1),
class_feature_means[c].values.reshape(13,1)

s += (x - mc).dot((x - mc).T)

within_class_scatter_matrix += s

Between Class Scatter Matrix

Next, we calculate the between class scatter matrix using the following formula.
where

feature_means = df.mean()between_class_scatter_matrix =
np.zeros((13,13))for c in class_feature_means:
n = len(df.loc[df['class'] == c].index)

mc, m = class_feature_means[c].values.reshape(13,1),
feature_means.values.reshape(13,1)

between_class_scatter_matrix += n * (mc - m).dot((mc - m).T)

Then, we solve the generalized eigenvalue problem for

to obtain the linear discriminants.


eigen_values, eigen_vectors =
np.linalg.eig(np.linalg.inv(within_class_scatter_matrix).dot(betwee
n_class_scatter_matrix))

The eigenvectors with the highest eigenvalues carry the most information about the distribution of
the data. Thus, we sort the eigenvalues from highest to lowest and select the first k eigenvectors.
In order to ensure that the eigenvalue maps to the same eigenvector after sorting, we place them in
a temporary array.
pairs = [(np.abs(eigen_values[i]), eigen_vectors[:,i]) for i in
range(len(eigen_values))]pairs = sorted(pairs, key=lambda x: x[0],
reverse=True)for pair in pairs:
print(pair[0])
Just looking at the values, it’s difficult to determine how much of the variance is explained by
each component. Thus, we express it as a percentage.
eigen_value_sums = sum(eigen_values)print('Explained Variance')
for i, pair in enumerate(pairs):
print('Eigenvector {}: {}'.format(i,
(pair[0]/eigen_value_sums).real))

First, we create a matrix W with the first two eigenvectors.


w_matrix = np.hstack((pairs[0][1].reshape(13,1),
pairs[1][1].reshape(13,1))).real

Then, we save the dot product of X and W into a new matrix Y.


where X is a n×d matrix with n samples and d dimensions, and Y is a n×k matrix with n samples
and k ( k<n) dimensions. In other words, Y is composed of the LDA components, or said yet
another way, the new feature space.
X_lda = np.array(X.dot(w_matrix))

matplotlib can’t handle categorical variables directly. Thus, we encode every class as a number so
that we can incorporate the class labels into our plot.
le = LabelEncoder()y = le.fit_transform(df['class'])

Then, we plot the data as a function of the two LDA components and use a different color for
each class.
plt.xlabel('LD1')
plt.ylabel('LD2')plt.scatter(
X_lda[:,0],
X_lda[:,1],
c=y,
cmap='rainbow',
alpha=0.7,
edgecolors='b'
)
Rather than implementing the Linear Discriminant Analysis algorithm from scratch every time,
we can use the predefined LinearDiscriminantAnalysis class made available to us by the scikit-
learn library.
from sklearn.discriminant_analysis import
LinearDiscriminantAnalysislda = LinearDiscriminantAnalysis()X_lda =
lda.fit_transform(X, y)

We can access the following property to obtain the variance explained by each component.
lda.explained_variance_ratio_

Just like before, we plot the two LDA components.


plt.xlabel('LD1')
plt.ylabel('LD2')plt.scatter(
X_lda[:,0],
X_lda[:,1],
c=y,
cmap='rainbow',
alpha=0.7,
edgecolors='b'
)

Next, let’s take a look at how LDA compares to Principal Component Analysis or PCA. We start
off by creating and fitting an instance of the PCA class.
from sklearn.decomposition import PCApca = PCA(n_components=2)X_pca
= pca.fit_transform(X, y)

We can access the explained_variance_ratio_ property to view the percentage of the variance
explained by each component.
pca.explained_variance_ratio_

As we can see, PCA selected the components which would result in the highest spread (retain the
most information) and not necessarily the ones which maximize the separation between classes.
plt.xlabel('PC1')
plt.ylabel('PC2')plt.scatter(
X_pca[:,0],
X_pca[:,1],
c=y,
cmap='rainbow',
alpha=0.7,
edgecolors='b'
)

Next, let’s see whether we can create a model to classify the using the LDA components as
features. First, we split the data into training and testing sets.
X_train, X_test, y_train, y_test = train_test_split(X_lda, y,
random_state=1)
Then, we build and train a Decision Tree. After predicting the category of each sample in the test
set, we create a confusion matrix to evaluate the model’s performance.
dt = DecisionTreeClassifier()dt.fit(X_train, y_train)y_pred =
dt.predict(X_test)confusion_matrix(y_test, y_pred)

As we can see, the Decision Tree classifier correctly classified everything in the test set.

What are the limitations of Logistic Regression?

Logistic Regression is a simple and powerful linear classification algorithm. However, it


has some disadvantages which have led to alternate classification algorithms like LDA.
Some of the limitations of Logistic Regression are as follows:

 Two-class problems – Logistic Regression is traditionally used for two-class and


binary classification problems. Though it can be extrapolated and used in multi-
class classification, this is rarely performed. On the other hand, Linear Discriminant
Analysis is considered a better choice whenever multi-class classification is required
and in the case of binary classifications, both logistic regression and LDA are
applied.
 Unstable with Well-Separated classes – Logistic Regression can lack stability
when the classes are well-separated. This is where LDA comes in.
 Unstable with few examples – If there are few examples from which the
parameters are to be estimated, logistic regression becomes unstable. However,
Linear Discriminant Analysis is a better option because it tends to be stable even in
such cases.
Real-Life Applications of LDA

Some of the practical applications of LDA are listed below:

 Face Recognition – LDA is used in face recognition to reduce the number of attributes to
a more manageable number before the actual classification. The dimensions that are
generated are a linear combination of pixels that forms a template. These are called
Fisher’s faces.
 Medical – You can use LDA to classify the patient disease as mild, moderate or severe.
The classification is done upon the various parameters of the patient and his medical
trajectory.
 Customer Identification – You can obtain the features of customers by performing a
simple question and answer survey. LDA helps in identifying and selecting which
describes the properties of a group of customers who are most likely to buy a particular
item in a shopping mall.

You might also like