Machine Learning Lab Manual 8
Machine Learning Lab Manual 8
The original technique was developed in the year 1936 by Ronald A. Fisher and was
named Linear Discriminant or Fisher's Discriminant Analysis. The original Linear
Discriminant was described as a two-class technique. The multi-class version was later
generalized by C.R Rao as Multiple Discriminant Analysis. They are all simply referred
to as the Linear Discriminant Analysis.
Code
Let’s see how we could go about implementing Linear Discriminant Analysis from scratch using
Python. To start, import the following libraries.
from sklearn.datasets import load_wine
import pandas as pd
import numpy as np
np.set_printoptions(precision=4)
from matplotlib import pyplot as plt
import seaborn as sns
sns.set()
from sklearn.preprocessing import LabelEncoder
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
In the proceeding tutorial, we’ll be working with the wine dataset which can be obtained from the
UCI machine learning repository. Fortunately, the scitkit-learn library provides a wrapper
function for downloading
wine = load_wine()X = pd.DataFrame(wine.data,
columns=wine.feature_names)
y = pd.Categorical.from_codes(wine.target, wine.target_names)
The features are composed of various characteristics such as the magnesium and alcohol content
of the wine.
X.head()
2. Compute the eigenvectors and corresponding eigenvalues for the scatter matrices
5. Obtain the new features (i.e. LDA components) by taking the dot product of the data and the
matrix from step 4
We calculate the within class scatter matrix using the following formula.
where x is a sample (i.e. row) and n is the total number of samples with a given class.
For every class, we create a vector with the means of each feature.
class_feature_means = pd.DataFrame(columns=wine.target_names)for c, rows in df.groupby('class'):
class_feature_means[c] = rows.mean()class_feature_means
Then, we plug the mean vectors (mi) into the equation from before in order to obtain the within
class scatter matrix.
within_class_scatter_matrix = np.zeros((13,13))for c, rows in
df.groupby('class'):rows = rows.drop(['class'], axis=1)
s += (x - mc).dot((x - mc).T)
within_class_scatter_matrix += s
Next, we calculate the between class scatter matrix using the following formula.
where
feature_means = df.mean()between_class_scatter_matrix =
np.zeros((13,13))for c in class_feature_means:
n = len(df.loc[df['class'] == c].index)
mc, m = class_feature_means[c].values.reshape(13,1),
feature_means.values.reshape(13,1)
The eigenvectors with the highest eigenvalues carry the most information about the distribution of
the data. Thus, we sort the eigenvalues from highest to lowest and select the first k eigenvectors.
In order to ensure that the eigenvalue maps to the same eigenvector after sorting, we place them in
a temporary array.
pairs = [(np.abs(eigen_values[i]), eigen_vectors[:,i]) for i in
range(len(eigen_values))]pairs = sorted(pairs, key=lambda x: x[0],
reverse=True)for pair in pairs:
print(pair[0])
Just looking at the values, it’s difficult to determine how much of the variance is explained by
each component. Thus, we express it as a percentage.
eigen_value_sums = sum(eigen_values)print('Explained Variance')
for i, pair in enumerate(pairs):
print('Eigenvector {}: {}'.format(i,
(pair[0]/eigen_value_sums).real))
matplotlib can’t handle categorical variables directly. Thus, we encode every class as a number so
that we can incorporate the class labels into our plot.
le = LabelEncoder()y = le.fit_transform(df['class'])
Then, we plot the data as a function of the two LDA components and use a different color for
each class.
plt.xlabel('LD1')
plt.ylabel('LD2')plt.scatter(
X_lda[:,0],
X_lda[:,1],
c=y,
cmap='rainbow',
alpha=0.7,
edgecolors='b'
)
Rather than implementing the Linear Discriminant Analysis algorithm from scratch every time,
we can use the predefined LinearDiscriminantAnalysis class made available to us by the scikit-
learn library.
from sklearn.discriminant_analysis import
LinearDiscriminantAnalysislda = LinearDiscriminantAnalysis()X_lda =
lda.fit_transform(X, y)
We can access the following property to obtain the variance explained by each component.
lda.explained_variance_ratio_
Next, let’s take a look at how LDA compares to Principal Component Analysis or PCA. We start
off by creating and fitting an instance of the PCA class.
from sklearn.decomposition import PCApca = PCA(n_components=2)X_pca
= pca.fit_transform(X, y)
We can access the explained_variance_ratio_ property to view the percentage of the variance
explained by each component.
pca.explained_variance_ratio_
As we can see, PCA selected the components which would result in the highest spread (retain the
most information) and not necessarily the ones which maximize the separation between classes.
plt.xlabel('PC1')
plt.ylabel('PC2')plt.scatter(
X_pca[:,0],
X_pca[:,1],
c=y,
cmap='rainbow',
alpha=0.7,
edgecolors='b'
)
Next, let’s see whether we can create a model to classify the using the LDA components as
features. First, we split the data into training and testing sets.
X_train, X_test, y_train, y_test = train_test_split(X_lda, y,
random_state=1)
Then, we build and train a Decision Tree. After predicting the category of each sample in the test
set, we create a confusion matrix to evaluate the model’s performance.
dt = DecisionTreeClassifier()dt.fit(X_train, y_train)y_pred =
dt.predict(X_test)confusion_matrix(y_test, y_pred)
As we can see, the Decision Tree classifier correctly classified everything in the test set.
Face Recognition – LDA is used in face recognition to reduce the number of attributes to
a more manageable number before the actual classification. The dimensions that are
generated are a linear combination of pixels that forms a template. These are called
Fisher’s faces.
Medical – You can use LDA to classify the patient disease as mild, moderate or severe.
The classification is done upon the various parameters of the patient and his medical
trajectory.
Customer Identification – You can obtain the features of customers by performing a
simple question and answer survey. LDA helps in identifying and selecting which
describes the properties of a group of customers who are most likely to buy a particular
item in a shopping mall.