0% found this document useful (0 votes)
149 views

Multiple Discriminant Analysis

Multiple discriminant analysis is a statistical technique used when the dependent variable is categorical and the independent variables are metric. It forms linear combinations of the independent variables to discriminate between the categories of the dependent variable. The analysis derives discriminant functions that maximize differences between categories and minimize differences within categories. Researchers use it to understand group differences, classify units accurately, and identify important predictor variables. It requires the assumptions of multivariate normality, equal variance/covariance, and linear relationships between variables.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
149 views

Multiple Discriminant Analysis

Multiple discriminant analysis is a statistical technique used when the dependent variable is categorical and the independent variables are metric. It forms linear combinations of the independent variables to discriminate between the categories of the dependent variable. The analysis derives discriminant functions that maximize differences between categories and minimize differences within categories. Researchers use it to understand group differences, classify units accurately, and identify important predictor variables. It requires the assumptions of multivariate normality, equal variance/covariance, and linear relationships between variables.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 18

Chapter 5

Multiple Discriminant Analysis

5-1
Discriminant Analysis Defined

Multiple discriminant analysis . . . is an appropriate


technique when the dependent variable is categorical (nominal or
nonmetric) and the independent variables are metric. The single
dependent variable can have two, three or more categories.

Examples:
• Gender – Male vs. Female
• Heavy Users vs. Light Users
• Purchasers vs. Non-purchasers
• Good Credit Risk vs. Poor Credit Risk
• Member vs. Non-Member
• Attorney, Physician or Professor

5-2
 
WHAT IS MULTIPLE DISCRIMINANT ANALYSIS?

• Discriminant analysis is a dependence technique that forms variates (linear combinations


of metric independent variables), which are used to predict the classification of a
categorical dependent variable.

• Classification is accomplished by a statistical procedure, which derives discriminant


functions, or variates of the predictor variables, which maximize the between-group
variance and minimize the within-group variance on the discriminant function score(s).

• The null hypothesis is that the two or more group means are equal on the discriminant
function(s), thus a statistically significant model would indicate that the group means are
not equal.

• Researchers use multiple discriminant analysis to help them understand:


– group differences on a set of independent variables
– the ability to correctly classify statistical units into groups or classes
– the relative importance of independent variables in the classification process
Multiple discriminant analysis may be
considered a type of profile analysis or an
analytical predictive technique which is most
appropriate when there is a single categorical
dependent variable and multiple metric
independent variables.
Common applications include
•  assessing credit risk
• predicting failures (firm, product, etc.)
•  profiling market segments
•  
• The discriminant function, or variates is derived
so as to maximize the between-group variance
and minimize the within-group variance.

• From the linear function a Z score is calculated


for each observation. By averaging these
scores one arrive at the centroid or group mean
Graphic Illustration of
Two-Group Discriminant Analysis

X2

A
B

A’

B’ X1

Discriminant
Function
Z

5-7
Stage 1: Objectives of Discriminant Analysis

1. Determine if statistically significant differences exist between the two (or


more) a priori defined groups.

2. Identify the relative importance of each of the independent variables in


predicting group membership.

3. Develop procedures for classifying objects (individuals, firms, products, etc.)


into groups, and then examining the predictive accuracy ( ie. hit ratio) of the
discriminant function to see if it is acceptable (> 25% increase).

5-8
Stage 2: Research Design for Discriminant Analysis

• Selection of dependent and


independent variables.

• Sample size (total & per variable).

• Sample division for validation.

5-9
Selection of dependent and independent
variables

• The dependent variable must be nonmetric, representing groups of


objects that are expected to differ on the independent variables.
• Choose a dependent variable that:
 best represents group differences of interest,
 defines groups that are substantially different, and
 minimizes the number of categories while still meeting the research
objectives.
• In converting metric variables to a nonmetric scale for use as the
dependent variable, consider using extreme groups to maximize the
group differences.
• Independent variables must identify differences between at least two
groups to be of any use in discriminant analysis.
5-10
Sample Size
• Overall Sample:
 Small samples increase sample error, large samples may make all differences
statistically sig.
 striving for 5-20 cases per independent variable

 have a large enough sample to divide it into an estimation and holdout


sample, each meeting the above requirements.

• Sample size for each category


Each category should exceed 20 observation. Wide variation between group
size impact on estimation

5-11
Stage 3: Assumptions of Discriminant Analysis

Key Assumptions
• Multivariate normality of the independent variables.

• Equal variance and covariance for the groups.

Other Assumptions

• Minimal multicollinearity among independent variables.


• Linear relationships.
• Elimination of outliers.
Stage 4: Estimation of the Discriminant Model and Assessing
Overall Fit

• 3 Stages of discriminant analysis

• Deriving the discriminant function


• Calibration
• interpretation
Stage 4: Estimation of the Discriminant Model and
Assessing Overall Fit

Step 1: Selecting An Estimation Method . . .


1. Simultaneous Estimation – all independent variables are
considered at the same time.
2. Stepwise Estimation – independent variables are entered into
the discriminant function one at a time.

Step 2: Assess the function statistical significance


Criteria: Wilks' lambda, Hotelling's trace, Pilliai's criteria, Roy's
greatest characteristic root, Mahalanobis' distance, and Rao's V
measures.
Significance level: The conventional criterion of .05 or beyond is
most often used. 5-14
• Step 3: Construct classification matrix to assess overall
fit of discriminant function

 Calculating discriminant Z scores for each observation see pg 259


 Evaluating group differences on the discriminant Z scores
 Assessing group membership prediction accuracy.
• The statistical and practical rational for developing classification matrices
• The cutting score determination
• Construction of the classification matrices
• Standards for assessing classification accuracy

Step 4:
Compare the ratio to ascertain the predictive accuracy of the
model
a) chance criteria
b) Press Q statistic
Stage 5: Interpretation of the Results

Three Methods . . .
1. Standardized discriminant weights,
2. Discriminant loadings (structure
correlations), and
3. Partial F values.

5-16
Interpretation of the Results

Two or More Functions . . .


1. Rotation of discriminant functions
2. Potency index

5-17
Stage 6: Validation of the Results

• Utilizing a Holdout Sample


• Cross-Validation

5-18

You might also like