0% found this document useful (0 votes)

18 views

MANOVA

The document describes analyzing a dataset containing measurements of plant heights and canopy volumes for different plant varieties using MANOVA. It is found that: 1) Plant varieties have a statistically significant association with both plant height and canopy volume based on a significant Pillai's Trace test statistic and a large effect size. 2) Linear discriminant analysis indicates that varieties C and D are significantly different from each other based on the two variables, while varieties A and B are more similar. 3) Assumptions of MANOVA like normality, homogeneity of variance-covariance matrices, absence of outliers, linearity, and no multicollinearity are met based on various statistical tests conducted on the dataset.

Uploaded by

Vénydo Dunhe

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views

MANOVA

Uploaded by

Vénydo Dunhe

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 12

MANOVA example datasetPermalink

Suppose we have a dataset of various plant varieties

(plant_var) and their associated phenotypic measurements
for plant heights (height) and canopy volume (canopy_vol).
We want to see if plant heights and canopy volume are
associated with different plant varieties using MANOVA.

For MANOVA, the dataset should contain more

observations per group in independent variables than a
number of the dependent variables. It is particularly
important for testing the Homogeneity of the variance-
covariance matrices using Box’s M-test.
Load dataset,
library(tidyverse)
df=read_csv("https://reneshbedre.github.io/assets/posts/ancova/manova_data.csv")
head(df, 2)
# output
plant_var height canopy_vol
<chr> <dbl> <dbl>
1 A 20 0.7
2 A 22 0.8

Summary statistics and visualization of datasetPermalink

Get summary statistics based on each dependent variable,

# summary statistics for dependent variable height
df %>% group_by(plant_var) %>% summarise(n = n(), mean = mean(height), sd =
sd(height))
# output
plant_var n mean sd
<chr> <int> <dbl> <dbl>
1 A 10 18.9 2.92
2 B 10 16.5 1.92
3 C 10 3.05 1.04
4 D 10 9.35 2.11

# summary statistics for dependent variable canopy_vol

df %>% group_by(plant_var) %>% summarise(n = n(), mean = mean(canopy_vol), sd =
sd(canopy_vol))
# output
plant_var n mean sd
<chr> <int> <dbl> <dbl>
1 A 10 0.784 0.121
2 B 10 0.608 0.0968
3 C 10 0.272 0.143
4 D 10 0.474 0.0945

Visualize dataset,
library(gridExtra)
p1 <- ggplot(df, aes(x = plant_var, y = height, fill = plant_var)) +
geom_boxplot(outlier.shape = NA) + geom_jitter(width = 0.2) +
theme(legend.position="top")
p2 <- ggplot(df, aes(x = plant_var, y = canopy_vol, fill = plant_var)) +
geom_boxplot(outlier.shape = NA) + geom_jitter(width = 0.2) +
theme(legend.position="top")
grid.arrange(p1, p2, ncol=2)
perform one-way MANOVAPermalink
dep_vars <- cbind(df$height, df$canopy_vol)
fit <- manova(dep_vars ~ plant_var, data = df)
summary(fit)
# output
Df Pillai approx F num Df den Df Pr(>F)
plant_var 3 1.0365 12.909 6 72 7.575e-10 ***
Residuals 36

# get effect size

library(effectsize)
effectsize::eta_squared(fit)
# output
Parameter | Eta2 (partial) | 95% CI
-----------------------------------------
plant_var | 0.52 | [0.36, 1.00]

The Pillai’s Trace test statistics is statistically significant

[Pillai’s Trace = 1.03, F(6, 72) = 12.90, p < 0.001] and
indicates that plant varieties has a statistically significant
association with both combined plant height and canopy
volume.

The measure of effect size (Partial Eta Squared; ηp2) is 0.52

and suggests that there is a large effect of plant varieties
on both plant height and canopy volume.

post-hoc testPermalink

The MANOVA results suggest that there are statistically

significant (p < 0.001) differences between plant varieties,
but it does not tell which groups are different from each
other. To know which groups are significantly different, the
post-hoc test needs to carry out.

How to Select Columns in R with Exa...

Replay
Unmute
Loaded: 100.00%

Remaining Time -0:00

Fullscreen

How to Select Columns in R with Examples

To test the between-group differences, the univariate
ANOVA can be done on each dependent variable, but this
will be not appropriate and lose information that can be
obtained from multiple variables together.
Here we will perform the linear discriminant analysis (LDA) to see the differences between each
group. LDA will discriminate the groups using information from both the dependent variables.
library(MASS)
post_hoc <- lda(df$plant_var ~ dep_vars, CV=F)
post_hoc
# output
Call:
lda(df$plant_var ~ dep_vars, CV = F)

Prior probabilities of groups:

A B C D
0.25 0.25 0.25 0.25

Group means:
dep_vars1 dep_vars2
A 18.90 0.784
B 16.54 0.608
C 3.05 0.272
D 9.35 0.474

Coefficients of linear discriminants:

LD1 LD2
dep_vars1 -0.4388374 -0.2751091
dep_vars2 -1.3949158 9.3256280

Proportion of trace:
LD1 LD2
0.9855 0.0145

# plot
plot_lda <- data.frame(df[, "plant_var"], lda = predict(post_hoc)$x)
ggplot(plot_lda) + geom_point(aes(x = lda.LD1, y = lda.LD2, colour = plant_var), size
= 4)
The LDA scatter plot discriminates against multiple plant
varieties based on the two dependent variables. The C and
D plant variety has a significant difference (well separated)
as compared to A and B. A and B plant varieties are more
similar to each other. Overall, LDA discriminated between
multiple plant varieties.
Test MANOVA assumptionsPermalink

Assumptions of multivariate normalityPermalink

This test may not be easy to test as it may not be available

in all statistical software packages. You can initially check
the univariate normality for each combination of the
independent and dependent variables. If this test does not
pass (significant p value), it may be possible that
multivariate normality is violated.

Note: As per Multivariate Central Limit Theorem, if the

sample size is large (say n > 20) for each combination of
the independent and dependent variable, we can assume
the assumptions of multivariate normality.
library(rstatix)
df %>% group_by(plant_var) %>% shapiro_test(height, canopy_vol)
plant_var variable statistic p
<chr> <chr> <dbl> <dbl>
1 A canopy_vol 0.968 0.876
2 A height 0.980 0.964
3 B canopy_vol 0.882 0.137
4 B height 0.939 0.540
5 C canopy_vol 0.917 0.333
6 C height 0.895 0.194
7 D canopy_vol 0.873 0.109
8 D height 0.902 0.231

As the p value is non-significant (p > 0.05) for each

combination of independent and dependent variable, we
fail to reject the null hypothesis and conclude that data
follows univariate normality.
If the sample size is large (say n > 50), the visual
approaches such as QQ-plot and histogram will be better
for assessing the normality assumption. Read more here
Now, let’s check for multivariate normality using Mardia’s
Skewness and Kurtosis test,
library(mvnormalTest)
mardia(df[, c("height", "canopy_vol")])$mv.test
# output
Test Statistic p-value Result
1 Skewness 2.8598 0.5815 YES
2 Kurtosis -0.9326 0.351 YES
3 MV Normality <NA> <NA> YES

As the p value is non-significant (p > 0.05) for Mardia’s

Skewness and Kurtosis test, we fail to reject the null
hypothesis and conclude that data follows multivariate
normality.

Here both Skewness and Kurtosis p value should be > 0.05

for concluding the multivariate normality.

Homogeneity of the variance-covariance matricesPermalink

We will use Box’s M test to assess the homogeneity of the

variance-covariance matrices. Null hypothesis: variance-
covariance matrices are equal for each combination
formed by each group in the independent variable
library(heplots)
boxM(Y = df[, c("height", "canopy_vol")], group = df$plant_var)
# output
Box M-test for Homogeneity of Covariance Matrices

data: df[, c("height", "canopy_vol")]

Chi-Sq (approx.) = 21.048, df = 9, p-value = 0.01244
As the p value is non-significant (p > 0.001) for Box’s M
test, we fail to reject the null hypothesis and conclude that
variance-covariance matrices are equal for each
combination of dependent variable formed by each group
in independent variable.

If this assumption fails, it would be good to check the

homogeneity of variance assumption using Bartlett’s or
Levene’s test to identify which variable fails in equal
variance.

Multivariate outliersPermalink

MANOVA is highly sensitive to outliers and may

produce type I or II errors. Multivariate outliers can be
detected using the Mahalanobis Distance test. The larger
the Mahalanobis Distance, the more likely it is an outlier.
library(rstatix)
# get distance
mahalanobis_distance(data = df[, c("height", "canopy_vol")])$is.outlier
# output
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
FALSE FALSE FALSE FALSE
[18] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
FALSE FALSE FALSE FALSE
[35] FALSE FALSE FALSE FALSE FALSE FALSE

From the results, there is no multivariate outliers (all

is.outlier = FALSE or p > 0.001) in the dataset. If is.outlier =
TRUE, it means there is multivariate outlier in the dataset.

Linearity assumptionPermalink
Linearity assumption can be checked by visualizing the
pairwise scatterplot for the dependent variable for each
group. The data points should lie on the straight line to
meet the linearity assumption. The violation of the linearity
assumption reduces the statistical power.
library(gridExtra)
p1 <- df %>% group_by(plant_var) %>% filter(plant_var == "A") %>% ggplot(aes(x =
height, y = canopy_vol)) + geom_point() + ggtitle("Variety: A")
p2 <- df %>% group_by(plant_var) %>% filter(plant_var == "B") %>% ggplot(aes(x =
height, y = canopy_vol)) + geom_point() + ggtitle("Variety: B")
p3 <- df %>% group_by(plant_var) %>% filter(plant_var == "C") %>% ggplot(aes(x =
height, y = canopy_vol)) + geom_point() + ggtitle("Variety: C")
p4 <- df %>% group_by(plant_var) %>% filter(plant_var == "D") %>% ggplot(aes(x =
height, y = canopy_vol)) + geom_point() + ggtitle("Variety: D")
grid.arrange(p1, p2, p3, p4, ncol=2)
The scatterplot indicates that dependent variables have a
linear relationship for each group in the independent
variable

Multicollinearity assumptionPermalink
Multicollinearity can be checked by correlation between
the dependent variable. If you have more than two
dependent variable you can use correlation matrix
or variance inflation factor to assess the multicollinearity.

The correlation between the dependent variable should

not be> 0.9 or too low. If the correlation is too low, you
can perform separate univariate ANOVA for each
dependent variable.
cor.test(x = df$height, y = df$canopy_vol, method = "pearson")$estimate
# output
cor
0.8652548

As the correlation coefficient between the dependent

variable is < 0.9, there is no multicollinearity.

References

1. Warne RT. A primer on multivariate analysis of

variance (MANOVA) for behavioral scientists. Practical
Assessment, Research & Evaluation. 2014 Nov 1;19.
2. Mardia KV. Measures of multivariate skewness and
kurtosis with applications. Biometrika. 1970 Dec
1;57(3):519-30.
3. French A, Macedo M, Poulsen J, Waterson T, Yu
A. Multivariate analysis of variance (MANOVA).

Group 4 MANCOVA
No ratings yet
Group 4 MANCOVA
31 pages
Effect of Roasting Degree of Coffee Beans On Sensory Evaluation - Research From The Perspective of Major Chemical Ingredients
No ratings yet
Effect of Roasting Degree of Coffee Beans On Sensory Evaluation - Research From The Perspective of Major Chemical Ingredients
10 pages
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
From Everand
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
Joseph George Caldwell
No ratings yet
04 Mahalanobis Distance in R MV PDF
No ratings yet
04 Mahalanobis Distance in R MV PDF
9 pages
MVN Packages R
No ratings yet
MVN Packages R
9 pages
Introduction To R. Graphical Representation of Multivariate Observations
No ratings yet
Introduction To R. Graphical Representation of Multivariate Observations
5 pages
Data Mining - R Assignment: Konstantinos Stavrou (70134) 11/11/2012
No ratings yet
Data Mining - R Assignment: Konstantinos Stavrou (70134) 11/11/2012
13 pages
ANOVA in R
No ratings yet
ANOVA in R
11 pages
DAafgfaga
No ratings yet
DAafgfaga
22 pages
2.3 supuestos
No ratings yet
2.3 supuestos
7 pages
Biotools
No ratings yet
Biotools
34 pages
6.T Test Anova and Manova
No ratings yet
6.T Test Anova and Manova
19 pages
Assumptions
No ratings yet
Assumptions
14 pages
04 Mahalanobis Distance in R
No ratings yet
04 Mahalanobis Distance in R
12 pages
MVN: An R Package For Assessing Multivariate Normality
No ratings yet
MVN: An R Package For Assessing Multivariate Normality
12 pages
Journal of Statistical Software: Nonparametric Inference For Multivariate Data: The R Package NPMV
No ratings yet
Journal of Statistical Software: Nonparametric Inference For Multivariate Data: The R Package NPMV
18 pages
4-Lecture 04
No ratings yet
4-Lecture 04
34 pages
STAT501 Multivariate Analysis
No ratings yet
STAT501 Multivariate Analysis
196 pages
MVN
No ratings yet
MVN
14 pages
R Practice
No ratings yet
R Practice
38 pages
MANOVA
No ratings yet
MANOVA
33 pages
Stata User Guide Release 18 - Multivariate Statistics, Mvtest Normality
No ratings yet
Stata User Guide Release 18 - Multivariate Statistics, Mvtest Normality
8 pages
LAB TEST (1)
No ratings yet
LAB TEST (1)
7 pages
Wilcox Functions
No ratings yet
Wilcox Functions
117 pages
STAT456 Study Guide
No ratings yet
STAT456 Study Guide
31 pages
Multivariate Analysis of Variance and Covariance
No ratings yet
Multivariate Analysis of Variance and Covariance
3 pages
Multivariate Analysis of Variance
No ratings yet
Multivariate Analysis of Variance
29 pages
Solutions - Lab 4 - Assumptions & Multiple Comparisons: Learning Outcomes
No ratings yet
Solutions - Lab 4 - Assumptions & Multiple Comparisons: Learning Outcomes
23 pages
Example Report
No ratings yet
Example Report
22 pages
Anova and Regression
No ratings yet
Anova and Regression
2 pages
CORRELATION AND COVARIANCE in R
100% (1)
CORRELATION AND COVARIANCE in R
24 pages
R Console
No ratings yet
R Console
6 pages
MVA
No ratings yet
MVA
4 pages
Commands for Data Analysis using R
No ratings yet
Commands for Data Analysis using R
11 pages
Intro To Analytics Modeling Homework 2
No ratings yet
Intro To Analytics Modeling Homework 2
22 pages
A6 Regression Challenge ANSWERS
No ratings yet
A6 Regression Challenge ANSWERS
6 pages
Which Test When: 1 Exploratory Tests
No ratings yet
Which Test When: 1 Exploratory Tests
5 pages
R Lab Program
No ratings yet
R Lab Program
20 pages
RMM Subject Powerpoint
No ratings yet
RMM Subject Powerpoint
19 pages
Data Exploration and Visualisation With R: Yanchang Zhao
No ratings yet
Data Exploration and Visualisation With R: Yanchang Zhao
45 pages
Multivariate Analysis of Covariance Mancova
No ratings yet
Multivariate Analysis of Covariance Mancova
3 pages
Final Data Lab
No ratings yet
Final Data Lab
21 pages
MVAx
No ratings yet
MVAx
4 pages
DM Assignment
No ratings yet
DM Assignment
17 pages
R Code For Linear Regression Analysis 1 Way ANOVA
No ratings yet
R Code For Linear Regression Analysis 1 Way ANOVA
8 pages
Statistics With R
No ratings yet
Statistics With R
20 pages
ANOVA in R
No ratings yet
ANOVA in R
7 pages
ANOVA in R in RIMSR
No ratings yet
ANOVA in R in RIMSR
12 pages
Chisq QQPlot
No ratings yet
Chisq QQPlot
8 pages
R
No ratings yet
R
6 pages
It Application - Manova
No ratings yet
It Application - Manova
11 pages
Linear Statistical Models
No ratings yet
Linear Statistical Models
16 pages
Data Science Project
No ratings yet
Data Science Project
31 pages
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
MANOVA1
No ratings yet
MANOVA1
6 pages
ANOVA y Tukey
No ratings yet
ANOVA y Tukey
9 pages
Using R For Data Preprocessing, Exploratory Analysis, Visualization
No ratings yet
Using R For Data Preprocessing, Exploratory Analysis, Visualization
7 pages
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Support Vector Machine: Fundamentals and Applications
From Everand
Support Vector Machine: Fundamentals and Applications
Fouad Sabry
No ratings yet
Statistical Classification: Fundamentals and Applications
From Everand
Statistical Classification: Fundamentals and Applications
Fouad Sabry
No ratings yet
MMSe Paed
No ratings yet
MMSe Paed
6 pages
NN (1) (1)
No ratings yet
NN (1) (1)
10 pages
Entrep M5
No ratings yet
Entrep M5
13 pages
(FREE PDF Sample) Primer of Applied Regression and Analysis of Variance 3rd Edition Glantz S.A. Ebooks
100% (1)
(FREE PDF Sample) Primer of Applied Regression and Analysis of Variance 3rd Edition Glantz S.A. Ebooks
62 pages
The Impact of Employee Empowerment On The Success of Organizational Change: A Study in Privatized Enterprises in Jordan
No ratings yet
The Impact of Employee Empowerment On The Success of Organizational Change: A Study in Privatized Enterprises in Jordan
15 pages
Is Enhanced Audit Quality Associated With Greater Real Earnings Management?
No ratings yet
Is Enhanced Audit Quality Associated With Greater Real Earnings Management?
22 pages
T test
No ratings yet
T test
17 pages
B. A. H Eco. 26 Applied Econometrics Sem. 4 2014
No ratings yet
B. A. H Eco. 26 Applied Econometrics Sem. 4 2014
5 pages
The Effect of Product Quality and Delivery Service On Online Customer Satisfaction in Zalora Indonesia
No ratings yet
The Effect of Product Quality and Delivery Service On Online Customer Satisfaction in Zalora Indonesia
11 pages
Inventory Management Practices and Financial Performance of Small and Medium Scale Enterprise in Keya
No ratings yet
Inventory Management Practices and Financial Performance of Small and Medium Scale Enterprise in Keya
16 pages
Introduction to Econometrics Update 3rd Edition Stock Test Bank - Quick Download In Full PDF Format With All Chapters
100% (3)
Introduction to Econometrics Update 3rd Edition Stock Test Bank - Quick Download In Full PDF Format With All Chapters
53 pages
Reporting Intangible Assets: Voluntary Disclosure Practices of Top Emerging Market Companies
No ratings yet
Reporting Intangible Assets: Voluntary Disclosure Practices of Top Emerging Market Companies
22 pages
Factor Analysis in SPSS 1
No ratings yet
Factor Analysis in SPSS 1
15 pages
Answers Review Questions Econometrics
84% (25)
Answers Review Questions Econometrics
59 pages
The Socio Economic Efficiency of Digital Government Transformation
No ratings yet
The Socio Economic Efficiency of Digital Government Transformation
13 pages
Regression and Factor
No ratings yet
Regression and Factor
95 pages
Pls-Sem P
100% (1)
Pls-Sem P
32 pages
Measures of Forecast Error - MSE MAD MAPE Regression Analysis
No ratings yet
Measures of Forecast Error - MSE MAD MAPE Regression Analysis
34 pages
A Comparative Analysis of Organizational Structure and Effectiveness Between Public and Private Universities: A Case of University of East Africa-Baraton and Moi University in Kenya
No ratings yet
A Comparative Analysis of Organizational Structure and Effectiveness Between Public and Private Universities: A Case of University of East Africa-Baraton and Moi University in Kenya
11 pages
University Social Responsibility - Doran - AAM
No ratings yet
University Social Responsibility - Doran - AAM
35 pages
Audit Fee
No ratings yet
Audit Fee
27 pages
Linear Regression Final Exam
No ratings yet
Linear Regression Final Exam
3 pages
Problems With Stepwise Regression
No ratings yet
Problems With Stepwise Regression
1 page
Econ 704
No ratings yet
Econ 704
3 pages
KnowlProcessManag-2024-Durst-TheimpactofethicalleadershiponKMpracticesandperformance
No ratings yet
KnowlProcessManag-2024-Durst-TheimpactofethicalleadershiponKMpracticesandperformance
10 pages
Multiple Linear Regression: Application
No ratings yet
Multiple Linear Regression: Application
22 pages
21452-Article Text-76851-1-10-20231212
No ratings yet
21452-Article Text-76851-1-10-20231212
14 pages
Political Behavior - Trustworthiness - Job Satisfaction and Commitment
No ratings yet
Political Behavior - Trustworthiness - Job Satisfaction and Commitment
21 pages
Text Problems Solved
No ratings yet
Text Problems Solved
9 pages

MANOVA

Uploaded by

MANOVA

Uploaded by

MANOVA example datasetPermalink

Suppose we have a dataset of various plant varieties

For MANOVA, the dataset should contain more

Summary statistics and visualization of datasetPermalink

Get summary statistics based on each dependent variable,

# summary statistics for dependent variable canopy_vol

# get effect size

The Pillai’s Trace test statistics is statistically significant

The measure of effect size (Partial Eta Squared; ηp2) is 0.52

The MANOVA results suggest that there are statistically

How to Select Columns in R with Exa...

Remaining Time -0:00

How to Select Columns in R with Examples

Prior probabilities of groups:

Coefficients of linear discriminants:

Assumptions of multivariate normalityPermalink

This test may not be easy to test as it may not be available

Note: As per Multivariate Central Limit Theorem, if the

As the p value is non-significant (p > 0.05) for each

As the p value is non-significant (p > 0.05) for Mardia’s

Here both Skewness and Kurtosis p value should be > 0.05

Homogeneity of the variance-covariance matricesPermalink

We will use Box’s M test to assess the homogeneity of the

data: df[, c("height", "canopy_vol")]

If this assumption fails, it would be good to check the

MANOVA is highly sensitive to outliers and may

From the results, there is no multivariate outliers (all

The correlation between the dependent variable should

As the correlation coefficient between the dependent

1. Warne RT. A primer on multivariate analysis of

You might also like