100% found this document useful (1 vote)

238 views25 pages

Factor Hair Revised PDF

The passage discusses the challenges of summarizing long documents in a concise manner. It notes that automatically generating summaries requires identifying the most important concepts and events while removing unnecessary details. The goal is to produce a high-level overview that conveys the key topics and storyline in just a few sentences.

Uploaded by

Sampada Halve

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

238 views25 pages

Factor Hair Revised PDF

Uploaded by

Sampada Halve

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

8/30/2019

Project Report: Advance

Statistics
Factor Hair Revised

Submitted By: Sampada Sumant

Table of Content

A. Factor Hair Revised Case Study Background.

B. Questions to be solved.

1. Perform exploratory data analysis on the dataset. Showcase some

chart, graphs. Check for Outliers and missing value.

2. Is there evidence of multicollinearity? Showcase your analysis.

3. Perform simple linear regression for the dependent variable with

every independent variable.

4. Perform PCA/Factor Analysis by extracting 4 factors. Interpret the

output and name the factors.

5. Perform multiple linear regression with customer satisfaction as

dependent variable and the four factor as independent variables.

Comment on the model output and validity. Your remark should

make it meaningful for everybody.

1. Perform exploratory data analysis on the dataset. Showcase some
chart, graphs. Check for Outliers and missing value.
setwd("C:/Users/Sampada/Desktop/BABI/Assignment/Factor-Hair-Revised")

factor_hair = read.csv("Factor-Hair-Revised.csv", header = TRUE)

summary(factor_hair)
ID ProdQual Ecom TechSup CompRes
Min. : 1.00 Min. : 5.000 Min. :2.200 Min. :1.300 Min. :2.600
1st Qu.: 25.75 1st Qu.: 6.575 1st Qu.:3.275 1st Qu.:4.250 1st Qu.:4.600
Median : 50.50 Median : 8.000 Median :3.600 Median :5.400 Median :5.450
Mean : 50.50 Mean : 7.810 Mean :3.672 Mean :5.365 Mean :5.442
3rd Qu.: 75.25 3rd Qu.: 9.100 3rd Qu.:3.925 3rd Qu.:6.625 3rd Qu.:6.325
Max. :100.00 Max. :10.000 Max. :5.700 Max. :8.500 Max. :7.800
Advertising ProdLine SalesFImage ComPricing WartyClaim
Min. :1.900 Min. :2.300 Min. :2.900 Min. :3.700 Min. :4.100
1st Qu.:3.175 1st Qu.:4.700 1st Qu.:4.500 1st Qu.:5.875 1st Qu.:5.400
Median :4.000 Median :5.750 Median :4.900 Median :7.100 Median :6.100
Mean :4.010 Mean :5.805 Mean :5.123 Mean :6.974 Mean :6.043
3rd Qu.:4.800 3rd Qu.:6.800 3rd Qu.:5.800 3rd Qu.:8.400 3rd Qu.:6.600
Max. :6.500 Max. :8.400 Max. :8.200 Max. :9.900 Max. :8.100
OrdBilling DelSpeed Satisfaction
Min. :2.000 Min. :1.600 Min. :4.700
1st Qu.:3.700 1st Qu.:3.400 1st Qu.:6.000
Median :4.400 Median :3.900 Median :7.050
Mean :4.278 Mean :3.886 Mean :6.918
3rd Qu.:4.800 3rd Qu.:4.425 3rd Qu.:7.625
Max. :6.700 Max. :5.500 Max. :9.900
attach(factor_hair)

names(factor_hair)
[1] "ID" "ProdQual" "Ecom" "TechSup" "CompRes" "Advert
ising"
[7] "ProdLine" "SalesFImage" "ComPricing" "WartyClaim" "OrdBilling" "DelSp
eed"
[13] "Satisfaction"

dim(factor_hair)
[1] 100 13
str(factor_hair)
'data.frame': 100 obs. of 13 variables:
$ ID : int 1 2 3 4 5 6 7 8 9 10 ...
$ ProdQual : num 8.5 8.2 9.2 6.4 9 6.5 6.9 6.2 5.8 6.4 ...
$ Ecom : num 3.9 2.7 3.4 3.3 3.4 2.8 3.7 3.3 3.6 4.5 ...
$ TechSup : num 2.5 5.1 5.6 7 5.2 3.1 5 3.9 5.1 5.1 ...
$ CompRes : num 5.9 7.2 5.6 3.7 4.6 4.1 2.6 4.8 6.7 6.1 ...
$ Advertising : num 4.8 3.4 5.4 4.7 2.2 4 2.1 4.6 3.7 4.7 ...
$ ProdLine : num 4.9 7.9 7.4 4.7 6 4.3 2.3 3.6 5.9 5.7 ...
$ SalesFImage : num 6 3.1 5.8 4.5 4.5 3.7 5.4 5.1 5.8 5.7 ...
$ ComPricing : num 6.8 5.3 4.5 8.8 6.8 8.5 8.9 6.9 9.3 8.4 ...
$ WartyClaim : num 4.7 5.5 6.2 7 6.1 5.1 4.8 5.4 5.9 5.4 ...
$ OrdBilling : num 5 3.9 5.4 4.3 4.5 3.6 2.1 4.3 4.4 4.1 ...
$ DelSpeed : num 3.7 4.9 4.5 3 3.5 3.3 2 3.7 4.6 4.4 ...
$ Satisfaction: num 8.2 5.7 8.9 4.8 7.1 4.7 5.7 6.3 7 5.5 ...
head(factor_hair)
ID ProdQual Ecom TechSup CompRes Advertising ProdLine SalesFImage ComPricing Wart
ycl aim
1 1 8.5 3.9 2.5 5.9 4.8 4.9 6.0 6.8
4.7
2 2 8.2 2.7 5.1 7.2 3.4 7.9 3.1 5.3
5.5
3 3 9.2 3.4 5.6 5.6 5.4 7.4 5.8 4.5
6.2
4 4 6.4 3.3 7.0 3.7 4.7 4.7 4.5 8.8
7.0
5 5 9.0 3.4 5.2 4.6 2.2 6.0 4.5 6.8
6.1
6 6 6.5 2.8 3.1 4.1 4.0 4.3 3.7 8.5
5.1
OrdBilling DelSpeed Satisfaction
1 5.0 3.7 8.2
2 3.9 4.9 5.7
3 5.4 4.5 8.9
4 4.3 3.0 4.8
5 4.5 3.5 7.1
6 3.6 3.3 4.7
tail(factor_hair)
ID ProdQual Ecom TechSup CompRes Advertising ProdLine SalesFImage ComPricing Wart
yClaim OrdBilling DelSpeed Satisfaction
95 95 9.3 3.8 4.0 4.6 4.7 6.4 5.5 7.4
5.3 3.6 3.4 7.7
96 96 8.6 4.8 5.6 5.3 2.3 6.0 5.7 6.7
5.8 4.9 3.6 7.3
97 97 7.4 3.4 2.6 5.0 4.1 4.4 4.8 7.2
4.5 4.2 3.7 6.3
98 98 8.7 3.2 3.3 3.2 3.1 6.1 2.9 5.6
5.0 3.1 2.5 5.4
99 99 7.8 4.9 5.8 5.3 5.2 5.3 7.1 7.9
6.0 4.3 3.9 6.4
100 100 7.9 3.0 4.4 5.1 5.9 4.2 4.8 9.7
5.7 3.4 3.5 6.4

anyNA(factor_hair)
[1] FALSE
boxplot(factor_hair[,-1], las=2)

From box plot it can be viewed that outliers are present in E-

Commerce, Sales Force Image, Order & Billing & Delivery Speed.

library(reshape2)
> dat.m = melt(factor_hair, id.vars = "ID", measure.vars = names(factor_hair[,-1]))
> ggplot(dat.m)+geom_boxplot(aes(x=ID, y=value, color=variable))

corrplot(cor(factor_hair[2:12]))
hist(factor_hair$Satisfaction, breaks = (0:10), labels = TRUE, main = "Histogram of sa
tisfaction", col = "Brown")

par(mfrow = c(3,4))
hist(factor_hair$Satisfaction, breaks = (0:10), labels = TRUE, main = "Histogram of sa
tisfaction", col = "Brown")

hist(factor_hair$ProdQual, breaks = (0:10), labels = TRUE, main = "Histogram of Produc

t Quality", col = "Brown")

> hist(factor_hair$Ecom, breaks = (0:10), labels = TRUE, main = "Histogram of Ecommerc

e", col = "Brown")

> hist(factor_hair$TechSup, breaks = (0:10), labels = TRUE, main = "Histogram of Techn

ial Support", col = "Brown")

> hist(factor_hair$CompRes, breaks = (0:10), labels = TRUE, main = "Histogram of Compl

aint Resoultion", col = "Brown")

> hist(factor_hair$Advertising, breaks = (0:10), labels = TRUE, main = "Histogram of A

dvertising", col = "Brown")

> hist(factor_hair$ProdLine, breaks = (0:10), labels = TRUE, main = "Histogram of Prod

uct Line", col = "Brown")

> hist(factor_hair$SalesFImage, breaks = (0:10), labels = TRUE, main = "Histogram of S

ales Force Image", col = "Brown")

> hist(factor_hair$ComPricing, breaks = (0:10), labels = TRUE, main = "Histogram of Co

mpetitive Pricing", col = "Brown")

> hist(factor_hair$WartyClaim, breaks = (0:10), labels = TRUE, main = "Histogram of Wa

rranty & Claims", col = "Brown")

> hist(factor_hair$OrdBilling, breaks = (0:10), labels = TRUE, main = "Histogram of Or

der & Billing", col = "Brown")

> hist(factor_hair$DelSpeed, breaks = (0:10), labels = TRUE, main = "Histogram of Deli

very Speed", col = "Brown")
par(mar = rep(2,4))
> plot(ProdQual, Satisfaction, main = "Plot of Product Quality against Satisfaction",
col = "Red")
> plot(Ecom, Satisfaction, main = "Plot of E-commerce against Satisfaction", col = "Re
d")
> plot(TechSup, Satisfaction, main = "Plot of Technical Support against Satisfaction",
col = "Red")
> plot(CompRes, Satisfaction, main = "Plot of Complaint Resolution against Satisfactio
n", col = "Red")
> plot(Advertising, Satisfaction, main = "Plot of Advertising against Satisfaction", c
ol = "Red")
> plot(ProdLine, Satisfaction, main = "Plot of Product Line against Satisfaction", col
= "Red")
> plot(SalesFImage, Satisfaction, main = "Plot of Sales Force Image against Satisfacti
on", col = "Red")
> plot(ComPricing, Satisfaction, main = "Plot of Competitive Pricing against Satisfact
ion", col = "Red")
> plot(WartyClaim, Satisfaction, main = "Plot of Warranty & Claim against Satisfaction
", col = "Red")
> plot(OrdBilling, Satisfaction, main = "Plot of Order & Billing against Satisfaction"
, col = "Red")
> plot(DelSpeed, Satisfaction, main = "Plot of Delivery Speed against Satisfaction", c
ol = "Red")
2. Is there evidence of multicollinearity? Showcase your analysis.
Fh.core = round(cor(myfactor_hair[,1:12]),2)

Fh.core

myfactor_hair = data.frame(factor_hair[,-1])
summary(myfactor_hair)
ProdQual Ecom TechSup CompRes Advertising P
rodLine SalesFImage ComPricing
Min. : 5.000 Min. :2.200 Min. :1.300 Min. :2.600 Min. :1.900 Min.
:2.300 Min. :2.900 Min. :3.700
1st Qu.: 6.575 1st Qu.:3.275 1st Qu.:4.250 1st Qu.:4.600 1st Qu.:3.175 1st
Qu.:4.700 1st Qu.:4.500 1st Qu.:5.875
Median : 8.000 Median :3.600 Median :5.400 Median :5.450 Median :4.000 Medi
an :5.750 Median :4.900 Median :7.100
Mean : 7.810 Mean :3.672 Mean :5.365 Mean :5.442 Mean :4.010 Mean
:5.805 Mean :5.123 Mean :6.974
3rd Qu.: 9.100 3rd Qu.:3.925 3rd Qu.:6.625 3rd Qu.:6.325 3rd Qu.:4.800 3rd
Qu.:6.800 3rd Qu.:5.800 3rd Qu.:8.400
Max. :10.000 Max. :5.700 Max. :8.500 Max. :7.800 Max. :6.500 Max.
:8.400 Max. :8.200 Max. :9.900
For further analysis out of 13 variables, we have nullified ID column
which is not required for the analysis

fhmodel = lm(Satisfaction~.,myfactor_hair)

summary(fhmodel)

Call:
lm(formula = Satisfaction ~ ., data = myfactor_hair)

Residuals:
Min 1Q Median 3Q Max
-1.43005 -0.31165 0.07621 0.37190 0.90120
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.66961 0.81233 -0.824 0.41199
ProdQual 0.37137 0.05177 7.173 2.18e-10 ***
Ecom -0.44056 0.13396 -3.289 0.00145 **
TechSup 0.03299 0.06372 0.518 0.60591
CompRes 0.16703 0.10173 1.642 0.10416
Advertising -0.02602 0.06161 -0.422 0.67382
ProdLine 0.14034 0.08025 1.749 0.08384 .
SalesFImage 0.80611 0.09775 8.247 1.45e-12 ***
ComPricing -0.03853 0.04677 -0.824 0.41235
WartyClaim -0.10298 0.12330 -0.835 0.40587
OrdBilling 0.14635 0.10367 1.412 0.16160
DelSpeed 0.16570 0.19644 0.844 0.40124
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.5623 on 88 degrees of freedom

Multiple R-squared: 0.8021, Adjusted R-squared: 0.7774
F-statistic: 32.43 on 11 and 88 DF, p-value: < 2.2e-16

fhmodel
Call:
lm(formula = Satisfaction ~ ., data = myfactor_hair)
Coefficients:
(Intercept) ProdQual Ecom TechSup CompRes Advertising ProdLine SalesFImage
-0.66961 0.37137 -0.44056 0.03299 0.16703 -0.02602 0.14034 0.80611
ComPricing WartyClaim OrdBilling DelSpeed
-0.03853 -0.10298 0.14635 0.16570

vif(fhmodel)
ProdQual Ecom TechSup CompRes Advertising ProdLine SalesFImage ComPricing
1.635797 2.756694 2.976796 4.730448 1.508933 3.488185 3.439420 1.635000
WartyClaim OrdBilling DelSpeed
3.198337 2.902999 6.516014

The above analysis shows the low VIF value in respect of the variables.
However, the VIF value for Delivery Speed is 6.51 which can be
considered for this analysis
3. Perform simple linear regression for the dependent variable with
every independent variable.
#simple linear regression of dependent variable with independent
variable
#SLM for Product quality

plot(ProdQual,Satisfaction,col="Red", abline(lm(Satisfaction~ProdQual),col="Blue"))

> cor(Satisfaction, ProdQual)

[1] 0.486325
> Model1 = lm(Satisfaction~ProdQual, data = myfactor_hair)
> summary(Model1)

Call:
lm(formula = Satisfaction ~ ProdQual, data = myfactor_hair)

Residuals:
Min 1Q Median 3Q Max
-1.88746 -0.72711 -0.01577 0.85641 2.25220
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.67593 0.59765 6.151 1.68e-08 ***
ProdQual 0.41512 0.07534 5.510 2.90e-07 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.047 on 98 degrees of freedom

Multiple R-squared: 0.2365, Adjusted R-squared: 0.2287
F-statistic: 30.36 on 1 and 98 DF, p-value: 2.901e-07
#SLM For E-Commerce

plot(Ecom,Satisfaction,col="Red", abline(lm(Satisfaction~Ecom),col="Blue"))
> cor(Satisfaction, Ecom)
[1] 0.282745
> Model2 = lm(Satisfaction~Ecom, data = myfactor_hair)
> summary(Model2)

Call:
lm(formula = Satisfaction ~ Ecom, data = myfactor_hair)

Residuals:
Min 1Q Median 3Q Max
-2.37200 -0.78971 0.04959 0.68085 2.34580

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.1516 0.6161 8.361 4.28e-13 ***
Ecom 0.4811 0.1649 2.918 0.00437 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.149 on 98 degrees of freedom
Multiple R-squared: 0.07994, Adjusted R-squared: 0.07056
F-statistic: 8.515 on 1 and 98 DF, p-value: 0.004368
#SLM for Technical Support

plot(TechSup,Satisfaction,col="Red", abline(lm(Satisfaction~TechSup),col="Blue"))

> cor(Satisfaction, TechSup)

[1] 0.1125972

> Model3 = lm(Satisfaction~TechSup, data = myfactor_hair)

> summary(Model3)
Call:
lm(formula = Satisfaction ~ TechSup, data = myfactor_hair)
Residuals:
Min 1Q Median 3Q Max
-2.26136 -0.93297 0.04302 0.82501 2.85617

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.44757 0.43592 14.791 <2e-16 ***
TechSup 0.08768 0.07817 1.122 0.265
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.19 on 98 degrees of freedom

Multiple R-squared: 0.01268, Adjusted R-squared: 0.002603
F-statistic: 1.258 on 1 and 98 DF, p-value: 0.2647
#SLM for Complaint Resolution

plot(CompRes,Satisfaction,col="Red", abline(lm(Satisfaction~CompRes),col="Blue"))

> cor(Satisfaction, CompRes)

[1] 0.6032626
> Model4 = lm(Satisfaction~CompRes, data = myfactor_hair)
> summary(Model4)

Call:
lm(formula = Satisfaction ~ CompRes, data = myfactor_hair)

Residuals:
Min 1Q Median 3Q Max
-2.40450 -0.66164 0.04499 0.63037 2.70949

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.68005 0.44285 8.310 5.51e-13 ***
CompRes 0.59499 0.07946 7.488 3.09e-11 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.9554 on 98 degrees of freedom
Multiple R-squared: 0.3639, Adjusted R-squared: 0.3574
F-statistic: 56.07 on 1 and 98 DF, p-value: 3.085e-11
#SLM for Advertising

plot(Advertising,Satisfaction,col="Red", abline(lm(Satisfaction~Advertising),col="Blue
"))

> cor(Satisfaction, Advertising)

[1] 0.3046695

> Model5 = lm(Satisfaction~Advertising, data = myfactor_hair)

> summary(Model5)

Call:
lm(formula = Satisfaction ~ Advertising, data = myfactor_hair)
Residuals:
Min 1Q Median 3Q Max
-2.34033 -0.92755 0.05577 0.79773 2.53412

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.6259 0.4237 13.279 < 2e-16 ***
Advertising 0.3222 0.1018 3.167 0.00206 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.141 on 98 degrees of freedom

Multiple R-squared: 0.09282, Adjusted R-squared: 0.08357
F-statistic: 10.03 on 1 and 98 DF, p-value: 0.002056

#SLM For Product Line

plot(ProdLine,Satisfaction,col="Red", abline(lm(Satisfaction~ProdLine),col="Blue"))

> cor(Satisfaction, ProdLine)

[1] 0.5505459
> Model6 = lm(Satisfaction~ProdLine, data = myfactor_hair)
> summary(Model6)
Call:
lm(formula = Satisfaction ~ ProdLine, data = myfactor_hair)
Residuals:
Min 1Q Median 3Q Max
-2.3634 -0.7795 0.1097 0.7604 1.7373

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.02203 0.45471 8.845 3.87e-14 ***
ProdLine 0.49887 0.07641 6.529 2.95e-09 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1 on 98 degrees of freedom

Multiple R-squared: 0.3031, Adjusted R-squared: 0.296
F-statistic: 42.62 on 1 and 98 DF, p-value: 2.953e-09
#SLM for Sales Force Image
plot(SalesFImage,Satisfaction,col="Red", abline(lm(Satisfaction~SalesFImage),col="Blue
"))

> cor(Satisfaction, SalesFImage)

[1] 0.5002053
> Model7 = lm(Satisfaction~SalesFImage, data = myfactor_hair)
> summary(Model7)

Call:
lm(formula = Satisfaction ~ SalesFImage, data = myfactor_hair)

Residuals:
Min 1Q Median 3Q Max
-2.2164 -0.5884 0.1838 0.6922 2.0728
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.06983 0.50874 8.000 2.54e-12 ***
SalesFImage 0.55596 0.09722 5.719 1.16e-07 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.037 on 98 degrees of freedom

Multiple R-squared: 0.2502, Adjusted R-squared: 0.2426
F-statistic: 32.7 on 1 and 98 DF, p-value: 1.164e-07

#SLM for Competitive Pricing

plot(ComPricing,Satisfaction,col="Red", abline(lm(Satisfaction~ComPricing),col="Blue")
)

> cor(Satisfaction, ComPricing)

[1] -0.2082957
> Model8 = lm(Satisfaction~ComPricing, data = myfactor_hair)
> summary(Model8)

Call:
lm(formula = Satisfaction ~ ComPricing, data = myfactor_hair)

Residuals:
Min 1Q Median 3Q Max
-1.9728 -0.9915 -0.1156 0.9111 2.5845
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 8.03856 0.54427 14.769 <2e-16 ***
ComPricing -0.16068 0.07621 -2.108 0.0376 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.172 on 98 degrees of freedom

Multiple R-squared: 0.04339, Adjusted R-squared: 0.03363
F-statistic: 4.445 on 1 and 98 DF, p-value: 0.03756

#SLM for Warranty Claim

plot(WartyClaim,Satisfaction,col="Red", abline(lm(Satisfaction~WartyClaim),col="Blue")
)

> cor(Satisfaction, WartyClaim)

[1] 0.1775448
> Model9 = lm(Satisfaction~WartyClaim, data = myfactor_hair)
> summary(Model9)
Call:
lm(formula = Satisfaction ~ WartyClaim, data = myfactor_hair)
Residuals:
Min 1Q Median 3Q Max
-2.36504 -0.90202 0.03019 0.90763 2.88985

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.3581 0.8813 6.079 2.32e-08 ***
WartyClaim 0.2581 0.1445 1.786 0.0772 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.179 on 98 degrees of freedom

Multiple R-squared: 0.03152, Adjusted R-squared: 0.02164
F-statistic: 3.19 on 1 and 98 DF, p-value: 0.0772
#SLM for Order & Billing

plot(OrdBilling,Satisfaction,col="Red", abline(lm(Satisfaction~OrdBilling),col="Blue")
)

> cor(Satisfaction, OrdBilling)

[1] 0.5217319
> Model10 = lm(Satisfaction~OrdBilling, data = myfactor_hair)
> summary(Model10)

Call:
lm(formula = Satisfaction ~ OrdBilling, data = myfactor_hair)

Residuals:
Min 1Q Median 3Q Max
-2.4005 -0.7071 -0.0344 0.7340 2.9673
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.0541 0.4840 8.377 3.96e-13 ***
OrdBilling 0.6695 0.1106 6.054 2.60e-08 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.022 on 98 degrees of freedom

Multiple R-squared: 0.2722, Adjusted R-squared: 0.2648
F-statistic: 36.65 on 1 and 98 DF, p-value: 2.602e-08

#SLM for Delivery Speed

plot(DelSpeed,Satisfaction,col="Red", abline(lm(Satisfaction~DelSpeed),col="Blue"))

> cor(Satisfaction, DelSpeed)

[1] 0.5770423
> Model11 = lm(Satisfaction~DelSpeed, data = myfactor_hair)
> summary(Model11)

Call:
lm(formula = Satisfaction ~ DelSpeed, data = myfactor_hair)

Residuals:
Min 1Q Median 3Q Max
-2.22475 -0.54846 0.08796 0.54462 2.59432
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.2791 0.5294 6.194 1.38e-08 ***
DelSpeed 0.9364 0.1339 6.994 3.30e-10 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.9783 on 98 degrees of freedom

Multiple R-squared: 0.333, Adjusted R-squared: 0.3262
F-statistic: 48.92 on 1 and 98 DF, p-value: 3.3e-10

4. Perform PCA/Factor Analysis by extracting 4 factors. Interpret the

output and name the factors.
#Initial Data Analysis
myfactorhair_pca = as.data.frame(myfactor_hair)
myfactorhair_pca$Satisfaction = NULL
Since we will not require dependent variable Satisfaction for the
PCA/ factor analysis, so we have nullified it.

summary(myfactorhair_pca)
ProdQual Ecom TechSup CompRes Advertising P
rodLine
Min. : 5.000 Min. :2.200 Min. :1.300 Min. :2.600 Min. :1.900 Min.
:2.300
1st Qu.: 6.575 1st Qu.:3.275 1st Qu.:4.250 1st Qu.:4.600 1st Qu.:3.175 1st
Qu.:4.700
Median : 8.000 Median :3.600 Median :5.400 Median :5.450 Median :4.000 Medi
an :5.750
Mean : 7.810 Mean :3.672 Mean :5.365 Mean :5.442 Mean :4.010 Mean
:5.805
3rd Qu.: 9.100 3rd Qu.:3.925 3rd Qu.:6.625 3rd Qu.:6.325 3rd Qu.:4.800 3rd
Qu.:6.800
Max. :10.000 Max. :5.700 Max. :8.500 Max. :7.800 Max. :6.500 Max.
:8.400
SalesFImage ComPricing WartyClaim OrdBilling DelSpeed
Min. :2.900 Min. :3.700 Min. :4.100 Min. :2.000 Min. :1.600
1st Qu.:4.500 1st Qu.:5.875 1st Qu.:5.400 1st Qu.:3.700 1st Qu.:3.400
Median :4.900 Median :7.100 Median :6.100 Median :4.400 Median :3.900
Mean :5.123 Mean :6.974 Mean :6.043 Mean :4.278 Mean :3.886
3rd Qu.:5.800 3rd Qu.:8.400 3rd Qu.:6.600 3rd Qu.:4.800 3rd Qu.:4.425
Max. :8.200 Max. :9.900 Max. :8.100 Max. :6.700 Max. :5.500

anyNA(myfactorhair_pca)
[1] FALSE

dim(myfactorhair_pca)
[1] 100 11

str(myfactorhair_pca)
'data.frame': 100 obs. of 11 variables:
$ ProdQual : num 8.5 8.2 9.2 6.4 9 6.5 6.9 6.2 5.8 6.4 ...
$ Ecom : num 3.9 2.7 3.4 3.3 3.4 2.8 3.7 3.3 3.6 4.5 ...
$ TechSup : num 2.5 5.1 5.6 7 5.2 3.1 5 3.9 5.1 5.1 ...
$ CompRes : num 5.9 7.2 5.6 3.7 4.6 4.1 2.6 4.8 6.7 6.1 ...
$ Advertising: num 4.8 3.4 5.4 4.7 2.2 4 2.1 4.6 3.7 4.7 ...
$ ProdLine : num 4.9 7.9 7.4 4.7 6 4.3 2.3 3.6 5.9 5.7 ...
$ SalesFImage: num 6 3.1 5.8 4.5 4.5 3.7 5.4 5.1 5.8 5.7 ...
$ ComPricing : num 6.8 5.3 4.5 8.8 6.8 8.5 8.9 6.9 9.3 8.4 ...
$ WartyClaim : num 4.7 5.5 6.2 7 6.1 5.1 4.8 5.4 5.9 5.4 ...
$ OrdBilling : num 5 3.9 5.4 4.3 4.5 3.6 2.1 4.3 4.4 4.1 ...
$ DelSpeed : num 3.7 4.9 4.5 3 3.5 3.3 2 3.7 4.6 4.4 ...

#Check for normality

shapiorfactorhair = sapply(myfactorhair_pca,shapiro.test)
> shapiorfactorhair
ProdQual Ecom TechSup
statistic 0.9497214 0.9585167 0.9862636
p.value 0.000795287 0.003156537 0.3900381
method "Shapiro-Wilk normality test" "Shapiro-Wilk normality test" "Shapiro-Wilk no
rmality test"
data.name "X[[i]]" "X[[i]]" "X[[i]]"
CompRes Advertising ProdLine
statistic 0.9864587 0.9762572 0.9869227
p.value 0.4022672 0.06769457 0.4324464
method "Shapiro-Wilk normality test" "Shapiro-Wilk normality test" "Shapiro-Wilk no
rmality test"
data.name "X[[i]]" "X[[i]]" "X[[i]]"
SalesFImage ComPricing WartyClaim
statistic 0.974033 0.9675841 0.9909419
p.value 0.04533896 0.01448414 0.7403655
method "Shapiro-Wilk normality test" "Shapiro-Wilk normality test" "Shapiro-Wilk no
rmality test"
data.name "X[[i]]" "X[[i]]" "X[[i]]"
OrdBilling DelSpeed
statistic 0.9740518 0.9816082
p.value 0.04549239 0.1770483
method "Shapiro-Wilk normality test" "Shapiro-Wilk normality test"
data.name "X[[i]]" "X[[i]]"

#Run barlett sphericity test for checking the possibility of data

dimension reduction
cortest.bartlett(myfactorcor, nrow(myfactorhair_pca))
$chisq
[1] 619.2726
$p.value
[1] 1.79337e-96
$df
[1] 55

P value is low can be process for dimension reduction

#kaiser meyer test
myfactor.kmo = KMO(cor(myfactorhair_pca))
> myfactor.kmo
Kaiser-Meyer-Olkin factor adequacy
Call: KMO(r = cor(myfactorhair_pca))
Overall MSA = 0.65
MSA for each item =
ProdQual Ecom TechSup CompRes Advertising ProdLine SalesFImage C
omPricing
0.51 0.63 0.52 0.79 0.78 0.62 0.62
0.75
WartyClaim OrdBilling DelSpeed
0.51 0.76 0.67
> A = eigen(myfactorcor)
> ev = A$values
> ev
[1] 3.42697133 2.55089671 1.69097648 1.08655606 0.60942409 0.55188378 0.40151815 0.24
695154
[9] 0.20355327 0.13284158 0.09842702
> plot(ev, type = "l",main = "Screeplot", xlab = "Principal Components", ylab = "Eigen
Values")

Factor Analysis using method PA

myfactor.solution1 = fa(r = cor(myfactorhair_pca), nfactors = 4, rotate = "none",
fm = "pa")

myfactor.solution1
Factor Analysis using method = pa
Call: fa(r = cor(myfactorhair_pca), nfactors = 4, rotate = "none",
fm = "pa")
Standardized loadings (pattern matrix) based upon correlation matrix
PA1 PA2 PA3 PA4 h2 u2 com
ProdQual 0.20 -0.41 -0.06 0.46 0.42 0.576 2.4
Ecom 0.29 0.66 0.27 0.22 0.64 0.362 2.0
TechSup 0.28 -0.38 0.74 -0.17 0.79 0.205 1.9
CompRes 0.86 0.01 -0.26 -0.18 0.84 0.157 1.3
Advertising 0.29 0.46 0.08 0.13 0.31 0.686 1.9
ProdLine 0.69 -0.45 -0.14 0.31 0.80 0.200 2.3
SalesFImage 0.39 0.80 0.35 0.25 0.98 0.021 2.1
ComPricing -0.23 0.55 -0.04 -0.29 0.44 0.557 1.9
WartyClaim 0.38 -0.32 0.74 -0.15 0.81 0.186 2.0
OrdBilling 0.75 0.02 -0.18 -0.18 0.62 0.378 1.2
DelSpeed 0.90 0.10 -0.30 -0.20 0.94 0.058 1.4
PA1 PA2 PA3 PA4
SS loadings 3.21 2.22 1.50 0.68
Proportion Var 0.29 0.20 0.14 0.06
Cumulative Var 0.29 0.49 0.63 0.69
Proportion Explained 0.42 0.29 0.20 0.09
Cumulative Proportion 0.42 0.71 0.91 1.00

Mean item complexity = 1.9

Test of the hypothesis that 4 factors are sufficient.

The degrees of freedom for the null model are 55 and the objective function was 6.5
5
The degrees of freedom for the model are 17 and the objective function was 0.33

The root mean square of the residuals (RMSR) is 0.02

The df corrected root mean square of the residuals is 0.03

Fit based upon off diagonal values = 1

Measures of factor score adequacy
PA1 PA2 PA3 PA4
Correlation of (regression) scores with factors 0.98 0.97 0.95 0.88
Multiple R square of scores with factors 0.96 0.95 0.91 0.78
Minimum correlation of possible factor scores 0.92 0.90 0.82 0.56
summary(myfactor.solution1)

Factor analysis with Call: fa(r = myfactorhair_pca, nfactors = 4, rotate = "none",

fm = "pa")

Test of the hypothesis that 4 factors are sufficient.

The degrees of freedom for the model is 17 and the objective function was 0.33
The number of observations was 100 with Chi Square = 30.27 with prob < 0.024
The root mean square of the residuals (RMSA) is 0.02
The df corrected root mean square of the residuals is 0.03
Tucker Lewis Index of factoring reliability = 0.921
RMSEA index = 0.096 and the 10 % confidence intervals are 0.032 0.139
BIC = -48.01

fa.diagram(myfactor.solution1)

Explore loading if factor can be balanced

myfactor.solution2 = fa(r = myfactorhair_pca, nfactors = 4, rotate = "varimax", fm = "
pa")
myfactor.solution2
Factor Analysis using method = pa
Call: fa(r = myfactorhair_pca, nfactors = 4, rotate = "varimax",
fm = "pa")
Standardized loadings (pattern matrix) based upon correlation matrix
PA1 PA2 PA3 PA4 h2 u2 com
ProdQual 0.02 -0.07 0.02 0.65 0.42 0.576 1.0
Ecom 0.07 0.79 0.03 -0.11 0.64 0.362 1.1
TechSup 0.02 -0.03 0.88 0.12 0.79 0.205 1.0
CompRes 0.90 0.13 0.05 0.13 0.84 0.157 1.1
Advertising 0.17 0.53 -0.04 -0.06 0.31 0.686 1.2
ProdLine 0.53 -0.04 0.13 0.71 0.80 0.200 1.9
SalesFImage 0.12 0.97 0.06 -0.13 0.98 0.021 1.1
ComPricing -0.08 0.21 -0.21 -0.59 0.44 0.557 1.6
WartyClaim 0.10 0.06 0.89 0.13 0.81 0.186 1.1
OrdBilling 0.77 0.13 0.09 0.09 0.62 0.378 1.1
DelSpeed 0.95 0.19 0.00 0.09 0.94 0.058 1.1

PA1 PA2 PA3 PA4

SS loadings 2.63 1.97 1.64 1.37
Proportion Var 0.24 0.18 0.15 0.12
Cumulative Var 0.24 0.42 0.57 0.69
Proportion Explained 0.35 0.26 0.22 0.18
Cumulative Proportion 0.35 0.60 0.82 1.00
Mean item complexity = 1.2
Test of the hypothesis that 4 factors are sufficient.
The degrees of freedom for the null model are 55 and the objective function was 6.5
5 with Chi Square of 619.27
The degrees of freedom for the model are 17 and the objective function was 0.33

The root mean square of the residuals (RMSR) is 0.02

The df corrected root mean square of the residuals is 0.03

The harmonic number of observations is 100 with the empirical chi square 3.19 with
prob < 1
The total number of observations was 100 with Likelihood Chi Square = 30.27 with p
rob < 0.024
Tucker Lewis Index of factoring reliability = 0.921
RMSEA index = 0.096 and the 90 % confidence intervals are 0.032 0.139
BIC = -48.01
Fit based upon off diagonal values = 1
Measures of factor score adequacy
PA1 PA2 PA3 PA4
Correlation of (regression) scores with factors 0.98 0.99 0.94 0.88
Multiple R square of scores with factors 0.96 0.97 0.88 0.78
Minimum correlation of possible factor scores 0.93 0.94 0.77 0.55

summary(myfactor.solution2)
Factor analysis with Call: fa(r = myfactorhair_pca, nfactors = 4, rotate = "varimax",
fm = "pa")
Test of the hypothesis that 4 factors are sufficient.
The degrees of freedom for the model is 17 and the objective function was 0.33
The number of observations was 100 with Chi Square = 30.27 with prob < 0.024

The root mean square of the residuals (RMSA) is 0.02

The df corrected root mean square of the residuals is 0.03

Tucker Lewis Index of factoring reliability = 0.921

RMSEA index = 0.096 and the 10 % confidence intervals are 0.032 0.139
BIC = -48.01
#solution 2 is giving better balanced loading.
str(myfactor_hair)
'data.frame': 100 obs. of 12 variables:
$ ProdQual : num 8.5 8.2 9.2 6.4 9 6.5 6.9 6.2 5.8 6.4 ...
$ Ecom : num 3.9 2.7 3.4 3.3 3.4 2.8 3.7 3.3 3.6 4.5 ...
$ TechSup : num 2.5 5.1 5.6 7 5.2 3.1 5 3.9 5.1 5.1 ...
$ CompRes : num 5.9 7.2 5.6 3.7 4.6 4.1 2.6 4.8 6.7 6.1 ...
$ Advertising : num 4.8 3.4 5.4 4.7 2.2 4 2.1 4.6 3.7 4.7 ...
$ ProdLine : num 4.9 7.9 7.4 4.7 6 4.3 2.3 3.6 5.9 5.7 ...
$ SalesFImage : num 6 3.1 5.8 4.5 4.5 3.7 5.4 5.1 5.8 5.7 ...
$ ComPricing : num 6.8 5.3 4.5 8.8 6.8 8.5 8.9 6.9 9.3 8.4 ...
$ WartyClaim : num 4.7 5.5 6.2 7 6.1 5.1 4.8 5.4 5.9 5.4 ...
$ OrdBilling : num 5 3.9 5.4 4.3 4.5 3.6 2.1 4.3 4.4 4.1 ...
$ DelSpeed : num 3.7 4.9 4.5 3 3.5 3.3 2 3.7 4.6 4.4 ...
$ Satisfaction: num 8.2 5.7 8.9 4.8 7.1 4.7 5.7 6.3 7 5.5 ...
dim(myfactor_hair)
[1] 100 12
print(myfactor.solution2$scores, digits = 2)

myfactordata = cbind(myfactor_hair[,12], myfactor.solution2$scores)

summary(myfactordata)
V1 PA1 PA2 PA3 PA4
Min. :4.700 Min. :-2.55956 Min. :-2.0373 Min. :-2.20200 Min. :-1.42
620
1st Qu.:6.000 1st Qu.:-0.61566 1st Qu.:-0.4663 1st Qu.:-0.73427 1st Qu.:-0.83
402
Median :7.050 Median : 0.07914 Median :-0.2038 Median : 0.09067 Median : 0.03
373
Mean :6.918 Mean : 0.00000 Mean : 0.0000 Mean : 0.00000 Mean : 0.00
000
3rd Qu.:7.625 3rd Qu.: 0.74181 3rd Qu.: 0.5719 3rd Qu.: 0.56502 3rd Qu.: 0.70
675
Max. :9.900 Max. : 1.99193 Max. : 2.8326 Max. : 2.08285 Max. : 2.15
737

head(myfactordata)
PA1 PA2 PA3 PA4
[1,] 8.2 -0.1338871 0.9175166 -1.719604873 0.09135411
[2,] 5.7 1.6297604 -2.0090053 -0.596361722 0.65808192
[3,] 8.9 0.3637658 0.8361736 0.002979966 1.37548765
[4,] 4.8 -1.2225230 -0.5491336 1.245473305 -0.64421384
[5,] 7.1 -0.4854209 -0.4276223 -0.026980304 0.47360747
[6,] 4.7 -0.5950924 -1.3035333 -1.183019401 -0.9591357

colnames(myfactordata) = c("Satisfaction", "Sales Support", "Sales & Marketing", "Post

Sales Service", "Quality Check")

head(myfactordata)
Satisfaction Sales Support Sales & Marketing Post Sales Service Quality Check
[1,] 8.2 -0.1338871 0.9175166 -1.719604873 0.09135411
[2,] 5.7 1.6297604 -2.0090053 -0.596361722 0.65808192
[3,] 8.9 0.3637658 0.8361736 0.002979966 1.37548765
[4,] 4.8 -1.2225230 -0.5491336 1.245473305 -0.64421384
[5,] 7.1 -0.4854209 -0.4276223 -0.026980304 0.47360747
[6,] 4.7 -0.5950924 -1.3035333 -1.183019401 -0.95913571

5. Perform multiple linear regression with customer satisfaction as

dependent variable and the four factor as independent variables.
Comment on the model output and validity. Your remark should
make it meaningful for everybody.
write.csv(myfactordata, "myfactordata.csv")

mymldata = read.csv("myfactordata.csv")
attach(mymldata)
cor(myfactordata)
Satisfaction SalesSupport Sales&Marketing PostSalesService QualityCheck
Satisfaction 1.00000000 0.49682808 0.49823377 0.07472834 0.44430972
Sales Support 0.49682808 1.00000000 0.01268190 -0.00169571 0.02848518
Sales & Marketing 0.49823377 0.01268190 1.00000000 0.01404754 -0.04695702
Post Sales Service0.07472834-0.00169571 0.01404754 1.00000000 0.05184746
Quality Check 0.44430972 0.02848518 -0.04695702 0.05184746 1.00000000
mlmodel = lm(Satisfaction~Sales.Support+Sales...Marketing+Post.Sales.Service+Quality.C
heck, data = mymldata)

summary(mlmodel)

Call:
lm(formula = Satisfaction ~ Sales.Support + Sales...Marketing +
Post.Sales.Service + Quality.Check, data = mymldata)
Residuals:
Min 1Q Median 3Q Max
-1.7125 -0.4708 0.1024 0.4158 1.3483

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.91800 0.06696 103.317 < 2e-16 ***
Sales.Support 0.57963 0.06857 8.453 3.32e-13 ***
Sales...Marketing 0.61978 0.06834 9.070 1.61e-14 ***
Post.Sales.Service 0.05692 0.07173 0.794 0.429
Quality.Check 0.61168 0.07656 7.990 3.16e-12 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.6696 on 95 degrees of freedom

Multiple R-squared: 0.6971, Adjusted R-squared: 0.6844
F-statistic: 54.66 on 4 and 95 DF, p-value: < 2.2e-16
vif(mlmodel)
Sales.Support Sales...Marketing Post.Sales.Service Quality.Check
1.001021 1.002683 1.002981 1.005848
For any given level of Sales Support, Sales & Marketing & Post Sales
Service, improving one point of quality check satisfaction will be
increased by 0.61. Correspondingly improving sales support, Post Sales
Service and quality check improving sales & marketing by one point
will increase the satisfaction by 0.61. Likewise Sales Support, Sales &
marketing & quality check, improving one point of Post Sales Service,
will increase satisfaction level by 0.05 which is very low. Also
improving Sales Support will increase satisfaction level by 0.57.
The adjusted R-squared is 0.68, which is considerably high. Also the
multiple R-squared is 0.69. Whereas F stat is 54.66 and P value is
highly significant
.

Decision Making: Submitted By-Ankita Mishra
No ratings yet
Decision Making: Submitted By-Ankita Mishra
20 pages
Research Methodology Module
No ratings yet
Research Methodology Module
121 pages
Kendall's Tau
No ratings yet
Kendall's Tau
49 pages
Introduccion A La HPLC Quattrocchi PDF
No ratings yet
Introduccion A La HPLC Quattrocchi PDF
406 pages
Hair Salon
33% (3)
Hair Salon
11 pages
Thera Bank - Project
100% (4)
Thera Bank - Project
34 pages
Telecom Churn Report
No ratings yet
Telecom Churn Report
66 pages
Machine Learning-1 BUSINESS REPORT
No ratings yet
Machine Learning-1 BUSINESS REPORT
122 pages
3-Preparing The Data-10-01-2024
No ratings yet
3-Preparing The Data-10-01-2024
127 pages
Advanced Statistics - Project Report
100% (5)
Advanced Statistics - Project Report
14 pages
B) " Data - Frame".: Outliers and Missing Values (8 Marks) Answer
100% (1)
B) " Data - Frame".: Outliers and Missing Values (8 Marks) Answer
23 pages
MA Pro Aug 2024
No ratings yet
MA Pro Aug 2024
64 pages
Project 5 PDF
100% (1)
Project 5 PDF
48 pages
Probability M
No ratings yet
Probability M
80 pages
Upsell Model Case PDF
No ratings yet
Upsell Model Case PDF
48 pages
Sampling Distribution-1
No ratings yet
Sampling Distribution-1
17 pages
Data Science Slides
No ratings yet
Data Science Slides
57 pages
Unit 4
No ratings yet
Unit 4
42 pages
Beer Sales With Analysis
No ratings yet
Beer Sales With Analysis
84 pages
Paper 2
No ratings yet
Paper 2
36 pages
PSUnit I Lesson 2 Constructing Probability Distributions
No ratings yet
PSUnit I Lesson 2 Constructing Probability Distributions
13 pages
Market Segmentation Practical
No ratings yet
Market Segmentation Practical
20 pages
Data Mining Project - Abinaya John
No ratings yet
Data Mining Project - Abinaya John
42 pages
Key Ingredients of PM
No ratings yet
Key Ingredients of PM
16 pages
PCA Business Report - Part 1
No ratings yet
PCA Business Report - Part 1
31 pages
How To Develop Quantitative Analysis Model
No ratings yet
How To Develop Quantitative Analysis Model
36 pages
How To Develop Quantitative Analysis Model
No ratings yet
How To Develop Quantitative Analysis Model
36 pages
Husserl - Philosophy As A Rigorous Science (New Translation)
84% (19)
Husserl - Philosophy As A Rigorous Science (New Translation)
47 pages
DSBA - Exploratory Data Analysis v2
No ratings yet
DSBA - Exploratory Data Analysis v2
22 pages
Master of Science in Data Science and Analytics (MSC Dsa)
No ratings yet
Master of Science in Data Science and Analytics (MSC Dsa)
8 pages
Business Analytics
No ratings yet
Business Analytics
16 pages
Assignment 2 - Factor Hair
No ratings yet
Assignment 2 - Factor Hair
39 pages
Pratik Zanke Factor Hair Revised
No ratings yet
Pratik Zanke Factor Hair Revised
37 pages
Factor-Hair-Revised: Salma Mohiuddin 27/08/2019 Setting Up The Working Directoryy
No ratings yet
Factor-Hair-Revised: Salma Mohiuddin 27/08/2019 Setting Up The Working Directoryy
37 pages
Project 2 Factor Hair Revised Case Study
No ratings yet
Project 2 Factor Hair Revised Case Study
25 pages
DATA MINING Project Report
No ratings yet
DATA MINING Project Report
28 pages
Отчет ВЭК перевод
No ratings yet
Отчет ВЭК перевод
67 pages
Data Mining Assignment-Clustering Data-Ads 24x7 Summary
No ratings yet
Data Mining Assignment-Clustering Data-Ads 24x7 Summary
12 pages
Standardizing Six Sigma Green BELT Training
No ratings yet
Standardizing Six Sigma Green BELT Training
19 pages
Introduction To Statistics-Part II
No ratings yet
Introduction To Statistics-Part II
35 pages
DSBA - Exploratory Data Analysis v2
No ratings yet
DSBA - Exploratory Data Analysis v2
22 pages
R Notebook
No ratings yet
R Notebook
8 pages
Monika Sree 08-06-2024
No ratings yet
Monika Sree 08-06-2024
36 pages
Lecture - 04 - Data Understanding and Preparation
No ratings yet
Lecture - 04 - Data Understanding and Preparation
59 pages
The Interpersonal Theory of Suicide
No ratings yet
The Interpersonal Theory of Suicide
49 pages
Business Analytics
No ratings yet
Business Analytics
14 pages
7 Prompts For Quantitative Data Analysis
No ratings yet
7 Prompts For Quantitative Data Analysis
9 pages
MiniProject3-AdvancedStat SaurabhMudgal
No ratings yet
MiniProject3-AdvancedStat SaurabhMudgal
22 pages
Laser-Induced Breakdown Spectroscopy and Hyperspectral Imaging Data Fusion For Improved Mineralogical Analysis of Copper Concentrates
No ratings yet
Laser-Induced Breakdown Spectroscopy and Hyperspectral Imaging Data Fusion For Improved Mineralogical Analysis of Copper Concentrates
14 pages
Iii Las 11 1
No ratings yet
Iii Las 11 1
11 pages
R Copy 4
No ratings yet
R Copy 4
14 pages
1 SM PDF
No ratings yet
1 SM PDF
18 pages
Pca Anova PDF
No ratings yet
Pca Anova PDF
21 pages
Sample Exam Answers CMA
No ratings yet
Sample Exam Answers CMA
9 pages
Assignment 2 PDF
No ratings yet
Assignment 2 PDF
25 pages
Practical 16
No ratings yet
Practical 16
9 pages
BRM Report
No ratings yet
BRM Report
16 pages
Advance Stats Assignment
No ratings yet
Advance Stats Assignment
18 pages
Ash Hair Salon DM-word
No ratings yet
Ash Hair Salon DM-word
6 pages
Coding Project Hair
No ratings yet
Coding Project Hair
8 pages
Pratik Zanke Source Codes
No ratings yet
Pratik Zanke Source Codes
20 pages
Advanced Statistics-Project
No ratings yet
Advanced Statistics-Project
16 pages
Howxtre
No ratings yet
Howxtre
8 pages
Factor Hair Revised Project Report PDF
No ratings yet
Factor Hair Revised Project Report PDF
23 pages
Unit 3-BA
No ratings yet
Unit 3-BA
31 pages
Data Mining
No ratings yet
Data Mining
24 pages
S2R XRF 82 Main Limestone Components
100% (1)
S2R XRF 82 Main Limestone Components
4 pages
Quantitative and Qualitative Data Analysis Techniques
No ratings yet
Quantitative and Qualitative Data Analysis Techniques
3 pages
Workflow Slides JSLong 110410 PDF
No ratings yet
Workflow Slides JSLong 110410 PDF
14 pages
Time Symmetry and Asymmetry in Quantum Mechanics and Quantum Cosmology
No ratings yet
Time Symmetry and Asymmetry in Quantum Mechanics and Quantum Cosmology
32 pages
Bakeev Web Article - Proof
No ratings yet
Bakeev Web Article - Proof
5 pages
Department of Business Administration: Final Exam (Take Home) Summer - 2020
No ratings yet
Department of Business Administration: Final Exam (Take Home) Summer - 2020
9 pages
DAV Practical 2
No ratings yet
DAV Practical 2
6 pages
Anova
No ratings yet
Anova
3 pages
DSE 2151 24 Sep 2022
No ratings yet
DSE 2151 24 Sep 2022
5 pages
Data Mining - Business Report: Clustering Clean - Ads
100% (4)
Data Mining - Business Report: Clustering Clean - Ads
24 pages
Principlesofworkflowin Dataanalysis: Scottlong
No ratings yet
Principlesofworkflowin Dataanalysis: Scottlong
14 pages
Project 2: Submitted By: Sumit Sinha Program & Group: Pgpbabionline May19 - A
No ratings yet
Project 2: Submitted By: Sumit Sinha Program & Group: Pgpbabionline May19 - A
17 pages
Standard For Australian Survey Control Network - v2.1
No ratings yet
Standard For Australian Survey Control Network - v2.1
13 pages
1 Introduction To Research
No ratings yet
1 Introduction To Research
31 pages
Mini Project - Factor Hair Analysis: Sravanthi.M
100% (2)
Mini Project - Factor Hair Analysis: Sravanthi.M
24 pages
KJWDH
No ratings yet
KJWDH
4 pages
DISS - Mod1 - Introduction To Social Sciences With Natural Sciences and Humanities
No ratings yet
DISS - Mod1 - Introduction To Social Sciences With Natural Sciences and Humanities
28 pages
Market Segmentation - Product Service Management
No ratings yet
Market Segmentation - Product Service Management
16 pages
Economics Assignment - Group B
No ratings yet
Economics Assignment - Group B
2 pages
Project Report - Advanced - Stats - Final PDF
No ratings yet
Project Report - Advanced - Stats - Final PDF
25 pages
SOP Sample 2
No ratings yet
SOP Sample 2
2 pages
Applied Predictive Analytics: Principles and Techniques for the Professional Data Analyst
From Everand
Applied Predictive Analytics: Principles and Techniques for the Professional Data Analyst
Dean Abbott
No ratings yet
Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques: A Guide to Data Science for Fraud Detection
From Everand
Fraud Analytics Using Descriptive, Predictive, and Social Network Techniques: A Guide to Data Science for Fraud Detection
Bart Baesens
No ratings yet

Factor Hair Revised PDF

Uploaded by

Factor Hair Revised PDF

Uploaded by

8/30/2019

Project Report: Advance

Submitted By: Sampada Sumant

A. Factor Hair Revised Case Study Background.

1. Perform exploratory data analysis on the dataset. Showcase some

chart, graphs. Check for Outliers and missing value.

2. Is there evidence of multicollinearity? Showcase your analysis.

3. Perform simple linear regression for the dependent variable with

every independent variable.

4. Perform PCA/Factor Analysis by extracting 4 factors. Interpret the

output and name the factors.

5. Perform multiple linear regression with customer satisfaction as

dependent variable and the four factor as independent variables.

Comment on the model output and validity. Your remark should

make it meaningful for everybody.

factor_hair = read.csv("Factor-Hair-Revised.csv", header = TRUE)

From box plot it can be viewed that outliers are present in E-

hist(factor_hair$ProdQual, breaks = (0:10), labels = TRUE, main = "Histogram of Produc

> hist(factor_hair$Ecom, breaks = (0:10), labels = TRUE, main = "Histogram of Ecommerc

> hist(factor_hair$TechSup, breaks = (0:10), labels = TRUE, main = "Histogram of Techn

> hist(factor_hair$CompRes, breaks = (0:10), labels = TRUE, main = "Histogram of Compl

> hist(factor_hair$Advertising, breaks = (0:10), labels = TRUE, main = "Histogram of A

> hist(factor_hair$ProdLine, breaks = (0:10), labels = TRUE, main = "Histogram of Prod

> hist(factor_hair$SalesFImage, breaks = (0:10), labels = TRUE, main = "Histogram of S

> hist(factor_hair$ComPricing, breaks = (0:10), labels = TRUE, main = "Histogram of Co

> hist(factor_hair$WartyClaim, breaks = (0:10), labels = TRUE, main = "Histogram of Wa

> hist(factor_hair$OrdBilling, breaks = (0:10), labels = TRUE, main = "Histogram of Or

> hist(factor_hair$DelSpeed, breaks = (0:10), labels = TRUE, main = "Histogram of Deli

Residual standard error: 0.5623 on 88 degrees of freedom

> cor(Satisfaction, ProdQual)

Residual standard error: 1.047 on 98 degrees of freedom

> cor(Satisfaction, TechSup)

> Model3 = lm(Satisfaction~TechSup, data = myfactor_hair)

Residual standard error: 1.19 on 98 degrees of freedom

> cor(Satisfaction, CompRes)

> cor(Satisfaction, Advertising)

> Model5 = lm(Satisfaction~Advertising, data = myfactor_hair)

Residual standard error: 1.141 on 98 degrees of freedom

#SLM For Product Line

> cor(Satisfaction, ProdLine)

Residual standard error: 1 on 98 degrees of freedom

> cor(Satisfaction, SalesFImage)

Residual standard error: 1.037 on 98 degrees of freedom

#SLM for Competitive Pricing

> cor(Satisfaction, ComPricing)

Residual standard error: 1.172 on 98 degrees of freedom

#SLM for Warranty Claim

> cor(Satisfaction, WartyClaim)

Residual standard error: 1.179 on 98 degrees of freedom

> cor(Satisfaction, OrdBilling)

Residual standard error: 1.022 on 98 degrees of freedom

#SLM for Delivery Speed

> cor(Satisfaction, DelSpeed)

Residual standard error: 0.9783 on 98 degrees of freedom

4. Perform PCA/Factor Analysis by extracting 4 factors. Interpret the

#Check for normality

#Run barlett sphericity test for checking the possibility of data

P value is low can be process for dimension reduction

Factor Analysis using method PA

Mean item complexity = 1.9

The root mean square of the residuals (RMSR) is 0.02

Fit based upon off diagonal values = 1

Factor analysis with Call: fa(r = myfactorhair_pca, nfactors = 4, rotate = "none",

Test of the hypothesis that 4 factors are sufficient.

Explore loading if factor can be balanced

PA1 PA2 PA3 PA4

The root mean square of the residuals (RMSR) is 0.02

The root mean square of the residuals (RMSA) is 0.02

Tucker Lewis Index of factoring reliability = 0.921

myfactordata = cbind(myfactor_hair[,12], myfactor.solution2$scores)

colnames(myfactordata) = c("Satisfaction", "Sales Support", "Sales & Marketing", "Post

5. Perform multiple linear regression with customer satisfaction as

Residual standard error: 0.6696 on 95 degrees of freedom

You might also like