100% found this document useful (3 votes)

75 views33 pages

PDF Solutions Manual to Advanced Regression Models with SAS and R 1st Edition Olga Korosteleva download

SAS

Uploaded by

baheymanno

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (3 votes)

75 views33 pages

PDF Solutions Manual to Advanced Regression Models with SAS and R 1st Edition Olga Korosteleva download

SAS

Uploaded by

baheymanno

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

Download the full version of the textbook now at textbookfull.

com

Solutions Manual to Advanced Regression

Models with SAS and R 1st Edition Olga
Korosteleva

https://textbookfull.com/product/solutions-manual-
to-advanced-regression-models-with-sas-and-r-1st-
edition-olga-korosteleva/

Explore and download more textbook at https://textbookfull.com

Recommended digital products (PDF, EPUB, MOBI) that
you can download immediately if you are interested.

Student Solutions Manual to Accompany Loss Models From

Data to Decisions Klugman

https://textbookfull.com/product/student-solutions-manual-to-
accompany-loss-models-from-data-to-decisions-klugman/

textbookfull.com

Clinical trial data analysis with R and SAS Second Edition

Chen

https://textbookfull.com/product/clinical-trial-data-analysis-with-r-
and-sas-second-edition-chen/

textbookfull.com

Linear Regression Models 1st Edition John P. Hoffmann

https://textbookfull.com/product/linear-regression-models-1st-edition-
john-p-hoffmann/

textbookfull.com

An impossible dream racial integration in the United

States 1st Edition Sharon A. Stanley

https://textbookfull.com/product/an-impossible-dream-racial-
integration-in-the-united-states-1st-edition-sharon-a-stanley/

textbookfull.com
Cities as International Actors Urban and Regional
Governance Beyond the Nation State 1st Edition Tassilo
Herrschel
https://textbookfull.com/product/cities-as-international-actors-urban-
and-regional-governance-beyond-the-nation-state-1st-edition-tassilo-
herrschel/
textbookfull.com

How to Prepare for Verbal Ability and Reading

Comprehension for the CAT Arun Sharma & Meenakshi Upadhyay
[Sharma
https://textbookfull.com/product/how-to-prepare-for-verbal-ability-
and-reading-comprehension-for-the-cat-arun-sharma-meenakshi-upadhyay-
sharma/
textbookfull.com

The Psychology of Religion and Place: Emerging

Perspectives Victor Counted

https://textbookfull.com/product/the-psychology-of-religion-and-place-
emerging-perspectives-victor-counted/

textbookfull.com

The Digital Project Management Evolution Essential Case

Studies from Organisations in the Middle East 1st Edition
Shafiz Affendi Mohd Yusof
https://textbookfull.com/product/the-digital-project-management-
evolution-essential-case-studies-from-organisations-in-the-middle-
east-1st-edition-shafiz-affendi-mohd-yusof/
textbookfull.com

Distributed Languaging Affective Dynamics and the Human

Ecology Volume II Co articulating Self and World Routledge
Advances in Communication and Linguistic Theory 1st
Edition Paul J. Thibault
https://textbookfull.com/product/distributed-languaging-affective-
dynamics-and-the-human-ecology-volume-ii-co-articulating-self-and-
world-routledge-advances-in-communication-and-linguistic-theory-1st-
edition-paul-j-thibault/
textbookfull.com
Wolf Shunned (The Alpha Queen Legacy #1) 1st Edition
Laurel Night [Night

https://textbookfull.com/product/wolf-shunned-the-alpha-queen-
legacy-1-1st-edition-laurel-night-night/

textbookfull.com
SOLUTIONS MANUAL
FOR
Korosteleva, O. (2018). Advanced Regression Models with SAS and R, CRC Press
By
OLGA KOROSTELEVA
Department of Mathematics and Statistics
California State University, Long Beach

1
TABLE OF CONTENTS
CHAPTER 1 ……………………………………………………………………………………. 3
CHAPTER 2 ……………………………………………………………………………………. 24
CHAPTER 3 ……………………………………………………………………………………. 58
CHAPTER 4 ……………………………………………………………………………………. 92
CHAPTER 5 ……………………………………………………………………………………. 131
CHAPTER 6 ……………………………………………………………………………………. 163
CHAPTER 7 ……………………………………………………………………………………. 187
CHAPTER 8 ……………………………………………………………………………………. 218
CHAPTER 9 ……………………………………………………………………………………. 284
CHAPTER 10 …………………………………………………………………………………. 315

2
CHAPTER 1

EXERCISE 1.1. Show that the normal distribution belongs to the exponential family of

distributions.

( )
𝑓(𝑦, 𝜇, 𝜎 ) = √ exp − = exp − ln(2𝜋𝜎 ) − (𝑦 − 2𝑦𝜇 + 𝜇 ) . Let 𝜃 = 𝜇

and 𝜙 = 𝜎 . Then, we can write 𝑓(𝑦, 𝜃, 𝜙) = exp − ln(2𝜋𝜙 ) − (𝑦 − 2𝑦𝜃 + 𝜃 )

( )
= exp − ln(2𝜋𝜙) − = exp + ℎ(𝑦, 𝜙) where 𝑐(𝜃) = , and

1 𝑦
ℎ(𝑦, 𝜙) = − ln(2𝜋𝜙) − .
2 2𝜙

EXERCISE 1.2. (a) Verify normality of the response variable, then fit the linear regression model
to the data. State the fitted model. Give estimates for all parameters.
In SAS:

data weightloss;
input drug$ age gender$ EWL @@;
cards;
A 49 F 14.2 A 54 M 25.4 A 37 F 14.1 A 43 F 20.0 A 57 M 11.7 A 48 M 16.6
A 34 F 15.9 A 51 F 17.4 A 54 F 22.8 A 45 F 16.7 A 36 M 12.7 A 57 M 15.0
A 44 M 8.4 A 56 M 11.2 A 44 M 17.3 A 47 M 20.5 A 44 F 6.7 B 52 F 29.4
B 51 M 21.9 B 44 F 23.6 B 53 F 23.8 B 55 M 7.4 B 30 F 23.1 B 47 M 16.8
B 26 M 14.1 B 56 F 24.6 B 28 F 17.8 B 34 M 27.8 B 43 M 10.6 B 55 M 26.8
B 52 F 15.7 B 54 F 23.7
;

/running normality check/

proc univariate;
var EWL;
histogram/normal;
run;

3
Goodness-of-Fit Tests for Normal Distribution
Test Statistic p Value
Kolmogorov-Smirnov D 0.10216310 Pr > D >0.150
Cramer-von Mises W-Sq 0.05103595 Pr > W-Sq >0.250
Anderson-Darling A-Sq 0.28788730 Pr > A-Sq >0.250

Based on the large p-values of the normality tests and the histogram, we can conclude that the
response variable follows a normal distribution.

/fitting general linear model/

proc genmod;
class drug(ref="A") gender;
model EWL = drug age gender / dist=normal link=identity;
run;

Log Likelihood -98.4395

Analysis Of Maximum Likelihood Parameter Estimates

Parameter DF Estimate Standard Wald 95% Confidence Wald Chi- Pr > ChiSq
Error Limits Square
Intercept 1 9.2146 5.3301 -1.2322 19.6614 2.99 0.0838
drug B 1 4.8103 1.8697 1.1456 8.4749 6.62 0.0101
drug A 0 0.0000 0.0000 0.0000 0.0000 . .
age 1 0.1102 0.1067 -0.0988 0.3192 1.07 0.3015
gender F 1 2.7235 1.8664 -0.9346 6.3815 2.13 0.1445
gender M 0 0.0000 0.0000 0.0000 0.0000 . .
Scale 1 5.2451 0.6556 4.1054 6.7012

The fitted model is 𝐸 (𝐸𝑊𝐿) = 9.2146 + 4.8103 ∙ 𝑑𝑟𝑢𝑔𝐵 + 0.1102 ∙ 𝑎𝑔𝑒 + 2.7235 ∙ 𝑓𝑒𝑚𝑎𝑙𝑒 ,
and 𝜎 = 5.2451.

In R:
weightloss.data<- read.csv(file="C:/./Exercise1.2Data.csv", header = TRUE, sep =
",")

#running normality check

library(rcompanion)
plotNormalHistogram(weightloss.data$EWL)

4
shapiro.test(weightloss.data$EWL)

Shapiro-Wilk normality test

W = 0.97424, p-value = 0.6234

#specifying reference levels

drug.rel<- relevel(weightloss.data$drug, ref="A")
gender.rel<- relevel(weightloss.data$gender, ref="M")

#fitting general linear model

summary(fitted.model<- glm(EWL ~ drug.rel + age + gender.rel, data =
weightloss.data, family=gaussian(link=identity)))

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 9.2146 5.6981 1.617 0.1171
drug.relB 4.8103 1.9988 2.407 0.0229
age 0.1102 0.1140 0.967 0.3420
gender.relF 2.7235 1.9952 1.365 0.1831

#outputting estimated sigma

sigma(fitted.model)

5.607257

(b) Which regression coefficients turn out to be significant at the 5%? Discuss goodness of fit of the
model.

Drug B is the only significant predictor in the model at the 5% significance level since the
corresponding p-value is the only one under 0.05.

In SAS:

/checking model fit/

proc genmod;
model EWL = / dist=normal link=identity;
run;

Log Likelihood -102.6326

data deviance_test;
deviance = -2*(-102.6326 - (-98.4395));
pvalue = 1 - probchi(deviance,3);
run;

proc print noobs;

run;

deviance pvalue
8.3862 0.038669

The p-value for the deviance test is less than 0.05, indicating a good fit of the model. The R code and
output are:
#checking model fit
null.model<- glm(EWL ~ 1, data=weightloss.data, family=gaussian(link=identity))
print(deviance<- -2*(logLik(null.model)-logLik(fitted.model)))
5
8.386158

print(p.value<- pchisq(deviance, df=3, lower.tail=FALSE))

0.03867005

(c) Is one of the drugs more efficient for weight loss than the other? Interpret all estimated significant
coefficients.
The estimated average EWL for subjects taking drug B is 4.8103 percent higher than that for
subjects taking drug A, keeping all the other predictors fixed. It means that drug B is more efficient
than drug A.

(d) According to the model, what is the predicted percent decrease in excess body weight for a 35-
year old male who is taking drug A?

The predicted percent decrease in excess body weight for a 35-year old male who is taking drug A is
computed by hand as: 𝐸𝑊𝐿 = 9.2146 + 0.1102 ∙ 35 = 13.0716.

In SAS:

/using fitted model for prediction/

data predict;
input drug$ age gender$;
cards;
A 35 M
;

data weightloss;
set weightloss predict;
run;

proc genmod;
class drug gender;
model EWL = drug age gender / dist=normal link=identity;
output out=outdata p=pEWL;
run;

proc print data=outdata (firstobs=33) noobs;

var pEWL;
run;

pEWL
13.0718

In R:
#using fitted model for prediction
print(predict(fitted.model, data.frame(drug.rel="A", age=35, gender.rel="M")))

13.7178

6
EXERCISE 1.3. (a) Reduce the car price by the factor of 1000. Check that the distribution of the
price is normal. Fit a general linear regression model to predict the price of a car. Write down the
fitted model, specifying all estimated parameters.

In SAS:
data carsales;
input bodystyle$ 1-9 country$ hwy doors leather$ price @@;
priceK=price/1000;
cards;
coupe USA 26 4 no 17445 coupe USA 40 4 no 23500
coupe USA 35 2 no 19600 coupe Germany 37 4 no 23400
coupe Germany 25 4 no 24100 coupe Germany 24 2 no 12400
coupe Japan 26 2 no 13300 coupe Japan 27 4 no 15550
coupe Japan 20 4 yes 29345 hatchback USA 30 2 no 12540
hatchback USA 39 4 no 17595 hatchback USA 38 2 no 17300
hatchback Germany 38 4 no 17800 hatchback Germany 32 4 no 22500
hatchback Germany 34 4 no 20300 hatchback Japan 38 4 yes 27300
hatchback Japan 38 2 yes 23300 hatchback Japan 38 2 yes 29300
sedan USA 29 4 no 32000 sedan USA 25 2 yes 34200
sedan USA 33 4 yes 33395 sedan Germany 40 4 no 22850
sedan Germany 23 2 yes 36000 sedan Germany 25 4 no 19900
sedan Japan 40 4 yes 36700 sedan Japan 35 4 yes 31600
sedan Japan 37 4 no 24600
run;

/running normality check/

proc univariate;
var priceK;
histogram/normal;
run;

Goodness-of-Fit Tests for Normal Distribution

Test Statistic p Value
Kolmogorov-Smirnov D 0.11287889 Pr > D >0.150
Cramer-von Mises W-Sq 0.05867848 Pr > W-Sq >0.250
Anderson-Darling A-Sq 0.37263698 Pr > A-Sq >0.250

P-values for the normality tests are all in excess of 0.05, indicating that normality holds. The
histogram also displays a distribution close to bell-shaped.
7
/*fitting general linear model*/
proc genmod;
class bodystyle(ref="hatchback") country(ref="Japan") leather(ref="no");
model priceK=bodystyle country hwy doors leather/dist=normal link=identity;
run;

Log Likelihood -67.2613

Analysis Of Maximum Likelihood Parameter Estimates

Parameter DF Estimate Standard Wald 95% Confidence Wald Chi- Pr > ChiSq
Error Limits Square
Intercept 1 5.1353 4.6900 -4.0570 14.3276 1.20 0.2735
bodystyle coupe 1 2.2698 1.6836 -1.0301 5.5696 1.82 0.1776
bodystyle sedan 1 6.4107 1.5477 3.3772 9.4441 17.16 <.0001
bodystyle hatchback 0 0.0000 0.0000 0.0000 0.0000 . .
country Germany 1 3.1959 1.6859 -0.1085 6.5002 3.59 0.0580
country USA 1 3.2128 1.5780 0.1199 6.3058 4.15 0.0418
country Japan 0 0.0000 0.0000 0.0000 0.0000 . .
hwy 1 0.1305 0.1117 -0.0884 0.3494 1.36 0.2427
doors 1 1.5554 0.6630 0.2560 2.8549 5.50 0.0190
leather yes 1 12.1757 1.6217 8.9972 15.3541 56.37 <.0001
leather no 0 0.0000 0.0000 0.0000 0.0000 . .
Scale 1 2.9219 0.3976 2.2378 3.8150

The fitted model is 𝐸 (𝑝𝑟𝑖𝑐𝑒𝐾) = 5.1353 + 2.2698 ∙ 𝑐𝑜𝑢𝑝𝑒 + 6.4107 ∙ 𝑠𝑒𝑑𝑎𝑛 + 3.1959 ∙
𝐺𝑒𝑟𝑚𝑎𝑛𝑦 + 3.2128 ∙ 𝑈𝑆𝐴 + 0.1305 ∙ ℎ𝑤𝑦 + 1.5554 ∙ 𝑑𝑜𝑜𝑟𝑠 + 12.1757 ∙ 𝑙𝑒𝑎𝑡ℎ𝑒𝑟, and
𝜎 = 2.9219.

In R:
carsales.data<- read.csv(file="C:/./Exercise1.3Data.csv",header=TRUE, sep=",")

#rescaling price
priceK<- carsales.data$price/1000

#running normality check

library(rcompanion)
plotNormalHistogram(priceK)

shapiro.test(priceK)
Shapiro-Wilk normality test

W = 0.95482, p-value = 0.28

8
#specifying reference levels
bodystyle.rel<- relevel(carsales.data$bodystyle, ref="hatchback")
country.rel<- relevel(carsales.data$country, ref="Japan")
leather.rel<- relevel(carsales.data$leather, ref="no")

#fitting general linear model

summary(fitted.model<- glm(priceK ~ bodystyle.rel + country.rel + hwy + doors +
leather.rel, data=carsales.data, family=gaussian(link=identity)))

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.1353 5.5909 0.919 0.36986
bodystyle.relcoupe 2.2698 2.0070 1.131 0.27216
bodystyle.relsedan 6.4107 1.8450 3.475 0.00254
country.relGermany 3.1959 2.0098 1.590 0.12829
country.relUSA 3.2128 1.8812 1.708 0.10394
hwy 0.1305 0.1332 0.980 0.33937
doors 1.5554 0.7904 1.968 0.06384
leather.relyes 12.1757 1.9332 6.298 4.79e-06

#outputting estimated sigma

sigma(fitted.model)

3.483088

(b) How good is the model fit? Discuss significance of the regression coefficients.
The p-value in the deviance test is way below 0.05, indicating a good model fit. Significant variables
are sedan body style and leather interior.

In SAS:

/checking model fit/

proc genmod;
model priceK = / dist=normal link=identity;
run;

Log Likelihood -91.1942

data deviance_test;
deviance = -2*(-91.1942 - (-67.2613));
pvalue = 1 - probchi(deviance,7);
run;

proc print noobs;

run;

deviance pvalue
47.8658 3.7823E-8

In R:

#checking model fit

null.model<- glm(priceK ~ 1, data=carsales.data, family=gaussian(link=identity))
print(deviance<- -2*(logLik(null.model)-logLik(fitted.model)))
47.86586

print(p.value<- pchisq(deviance, df=7, lower.tail = FALSE))

9
3.78218e-08

(c) Interpret the estimates of those regression coefficients that differ significantly from zero.
As estimated, sedan costs on average $6,410.70 more than a hatchback, under all other equal conditions.
The estimated average price of a car with leather interior is $12,175.70 larger compared to a car without
leather interior.

(d) What is the predicted price of a sedan made in USA that has 4 doors, leather seats, and runs 30
mpg on highway?
The predicted price of a sedan that is made in USA, has 4 doors, leather seats, and runs 30 mpg on
highway is calculated as: 𝑝𝑟𝑖𝑐𝑒 = $1,000(5.1353 + 6.4107 + 3.2128 + 0.1305 ∙ 30 + 1.5554 ∙
4 + 12.1757) = $37,071.10.

In SAS:

/using fitted model for prediction/

data predict;
input bodystyle$ country$ hwy doors leather$;
cards;
sedan USA 30 4 yes
;

data carsales;
set carsales predict;
run;

proc genmod;
class bodystyle country leather;
model priceK = bodystyle country hwy doors leather / dist=normal link=identity;
output out=outdata p=ppriceK;
run;

data final_prediction;
set outdata;
pprice=ppriceK*1000;
run;

proc print data=final_prediction (firstobs=28) noobs;

var pprice;
run;

pprice
37071.14

In R:

#using fitted model for prediction

prediction<- (predict(fitted.model, data.frame(bodystyle.rel="sedan", country.rel
="USA", hwy=30, doors=4, leather.rel="yes")))
print(prediction*1000)

37071.14

10
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
EXERCISE 1.4. (a) Show normality of the distribution of the number of hours of sleep per night.
Regress the number of hours of sleep on all the given factors. Write explicitly what the fitted model
is.
In SAS:
data sleep;
input age gender$ quiettime nchildren stresslevel jobstatus$ nactivities pastvac
sleephours @@;
cards;
62 F 60 1 5 unempl 1 15 7.7 28 F 15 1 6 unempl 5 11 5.3
50 M 15 0 5 unempl 1 19 6.4 36 M 60 1 6 full 1 21 7.7
56 F 50 0 3 part 4 5 7.6 48 M 180 0 5 full 0 6 6.4
55 M 40 0 8 full 8 23 7.0 26 F 80 0 7 student 9 8 8.3
44 M 180 1 3 part 6 20 9.6 49 F 5 0 7 unempl 5 15 5.5
29 M 60 2 5 student 5 7 7.7 56 M 10 1 4 unempl 4 17 5.7
46 F 40 1 7 part 3 3 7.4 41 F 5 2 6 full 9 10 6.2
22 M 15 0 8 full 4 3 6.3 36 F 45 2 5 part 8 14 7.5
54 F 120 1 8 part 7 10 8.5 42 F 60 3 1 full 9 11 6.3
58 F 5 1 7 full 1 17 5.3 33 M 100 2 1 full 9 5 8.3
50 F 2 2 6 full 3 12 5.1 59 M 30 2 5 full 2 6 6.9
32 M 30 1 8 full 5 9 6.9 50 M 60 2 8 part 8 13 8.0
56 F 10 0 3 unempl 7 7 6.1 42 F 240 0 1 part 8 21 8.8
58 F 10 2 7 full 9 4 6.2 57 F 15 1 6 full 2 16 6.3
30 F 30 0 2 full 8 9 8.3 54 M 20 2 8 full 6 7 6.5
57 M 45 2 4 full 7 18 7.5 45 F 120 0 9 part 2 13 6.6
33 F 40 1 6 unempl 9 24 7.0 56 F 120 0 5 part 2 20 8.7
59 F 60 2 9 part 4 19 8.1 41 M 60 2 3 student 2 3 7.5
62 M 40 0 1 unempl 0 2 8.6 29 M 15 1 7 unempl 3 20 6.3
34 F 30 0 7 unempl 9 0 6.6 32 F 20 3 7 unempl 2 8 7.8
46 F 20 2 3 unempl 9 18 7.9 45 M 60 0 2 unempl 0 22 9.0
23 M 45 0 6 part 4 12 7.6 38 M 60 4 5 full 3 5 7.8
45 M 30 0 5 unempl 9 7 6.8 63 F 40 0 6 unempl 5 5 7.3
27 F 120 0 4 student 1 16 7.3 30 F 45 0 7 part 8 10 7.7
34 F 5 3 6 full 0 4 6.0 62 M 10 0 10 part 8 11 6.0
;

/running normality check/

proc univariate;
var sleephours;
histogram/normal;
run;

11
Goodness-of-Fit Tests for Normal Distribution
Test Statistic p Value
Kolmogorov-Smirnov D 0.08733974 Pr > D >0.150
Cramer-von Mises W-Sq 0.06145088 Pr > W-Sq >0.250
Anderson-Darling A-Sq 0.32815950 Pr > A-Sq >0.250

The normality tests (p-values > 0.05) as well as the bell-shaped histogram indicate normality of the
response variable.

/fitting general linear model/

proc genmod;
class gender(ref="F") jobstatus(ref="full");
model sleephours = age gender quiettime nchildren stresslevel jobstatus
nactivities pastvac / dist=normal link=identity;
run;

Log Likelihood -54.6201

Analysis Of Maximum Likelihood Parameter Estimates

Parameter DF Estimate Standard Wald 95% Confidence Wald Chi- Pr > ChiSq
Error Limits Square
Intercept 1 6.8260 0.7051 5.4440 8.2080 93.72 <.0001
age 1 -0.0037 0.0093 -0.0218 0.0145 0.16 0.6932
gender M 1 0.3568 0.2132 -0.0610 0.7747 2.80 0.0942
gender F 0 0.0000 0.0000 0.0000 0.0000 . .
quiettime 1 0.0074 0.0029 0.0018 0.0130 6.74 0.0095
nchildren 1 0.1204 0.1086 -0.0925 0.3334 1.23 0.2677
stresslevel 1 -0.1398 0.0536 -0.2450 -0.0347 6.80 0.0091
jobstatus part 1 1.0484 0.3188 0.4235 1.6732 10.81 0.0010
jobstatus student 1 0.6286 0.4358 -0.2255 1.4828 2.08 0.1492
jobstatus unempl 1 0.3818 0.2857 -0.1781 0.9418 1.79 0.1814
jobstatus full 0 0.0000 0.0000 0.0000 0.0000 . .
nactivities 1 0.0204 0.0345 -0.0472 0.0879 0.35 0.5545
pastvac 1 0.0050 0.0170 -0.0282 0.0383 0.09 0.7663
Scale 1 0.7214 0.0721 0.5930 0.8776

The fitted model is

𝐸 (𝑠𝑙𝑒𝑒𝑝ℎ𝑜𝑢𝑟𝑠) = 6.8260 − 0.0037 ∙ 𝑎𝑔𝑒 + 0.3568 ∙ 𝑚𝑎𝑙𝑒 + 0.0074 ∙ 𝑞𝑢𝑖𝑒𝑡𝑡𝑖𝑚𝑒 + 0.1204 ∙
𝑛𝑐ℎ𝑖𝑙𝑑𝑟𝑒𝑛 − 0.1398 ∙ 𝑠𝑡𝑟𝑒𝑠𝑠𝑙𝑒𝑣𝑒𝑙 + 1.0484 ∙ 𝑝𝑎𝑟𝑡𝑡𝑖𝑚𝑒 + 0.6286 ∙ 𝑠𝑡𝑢𝑑𝑒𝑛𝑡 + 0.3818 ∙ 𝑢𝑛𝑒𝑚𝑝𝑙 +
0.0204 ∙ 𝑛𝑎𝑐𝑡𝑖𝑣𝑖𝑡𝑖𝑒𝑠 + 0.0050 ∙ 𝑝𝑎𝑠𝑡𝑣𝑎𝑐, and 𝜎 = 0.7214.

In R:

sleep.data<- read.csv(file="C:/./Exercise1.4Data.csv", header=TRUE, sep=",")

#running normality check

library(rcompanion)
plotNormalHistogram(sleep.data$sleephours)

12
shapiro.test(sleep.data$sleephours)

Shapiro-Wilk normality test

W = 0.98284, p-value = 0.6762

#specifying reference levels

gender.rel<- relevel(sleep.data$gender, ref="F")
jobstatus.rel<- relevel(sleep.data$jobstatus, ref="full")

#fitting general linear model

summary(fitted.model<- glm(sleephours ~ age + gender.rel + quiettime + nchildren
+ stresslevel + jobstatus.rel + nactivities + pastvac, data=sleep.data,
family=gaussian(link=identity)))

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.826002 0.798388 8.550 1.78e-10
age -0.003656 0.010494 -0.348 0.72943
gender.relM 0.356815 0.241401 1.478 0.14741
quiettime 0.007421 0.003238 2.292 0.02738
nchildren 0.120419 0.123020 0.979 0.33368
stresslevel -0.139828 0.060734 -2.302 0.02674
jobstatus.relpart 1.048386 0.360976 2.904 0.00603
jobstatus.relstudent 0.628623 0.493437 1.274 0.21021
jobstatus.relunempl 0.381840 0.323501 1.180 0.24501
nactivities 0.020373 0.039031 0.522 0.60465
pastvac 0.005046 0.019222 0.263 0.79430

#outputting estimated sigma

sigma(fitted.model)

0.8168443

(b) How good is the model fit? What beta coefficients are significantly different from zero at the 5%
level of significance?
The p-value in the deviance test is smaller than 0.05, which indicates a good fit of the model.
Significant variables at the 5% level are quiet time, stress level, and part-time employment status.

In SAS:
/*checking model fit*/
proc genmod;
model sleephours = / dist=normal link=identity;
run;
13
Log Likelihood -73.0195

data deviance_test;
deviance = -2*(-73.0195 - (-54.6201));
pvalue = 1 - probchi(deviance,10);
run;

proc print noobs;

run;

deviance pvalue
36.7988 .000061312

In R:

#checking model fit

null.model<- glm(sleephours ~ 1, data=sleep.data, family=gaussian(link=identity))
print(deviance<- -2*(logLik(null.model)-logLik(fitted.model)))

36.79887

print(p.value<- pchisq(deviance, df=10, lower.tail = FALSE))

6.131066e-05

(c) Interpret the estimated significant regression coefficients.

It is estimated that for each extra minute of quiet time, a person would get on average 0.0074 hours
more sleep per night. For a unit increase in stress level, the estimated average number of hours of night
sleep decrease by 0.1398. It is estimated that, on average, someone working part-time would get 1.0484
more hours of sleep compared to someone who is working full-time.
(d) Find the estimated number of hours of night’s sleep that a 30-year old full-time mom of three
children under the age of five has, if she gets 10 minutes a day for herself, walks to the park with her
kids every day of the week, estimates her stress level as 7, and who hasn’t gotten any vacation for one
year.

Below we calculate the predicted number of hours of night’s sleep that a 30-year old full-time mom
of three children under the age of five has, if she gets 10 minutes a day for herself, walks to the park
with her kids every day of the week, estimates her stress level as 7, and who hasn’t gotten any
vacation for one year.

𝑠𝑙𝑒𝑒𝑝ℎ𝑜𝑢𝑟𝑠 = 6.8260 − 0.0037 ∙ 30 + 0.0074 ∙ 10 + 0.1204 ∙ 3 − 0.1398 ∙ 7 + 0.0204 ∙ 7

+ 0.0050 ∙ 12 = 6.3744.
In SAS:

/using fitted model for prediction/

data predict;
input age gender$ quiettime nchildren stresslevel jobstatus$ nactivities pastvac;
cards;
30 F 10 3 7 full 7 12
;

data sleep;

14
set sleep predict;
run;

proc genmod;
class gender jobstatus;
model sleephours = age gender quiettime nchildren stresslevel jobstatus
nactivities pastvac / dist=normal link=identity;
output out=outdata p=psleephours;
run;

proc print data=outdata (firstobs=51) noobs;

var psleephours;
run;

psleephours
6.37616

In R:
#using fitted model for prediction
print(predict(fitted.model, data.frame(age=30, gender.rel="F", quiettime=10,
nchildren=3, stresslevel=7, jobstatus.rel="full", nactivities=7, pastvac=12)))

6.376164

EXERCISE 1.5. (a) Compute the total time spent on both transitions. Verify normality of the
distribution of this variable, and fit a general linear regression model. Specify the fitted model.
In SAS:

data time;
input age gender$ run t1 bike t2 swim @@;
transitiontime=t1+t2;
cards;
55 M 24.17 2.60 37.95 2.50 5.70 59 F 34.88 2.83 52.15 3.05 5.20
24 M 32.97 2.55 59.20 3.47 5.37 53 F 22.2 1.83 46.70 2.15 5.50
51 M 27.35 1.75 42.05 2.32 3.75 38 F 32.13 2.38 50.92 2.95 6.00
66 M 25.39 1.95 41.57 2.80 3.93 30 F 24.67 1.58 48.28 2.77 5.68
43 F 42.33 2.78 63.60 4.08 7.18 47 F 28.73 2.35 45.57 3.90 6.62
26 F 29.62 2.92 51.23 3.85 4.92 45 M 22.23 2.07 38.95 2.35 4.28
29 F 26.93 2.10 44.33 2.45 7.47 34 M 17.75 0.75 33.27 1.23 3.65
39 M 37.47 2.52 55.67 4.47 8.60 54 M 36.63 3.27 43.92 3.08 7.15
26 M 34.42 2.73 52.62 2.67 9.23 36 M 27.38 2.22 39.03 2.92 7.43
42 M 21.37 2.12 35.95 1.93 3.95 49 M 29.03 4.50 38.53 3.95 8.80
42 F 28.53 3.27 49.85 3.67 8.13 42 F 25.12 1.72 39.52 2.50 4.55
42 F 26.33 1.70 48.98 2.30 5.02 41 F 36.75 3.95 62.85 3.13 6.93
15 M 25.12 1.70 44.75 3.20 7.48 48 M 26.52 4.43 40.98 3.82 6.58
37 M 28.3 2.85 41.78 3.47 6.02 55 M 31.25 2.70 43.43 3.25 5.25
42 M 24.38 1.45 37.13 1.83 3.70 25 M 33.45 2.25 51.38 4.03 7.45
12 F 27.62 2.23 55.47 2.97 4.37 23 F 28.55 2.17 54.57 2.55 7.90
49 M 33.88 2.77 54.82 3.87 6.90 53 F 26.97 1.77 42.33 3.40 6.58
45 F 26.58 1.65 44.30 2.52 5.40 33 F 32.32 2.10 54.87 2.32 6.25
63 M 40.53 3.78 69.75 3.83 12.17 50 M 33.68 3.07 43.57 3.13 5.77
43 F 34.93 2.58 62.35 2.95 7.92 24 M 22.88 1.82 39.55 2.12 4.03
44 M 29.25 2.47 45.60 2.75 9.18 51 F 36.98 3.70 46.58 5.18 7.60
;
15
/*running normality check*/
proc univariate;
var transitiontime;
histogram/normal;
run;

Goodness-of-Fit Tests for Normal Distribution

Test Statistic p Value
Kolmogorov-Smirnov D 0.07499320 Pr > D >0.150
Cramer-von Mises W-Sq 0.03895414 Pr > W-Sq >0.250
Anderson-Darling A-Sq 0.26390584 Pr > A-Sq >0.250

The p-values in the normality tests are above 0.05, which means that the response variable has a
normal distribution. The histogram displays a bell-shaped curve, supporting the normality conclusion.

/fitting general linear model/

proc genmod;
class gender;
model transitiontime = age gender run bike swim / dist=normal link=identity;
run;

Log Likelihood -56.4150

Analysis Of Maximum Likelihood Parameter Estimates

Parameter DF Estimate Standard Wald 95% Confidence Wald Chi- Pr > ChiSq
Error Limits Square
Intercept 1 0.5293 1.0253 -1.4803 2.5388 0.27 0.6057
age 1 0.0067 0.0128 -0.0184 0.0318 0.27 0.6032
gender F 1 0.0961 0.3256 -0.5421 0.7343 0.09 0.7679
gender M 0 0.0000 0.0000 0.0000 0.0000 . .
run 1 0.1964 0.0500 0.0985 0.2943 15.46 <.0001
bike 1 -0.0565 0.0328 -0.1207 0.0078 2.97 0.0849
swim 1 0.2475 0.1024 0.0468 0.4483 5.84 0.0156
Scale 1 0.9271 0.1012 0.7486 1.1481

The fitted model is 𝐸 (𝑡𝑟𝑎𝑛𝑠𝑖𝑡𝑖𝑜𝑛𝑡𝑖𝑚𝑒) = 0.5293 + 0.0067 ∙ 𝑎𝑔𝑒 + 0.0961 ∙ 𝑓𝑒𝑚𝑎𝑙𝑒 + 0.1964 ∙
𝑟𝑢𝑛 − 0.0565 ∙ 𝑏𝑖𝑘𝑒 + 0.2475 ∙ 𝑠𝑤𝑖𝑚, and 𝜎 = 0.9271.

In R:
16
time.data<- read.csv(file="C:/./Exercise1.5Data.csv", header=TRUE, sep=",")

#computing total transition time

transition.time<- time.data$t1 + time.data$t2

#running normality check

library(rcompanion)
plotNormalHistogram(transition.time)

shapiro.test(transition.time)

Shapiro-Wilk normality test

W = 0.97896, p-value = 0.6216

#specifying reference levels

gender.rel<- relevel(time.data$gender, ref="M")

#fitting general linear model

summary(fitted.model<- glm(transition.time ~ age + gender.rel + run + bike +
swim, data=time.data, family=gaussian(link=identity)))

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.529266 1.107464 0.478 0.635605
age 0.006659 0.013837 0.481 0.633232
gender.relF 0.096094 0.351716 0.273 0.786250
run 0.196405 0.053953 3.640 0.000849
bike -0.056487 0.035412 -1.595 0.119427
swim 0.247544 0.110615 2.238 0.031507

#outputting estimated sigma

sigma(fitted.model)

1.001351

(b) Discuss the model fit. Are all the predictors in that model significant at the 5% significance level?

In SAS:
/*checking model fit*/
proc genmod;
model transitiontime = / dist=normal link=identity;
run;

17
Log Likelihood -74.6263

data deviance_test;
deviance = -2*(-74.6263 - (-56.4150));
pvalue = 1 - probchi(deviance,5);
run;

proc print noobs;

run;

deviance pvalue
36.4226 .000000782

Since the p-value in the deviance test is tiny, the model has a good fit. The only significant predictors
at the 5% level are run time and swim time.

In R:

#checking model fit

null.model<- glm(transition.time ~ 1, data=time.data,
family=gaussian(link=identity))
print(deviance<- -2*(logLik(null.model)-logLik(fitted.model)))

36.42269

print(p.value<- pchisq(deviance, df=5, lower.tail = FALSE))

7.817128e-07

The estimated average transition time increases by 0.1964 for a one-minute increase in run time.
For a one-minute increase in swim time, the estimated average transition time increases by 0.2475.

(d) What is the predicted total time at transitions for the student, if his best result in 5-kilometer run is
27:32, 13-mile bike is 56:17, and 200-meter swim is 8:46?

Below we compute the predicted time at transitions for the 25-year-old student with a 27:32 run,
56:17 bike, and 8:46 swim. First we convert the times into minutes: 27+32/60=27.53,
56+17/60=56.28, and 8+46/60=8.77. The calculation is as follows: 𝑡𝑟𝑎𝑛𝑠𝑖𝑡𝑖𝑜𝑛𝑡𝑖𝑚𝑒 = 0.5293 +
0.0067 ∙ 25 + 0.1964 ∙ 27.53 − 0.0565 ∙ 56.28 + 0.2475 ∙ 8.77 = 5.09.

In SAS:

/using fitted model for prediction/

data predict;
input age gender$ run bike swim;
cards;
25 M 27.53 56.28 8.77
;
data time;
set time predict;
run;

proc genmod;
18
class gender;
model transitiontime = age gender run bike swim / dist=normal link=identity;
output out=outdata p=ptransitiontime;
run;

proc print data=outdata (firstobs=43) noobs;

var ptransitiontime;
run;

ptransitiontime
5.09465

In R:
#using fitted model for prediction
print(predict(fitted.model, data.frame(age=25, gender.rel="M", run=27.53,
bike=56.28, swim=8.77)))

5.094653

EXERCISE 1.6. (a) Check that the measurements for the heart rate are coming from a normal
distribution. Fit the regression model and specify all estimated parameters.

In SAS:
data heartrate;
length AQI $9.;
input age gender$ ethnicity$ BMI nmeds AQI$ HR @@;
cards;
48 F Black 29.9 0 good 76 56 F White 22.9 3 unhealthy 112
67 F White 23.4 1 good 94 82 M Black 29.7 0 good 92
64 F White 31.4 3 good 97 58 M White 18.9 2 moderate 79
72 F Black 25.2 0 moderate 114 70 F Black 25.9 1 moderate 115
54 M Hispanic 29.6 0 moderate 80 57 F Hispanic 20.2 2 good 81
50 F Black 23.9 1 unhealthy 97 59 F Hispanic 22.6 0 good 86
61 M Hispanic 32.8 1 good 84 69 M Hispanic 24.1 2 unhealthy 94
65 F Black 23.4 2 moderate 114 66 F Hispanic 27.8 3 good 82
74 M White 32.4 1 moderate 97 66 M Hispanic 22.9 2 good 86
53 M Hispanic 25.2 0 good 84 55 M Hispanic 24.6 0 moderate 94
73 F Hispanic 24.8 3 moderate 105 45 F Hispanic 19.0 2 unhealthy 83
71 F White 20.3 2 unhealthy 111 63 M Black 23.8 2 unhealthy 108
71 F White 21.5 2 moderate 100 62 M Hispanic 27.4 3 good 79
44 F Hispanic 17.2 0 unhealthy 86 49 M White 17.1 1 good 75
63 M Black 28.0 2 good 91 65 F Hispanic 22.2 1 moderate 106
;

/running normality check/

proc univariate;
var HR;
histogram/normal;
run;

19
Goodness-of-Fit Tests for Normal Distribution
Test Statistic p Value
Kolmogorov-Smirnov D 0.15627802 Pr > D 0.061
Cramer-von Mises W-Sq 0.09496306 Pr > W-Sq 0.129
Anderson-Darling A-Sq 0.65250988 Pr > A-Sq 0.084

Based on the histograms and the large p-values, we can conclude that the heart rate follows a normal
distribution.
/*fitting generallinear model*/
proc genmod;
class gender ethnicity(ref="Hispanic") AQI(ref="good");
model HR = age gender ethnicity BMI nmeds AQI / dist=normal link=identity;
run;

Log Likelihood -96.2779

Analysis Of Maximum Likelihood Parameter Estimates

Parameter DF Estimate Standard Wald 95% Confidence Wald Chi- Pr > ChiSq
Error Limits Square
Intercept 1 38.0164 10.2408 17.9449 58.0879 13.78 0.0002
age 1 0.6503 0.1472 0.3617 0.9389 19.51 <.0001
gender F 1 7.1031 2.3608 2.4760 11.7303 9.05 0.0026
gender M 0 0.0000 0.0000 0.0000 0.0000 . .
ethnicity Black 1 7.5351 2.8956 1.8598 13.2104 6.77 0.0093
ethnicity White 1 2.2633 2.7895 -3.2041 7.7306 0.66 0.4172
ethnicity Hispanic 0 0.0000 0.0000 0.0000 0.0000 . .
BMI 1 0.0431 0.3225 -0.5890 0.6751 0.02 0.8938
nmeds 1 0.4384 1.1919 -1.8976 2.7743 0.14 0.7130
AQI moderate 1 10.8596 2.6942 5.5790 16.1402 16.25 <.0001
AQI unhealthy 1 14.1674 3.1905 7.9142 20.4206 19.72 <.0001
AQI good 0 0.0000 0.0000 0.0000 0.0000 . .
Scale 1 5.9914 0.7735 4.6520 7.7165

The fitted model is 𝐸 (𝐻𝑅) = 38.0164 + 0.6503 ∙ 𝑎𝑔𝑒 + 7.1031 ∙ 𝑓𝑒𝑚𝑎𝑙𝑒 + 7.5351 ∙ 𝐵𝑙𝑎𝑐𝑘 + 2.
2633 ∙ 𝑊ℎ𝑖𝑡𝑒 + 0.0431 ∙ 𝐵𝑀𝐼 + 0.4384 ∙ 𝑛𝑚𝑒𝑑𝑠 + 10.8596 ∙ 𝐴𝑄𝐼𝑚𝑜𝑑𝑒𝑟𝑎𝑡𝑒 + 14.1674 ∙
𝐴𝑄𝐼𝑢𝑛ℎ𝑒𝑎𝑙𝑡ℎ𝑦, and 𝜎 = 5.9914.

In R:

20
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
hr.data<- read.csv(file="C:/./Exercise1.6Data.csv", header=TRUE, sep=",")

#running normality check

library(rcompanion)
plotNormalHistogram(hr.data$HR)

shapiro.test(hr.data$HR)

Shapiro-Wilk normality test

W = 0.93047, p-value = 0.05054

#specifying reference levels

gender.rel<- relevel(hr.data$gender, ref="M")
ethnicity.rel<- relevel(hr.data$ethnicity, ref="Hispanic")
AQI.rel<- relevel(hr.data$AQI, ref="good")

#fitting general linear model

summary(fitted.model<- glm(HR ~ age + gender.rel + ethnicity.rel + BMI + nmeds +
AQI.rel, data=hr.data, family=gaussian(link=identity)))

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 38.01638 12.24005 3.106 0.00535
age 0.65033 0.17599 3.695 0.00134
gender.relF 7.10311 2.82173 2.517 0.02002
ethnicity.relBlack 7.53509 3.46094 2.177 0.04102
ethnicity.relWhite 2.26328 3.33411 0.679 0.50466
BMI 0.04306 0.38543 0.112 0.91210
nmeds 0.43836 1.42454 0.308 0.76133
AQI.relmoderate 10.85963 3.22023 3.372 0.00288
AQI.relunhealthy 14.16737 3.81333 3.715 0.00128

#outputting estimated sigma

sigma(fitted.model)

7.161087

(b) Discuss the goodness-of-fit of the model. What variables are significant predictors of heart rate at
the 5% level of significance?

In SAS:
/*checking model fit*/
proc genmod;
model HR = / dist=normal link=identity;

21
run;

Log Likelihood -117.8512

data deviance_test;
deviance = -2*(-117.8512 - (-96.2779));
pvalue = 1 - probchi(deviance,8);
run;

proc print noobs;

run;

deviance pvalue
43.1466 .000000824

Since the p-value in the deviance test is tiny, the model has a good fit. The significant predictors at
the 5% level are age, gender, ethnicity level Black, and both levels of AQI.

In R:

#checking model fit

null.model<- glm(HR ~ 1, data=hr.data, family=gaussian(link=identity))
print(deviance<- -2*(logLik(null.model)-logLik(fitted.model)))

43.14658

print(p.value<- pchisq(deviance, df=8, lower.tail=FALSE))

8.243212e-07

(c) Give interpretation of the estimated statistically significant regression coefficients.

As age increases by one year, the estimated average heart rate increases by 0.6503 beats per minute.
The estimated average heart rate for females is 7.1031 beats per minute larger than that for males.
The estimated average heart rate for Blacks is 7.5351 beats per minute larger than that for Hispanics.
The estimated average heart rate for people living with moderate air quality is 10.8956 beats per
minute larger than that for people living with good air quality. The estimated average heart rate for
people living with moderate air quality is 14.1674beats per minute larger than that for people living
with good air quality.
(d) Compute the predicted heart rate of a 50-year-old Hispanic male who has a BMI of 20, is not
taking any heart medications, and resides in an area with a moderate air quality.
The predicted heart rate of a 50-year-old Hispanic male who has a BMI of 20, is not taking any heart
medications, and resides in an area with a moderate air quality is computed as follows:

𝐻𝑅 = 38.0164 + 0.6503 ∙ 50 + 0.0431 ∙ 20 + 10.8596 = 82.253.

In SAS:
/*using fitted model for prediction*/
data predict;
input age gender$ ethnicity$ BMI nmeds AQI$;
cards;
50 M Hispanic 20 0 moderate
;

22
data heartrate;
set heartrate predict;
run;

proc genmod;
class gender ethnicity AQI;
model HR = age gender ethnicity BMI nmeds AQI / dist=normal link=identity;
output out=outdata p=pHR;
run;

proc print data=outdata (firstobs=31) noobs;

var pHR;
run;

pHR
82.2536

In R:
#using fitted model for prediction
print(predict(fitted.model, data.frame(age=50, gender.rel="M", ethnicity.rel="Hi
spanic", BMI=20, nmeds=0, AQI.rel="moderate")))

82.25361

23
CHAPTER 2
EXERCISE 2.1. (a) Is the decrease in BMI percentile (preBMI-postBMI) normally distributed?
Plot a histogram and test for normality of the distribution.

In SAS:
data obesity;
input gender$ age group$ preBMI postBMI @@;
BMIdiff=preBMI-postBMI;
female=(gender="F");
control=(group="Cx");
cards;
F 6 Cx 85.7 83.8 F 6 Cx 93.8 92.9 F 7 Cx 93.5 92.5 F 8 Cx 90.1 89.8
F 9 Tx 92.3 90.7 F 9 Tx 90.3 88.3 F 12 Cx 87.6 85.9 F 12 Cx 87.2 84.1
F 12 Tx 96.9 94.9 F 12 Tx 85.8 81.2 F 13 Cx 96.7 94.1 F 13 Cx 93.5 92.9
F 13 Tx 92.3 87.5 F 13 Tx 85.3 83.7 F 14 Tx 95.5 78.7 F 15 Cx 91.3 89.9
F 15 Tx 95.8 87.1 F 16 Tx 90.7 87.2 M 6 Cx 92.6 88.1 M 7 Cx 95.8 94.7
M 7 Cx 90.4 89.1 M 7 Cx 91.2 88.6 M 8 Tx 94.4 87.8 M 8 Tx 93.2 87.3
M 10 Cx 93.9 91.5 M 10 Tx 96.2 91.1 M 10 Tx 89.4 87.9 M 11 Tx 86.2 77.1
M 11 Tx 95.4 84.8 M 12 Cx 97.7 95.8 M 13 Tx 85.3 80.0 M 13 Tx 86.2 82.4
M 14 Cx 85.5 83.6 M 14 Cx 97.8 93.8 M 16 Cx 95.0 93.6 M 16 Tx 93.1 86.8
;

/running normality check of response/

proc univariate;
var BMIdiff;
histogram/normal;
run;

Goodness-of-Fit Tests for Normal Distribution

Test Statistic p Value
Kolmogorov-Smirnov D 0.18720025 Pr > D <0.010
Cramer-von Mises W-Sq 0.36512474 Pr > W-Sq <0.005
Anderson-Darling A-Sq 2.15289200 Pr > A-Sq <0.005

Neither the histogram nor the normality tests support normality of the response. In fact, the
distribution is right-skewed.

24
In R:

bmi.data<- read.csv(file="C:/./Exercise2.1Data.csv",header=TRUE, sep=",")

#creating the difference in BMI

BMIdiff<- bmi.data$preBMI-bmi.data$postBMI

#running normality check of response

library(rcompanion)
plotNormalHistogram(BMIdiff)

shapiro.test(BMIdiff)

Shapiro-Wilk normality test

W = 0.79159, p-value = 1.114e-05

(b) Find the optimal lambda for Box-Cox transformation. Transform the change in BMI percentile
(find the appropriate transformation in Table 2.1), and show that the transformed variable is
normally distributed. Plot the histogram and do a formal testing.

In SAS:
/*finding optimal lambda for Box-Cox transformation*/
proc transreg;
model BoxCox(BMIdiff) = identity(age female control);
run;

/applying Box-Cox transformation with lambda=0/

data obesity;
set obesity;

25
Random documents with unrelated
content Scribd suggests to you:
credit card donations. To donate, please visit:
www.gutenberg.org/donate.

Section 5. General Information About

Project Gutenberg™ electronic works
Professor Michael S. Hart was the originator of the Project
Gutenberg™ concept of a library of electronic works that could
be freely shared with anyone. For forty years, he produced and
distributed Project Gutenberg™ eBooks with only a loose
network of volunteer support.

Project Gutenberg™ eBooks are often created from several

printed editions, all of which are confirmed as not protected by
copyright in the U.S. unless a copyright notice is included. Thus,
we do not necessarily keep eBooks in compliance with any
particular paper edition.

Most people start at our website which has the main PG search
facility: www.gutenberg.org.

This website includes information about Project Gutenberg™,

including how to make donations to the Project Gutenberg
Literary Archive Foundation, how to help produce our new
eBooks, and how to subscribe to our email newsletter to hear
about new eBooks.

Solutions Manual to Advanced Regression Models with SAS and R 1st Edition Olga Korosteleva pdf download
100% (2)
Solutions Manual to Advanced Regression Models with SAS and R 1st Edition Olga Korosteleva pdf download
68 pages
Solutions Manual to Advanced Regression Models with SAS and R 1st Edition Olga Korosteleva download
100% (4)
Solutions Manual to Advanced Regression Models with SAS and R 1st Edition Olga Korosteleva download
65 pages
Linear Models with R (Chapman & Hall/CRC Texts in Statistical Science) 2nd Edition, (Ebook PDF) - The complete ebook version is now available for download
100% (1)
Linear Models with R (Chapman & Hall/CRC Texts in Statistical Science) 2nd Edition, (Ebook PDF) - The complete ebook version is now available for download
63 pages
9_2_MultipleRegression
No ratings yet
9_2_MultipleRegression
71 pages
17126
No ratings yet
17126
50 pages
Review of T-Test, ANOVA, ChiSq-Reg
No ratings yet
Review of T-Test, ANOVA, ChiSq-Reg
41 pages
Basic Statistics With R - Reaching Decisions With Data
No ratings yet
Basic Statistics With R - Reaching Decisions With Data
262 pages
Primer of Applied Regression and Analysis of Variance (Glantz S.a., Slinker B.K., Neilands T.B)
No ratings yet
Primer of Applied Regression and Analysis of Variance (Glantz S.a., Slinker B.K., Neilands T.B)
1,472 pages
STAT 231 Course Notes W16 Print
No ratings yet
STAT 231 Course Notes W16 Print
424 pages
TEST OF STATISTICAL HYPOTHESIS
No ratings yet
TEST OF STATISTICAL HYPOTHESIS
100 pages
ECON20003 S1 2024 Sample Exam
No ratings yet
ECON20003 S1 2024 Sample Exam
27 pages
Project of Biostatistics#02-RaeesaAli-MS - BIOTECH
No ratings yet
Project of Biostatistics#02-RaeesaAli-MS - BIOTECH
27 pages
Applied Linear Models with SAS - 1st Edition ISBN 052176159X, 9780521761598 Complete EPUB Download
No ratings yet
Applied Linear Models with SAS - 1st Edition ISBN 052176159X, 9780521761598 Complete EPUB Download
15 pages
Information Systems Outsourcing The Era of Digital Transformation Rudy Hirschheim All Chapters Instant Download
100% (3)
Information Systems Outsourcing The Era of Digital Transformation Rudy Hirschheim All Chapters Instant Download
62 pages
The Complete Idiots Guide To Private Investigating 3rd Edition Steven Kerry Brown instant download
100% (1)
The Complete Idiots Guide To Private Investigating 3rd Edition Steven Kerry Brown instant download
76 pages
sujal 4
No ratings yet
sujal 4
31 pages
Practice Questions - Final With Feedback
No ratings yet
Practice Questions - Final With Feedback
8 pages
The Changing Role of SMEs in Global Business: Volume I: Paradigms of Opportunities and Challenges Alkis Thrassou download pdf
100% (3)
The Changing Role of SMEs in Global Business: Volume I: Paradigms of Opportunities and Challenges Alkis Thrassou download pdf
62 pages
Full Download Solutions Manual to Advanced Regression Models with SAS and R 1st Edition Olga Korosteleva PDF DOCX
100% (4)
Full Download Solutions Manual to Advanced Regression Models with SAS and R 1st Edition Olga Korosteleva PDF DOCX
55 pages
R Practice
No ratings yet
R Practice
38 pages
Download ebooks file The Changing Role of SMEs in Global Business: Volume II: Contextual Evolution Across Markets, Disciplines and Sectors Alkis Thrassou all chapters
100% (3)
Download ebooks file The Changing Role of SMEs in Global Business: Volume II: Contextual Evolution Across Markets, Disciplines and Sectors Alkis Thrassou all chapters
52 pages
Syllabus MAS202 Sp23
No ratings yet
Syllabus MAS202 Sp23
23 pages
M348 Applied Statistical Modelling - Generalised Linear Models
No ratings yet
M348 Applied Statistical Modelling - Generalised Linear Models
295 pages
Assignment_STAT5002
No ratings yet
Assignment_STAT5002
5 pages
Command For Stata
No ratings yet
Command For Stata
8 pages
Simple Linear Regression and Correlation: Inferential Methods
No ratings yet
Simple Linear Regression and Correlation: Inferential Methods
10 pages
Businees Statistics Sums for Practice
No ratings yet
Businees Statistics Sums for Practice
12 pages
[FREE PDF sample] The Maternal Imagination of Film and Film Theory Lauren Bliss ebooks
100% (3)
[FREE PDF sample] The Maternal Imagination of Film and Film Theory Lauren Bliss ebooks
62 pages
Logistic Regression (2022)
No ratings yet
Logistic Regression (2022)
44 pages
Sem 2 20172018 Final Exam Question Bum2413
No ratings yet
Sem 2 20172018 Final Exam Question Bum2413
11 pages
Session 6-15 - Unit II & III: Probability and Distribution, Classical Tests
No ratings yet
Session 6-15 - Unit II & III: Probability and Distribution, Classical Tests
34 pages
Programming With R Test 2
50% (2)
Programming With R Test 2
5 pages
A1
No ratings yet
A1
8 pages
Statistics Econometrics Exam Feb
No ratings yet
Statistics Econometrics Exam Feb
8 pages
Full Download Faith, Reason, and Culture: An Essay in Fundamental Theology George Karuvelil PDF DOCX
100% (3)
Full Download Faith, Reason, and Culture: An Essay in Fundamental Theology George Karuvelil PDF DOCX
52 pages
Basic Statistics with R: Reaching Decisions with Data Stephen C. Loftus 2024 Scribd Download
100% (5)
Basic Statistics with R: Reaching Decisions with Data Stephen C. Loftus 2024 Scribd Download
66 pages
Advanced Econometrics: Instructor: Kanika Mahajan
No ratings yet
Advanced Econometrics: Instructor: Kanika Mahajan
36 pages
Problems
No ratings yet
Problems
12 pages
Problem-Set - 1 Practise Problems From Textbook
No ratings yet
Problem-Set - 1 Practise Problems From Textbook
2 pages
ass_3_skeleton--1-
No ratings yet
ass_3_skeleton--1-
4 pages
Prob _ Stat Mod-Course Outline-Updated
No ratings yet
Prob _ Stat Mod-Course Outline-Updated
4 pages
Homework 1: Statistics 109 Due February 17, 2019 at 11:59pm EST
No ratings yet
Homework 1: Statistics 109 Due February 17, 2019 at 11:59pm EST
23 pages
Weatherwax Weisberg Solutions
No ratings yet
Weatherwax Weisberg Solutions
162 pages
30C00200 Problem Set 1
No ratings yet
30C00200 Problem Set 1
4 pages
Statistical Modeling for Biomedical Researchers A Simple Introduction to the Analysis of Complex Data - 1st Edition Readable PDF Download
100% (12)
Statistical Modeling for Biomedical Researchers A Simple Introduction to the Analysis of Complex Data - 1st Edition Readable PDF Download
17 pages
BA - Advanced statistical method using R (P2)
No ratings yet
BA - Advanced statistical method using R (P2)
12 pages
Sta 226
No ratings yet
Sta 226
5 pages
Assignment06 1
No ratings yet
Assignment06 1
4 pages
Statistical Modeling for Biomedical Researchers A Simple Introduction to the Analysis of Complex Data 2nd Edition Complete DOCX Download
100% (19)
Statistical Modeling for Biomedical Researchers A Simple Introduction to the Analysis of Complex Data 2nd Edition Complete DOCX Download
15 pages
Basic STATA Command
No ratings yet
Basic STATA Command
5 pages
Instant Access to Cell Biology and Translational Medicine Volume 8 Stem Cells in Regenerative Medicine Kursad Turksen ebook Full Chapters
100% (1)
Instant Access to Cell Biology and Translational Medicine Volume 8 Stem Cells in Regenerative Medicine Kursad Turksen ebook Full Chapters
62 pages
Biostatistics for Clinical and Public Health Research, 1st Edition Instant Reading Access
100% (9)
Biostatistics for Clinical and Public Health Research, 1st Edition Instant Reading Access
17 pages
Favorite Word Essay
100% (2)
Favorite Word Essay
5 pages
Introductory Statics For The Life and Biomedical Sciences
100% (1)
Introductory Statics For The Life and Biomedical Sciences
348 pages
Appendix: Answers To Selected Exercises: /user
No ratings yet
Appendix: Answers To Selected Exercises: /user
8 pages
Regression with Linear Predictors Complete DOCX Download
100% (12)
Regression with Linear Predictors Complete DOCX Download
16 pages
Lesson 5-Communicative Strategies
No ratings yet
Lesson 5-Communicative Strategies
21 pages
Multilayer Arthemetic Logic Unit
No ratings yet
Multilayer Arthemetic Logic Unit
5 pages
MTH212
No ratings yet
MTH212
218 pages
Which Test When: 1 Exploratory Tests
No ratings yet
Which Test When: 1 Exploratory Tests
5 pages
M L Schroff
No ratings yet
M L Schroff
6 pages
Techniques of Statistical Analysis 1 Group 2 2014-15
No ratings yet
Techniques of Statistical Analysis 1 Group 2 2014-15
3 pages
Current Status and Development of Terminal Blend Tyre Rubber Modified
100% (1)
Current Status and Development of Terminal Blend Tyre Rubber Modified
11 pages
MAT 1011 Applied Statistics Revision - New
No ratings yet
MAT 1011 Applied Statistics Revision - New
3 pages
How to Find Inter-Groups Differences Using Spss/Excel/Web Tools in Common Experimental Designs: Book 1
From Everand
How to Find Inter-Groups Differences Using Spss/Excel/Web Tools in Common Experimental Designs: Book 1
P.Y. Cheng
No ratings yet
Minitab Tip Sheet 15
No ratings yet
Minitab Tip Sheet 15
5 pages
Big Books Its Effectiveness On Reading Comprehension Skills
No ratings yet
Big Books Its Effectiveness On Reading Comprehension Skills
9 pages
AN 70302 Fluorescence Method Development Handbook AN70302 E
No ratings yet
AN 70302 Fluorescence Method Development Handbook AN70302 E
15 pages
Y7 Autumn Block 1 WO4 Linear and Non Linear Sequences 2019
No ratings yet
Y7 Autumn Block 1 WO4 Linear and Non Linear Sequences 2019
2 pages
Radiation
No ratings yet
Radiation
2 pages
Acceptence Letter For Oral Presentation
No ratings yet
Acceptence Letter For Oral Presentation
19 pages
Parameter Diode Laser 810 and 980 Dentin Hipersensitivity Umana2013
No ratings yet
Parameter Diode Laser 810 and 980 Dentin Hipersensitivity Umana2013
8 pages
Jurnal Akupuntur PDF
No ratings yet
Jurnal Akupuntur PDF
12 pages
The Philippines Education System in Crisis
No ratings yet
The Philippines Education System in Crisis
5 pages
Syllabus Activity Points
No ratings yet
Syllabus Activity Points
56 pages
MATHEMATICS GRADE 8 Test2024
No ratings yet
MATHEMATICS GRADE 8 Test2024
11 pages
Chlorrination of Waste Water
No ratings yet
Chlorrination of Waste Water
12 pages
THESIS Chapters 1-3
No ratings yet
THESIS Chapters 1-3
22 pages
EC-4110-I EC-4110-ICON Intelligent: Ġ Ġ Ġ Conductivity
No ratings yet
EC-4110-I EC-4110-ICON Intelligent: Ġ Ġ Ġ Conductivity
88 pages
Spectral method for fatigue damage estimation with non-zero mean stress
From Everand
Spectral method for fatigue damage estimation with non-zero mean stress
Pedro H. Alves Corrêa
No ratings yet
Legendre Polynomials, Associated Legendre Functions and Spherical Harmonics
No ratings yet
Legendre Polynomials, Associated Legendre Functions and Spherical Harmonics
50 pages
Introductory Econometrics A Modern Approach 5th Edition Wooldridge Solutions Manual 1
100% (51)
Introductory Econometrics A Modern Approach 5th Edition Wooldridge Solutions Manual 1
26 pages
Software Development
100% (2)
Software Development
272 pages
Electrical Equipment Training: Introductory, Intermediate and Advanced
No ratings yet
Electrical Equipment Training: Introductory, Intermediate and Advanced
2 pages
EAPP (Survey Questionnaire)
No ratings yet
EAPP (Survey Questionnaire)
5 pages
726.astm B280-20
No ratings yet
726.astm B280-20
9 pages
Classical Approach to Constrained and Unconstrained Molecular Dynamics
From Everand
Classical Approach to Constrained and Unconstrained Molecular Dynamics
Ajith Gunaratne
No ratings yet
Public Notice Dated - 18.12.2021 Date Wise Subject
No ratings yet
Public Notice Dated - 18.12.2021 Date Wise Subject
1 page
Statistical Modelling For Biomedical Researchers
100% (2)
Statistical Modelling For Biomedical Researchers
544 pages
Student Solutions Manual to Accompany Loss Models: From Data to Decisions, Fourth Edition
From Everand
Student Solutions Manual to Accompany Loss Models: From Data to Decisions, Fourth Edition
Stuart A. Klugman
4/5 (1)
FS2 Episode 18
No ratings yet
FS2 Episode 18
8 pages
Surface Consistent Deconvolution On Seismic Data With Surface Consistent Noise
No ratings yet
Surface Consistent Deconvolution On Seismic Data With Surface Consistent Noise
5 pages
Statistical Modeling For Biomedical Researchers: A Simple Introduction To The Analysis of Complex Data (Cambridge Medicine (Paperback) )
100% (27)
Statistical Modeling For Biomedical Researchers: A Simple Introduction To The Analysis of Complex Data (Cambridge Medicine (Paperback) )
23 pages
Lecture 1: Overview of FSI in Workbench: ANSYS Fluent Fluid Structure Interaction (FSI) With ANSYS Mechanical
No ratings yet
Lecture 1: Overview of FSI in Workbench: ANSYS Fluent Fluid Structure Interaction (FSI) With ANSYS Mechanical
38 pages
Comprehensive Disaster Risk Reduction and Management in Basic Education Framework
No ratings yet
Comprehensive Disaster Risk Reduction and Management in Basic Education Framework
39 pages
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
From Everand
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
Joseph George Caldwell
No ratings yet