F24_Lab-01 (1)
F24_Lab-01 (1)
8/28/2024
1
Summary statistics
Create a variable for estimated height D.Hat = 200 + .708 Height − .000344 Height2 and add it to the data
frame Galileo.
Create a new variable LO that takes a value of TRUE when the estimated distance is lower than the measured
distance (D.Hat < Distance) and a value of FALSE otherwise and add it to the data frame Galileo. Use
this to get a subset of the Galileo data frame removing the observations for which the estimated distance
is lower than the measured distance.
# Remove cases whose estimated distance is lower than the measured distance
Galileo[!Galileo$LO, ]
install.packages("Sleuth3")
library(Sleuth3)
summary(case0101)
2
Obtain summary statistics of the scores for the two treatment groups.
par(mfrow=c(1,2))
hist(int.score)
hist(ext.score)
stem(int.score)
stem(ext.score)
Find the average score difference between the two treatment groups.
mean(int.score) - mean(ext.score)
var(int.score)
var(ext.score)
summary(ex0116)
3
hist(ex0116$PerCapitaGDP, breaks = seq(0, max(ex0116$PerCapitaGDP)+5000, 5000))
boxplot(ex0116$PerCapitaGDP)
# Get a list of countries with GDP < LIF or GDP > UIF
ex0116[ex0116$PerCapitaGDP < LIF | ex0116$PerCapitaGDP > UIF, ]
4. Exercise
1. Download the baseball data set baseball.csv given on the Canvas module for this lab. It contains
data from the back-side of 59 baseball cards. The file has 59 observations on the following 6 variables:
height: Height in inches; weight: Weight in pounds; bat: a factor with levels L R S; throw: a factor
with levels L R; field: a factor with levels 0 1, average: ERA if the player is a pitcher and his batting
average if the player is a fielder.
2. Create a data frame.
9. Calculate the difference between the mean ERA of pitchers who are classified as overweight (BMI ≥ 25)
and the mean ERA of pitchers with BMI < 25.
10. Create a new data frame owbb that contains baseball players classified as overweight according to their
BMI.