Notes
Notes
727 - 737
• Exercises *Extra problem, 13.2, 13.3. 13.4, 13.13, (Ignore directions on book exercises. Use tables and 2 cdf to
find values for and p .
2
Show work.)
Night: Sun Mon Tues Wed Thurs Fri Sat • *Extra Problem for 13.1: A school’s principal
Average 130 108 115 104 99 37 62 wants to know if students spend about the same amount
Time:
of time on homework each night of the week. She asks a
random sample of 50 students to keep track of their homework time for a week. The following table displays the
average amount of time (in minutes) students reported per night: Explain
carefully why it would not be appropriate to perform a chi-square test for goodness of fit using these data.
Friday, March 15, 2019: Section 13.1: Test for Goodness of Fit
• Exercises 13.10, 13.31,
• 13.2 Intro worksheet
• Question: What is the critical value needed to reject the null hypothesis at the .05 level for a four category GOF?
What is the maximum critical value to fail to reject the null of a 10 category GOF at the .05 level?
• Goodness of Fit Lab (Due Wednesday)
Monday, March 18,2019: Section 13.2: Inference for Two-Way Tables and Section 14.1: Inference About the Model
Read p. 744 – 766
• 13.2 WS I
Tuesday, March 19, 2019: Section 13.2: Inference for Two-Way Tables and Section 14.1: Inference About the Model
• Read p. 781 – 806 Section
• 14.1 Intro Worksheet
Wednesday, March 20, 2019: Sections 14.1: Inference About the Model and 14.2: Predictions and Conditions
• Read p. 781 – 806
• Lab is due
• 14.1 WS I
Thursday, March 21, 2019 Sections 14.1: Inference About the Model and 14.2: Predictions and Conditions
• 14.2 WS II
Friday, March 22, 2019: Sections 14.1: Inference About the Model and 14.2: Predictions and Conditions
• FRQ 1,2 MC1-6
Monday, March 25, 2019: Review
• MC 7-12, FRQ 3
• FRQ Review
Tuesday, March 26, 2019: Wednesday, March 27, 2019: Chapters 13 and 14 Test
AP Statistics Section 13.1 Notes (Day 1)
• Sometimes we want to examine the of a single
variable in a population.
• The allows
us to determine whether a hypothesized distribution seems valid.
I. Definition: Chi-Square Statistic
The chi-square statistic is a measure of how far the observed counts are from the expected counts. The
formula for the statistic
2 =
where the sum is over all possible outcomes.
3) Biologists wish to mate pairs of fruit flies having genetic makup RrCc, indicating that each has one
dominant gene (R) and one recessive gene (r) for eye color, along with one dominant (C) and one
recessive (c) gene for wing type. Each offspring will receive one gene for each of the two traits from
each parent. The following table, known as a Punnett square, shows the possible combinations of genes
received by the offspring:
RC Rc rC rc Any offspring receiving an R gene will have
RC RRCC (x) RRCc (x) RrCC (x) RrCc (x) red eyes, and any offspring receiving a C gene
Rc RRCc (x) RRcc (y) RrCc (x) Rrcc (y)
rC RrCC (x) RrCc (x) rrCC (z) rrCc (z) will have straight wings. Thus, based on this
rc RrCc (x) Rrcc (y) rrCc (z) Rrcc (w) Punnett Square, biologists predict a ratio of
9:3:3:1
• 9 red-eyed, straight-winged offspring (x) To test their hypothesis about the distribution of offspring, the
• 3 red-eyed, curly-winged offspring (y) biologists mate a random sample of pairs of fruit flies. Of 180
• 3 white-eyed, straight-winged offspring (z) offspring, 99 had red eyes and straight wings, 42 had red eyes
• 1 white-eyed, curcly-winged offspring (w) and curly wings, 29 had white eyes and straight wings, and 10
had white eyes and curly wings. Calculate the 2 statistic and
associated probability for these data (don’t forget to list df).
Chi-Square Goodness of Fit test
Conditions: Random sample; n < 1/10 N, All expected counts > 5.
(𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑−𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑)2
𝑥2 = ∑ 𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑
• Since there were 84 total entrees ordered when no music was played, the expected count of French
entrees would be 84 • 99 = 34.22 .
243
• Since there were 75 total entrees ordered when French music was played, the expected count of
French entrees would be 75 • 99 = 30.56 .
243
• Since there were 84 total entrees ordered when Italian music was played, the expected count of
French entrees would be 84 •
99
= 34.22 .
243
Problem: If the specific type of music that’s playing has no effect on entrée orders, the proportion of
Italian entrees ordered under each music condition should be .
Problem: Find the three expected counts for this proportion: expected counts for no music, French
music, and Italian music when Italian entrees are ordered.__________________________________________
II. Check your work with Expected Counts Table (Note: I did the last row for you.)
Type of Music Note: When H 0 is true, the expected count in
Entrée Ordered None French Italian Total any cell of a two-way table is
French 34.22 30.56 34.22 99
Italian 10.72 9.57 10.72 31 expected count =
( row total ) • ( column total ) Look at
Other 39.06 34.88 39.06 113 table total
Total 84 75 84 243 computations & convince yourself this
formula=true.
Recall 2 = (
Observed - Expected )
2
IV. Problem
The Pennsylvania State University has its main campus in the town of State College and more than
20 smaller “commonwealth campuses” around the state. The Penn State Division of Student
Affairs polled separate random samples of undergraduates from the main campus and
commonwealth campuses about their use of online social networking. Facebook was the most
popular site. Here is a comparison of Facebook use by undergraduates at the main campus, and
commonwealth campuses who have a Facebook account.
Use Facebook Main Campus Commonwealth Total
Several times a month or less 55 76 131
At least once a week 215 157 372
At least once a day 640 394 1034
Total Facebook users 910 627 1537
(1) Calculate the table of expected counts.
(2) Calculate the chi-square test statistic (show work as shown above) and associated probability.
Use the formula for degrees of freedom for two-way tables above.
( Observed − Expected )
2
treatments. =
2
df: (row – 1)(column – 1)
Expected
Example:
Random digit telephone surveys used to exclude cell phone numbers. If the opinions of people who only
have cell phones differ from those of people who have landline service, the poll results may not represent
the entire adult population. The Pew-Research Center interviewed separate random samples of cell-only
and landline telephone users. Here’s what the Pew survey found about how these people describe their
political party affiliation:
Cell-only Sample Landline Sample
Democrat or lean Democratic 49 47 Do these data provide convincing
Refuse to lean either way 15 27 evidence at the = 0.05 level
Republican or lean Republican 32 30 that the distribution of party
Total 96 104 affiliation differs in cell-only and
landline user populations? (Separate paper)
14.1 WS 1
Regression Analysis: Flight Time vs. Drop Time
1) Use the least squares regression Predictor Coef SE Coef T P
analysis on Flight Time vs. Drop Time of Constant –0.03761 0.05838 –0.64 0.522
a Helicopter. The data came from Drop height (cm) 0.0057244 0.0002018 28.37 0.000
S=0.168181 R-Sq = 92.2% R-Sq (adj) = 92.1%
dropping 70 paper helicopters from
various heights and measuring flight times. Assume all conditions have been verified. Find and interpret
a 95% confidence interval for the slope of the true regression line.
2) A random sample of 16 used Ford F-150 SuperCrew 4 x 4’s was selected among those for sale on
autotrader.com The number of miles driven Regression Analysis: Price ($) versus Miles Drive
and price (in dollars) were recorded for each of Predictor Coef SE Coef T P
Constant 38257 2446 15.64 0.000
the trucks. The regression analysis is given
Miles Drive –0.16292 0.03096 –5.26 0.000
below. Assume all conditions have been S=5740.13 R-Sq = 66.4% R-Sq (adj) = 64%
verified.
Find and interpret a 90% confidence interval for the slope of the true regression line.
3) Infants who cry easily may be more
Regression Analysis: IQ vs Crycount
easily stimulated than others. This may be a Predictor Coef SE Coef T P
sign of higher IQ. Child development Constant 91.268 8.934 10.22 0.000
researchers explored the relationship between Crycount 1.4929 0.4870 3.07 0.004
the crying of infants 4 to 10 days old and their S=17.50 R-Sq = 20.7% R-Sq (adj) = 18.5%
later IQ scores. The researchers measured the “crycount” at age 4 to 10 days and then later measured IQ
at age three. Thirty-eight children were sampled (SRS). Assume all conditions have been verified. The
regression analysis is given below. Do theses data provide convincing evidence that a positive linear
relationship between crying counts and IQ in the population of interest? (Carry out a significance test.)
Chapter 14 Notes Day 2
1) Body weights and backpack weights were collected for 8 students. The regression analysis and
relevant graphs are given below. Is there statistically significant evidence of a positive linear relationship
between body weight and backpack weight? (a) Carry out an appropriate significance test at the
= 0.01 level. (b) Give a 99% confidence interval for the slope of the population regression line.
Predictor Coef Stdev t-ratio P
Constant 16.265 3.937 4.13 0.006
BodyWT 0.09080 0.02831 3.21 0.018
S = 2.270 R-sq = 63.2% R-sq (adj) = 57%
Backckpack wt (lbs)
Frequency
Residual