SCI 1020 - wk2
SCI 1020 - wk2
PRELIMINARY QUESTIONS:
These problems are to help you engage with the lecture material, and also to make sure that everyone is up-
to-speed before the workshop starts. Please make sure you do them before class each week!
Q.1 State in your own words what is meant by each of the terms listed below. Be specific.
Term Definition
Explanatory variable
Response Variable
Association
Correlation
Regression line
Residual
Q.2 What is the general equation of a straight line? Define all the terms in the equation.
WORKSHOP PROBLEMS:
Q.4 Demonstration of correlation and least squares regression.
a) Go to the website http://digitalfirst.bfwpub.com/stats_applet/stats_applet_5_correg.html (Note
that spaces in the URL are underscores_ ).
Create a scatterplot of linear trend (similar to plot #1 below. Observe the size of the correlation
coefficient for different scatter patterns. Use “Draw your own line” to draw a line of best fit.
Change the intercept and slope, trying to minimise the sum of the squares of the residuals as shown
by the “relative SS” value. Compare yours with the “Show least-squares line” which is placed by
calculation. No written answers are required here just observe the values.
b) Describe the relationship in the x-y data plotted below:
Quiz score vs chocolate consumption 5. Change in pulse rate with exercise 3. Measured radioactive decay
1. 120
2. 140 1400
Pulse rate after exercise (beats
120 1200
100
Counts per minute
100 1000
Quiz score (%)
80
per minut)
80 800
60
60 600
40 400
40
20 20 200
0 0 0
0 50 100 150 200 250 300 0 20 40 60 80 100 120 0 5 10 15
Daily Chocolate consumption (g) Pulse rate before exercise (beats per minute) Time (mins)
Association
Correlation
Estimate r
(If approp.)
Week 2 Page | 2
Q.5 Do Q4.29 (7th ed: Q4.28) from Moore et al, p.121 and with the same data, Q5.39, p.155.
Download the data set “Sparrowhawk” from the Moodle page/Part 1: Exploring Data.
Produce the scatterplot of the relationship in the (x,y) data.
Describe the association between New Adults arriving and the percentage of returning birds:
ALSO: Apply linear regression analysis using Excel. Include the residual plot.
TUTOR CHECK OF PLOTS: Scatterplot and Residual plot
(Not done in class? You must attach your plots printed out)
ALSO: Describe the residual plot. Is there any trend: is there a curve of data about the line of best
fit OR are the data points randomly scattered either side along the linear trend line?
What does this tell you about fitting a linear model to these data?
Moore Q5.39 a) What is the equation of the linear model for this relationship?
Do not use x and y designations but replace them with a descriptive notation for the variables.
Week 2 Page | 3
Moore 5.39 c) Use the model to predict the new adult number if 45% of adults from the previous
year return:
EXTRA: Verify the value of the residual for the datum point where x = 45. Show the full
calculation of a residual.
Residual at each x = (data y value- line predicted ŷ value)
From this exercise, you should make sure you able to:
• draw a scatterplot:
• obtain a line of best fit for linear data and identify its equation
• obtain a residual plot
• obtain the correlation coefficient
• state what each of the items above tells you about the data.
MARK : /10
Week 2 Page | 4