Correlation Regression
Correlation Regression
1 a) Coca cola is studying the effect of its latest advertising campaign. People chosen at
random were called and asked how many cans of coca cola they had bought in the past
week and how many coca cola advertisement they had either read or seen in the past
week. The data collected from different people are as follows:
b) A broker for a local investment firm has been studying the relationship between
increases in the price of gold (X) and her customers' request to liquidate stocks (y).
From a dataset based on 15 observations, the sample slope was found to be 2.9. If the
standard error of the regression slope coefficient is 0.18, is there reason to believe (at
the 0.05 significance level) that the slope has changed from its past value of 3.2?
ANOVA Table
df Sum of Mean F Value
Square Square
Regression ? ? ? ?
Residual ? 2166000 ?
Total 23 14370000
i) Test the significance of the estimated regression coefficient of X3 at the 5%
significance level
ii) Compute the standard error of the estimate Syx
iii Compute the R2 and interpret its meaning
iv) Given that X1=1295, X2: 18 X3=5, X4= 3, X5= 1 calculate the 95% confidence interval
for the average stand by hours per week.
v) Set up the null and alternative hypothesis, carry out F-test and interpret your result
Car 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Number of Options 3 4 4 7 7 8 9 11 12 12 14 16 20 23 25
ordered (X)
Delivery time in 25 32 26 38 34 41 39 46 44 51 58 53 64 6 70
days (Y) 6
i) Find the correlation coefficient, between the number of options ordered and the
delivery time in days, examine if this relationship is significant at the 5%o level of
significance.
ii) Find the linear regression line. Compute the residual for car 6. Next given that
∑ (Y-Y^)2 = 153 Compute the standard error of the estimate, Sxy and interpret its
meaning.
iii) Given that ∑ (X-X-)2 = 657.33, test the regression coefficient at 5% level of significance
iv) Compute the 95% prediction interval of the delivery time for a car with 14 options
ordered.
4) In the regression problem with a sample size of 25, the slope was found to be 1.12
and the standard error of estimate is 8.516. The quantity ∑x 2-nx-2) = 327.52
i) Find the standard error of the regression slope coefficient.
ii) Test whether the regression coefficient is different from 0 at significance level for
0.05.
iii) Construct a 95o/o confidence interval for the population slope.
5) In a regression equation Y^ = 22.22 + 2.02X with a sample size of 12, given that
∑ (Y-Y^)2 = 153 .421 and ∑ (X-X -)2 = 657.33 compute the standard error of estimate then
test the regression coefficient the 5 o/o level of significance.
● Sales of major appliance vary with the housing market: when new home sales
are good, so are the sales of dishwashers, washing machines, driers, and
refrigerators. A trade association compiled the following historical data (in
thousands of units) on major appliance sales and housing starts.
Housing 2.0 2.5 3.2 3.6 3.3 4.0 4.2 4.6 4.8 5.0
starts ('000)
Appliance 5.0 5.5 6.0 7 .0 7.2 7.7 8.4 9.0 9.7 10.0
sales ('000)
a. Develop an equation for the relationship between appliance sales (in thousand) and
housing starts (in thousands).
b. Interpret the slope of the regression line.
c. Compute and interpret the standard error of estimate and coefficient of
determination.
d. Compute 95% confidence interval for 12 ('000) housing start.
e. Compute 90 % confidence interval for the regression coefficient.
● Nepal Stock Exchange Board has provided the data of ordinary and right share
(in million) for fiscal years 2000/01 to 2005/06 as following:
a) For ordinary share, develop the linear estimating equation that best describes these
data predict for the fiscal year 2007
b) Calculate the percent of trend for ordinary share and in which year does the largest
fluctuation from trend occur
c) Exponentially smooth the right share data using smoothing coefficient (w=0.25).
Workers 1 2 3 4 5 6 7 8 9 10
(Y) Rs 120 130 135 138 142 149 155 158 160 169
(X) Rs 34 37 39 42 41 45 40 52 50 62
Passengers per 100 800 780 780 660 640 600 620 620
miles
a) Develop the estimating equation that best describes these data
b) Predict the number of passengers per 100 miles if the tickets price were 50 cents. Use
a 95 % approximate prediction interval.
Coefficient Table
Predictor Coefficient Standard t
Deviation
Constant 6.584 8.542 ?
ADS 0.625 1.120 ?
Cost 2.139 1.470 ?
ANOVA Table
Source df Sum of Mean F Value
Square Square
Regression ? ? ? ?
Residual ? 143.2 ?
Total 11 453.19
Sales = total revenue based on number of ticket sold (in thousands of dollars)
PROMOT= amount spent on promoting the airline in the area (in thousands of dollars)
COMP= number of competing airlines at that terminal
FREE= the percentage of passengers who flew free (for various reasons)
The regression equation is
Sales = 172.34 +25.650 PROMOT - 13.238 COMP - 3.041 FREE
● A new game show A11 asks contestants to specify the minimum number of
parameters they need to determine whether a multiple regression model is
significant as a whole at α : 0.10. You have won the bidding with 4 parameters' -
Using the information given below, whether the regression is significant.
R2 =0.7452
SSE= 125.4
n= 18
Number of independent variables= 3
● Coca cola is studying the effect of its latest advertising campaign. People chosen
at random were called and asked how many cans of coca cola they had bought
in the past week and how many coca cola advertisement they had either read or
seen in the past week. The data collected from different people are as follows:
ANOVA
df Sum of Mean F Value
Square Square
Regression ? 35181.794 ? ?
Residual ? ? ?
Total 25 56464.615
The ANOVA and coefficient table obtained from SPSS software is as following:
Unstandardized Coefficients t
bi Standard Error
constant 23.60 8.30 ?
X1 0.62 0.39 ?
X2 16.97 7.86 ?
X3 -0.31 0.33 ?
ANOVA
df Sum of Mean F Value
Square Square
Regression ? 388.24 ? ?
Residual ? ?
Total ? 476.90