100% found this document useful (1 vote)
144 views

Correlation Regression

The seventh document analyzes

Uploaded by

Nikhil Tandukar
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
144 views

Correlation Regression

The seventh document analyzes

Uploaded by

Nikhil Tandukar
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

DAM

1 a) Coca cola is studying the effect of its latest advertising campaign. People chosen at
random were called and asked how many cans of coca cola they had bought in the past
week and how many coca cola advertisement they had either read or seen in the past
week. The data collected from different people are as follows:

Number of ads (X) 3 7 6 6 10 12 12 13 12 13 14 15


Cans purchased (Y) 33 38 24 61 52 45 65 82 29 63 50 79

i) Plot these data.


ii) Develop the estimating equation that best describes these data.
iii) Calculate the standard error of estimate.

b) A broker for a local investment firm has been studying the relationship between
increases in the price of gold (X) and her customers' request to liquidate stocks (y).
From a dataset based on 15 observations, the sample slope was found to be 2.9. If the
standard error of the regression slope coefficient is 0.18, is there reason to believe (at
the 0.05 significance level) that the slope has changed from its past value of 3.2?

2) A manager selects a representative sample of 24 monthly customer bills taken from


several recent heating seasons. The manager considers kilowatt hours per month (Y) as
a linear function of square feet heated space (X1), an index of roof insulation quality
(X2), presence/absence of insulated windows (X3), mean temperature (X4), and heat
pump/electric forced air(X5).
The SPSS output is as following:
Coefficient Table
Unstandardized Coefficients t sig
bi Standard Error
constant 6356.17 838.701 ? 0.00000
X1 0.56038 0.15811 ? 0.0023
X2 -31.2077 8.95905 ? 0.0027
X3 -327.503 149.169 ?
X4 -113.895 16.2604 ? 0.00000
X5 -621.458 147.828 ? 0.0005

ANOVA Table
df Sum of Mean F Value
Square Square
Regression ? ? ? ?
Residual ? 2166000 ?
Total 23 14370000
i) Test the significance of the estimated regression coefficient of X3 at the 5%
significance level
ii) Compute the standard error of the estimate Syx
iii Compute the R2 and interpret its meaning
iv) Given that X1=1295, X2: 18 X3=5, X4= 3, X5= 1 calculate the 95% confidence interval
for the average stand by hours per week.
v) Set up the null and alternative hypothesis, carry out F-test and interpret your result

3) A statistician for an American automobile manufacturer would like to develop a


statistical model for predicting delivery time (the days between the ordering of the car
and the actual delivery of the car) of custom-ordered new automobiles. A random
sample of 15 cars is selected with the results is summarized in the following table

Car 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Number of Options 3 4 4 7 7 8 9 11 12 12 14 16 20 23 25
ordered (X)
Delivery time in 25 32 26 38 34 41 39 46 44 51 58 53 64 6 70
days (Y) 6

Calculation Shows that

∑X=175 ∑X2=2699 ∑Y=687 ∑Y2=34305 ∑XY=9344

i) Find the correlation coefficient, between the number of options ordered and the
delivery time in days, examine if this relationship is significant at the 5%o level of
significance.
ii) Find the linear regression line. Compute the residual for car 6. Next given that
∑ (Y-Y^)2 = 153 Compute the standard error of the estimate, Sxy and interpret its
meaning.
iii) Given that ∑ (X-X-)2 = 657.33, test the regression coefficient at 5% level of significance
iv) Compute the 95% prediction interval of the delivery time for a car with 14 options
ordered.

4) In the regression problem with a sample size of 25, the slope was found to be 1.12
and the standard error of estimate is 8.516. The quantity ∑x 2-nx-2) = 327.52
i) Find the standard error of the regression slope coefficient.
ii) Test whether the regression coefficient is different from 0 at significance level for
0.05.
iii) Construct a 95o/o confidence interval for the population slope.
5) In a regression equation Y^ = 22.22 + 2.02X with a sample size of 12, given that
∑ (Y-Y^)2 = 153 .421 and ∑ (X-X -)2 = 657.33 compute the standard error of estimate then
test the regression coefficient the 5 o/o level of significance.
● Sales of major appliance vary with the housing market: when new home sales
are good, so are the sales of dishwashers, washing machines, driers, and
refrigerators. A trade association compiled the following historical data (in
thousands of units) on major appliance sales and housing starts.
Housing 2.0 2.5 3.2 3.6 3.3 4.0 4.2 4.6 4.8 5.0
starts ('000)
Appliance 5.0 5.5 6.0 7 .0 7.2 7.7 8.4 9.0 9.7 10.0
sales ('000)

a. Develop an equation for the relationship between appliance sales (in thousand) and
housing starts (in thousands).
b. Interpret the slope of the regression line.
c. Compute and interpret the standard error of estimate and coefficient of
determination.
d. Compute 95% confidence interval for 12 ('000) housing start.
e. Compute 90 % confidence interval for the regression coefficient.

● Nepal Stock Exchange Board has provided the data of ordinary and right share
(in million) for fiscal years 2000/01 to 2005/06 as following:

FY 2000 2001 2002 2003 2004 2005

Ordinary 279 320 394 6s7 377 579


Share
Right 132 622 162 70 949 1013
share

a) For ordinary share, develop the linear estimating equation that best describes these
data predict for the fiscal year 2007
b) Calculate the percent of trend for ordinary share and in which year does the largest
fluctuation from trend occur
c) Exponentially smooth the right share data using smoothing coefficient (w=0.25).

Workers 1 2 3 4 5 6 7 8 9 10
(Y) Rs 120 130 135 138 142 149 155 158 160 169
(X) Rs 34 37 39 42 41 45 40 52 50 62

Calculation Shows that

∑X=442 ∑X2=20164 ∑Y=1456 ∑Y2=214084 ∑XY=65372


i) Calculate the correlation coefficient between the daily wages and amount of monthly
rent payments, examine if this relationship is significant at the 5 % level of significance.
ii) Estimate the best fitting regression line. Compute the residual for unskilled worker 6.
Also compute the standard error of the estimate Syx and interpret its meaning.
iii) Compute the coefficient of determination and interpret its meaning.
iv) Test the significance of regression coefficient at the 5% level of significance.
v) Compute the 95o/o prediction interval for amount of monthly payments (Y) of
unskilled worker 8.
● The following data represent the annual number of employees (in thousands) in
an oil supply company for the years 1985-1994
(2+5+4+4+5)

Year 1985 1986 1987 1988 1989 1990 1991 1992


Number 65 89 114 200 250 300 314 444

i) Plot the above data.


ii) Find the linear estimating equation that best describes the data
iii) calculate the percent of trend and in which year does largest fluctuation occur?
iv) Also develop the second degree equation.
v) Find the TAD, MAD, & MSE for the above data by using part (ii)

● A study by the Atlantic, Georgia department of Transportation on the effect of


the bus ticket prices on the number of the passengers produced the following
results:
Ticket price (cent) 25 30 35 40 45 50 55 60

Passengers per 100 800 780 780 660 640 600 620 620
miles
a) Develop the estimating equation that best describes these data
b) Predict the number of passengers per 100 miles if the tickets price were 50 cents. Use
a 95 % approximate prediction interval.
Coefficient Table
Predictor Coefficient Standard t
Deviation
Constant 6.584 8.542 ?
ADS 0.625 1.120 ?
Cost 2.139 1.470 ?

ANOVA Table
Source df Sum of Mean F Value
Square Square
Regression ? ? ? ?
Residual ? 143.2 ?
Total 11 453.19

a) Complete above coefficient table and ANOVA table


b) Find the standard error of estimate and the coefficient of multiple determination.
c) Write down the regression equation of Sales on Ads and Cost
d) Predict the value of Sales when Ads is 20 and Cost is 30.Also construct 90%
confidence interval.
A New-England-based ABC airline has taken a survey of its 15 terminals and has
obtained the following data for the month of September, where

Sales = total revenue based on number of ticket sold (in thousands of dollars)
PROMOT= amount spent on promoting the airline in the area (in thousands of dollars)
COMP= number of competing airlines at that terminal
FREE= the percentage of passengers who flew free (for various reasons)
The regression equation is
Sales = 172.34 +25.650 PROMOT - 13.238 COMP - 3.041 FREE

Predictor Coefficient Standard


Deviation
Constant 172.34 51.38
PROMOT 25.650 4.877
COMP -13.238 3.686
FREE -3.041 2.342
a) Do the passengers who fly free cause sales to decrease significantly? Use α : 0.05.
b) Does an increase in promotions by 1000 dollars change sales by 28000 dollars or is
the change significantly different from 28000 dollars? Use α : 0.10
c) Give a 90 percent confidence interval for the slope coefficient of COMP

● A new game show A11 asks contestants to specify the minimum number of
parameters they need to determine whether a multiple regression model is
significant as a whole at α : 0.10. You have won the bidding with 4 parameters' -
Using the information given below, whether the regression is significant.
R2 =0.7452
SSE= 125.4
n= 18
Number of independent variables= 3

● Coca cola is studying the effect of its latest advertising campaign. People chosen
at random were called and asked how many cans of coca cola they had bought
in the past week and how many coca cola advertisement they had either read or
seen in the past week. The data collected from different people are as follows:

Number ads (X) 3 7 6 6 10 12 12 13 12 13 14 15


Cans purchased (Y) 33 38 24 61 52 45 65 82 29 63 50 79

i) Plot these data.


ii) Develop the estimating equation that best describes these data.
iii) Calculate the standard error of estimate.
iv) Calculate coefficient of determination of and interpret its meaning.

From the following SPSS output:


Unstandardized Coefficients t
bi Standard Error
constant -330.832 110.895 ?
X1 1.246 0.412 ?
X2 -0.118 0.054 ?
X3 -0.297 0.118 ?
X4 0.131 0.059 ?

ANOVA
df Sum of Mean F Value
Square Square
Regression ? 35181.794 ? ?
Residual ? ? ?
Total 25 56464.615

i) Test the significance of the estimated regression coefficient of X3 at the 5%


significance level
ii) Compute the standard error of the estimate Syx
iii) Compute the R2 and interpret its meaning
iv) Given that X1=319, X2= 449 X3=279, X4= 1813 calculate the 95% confidence interval
for Y.
v) Set up the null and alternative hypothesis, carry out F-test and interpret Your result

● A health research team collects data on ten communities. Measurements are


obtained on the following variables:

Y = health-care facility utilization index


X1=median family income
X2= proportion of workers with health insurance
X3= doctor population ratio

The ANOVA and coefficient table obtained from SPSS software is as following:

Unstandardized Coefficients t
bi Standard Error
constant 23.60 8.30 ?
X1 0.62 0.39 ?
X2 16.97 7.86 ?
X3 -0.31 0.33 ?

ANOVA
df Sum of Mean F Value
Square Square
Regression ? 388.24 ? ?
Residual ? ?
Total ? 476.90

i) Complete the above coefficient table and ANOVA table


ii) Fit a multiple regression model and predict the value of Y when X1=15, X2= 22, X3=25
iii) Is there any significant relationship between dependent and three independent
variables? (Test at 5% significant level)
iv) Test the significance of the estimated regression coefficient of X2 at the 5%
significance level
v) What proportion variation in health care facility utilization index (Y) is explained by
the three independent variables?
vi) Compute and interpret the standard error of estimate

You might also like