Correlation and Regression
Correlation and Regression
1.Calculate the coefficient of correlation from the following data and interpret the value
Advertising expenditure : 10 12 13 23 27 30
(Rs. In lakhs)
Sales turnover : 40 42 40 45 48 50
(Rs. In crores)
2.The following data show the experience of machine operators and their performance ratings as given by
the number of good parts turned out per 100 pieces.
Operator : 1 2 3 4 5 6 7 8
experience(X) : 16 12 18 4 3 10 5 12
performance : 87 88 89 68 78 80 75 83
rating(y)
Calculate the regression line of performance ratings on experience and estimate the probable
performance if an operator has 10 year’s experience.
3.The following table gives the aptitude test scores and the productivity indices of 6 workers selected at
random.
Aptitude Index(X) : 60 62 65 70 72 48
Productivity Index(Y) : 68 60 62 80 85 40
a)What are dependent and independent variables?
b)Fit regression of Y on X
c)estimate the average productivity of a worker whose test score is 82.
d)Calculate the coefficient of determination and interpret
e)conduct a test to determine whether relation between X and Y is significant.
4.The following is an estimated supply regression for sugar
Y = 0.025 + 1.25 X
Where Y is supply in kilos and X is price(Rs.) per kilo.
a)Interpret the coefficient of variable X
b)Predict the supply when price is Rs 20 per kilo
5.The following table gives the age of cars of a certain make and the annual maintenance costs. Obtain the
regression equation for costs related to age.
3. ƩY = 340 4. Ʃ (X - )2 = 234
7. SSR = 104
What is the value of correlation coefficient and coefficient of determination? Interpret the value of
Coefficient of determination
6. An automobile dealer wants to see if there is a relationship between monthly sales and the interest rate. A
random sample of 4 months was taken. The results of the sample are presented below. The estimated least
squares regression equation is
= 75.061 - 6.254X
Y X
Monthly Sales Interest Rate (In Percent)
22 9.2
20 7.6
10 10.4
45 5.3
a. Obtain a measure of how well the estimated regression line fits the data (R2).
b. You want to test to see if there is a significant relationship between the interest
rate and monthly sales at the 1% level of significance. State the null and
alternative hypotheses.
c. At 1% level of significance tests the above hypothesis.
d. Construct a 99% confidence interval for the average monthly sales for all months
with a 10% interest rate.
7.You are given the following information about advertising expenditure and sales
Mean 10 90
Standard deviation 3 12
Marks in
Economics: 25 28 35 32 31 36 29 38 34 32
Marks in
Statistics: 43 46 49 41 36 32 31 30 33 39
9. Find the most likely production corresponding to a rainfall of 40” from the following data
Rain fall Production
Average 30” 50 quintals
Standard deviation 5” 10 quintals
Coefficient of correlation = 0.8
10. Find the regression equation showing the regression equation of capacity utilization on
production from the following data
Average Standard deviation
Production 35.6 10.5
(in lakh units)
Capacity utilization 84.8 8.5
(in percentage)
11. The following data give the percentage of women working in five companies in the retail and trade
industry. The percentage of management jobs held by women in each company is also shown.
%Working 67 45 73 54 61
% Management 49 21 65 47 33
a. Develop a scatter diagram for these data with the percentage of women working in the company
as the independent variable.
b. what does the scatter diagram developed in part (a) indicate about the relationship between the
two variables?
c. Try to approximate the relationship between the percentage of women working in the company
and the percentage of management jobs held by women in that company.
d. Develop the estimated regression equation by computing the values of b0 and b1.
e. Predict the percentage of management jobs held by women in a company that has 60%
women employees
12. To the Internal Revenue Service, the reasonableness of total itemized deductions depends on the
taxpayer’s adjusted gross income. Large deductions, which include charity and medical deductions,
are more reasonable for taxpayers with large adjusted gross incomes. If a taxpayer claims larger than
average itemized deductions for a given level of income, the chances of an IRS audit are increased.
Data (in thousands of dollars) on adjusted gross income and the average or reasonable amount of
itemized educations follow.
Reasonable Amount of
Adjusted Gross Income ($1000s) Itemized Deductions ($1000s)
22 9.6
27 9.6
32 10.1
48 11.1
65 13.5
85 17.7
120 25.5
a. Develop a scatter diagram for these data with adjusted gross income as the independent
variable.
b. Use the least squares method to develop the estimated regression equation.
c. Predict the reasonable level of total itemized deductions for a taxpayer with an adjusted gross
income of $52,500. If this taxpayer claimed itemized deductions of $20,400, would the IRS
agent’s request for an audit appear justified? Explain.
13. A large city hospital conducted a study to investigate the relationship between the number of
unauthorized days that employees are absent per year and the distance (miles) between home and work
for the employees. A sample of 10 employees was selected and the following data were collected.
Distance to Work Number of Days
(miles) Absent
1 8
3 5
4 8
6 7
8 6
10 3
12 5
14 2
14 4
18 2
a. Develop a scatter diagram for these data. Does a linear relationship appear reasonable?
Explain.
b. Develop the least squares estimated regression equation that relates the distance to
work to the number of days absent.
c. Predict the number of days absent for an employee who lives 5 miles from the hospital.