0% found this document useful (0 votes)

82 views

Big Ipl

The document discusses regression models to analyze factors that influence the price at which players are sold in the Indian Premier League cricket tournament. A linear regression shows strike rate alone explains only 2.6% of price variation. The ability to hit "sixers" (home runs) correlates more strongly at 19.6% of price variation. A multiple regression of strike rate and sixers together explains 19% of price variation. Player age has a very weak 2.6% correlation with price. Players of Indian origin have higher average sold prices than international players. The best model to predict sold price for franchises explains 31.57% of price variation based on factors like strike rate, sixers, and player origin.

Uploaded by

Nishant Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

82 views

Big Ipl

Uploaded by

Nishant Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

PRICING OF PLAYERS IN THE INDIAN PREMIER LEAGUE

CASE QUESTIONS
1. Develop a simple linear regression model between the sold price and batting strike rate,
is there a statistically significant relationship between sold price and batting strike rate?
Answer:
Equation for estimating line:-

Y = β0 + β1 (X)

Sold Price = β0 + β1 (Strike rate)

β0 = 289510.4

β1 = 2086.5

So,

Sold Price = 289510.4 + 2086.5* (Strike rate)

We get R2=0.02641

Variation in Strike rate explains only 2.6% of variation in Sold Price. Therefore, variation in Strike
rate doesn’t explain most of the variations in Sold Price.

In the concerned analysis we can observe that R2=0.02641. This implies that variation in strike
rate explains only 2.6% of variation in sold price. This implies that the level of degree of
dependency between strike rates and sold price is only of 2.6%. Thus, they are not closely related.
In general variables with 70% or above level of dependency are considered to be closely related.
With only 2.6% of dependency, we can conclude that there are other factors which affect sold price
more than strike rate.
2. What is the impact of ability to score “SIXERS” on the player’s price?

Answer:

Equation for estimating line:-

Y = β0 + β1 (X)

Sold Price = β0 + β1 (Sixers)

β0 = 385115

β1 = 7693

So,

Sold Price = 385115 + 7693* (Strike rate)

We get R2= 0.1968

This implies that the level of degree of dependency of sixers on sold price is only of 19.6%.
3. Develop a multiple linear regression model between Sold price and batting striking rate
and Sixers? What do you conclude from this model?

Answer:

Equation for estimating line:-

Y = β0 + β1 (X1) + β2(X2)

Sold Price = β0 + β1 (strike rate) + β2 (Sixers)

β0 = 395327.0

β1 = -102.4

β2 = 7758.7

So,

Sold Price = 395327.0 + (-102.4) * (Strike rate) + 7758.7 * (sixers)

We get R2= 0.1906

This implies that the level of degree of dependency of batting strike rates and sixers on sold price
is only of 19%.

4. Cricket in the T20 format is considered a young man’s sport, is there evidence that the
player’s price is influenced by age?

Answer:

For Category 1: Age < 25, we have taken 1, for other category 0.

Equation for estimating line:-

Y = β0 + β1 (X)

Sold Price = β0 + β1 (Age)

β0 = 493290

β1 = 226961

So,

Sold Price = 493290 + 226961*(Age)

We get R2= 0.02631

This implies that the level of degree of dependency of player’s Age on sold price is only 2.6%.
So the age of the player’s hardly depends on the sold price.

5. Are players of Indian origin paid more than players from other countries?

Answer:

Calculating Mean values of each Category of Age

In the given data, a column was added where Countries cricketers belong to, were codified into
two categories:
Player of Indian Origin – represented by A

Others – represented by B (Clubbed into one category)

Mean for the above mentioned categories have been calculated individually. The mean value of
Sold Price for Category A=Rs.652339.6 and Category B=Rs.430974.

Regression Model between Sold Price and Age

Result

The Multilinear regression model of the given sample is:

Mean Sold Price = β1+ β2 *(A)

Solving the Equation, we get β1 = 430974 and β2 = (221366).

Solving the indicator variable:

If A belongs to 1 then it belongs to player of Indian origin and if it belongs to 0 then it is from
other origin.

Mean Sold Price =720250 - 235715(1) - 200071(0) = 484535

Therefore, the mean selling price when the individual’s age lies in the Category 2 is Rs.
4,84,535.

Similarly,

1, if the individual’s age equals 3

X3 is called
0, otherwise

Mean Sold Price (A) = 430974 + 221366*(1) = 652340

Therefore, the mean selling price when the Player is of Indian Origin is Rs.6,52,340.

Country Code A is serving as our reference or base line. Therefore, to know the mean selling
price when the Players are other than that of Indian Origin is obtained when A equals 0 in the
above equation.

 Mean Sold Price(B) = 430974 + 221366*(0) = 430974

All these values matches the individual categorial means that was calculated through command
1.

Analysis of Regression Model:

In the given result, it can be seen that p-value of the F-statistic is 0.002015, which is very less than
0.05, hence, it is highly significant. This means that, there exist a statistical relation between Sold
Price and the Country Cricketers belong to.

It can be seen that, changing in Country Cricketers belong to is significantly associated to changes
in Sold Price.

Model accuracy assessment

R-squared:

In the given result, it is observed that R-squared value is 0.06481 i.e., 6.481% which is extremely
less. This implies that variation in Sold Price explains only 6.481% of variation in Country
Cricketers belong to. Thus, the predictor variables and the outcome variable are not closely
related. In general variables with 70% or above level of dependency are considered to be
closely related. It can be concluded that there are other factors also, which affect the Sold
Price.

6. Develop the model which can used by Franchises to predict the sold price.

To develop the best model which can be used by Franchises to predict the Sold Price, four models
have been created – modelOpt1, modelOpt2, modelOpt3 and modelOpt4.
Model p-value R-Squared

Value Interpretation Value Interpretation

Option 1 0.5516 More than 0.05. Hence, not -0.78% Negative Adjusted R2 appears when
significant. There doesn’t exists any Residual sum of squares approaches to
significant relationship between one the total sum of squares, that means the
or more predictor variables and the explanation towards response is very
outcome variable. low or negligible. So, Negative
Adjusted R2 means insignificance of
explanatory variables.

Option 2 1.548e-05 Very less than 0.05. Hence, it is 14.68% Extremely less %. Thus, the predictor
highly significant. This means that, variables and the outcome variable are
at least, one of the predictor not closely related.
variables is significantly related to
the outcome variable.

Option 3 3.742e-07 Very less than 0.05. Hence, it is 23.17% Less %. Thus, the predictor variables
highly significant. This means that, and the outcome variable are not closely
at least, one of the predictor related.
variables is significantly related to
the outcome variable.

Option 4 4.91e-07 Very less than 0.05. Hence, it is 31.57% Less %. Thus, the predictor variables
highly significant. This means that, and the outcome variable are not closely
at least, one of the predictor related.
variables is significantly related to
the outcome variable.

In the given result, it can be seen that p-value of the F-statistic in Option 1 is more than 0.05, more
than 50%. This shows that there does not exist any significant relationship between one or more
predictor or independent variables and the outcome variable. Also, the Adjusted R2 has a negative
value which proves that the explanation towards response is very low or negligible. So this option can be removed this
model from the four options.

At the same time, p-value in Option 2, Option 3 and Option 4 are very less than 0.05, Hence, they
represent highly significant relationship. This means that, one or more predictor variables are
significantly related to the outcome variable.
Comparing all the other three options, we see that p-value of Option 3 is farthest from 0.05 as
against other two options. This tells us that model 3 is better than the rest of the three models. But
Adjusted R2 is highest in Option 4. This tells us that model 4 is better than the other three models.
If we compare the p-values of Option 3 and Option 4, there is not much difference in the values.
This contradiction in selection of best model between Option 3 and Option 4 may be due to one or
more insignificant variables in the models. In that case it is better to remove such insignificant
variables to show the best relation between dependant and independent variables.
Model accuracy assessment

The overall quality of the model can be assessed by examining the R-squared (R2) and Residual
Standard Error (RSE).
R-squared:
In multiple linear regression, the R2 represents the correlation coefficient between the observed
values of the outcome variable (y) and the fitted (i.e., predicted) values of y. For this reason, the
value of R will always be positive and will range from zero to one.
R2 represents the proportion of variance, in the outcome variable y, that may be predicted by
knowing the value of the x variables. An R2 value close to 1 indicates that the model explains a
large portion of the variance in the outcome variable.
A problem with the R2 is that, it will always increase when more variables are added to the model,
even if those variables are only weakly associated with the response. A solution is to adjust the R2
by taking into account the number of predictor variables.
The adjustment in the “Adjusted R Square” value in the summary output is a correction for the
number of x variables included in the prediction model.

In the given result, it is observed that R-squared value is 0.1906 i.e., 19.06% which is very less.
This implies that variation in Sold Price explains only 19.06% of variation in Strike Rate of Batting
and Sixers. Thus, the predictor variables and the outcome variable are not closely related. In
general variables with 70% or above level of dependency are considered to be closely related. It
can be concluded that there are other factors also, which affect the Sold Price.

Indian Premier League - Final
67% (3)
Indian Premier League - Final
12 pages
Econometrics: A Simple Introduction
From Everand
Econometrics: A Simple Introduction
K.H. Erickson
3.5/5 (5)
SEE37 Stats & Norm: June 16, 2021
100% (1)
SEE37 Stats & Norm: June 16, 2021
2 pages
Big Data Assn Document PDF
No ratings yet
Big Data Assn Document PDF
22 pages
Big Data Analysis Ipl Case Questions: Vanshika Shukla MFM/18/66
No ratings yet
Big Data Analysis Ipl Case Questions: Vanshika Shukla MFM/18/66
24 pages
IPL
No ratings yet
IPL
8 pages
Business Analytics Using Data Mining
No ratings yet
Business Analytics Using Data Mining
5 pages
Statistics and Probability PROJECT 1
No ratings yet
Statistics and Probability PROJECT 1
4 pages
Econometrics Assignment Week 1-806979
No ratings yet
Econometrics Assignment Week 1-806979
6 pages
IPL - Salary Prediction
No ratings yet
IPL - Salary Prediction
19 pages
Indicates That It Is Significant and Will Reject The Null Hypothesis and Accept The Alternative Hypothesis Since It Is Lower Than 0.05
No ratings yet
Indicates That It Is Significant and Will Reject The Null Hypothesis and Accept The Alternative Hypothesis Since It Is Lower Than 0.05
4 pages
Research Methodology Final CA-2
No ratings yet
Research Methodology Final CA-2
23 pages
Regression Analysis in R
No ratings yet
Regression Analysis in R
7 pages
Advanced Data Analytics Assignment
No ratings yet
Advanced Data Analytics Assignment
17 pages
Lesson-9 (1)
No ratings yet
Lesson-9 (1)
4 pages
Lesson 3.1 SPSS OUTPUT
No ratings yet
Lesson 3.1 SPSS OUTPUT
6 pages
Regression Notes
No ratings yet
Regression Notes
6 pages
group
No ratings yet
group
1 page
IPL Dataset Player Price Prediction: Business Analytics Assignment
No ratings yet
IPL Dataset Player Price Prediction: Business Analytics Assignment
4 pages
Assignment 2
No ratings yet
Assignment 2
2 pages
Assosa University School of Graduate Studies Mba Program
No ratings yet
Assosa University School of Graduate Studies Mba Program
10 pages
Model Comparison
No ratings yet
Model Comparison
27 pages
Unit-III (Data Analytics)
50% (2)
Unit-III (Data Analytics)
15 pages
Regression Using Spss
No ratings yet
Regression Using Spss
12 pages
04.Session Notes on Principal Component Regression(1)
No ratings yet
04.Session Notes on Principal Component Regression(1)
12 pages
Regression (Basic Concepts)
No ratings yet
Regression (Basic Concepts)
39 pages
Quantitative Methods Ii Quiz 1: Saturday, October 23, 2010
No ratings yet
Quantitative Methods Ii Quiz 1: Saturday, October 23, 2010
14 pages
Introduction To Data Science
No ratings yet
Introduction To Data Science
45 pages
Data Analytics Unit III
No ratings yet
Data Analytics Unit III
15 pages
MV - Multiple Regression
No ratings yet
MV - Multiple Regression
19 pages
LECTURE 10-SIMPLE-REGRESSION TO MULTIPLE REGRESSION
No ratings yet
LECTURE 10-SIMPLE-REGRESSION TO MULTIPLE REGRESSION
7 pages
Linear Models and Linear Mixed Effects Models in R
No ratings yet
Linear Models and Linear Mixed Effects Models in R
5 pages
Regression Notes
No ratings yet
Regression Notes
7 pages
Optimising Player Valuations in The Indi
No ratings yet
Optimising Player Valuations in The Indi
53 pages
Data Analysis
100% (1)
Data Analysis
28 pages
Quiz1 Answer FINC Stats SASIN Emba
No ratings yet
Quiz1 Answer FINC Stats SASIN Emba
2 pages
STATISTICAL-MODELLING
No ratings yet
STATISTICAL-MODELLING
39 pages
MODULE 2
No ratings yet
MODULE 2
21 pages
SubjectiveQuestions
No ratings yet
SubjectiveQuestions
4 pages
Introudction To Regression Analysis and Measuring With Stat Model 1702371825910
No ratings yet
Introudction To Regression Analysis and Measuring With Stat Model 1702371825910
16 pages
DISC 212 Session 13
No ratings yet
DISC 212 Session 13
29 pages
Report Group 8 Final
No ratings yet
Report Group 8 Final
13 pages
625 Preliminary
No ratings yet
625 Preliminary
39 pages
Mscfe CRT m2
100% (1)
Mscfe CRT m2
6 pages
강준혁 회귀분석 과제 4
No ratings yet
강준혁 회귀분석 과제 4
10 pages
Regression Output
No ratings yet
Regression Output
3 pages
Number of Observations: It: Number of Variables Plus 1'. Here We Want To Estimate For 1 Variable Only, So Number of
No ratings yet
Number of Observations: It: Number of Variables Plus 1'. Here We Want To Estimate For 1 Variable Only, So Number of
3 pages
QM-II Midterm OCT 2014 Solution
No ratings yet
QM-II Midterm OCT 2014 Solution
19 pages
BMS2024-Multiple Linear Regression-1 Lesson
No ratings yet
BMS2024-Multiple Linear Regression-1 Lesson
37 pages
02450ex Fall2017 Sol
No ratings yet
02450ex Fall2017 Sol
20 pages
02450ex Fall2017
No ratings yet
02450ex Fall2017
12 pages
1276
No ratings yet
1276
13 pages
Data Science Simpli-Ed Part 4: Simple Linear Regression Models
No ratings yet
Data Science Simpli-Ed Part 4: Simple Linear Regression Models
1 page
1. MT PGP - JAN 2018
No ratings yet
1. MT PGP - JAN 2018
16 pages
06_prediction-and-decision-making.en
No ratings yet
06_prediction-and-decision-making.en
2 pages
Practical - Regression
No ratings yet
Practical - Regression
114 pages
Regression Model Development for Revenue Dataset.docx
No ratings yet
Regression Model Development for Revenue Dataset.docx
9 pages
Regression Analysis
No ratings yet
Regression Analysis
3 pages
Iskak, Stats 2
No ratings yet
Iskak, Stats 2
5 pages
PM Week1 MLSDeck0.2
No ratings yet
PM Week1 MLSDeck0.2
15 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Brand Equity (UCB Vs Potner)
No ratings yet
Brand Equity (UCB Vs Potner)
26 pages
Louis Vuitton Document
100% (2)
Louis Vuitton Document
19 pages
Excodus Garment Factory
No ratings yet
Excodus Garment Factory
1 page
Jamawar: Sarees
No ratings yet
Jamawar: Sarees
4 pages
Grand Strategy Matrix
No ratings yet
Grand Strategy Matrix
1 page
IB Jury FINAL Muskan
No ratings yet
IB Jury FINAL Muskan
43 pages
Levis
No ratings yet
Levis
7 pages
Global Fashion Business
No ratings yet
Global Fashion Business
17 pages
Neuromarketing PPT Final
100% (1)
Neuromarketing PPT Final
50 pages
Inference in Regression: Brian Caffo, Jeff Leek and Roger Peng Johns Hopkins Bloomberg School of Public Health
No ratings yet
Inference in Regression: Brian Caffo, Jeff Leek and Roger Peng Johns Hopkins Bloomberg School of Public Health
14 pages
Run Test
No ratings yet
Run Test
24 pages
Problem 4. Consider Two Normal Distributions With ...
No ratings yet
Problem 4. Consider Two Normal Distributions With ...
5 pages
Confidence Interval Estimationnew
No ratings yet
Confidence Interval Estimationnew
12 pages
Regression Analysis Microsoft Excel 1st Edition Conrad Carlberg download
100% (1)
Regression Analysis Microsoft Excel 1st Edition Conrad Carlberg download
69 pages
Module 11 Case Study .
No ratings yet
Module 11 Case Study .
4 pages
Student's Level of Satisfaction Towards The Selected School Facilities
0% (1)
Student's Level of Satisfaction Towards The Selected School Facilities
2 pages
Research Article: Bootstrapping Nonparametric Prediction Intervals For Conditional Value-at-Risk With Heteroscedasticity
No ratings yet
Research Article: Bootstrapping Nonparametric Prediction Intervals For Conditional Value-at-Risk With Heteroscedasticity
7 pages
Impact of HR Practices On Employee Performance
No ratings yet
Impact of HR Practices On Employee Performance
19 pages
Revisiting Measures of Risk
No ratings yet
Revisiting Measures of Risk
11 pages
RHandbookProgramEvaluation PDF
100% (1)
RHandbookProgramEvaluation PDF
759 pages
Bola Armoush - MAT120 final-REVIEW PART 1.rtf
No ratings yet
Bola Armoush - MAT120 final-REVIEW PART 1.rtf
8 pages
Business Statistics Assignment
100% (1)
Business Statistics Assignment
17 pages
Live Crypto Sentiment: Social Media Influence On Multi-Sectoral Coin and Its Impact On Portfolio Risk Management, Using Data Analytics.
No ratings yet
Live Crypto Sentiment: Social Media Influence On Multi-Sectoral Coin and Its Impact On Portfolio Risk Management, Using Data Analytics.
9 pages
Paper Statistics Bangalore University
No ratings yet
Paper Statistics Bangalore University
13 pages
Lectura 11. Cutting or Capping of High Assay Values
No ratings yet
Lectura 11. Cutting or Capping of High Assay Values
19 pages
AE 413 Lecture 5 - Forecsating Typologies 2023-2024-1
No ratings yet
AE 413 Lecture 5 - Forecsating Typologies 2023-2024-1
12 pages
Contoh Jurnal
No ratings yet
Contoh Jurnal
9 pages
Running A T-Test in Excel
No ratings yet
Running A T-Test in Excel
3 pages
Lms Weight Girls 3mon P
No ratings yet
Lms Weight Girls 3mon P
1 page
Stat 112 D. Small Example of Regression Analysis: Emergency Calls To The New York Auto Club
No ratings yet
Stat 112 D. Small Example of Regression Analysis: Emergency Calls To The New York Auto Club
7 pages
How Meta-Analysis Increases Statistical Power (Cohn 2003)
No ratings yet
How Meta-Analysis Increases Statistical Power (Cohn 2003)
11 pages
DSE 2151 24 Sep 2022
No ratings yet
DSE 2151 24 Sep 2022
5 pages
Data Table: No. Date Stock Prices Returns DHT Vnindex DHT Vnindex
No ratings yet
Data Table: No. Date Stock Prices Returns DHT Vnindex DHT Vnindex
7 pages
CHE 331 Engineering Statistical Design Answers For Midterm Exam - Winter 2010 Question 1 (2-166)
No ratings yet
CHE 331 Engineering Statistical Design Answers For Midterm Exam - Winter 2010 Question 1 (2-166)
5 pages
Hotelling-T 2 Control Chart
No ratings yet
Hotelling-T 2 Control Chart
18 pages
Statistics For Engineers (MAT2001) - Syllabus
No ratings yet
Statistics For Engineers (MAT2001) - Syllabus
3 pages
Christensen 等 - 2023 - A Machine Learning Approach to Volatility Forecast
No ratings yet
Christensen 等 - 2023 - A Machine Learning Approach to Volatility Forecast
48 pages
Final SPSS Record (1)
No ratings yet
Final SPSS Record (1)
44 pages