0% found this document useful (0 votes)
2 views

Capstone Presentation

Uploaded by

ANIL
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Capstone Presentation

Uploaded by

ANIL
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 10

Capstone Presentation -

Cricket Win Prediction

By
Anil Ulchala
PGPDSBA Online May_A 2021
Business Problem Understanding
Business Problem:
BCCI has come up with a problem on how to increase the Team India winning Probability. For the same, BCCI had a tie up with Data Analytics
Consultant. The major objective of this tie up is to extract actionable insights from the historical match data and make strategic changes to make India win.
Primary objective is to create Machine Learning models which correctly predicts a win for the Indian Cricket Team. Once a model is developed then you have to
extract actionable insights and recommendation.
Constraints:
• The data set provided has slight imbalance. 83% of the data relates to the matches where India has won. Rest 17% of the data relates to the matches
where India lost. So this may over train the model and bias the output towards Winning
• Also, another constraint is Data pertaining to some formats is not available. One such scenario is as per the problem requirement, it is needed to predict
the India winning strategy against Australia in T20 format. But the dataset has no record of India played T20 with Australia in the past. So this is a
constraint that will not allow to split the data into format wise and do analysis. Also, we cannot able to build three separate models based on format type.
Scope:
The scope of the business problem is to draw actionable insights and recommendations from the data set and build the models to predict the
result of Team India.
objectives:
The main objective is provide the winning strategy for the upcoming matches India will play with their opponents. The strategy should be
different when you play a match with same opponent and same parameters.
Data Set and Dictionary

Subheading

Lorem Ipsum is simply


dummy text of the printing
Variables Description
Variables Description
Audience_number
and typesetting industry.
Total number of audience in the stadium
Game_number Unique ID for each match
Offshore Match played within country or outside of the country
Result Final result of the match
Max_run_scored_1over Maximum run scored in 1 over by team
Avg_team_Age Average age of the playing 11 players for that match
Max_wicket_taken_1over Maximum wicket taken in 1 over by team
Match_light_type type of match: Day, night or day & night
Extra_bowls_bowled Total number of extras bowled by team
Match_format Format of the match: T20, ODI or test
Min_run_given_1over Minimum run given by the bowler in one over
Bowlers_in_team how many full time bowlers has been player in the team
Min_run_scored_1over Minimum run scored in 1 over by team
Wicket_keeper_in_team how many full time wicket keeper has been player in the team
Max_run_given_1over Maximum run given by the bowler in one over
All_rounder_in_team how many full time all rounder has been player in the team
extra_bowls_opponent Total number of extras bowled by opponent
First_selection First inning of team: batting or bowling
player_highest_run Highest score in the match by one player
Opponent Opponent team in the match
Players_scored_zero Number of player out on zero run
Season What is the season of the city, where match has been played
player_highest_wicket Highest wickets taken by single player in match
Modelling Approach Used & Why
Preprocessing the data ex.. Cleaning and deleting unwanted variables

Considering the constraints on the data set only one model is build for the complete dataset and used accordingly based
on the strategy required

One hot encoding is done for the Object variables and on necessary dependent variables

‘Result’ Variable is considered as Target variable

Data is divided into Train and Test samples (70:30)

Using Sklearn frunction imported necessary models

GridSearchCV method is used to Hyper tune the models for the best accuracy

Comparison of various models based on model metrics

Finalized the modes based on the model metrics – Logistic Regression


Modelling Approach Used & Why – (Contd.
• Various Models are created on the data set
• Decision Tree
• Random Forest
• ANN and
• Logistic Regression

Subheading
• Table 1, gives the comparison of the various models built Table 1: Comparison between the models
• Based on the below parameters, it is identified that Lorem Ipsum is simply
Logistic Regression model is the best model to predict dummy text of the printing
the Match result of Team India against opponents and typesetting industry.
• Accuracy of the model
• Precision
• Recall &
• Performance of the model on Test and Train data

• Logistic Model has highest accuracy of 87% and a Precision of


0.89 and also the model performance is stable across train and
test data
Insights from Analysis
From the Analysis so far done now following insights are capsulated:
 Most of the times India has success rate
 India played most of the matches with Avg team age as 30 and highest wins are recorded
with this average age
 India is performing well in the home pitches than foreign tours
 India is doing well in the Rainy season compared to other seasons
 Indian team won most of the matches when they opted bowling first, in ODIs and Day
matches
Insights from Analysis(Contd..)
From the Analysis so far done now following insights are capsulated:
 As per the data when you bowl 40 extra balls to the opponent, definitely India will lose the
match
 If the opponent bowls extra balls more than 10 there are high chances that Team India will
win
 If the opponent bowls extra balls more than 16 India will win the Match as per the data set
 Overall India has highest winning rate against Bangladesh followed by Pakistan with three
full-time bowlers
 Indian team is performing well with West Indies, Bangladesh, England and Pakistan.
 Indian teams win rate is less with South Africa and Srilanka compared to other opponents
 India has lost to Srilanka majority of the times it is Summer Season.
 chances of Winning the match is high when 5 wickets are taken by a single player in the
match and when team has min of 3 bowlers
Recommendations
 1 Test match with England in England. All the match are day matches. In England, it will be rainy season at
the time to match.
So Team Dynamics should be as follows:

 2 T20 match with Australia in India. All the match are Day and Night matches. In India, it will
be winter season at the time to match.
So Team Dynamics should be as follows:
Recommendations (contd.
 2 ODI match with Sri Lanka in India. All the match are Day and Night matches. In India, it will
be winter season at the time to match.

So Team Dynamics should be as follows:


& A
Q

You might also like