Presentation On IPL Match Winner Prediction With ML
This document describes a machine learning project that predicts the winner of an IPL cricket match based on features like the playing teams, toss winner, and venue. It discusses preprocessing the dataset, encoding categorical features, training logistic regression, SVM, decision tree and random forest models on the data, and selecting random forest as it achieved the highest accuracy of 55% based on cross validation.
Download as PPTX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
363 views
Presentation On IPL Match Winner Prediction With ML
This document describes a machine learning project that predicts the winner of an IPL cricket match based on features like the playing teams, toss winner, and venue. It discusses preprocessing the dataset, encoding categorical features, training logistic regression, SVM, decision tree and random forest models on the data, and selecting random forest as it achieved the highest accuracy of 55% based on cross validation.
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 27
PRESENTATION ON
IPL MATCH WINNER
PREDICTION WITH ML Submitted by: Deepshekhar Dey Chandan Kumar Gupta Himadri Sikhar Gogoi Ritusman Kashyap Bhuyan PROJECT OVERVIEW Our project is basically a prediction project which predicts which team will win in an IPL match provided the match details like playing teams, toss winning team, toss decision, and the playing venue. This is done with the analysis of different ML prediction algorithms and choosing the most efficient one for the prediction. DATASET INFO Our algorithm does the prediction based on the past data of the teams that have already played. This past data is provided to the algorithm in the form of a dataset. In the dataset we can see that all the information regarding the matches are provided. Our model uses these past data to predict the upcoming match events. Now, in our dataset there are many unnecessary columns, so we drop those columns and extract only those columns which are relevant to our prediction process. FEATURES AND TARGET VARIABLES IN OUR DATASET
Here in our dataset, Team 1, Team 2, Toss
Winner, Toss Decision, and Venue are our features. Winning Team is our target variable. First, we import all our necessary python and ML libraries that we will use in our project. Pandas- Helps us with data analysis. Numpy- Helps us with numerical computation. Seaborn and Matplotlib- Helps us with data visualization. The first thing we do is check whether there are null values in our dataset or not. Since there is null value in our Winner column so, we replace those values with “DRAW”. After removing all the null values from our dataset, we extract our necessary columns which are going to be our features and target variables. We can see that there are no null values in our dataset. So, we proceed further. Since, our dataset contains categorical data, so we need to encode them into numerical before feeding them into ML algorithm. So, we use LabelEncoder, which will encode our categorical values into numerical values. Encoding our teams. Encoding our venues. Encoding our “Toss Decision” column. After encoding all categorical columns, our dataset looks like this. So, we are ready to feed this dataset into ML algorithm. Before feeding our dataset into our ML model, we divide it into train data and test data. Our train feature data. Our test feature data. Our train target data. Similarly we have our test target data. LOGISTIC REGRESSION
Using Logisitic Regression model in our
dataset with accuracy 30%. SUPPORT VECTOR MACHINE
Using Support Vector Machine(SVM) model in
our dataset with accuracy 38%. DECISION TREE
Using Decision Tree model in our dataset with
accuracy 48%. RANDOM FOREST
Using Random Forest model in our dataset
with accuracy 51%. CROSS VALIDATION
Cross-validation is a re sampling method that uses different
portions of the data to test and train a model on different iterations. After applying cross validation scores on various algorithms on our dataset, we can see that the Random Forest Algorithm is providing us the maximum accuracy upto 55%. So, we will use Random Forest model to predict our target variable. Our prediction function. PREDICTIONS THANK YOU!!!!