0% found this document useful (0 votes)
6 views

Ipl Data Analysis Pbl

The document discusses data wrangling and visualization, focusing on the analysis of IPL data using Python. It outlines key functionalities of data wrangling, such as data exploration, handling missing values, reshaping, and filtering data. Additionally, it introduces PandasAI for generating insights from data and mentions visualization libraries like Matplotlib and Seaborn used for analyzing IPL performance trends and player statistics.

Uploaded by

23eg106e26
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Ipl Data Analysis Pbl

The document discusses data wrangling and visualization, focusing on the analysis of IPL data using Python. It outlines key functionalities of data wrangling, such as data exploration, handling missing values, reshaping, and filtering data. Additionally, it introduces PandasAI for generating insights from data and mentions visualization libraries like Matplotlib and Seaborn used for analyzing IPL performance trends and player statistics.

Uploaded by

23eg106e26
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 11

DATA

WRANGLING &
VISUALIZATION
PROJECT BASED
TOPIC:IPL DATA
LEARNING
ANALYSIS
PRESENTER
S:23EG106E26
23EG106E37
24EG506A01
24EG506B03
WHAT IS DATA
WRANGLING
Data Wrangling is the process of gathering, collecting, and transforming Raw
data into another format for better understanding, decision-making, accessing,
and analysis in less time. Data Wrangling is also known as Data Munging.

Data wrangling in Python deals with the below functionalities:


1.Data exploration: In this process, the data is studied, analyzed, and understood by visualizing
representations of data.
2.Dealing with missing values: Most of the datasets having a vast amount of data contain
missing values of NaN, they are needed to be taken care of by replacing them with mean,
mode, the most frequent value of the column, or simply by dropping the row having a NaN
value.
3.Reshaping data: In this process, data is manipulated according to the requirements, where
new data can be added or pre-existing data can be modified.
4.Filtering data: Some times datasets are comprised of unwanted rows or columns which are
required to be removed or filtered
5.Other: After dealing with the raw dataset with the above functionalities we get an efficient
dataset as per our requirements and then it can be used for a required purpose like data
analyzing, machine learning, data visualization, model training etc.
It is a new tool for making data analysis and visualization tasks easier.
PandasAI is built with Python’s Pandas library and uses Generative AI and
LLMs in its work. Unlike Pandas, in which you have to analyze and
manipulate data manually, PandasAI LLMs allow you to generate insights
from data by simply providing a text prompt. It is like giving instructions to
your assistant, who is skilled and proficient and can do the work for you
quickly. The only difference is that it is not a human but a machine that can
1 Matplotlib (portmanteau of MATLAB, plot, and library[3]) is a plotting library for
the Python programming language and its numerical mathematics extension
NumPy.
2 It provides an object-oriented API for embedding plots into applications using
general-purpose GUI toolkits like Tkinter, wxPython, Qt, or GTK

Comparison with MATLAB

Pyplot is a Matplotlib module that provides a MATLAB-like interface. Matplotlib is


designed to be as usable as MATLAB, with the ability to use Python, and the advantage
of being free and open-source.
Matplotlib supports various types of 2 dimensional and 3 dimensional plots. The support for two
dimensional plots is robust. The support for three dimensional plots was added later and while it
is good, it is not as robust as 2 dimensional plots.
Studio Shodwe
1.IMPORTING LIBRARIES

2.IPL
DATASET
1.Downloading IPL
dataset
2.checking IPL Dataset
Attributes
IPL DATA ANALYSIS AND VISUALIZATION
WITH PYTHON
READ CSV FILE
DATA ANALYSIS
MENU

10. Match win By Maximum


Runs:
GRAPH MENU

4.Most Successful Team - Bar


Graph
We used Python and the Pandas library along
with other visualization libraries like Matplotlib
and Seaborn, analyzing an IPL dataset on the
following terms:
• Who the top players are?
• Which teams have had the best performances
all these years?
THANK
• What trends exist in the match outcomes?

YOU

You might also like