SlideShare a Scribd company logo
INTRODUCTION TO PANDAS
A LIBRARY THAT IS USED FOR DATA MANIPULATION AND ANALYSIS TOOL
USING POWERFUL DATA STRUCTURES
TYPES OF DATA STRUCTUE IN PANDAS
Data Structure Dimensions Description
Series 1 1D labeled homogeneous
array, sizeimmutable.
Data Frames 2 General 2D labeled, size-
mutable tabular structure
with potentially
heterogeneously typed
columns.
Panel 3 General 3D labeled, size-
mutable array.
SERIES
• Series is a one-dimensional array like structure with homogeneous
data. For example, the following series is a collection of integers 10,
23, 56,
• … 10 23 56 17 52 61 73 90 26 72
DataFrame
• DataFrame is a two-dimensional array with heterogeneous data. For
example,
Name Age Gender Rating
Steve 32 Male 3.45
Lia 28 Female 4.6
Vin 45 Male 3.9
Katie 38 Female 2.78
Data Type of Columns
Column Type
Name String
Age Integer
Gender String
Rating Float
PANEL
• Panel is a three-dimensional data structure with heterogeneous data.
It is hard to represent the panel in graphical representation. But a
panel can be illustrated as a container of DataFrame.
DataFrame
• A Data frame is a two-dimensional data structure, i.e., data is aligned
in a tabular fashion in rows and columns.
• Features of DataFrame
• Potentially columns are of different types
• Size – Mutable
• Labeled axes (rows and columns)
• Can Perform Arithmetic operations on rows and columns
Structure
pandas.DataFrame
pandas.DataFrame(data, index , columns , dtype , copy )
• data
• data takes various forms like ndarray, series, map, lists, dict, constants and also
another DataFrame.
• index
• For the row labels, the Index to be used for the resulting frame is Optional Default
np.arrange(n) if no index is passed.
• columns
• For column labels, the optional default syntax is - np.arrange(n). This is only true if
no index is passed.
• dtype
• Data type of each column.
• copy
• This command (or whatever it is) is used for copying of data, if the default is False.
• Create DataFrame
• A pandas DataFrame can be created using various inputs like −
• Lists
• dict
• Series
• Numpy ndarrays
• Another DataFrame
Example
• Example
• import pandas as pd
• Data = [_______]
• Df = pd.DataFrame(data)
• Print df
Example 2
Import pandas as pd
Data = {‘Name’ : [‘__’. ‘__’],’Age’: [___]}
Df = pd.DataFrame(data)
print df
Example
• import pandas as pd
• data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
• df = pd.DataFrame(data)
• print df
• ________________________________________
• import pandas as pd
• data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
• df = pd.DataFrame(data, index=['first', 'second'])
• print df
• import pandas as pd
• data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
• #With two column indices, values same as dictionary keys
• df1 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b'])
• #With two column indices with one index with other name
• df2 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b1'])
• print df1print df2
The following example shows how to create a DataFrame with
a list of dictionaries, row indices, and column indices.
• import pandas as pd
• data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}]
• #With two column indices, values same as dictionary keys
• df1 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b'])
• #With two column indices with one index with other name
• df2 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b1'])
• print df1
• print df2
Create a DataFrame from Dict of Series
• import pandas as pd
• d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']), 'two' : pd.Series([1,
2, 3, 4], index=['a', 'b', 'c', 'd'])}
• df = pd.DataFrame(d)
• print df
Column Addition
• import pandas as pd
• d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']), 'two' : pd.Series([1, 2,
3, 4], index=['a', 'b', 'c', 'd'])}
• df = pd.DataFrame(d)
• # Adding a new column to an existing DataFrame object with column label
by passing new series
• print ("Adding a new column by passing as Series:")
• df['three']=pd.Series([10,20,30],index=['a','b','c'])
• print dfprint ("Adding a new column using the existing columns in
DataFrame:")
• df['four']=df['one']+df['three']
• print df
Column Deletion
• # Using the previous DataFrame, we will delete a
column
• # using del function
• import pandas as pd
• d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd']),
'three' : pd.Series([10,20,30], index=['a','b','c'])}
• df = pd.DataFrame(d)
• print ("Our dataframe is:")
• print df
• # using del function
• print ("Deleting the first column using DEL function:")
• del df['one']
• print df
# using pop function
• print ("Deleting another column using POP function:")
• df.pop('two')
• print df
Slicing in python
•import pandas as pd
d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']),
'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c',
'd'])}
•df = pd.DataFrame(d)
•print df[2:4]
Addition of rows
• Df2 = pd.DataFrame([[5,6], [7,8]], columns = [‘a’, ‘b’])
• Df = df.append(df2 )
• Print df
Deletion of rows
• Df2 = pd.DataFrame([[5,6], [7,8]], columns = [‘a’, ‘b’])
• Df = df.drop(0)
• Print df
Reindexing
• import pandas as pd
• import numpy as np
df1 =
pd.DataFrame(np.random.randn(10,3),columns=['col1','col2','col3'])
• df2 =
pd.DataFrame(np.random.randn(7,3),columns=['col1','col2','col3'])df1
= df1.reindex_like(df2)print df1
Concatenating objects
• import pandas as pd
• One = pd.DataFrame({ ‘Name’: [‘__’] , ‘subject_id’: [‘__’], ‘marks’:
[‘__’]}, index = [] )
• two= pd.DataFrame({ ‘Name’: [‘__’] , ‘subject_id’: [‘__’], ‘marks’:
[‘__’]}, index = [] )
• Print pd.concat([one, two])
Handling categorical data
• There are many data that are repetitive for example gender , country , and codes are
always repetitive .
• Categorical variables can take on only a limited
• The categorical data type is useful in the following cases −
• A string variable consisting of only a few different values. Converting such a string
variable to a categorical variable will save some memory.
• The lexical order of a variable is not the same as the logical order (“one”, “two”,
“three”). By converting to a categorical and specifying an order on the categories,
sorting and min/max will use the logical order instead of the lexical order.
• As a signal to other python libraries that this column should be treated as a
categorical variable (e.g. to use suitable statistical methods or plot types).
• import pandas as pd
• cat = pd.Categorical(['a', 'b', 'c', 'a', 'b', 'c'])
• print cat
• ____________________________________________
• import pandas as pd
• import numpy as np
• cat = pd.Categorical(["a", "c", "c", np.nan], categories=["b", "a", "c"])
• df = pd.DataFrame({"cat":cat, "s":["a", "c", "c", np.nan]})
• print df.describe()
• print df["cat"].describe()
Introduction to pandas

More Related Content

What's hot (20)

PDF
Python NumPy Tutorial | NumPy Array | Edureka
Edureka!
 
PPTX
Classification techniques in data mining
Kamal Acharya
 
PPTX
Exploratory data analysis with Python
Davis David
 
PDF
Algorithms Lecture 1: Introduction to Algorithms
Mohamed Loey
 
PDF
Introduction to Pandas and Time Series Analysis [PyCon DE]
Alexander Hendorf
 
PPTX
Data Structures in Python
Devashish Kumar
 
PDF
Data visualization in Python
Marc Garcia
 
PPTX
Introduction to numpy
Gaurav Aggarwal
 
PPTX
Introduction to python for Beginners
Sujith Kumar
 
PPTX
Packages In Python Tutorial
Simplilearn
 
PDF
Introduction to Recurrent Neural Network
Knoldus Inc.
 
PPTX
Python Seminar PPT
Shivam Gupta
 
PPTX
Presentation on data preparation with pandas
AkshitaKanther
 
PDF
Classification Based Machine Learning Algorithms
Md. Main Uddin Rony
 
PDF
pandas - Python Data Analysis
Andrew Henshaw
 
PPTX
Introduction to data structure ppt
NalinNishant3
 
PPTX
Regular expressions in Python
Sujith Kumar
 
PPTX
PPT on Data Science Using Python
NishantKumar1179
 
PDF
Algorithms Lecture 6: Searching Algorithms
Mohamed Loey
 
PDF
UNIT I LINEAR DATA STRUCTURES – LIST
Kathirvel Ayyaswamy
 
Python NumPy Tutorial | NumPy Array | Edureka
Edureka!
 
Classification techniques in data mining
Kamal Acharya
 
Exploratory data analysis with Python
Davis David
 
Algorithms Lecture 1: Introduction to Algorithms
Mohamed Loey
 
Introduction to Pandas and Time Series Analysis [PyCon DE]
Alexander Hendorf
 
Data Structures in Python
Devashish Kumar
 
Data visualization in Python
Marc Garcia
 
Introduction to numpy
Gaurav Aggarwal
 
Introduction to python for Beginners
Sujith Kumar
 
Packages In Python Tutorial
Simplilearn
 
Introduction to Recurrent Neural Network
Knoldus Inc.
 
Python Seminar PPT
Shivam Gupta
 
Presentation on data preparation with pandas
AkshitaKanther
 
Classification Based Machine Learning Algorithms
Md. Main Uddin Rony
 
pandas - Python Data Analysis
Andrew Henshaw
 
Introduction to data structure ppt
NalinNishant3
 
Regular expressions in Python
Sujith Kumar
 
PPT on Data Science Using Python
NishantKumar1179
 
Algorithms Lecture 6: Searching Algorithms
Mohamed Loey
 
UNIT I LINEAR DATA STRUCTURES – LIST
Kathirvel Ayyaswamy
 

Similar to Introduction to pandas (20)

PPTX
ppanda.pptx
DOLKUMARCHANDRA
 
PPTX
introductiontopandas- for 190615082420.pptx
rahulborate13
 
PPTX
Introduction To Pandas:Basics with syntax and examples.pptx
sonali sonavane
 
PPTX
PANDAS IN PYTHON (Series and DataFrame)
Harshitha190299
 
PPTX
introduction to data structures in pandas
vidhyapm2
 
PPTX
Pandas Dataframe reading data Kirti final.pptx
Kirti Verma
 
PPTX
pandas for series and dataframe.pptx
ssuser52a19e
 
PPTX
Data Frame Data structure in Python pandas.pptx
Ramakrishna Reddy Bijjam
 
PPTX
Unit 1 Ch 2 Data Frames digital vis.pptx
abida451786
 
PPTX
DataFrame Creation.pptx
SarveshMariappan
 
PDF
pandas dataframe notes.pdf
AjeshSurejan2
 
PPTX
Unit 4_Working with Graphs _python (2).pptx
prakashvs7
 
PPTX
Presentation on Pandas in _ detail .pptx
16115yogendraSingh
 
PPTX
pandas directories on the python language.pptx
SumitMajukar
 
PPTX
Pandas yayyyyyyyyyyyyyyyyyin Python.pptx
AamnaRaza1
 
PPTX
Python Pandas.pptx
SujayaBiju
 
PPTX
Pandas-(Ziad).pptx
Sivam Chinna
 
PDF
Data Analytics ,Data Preprocessing What is Data Preprocessing?
dchatterjee2110
 
PDF
Data science using python, Data Preprocessing
dchatterjee2110
 
PPTX
dataframe_operations and various functions
JayanthiM19
 
ppanda.pptx
DOLKUMARCHANDRA
 
introductiontopandas- for 190615082420.pptx
rahulborate13
 
Introduction To Pandas:Basics with syntax and examples.pptx
sonali sonavane
 
PANDAS IN PYTHON (Series and DataFrame)
Harshitha190299
 
introduction to data structures in pandas
vidhyapm2
 
Pandas Dataframe reading data Kirti final.pptx
Kirti Verma
 
pandas for series and dataframe.pptx
ssuser52a19e
 
Data Frame Data structure in Python pandas.pptx
Ramakrishna Reddy Bijjam
 
Unit 1 Ch 2 Data Frames digital vis.pptx
abida451786
 
DataFrame Creation.pptx
SarveshMariappan
 
pandas dataframe notes.pdf
AjeshSurejan2
 
Unit 4_Working with Graphs _python (2).pptx
prakashvs7
 
Presentation on Pandas in _ detail .pptx
16115yogendraSingh
 
pandas directories on the python language.pptx
SumitMajukar
 
Pandas yayyyyyyyyyyyyyyyyyin Python.pptx
AamnaRaza1
 
Python Pandas.pptx
SujayaBiju
 
Pandas-(Ziad).pptx
Sivam Chinna
 
Data Analytics ,Data Preprocessing What is Data Preprocessing?
dchatterjee2110
 
Data science using python, Data Preprocessing
dchatterjee2110
 
dataframe_operations and various functions
JayanthiM19
 
Ad

Recently uploaded (20)

PDF
Lean IP - Lecture by Dr Oliver Baldus at the MIPLM 2025
MIPLM
 
PPTX
How to Add a Custom Button in Odoo 18 POS Screen
Celine George
 
PPTX
Marketing Management PPT Unit 1 and Unit 2.pptx
Sri Ramakrishna College of Arts and science
 
PPTX
How to Manage Expiry Date in Odoo 18 Inventory
Celine George
 
PPTX
How to Send Email From Odoo 18 Website - Odoo Slides
Celine George
 
PDF
AI-assisted IP-Design lecture from the MIPLM 2025
MIPLM
 
PDF
Indian National movement PPT by Simanchala Sarab, Covering The INC(Formation,...
Simanchala Sarab, BABed(ITEP Secondary stage) in History student at GNDU Amritsar
 
PPTX
AIMA UCSC-SV Leadership_in_the_AI_era 20250628 v16.pptx
home
 
PDF
DIGESTION OF CARBOHYDRATES ,PROTEINS AND LIPIDS
raviralanaresh2
 
PDF
Lesson 1 : Science and the Art of Geography Ecosystem
marvinnbustamante1
 
PDF
STATEMENT-BY-THE-HON.-MINISTER-FOR-HEALTH-ON-THE-COVID-19-OUTBREAK-AT-UG_revi...
nservice241
 
PPTX
Life and Career Skills Lesson 2.pptxProtective and Risk Factors of Late Adole...
ryangabrielcatalon40
 
PPTX
DIGITAL CITIZENSHIP TOPIC TLE 8 MATATAG CURRICULUM
ROBERTAUGUSTINEFRANC
 
PPTX
Ward Management: Patient Care, Personnel, Equipment, and Environment.pptx
PRADEEP ABOTHU
 
PPTX
Iván Bornacelly - Presentation of the report - Empowering the workforce in th...
EduSkills OECD
 
PPTX
Natural Language processing using nltk.pptx
Ramakrishna Reddy Bijjam
 
PPTX
Aerobic and Anaerobic respiration and CPR.pptx
Olivier Rochester
 
PDF
CAD25 Gbadago and Fafa Presentation Revised-Aston Business School, UK.pdf
Kweku Zurek
 
PPT
Indian Contract Act 1872, Business Law #MBA #BBA #BCOM
priyasinghy107
 
PDF
WATERSHED MANAGEMENT CASE STUDIES - ULUGURU MOUNTAINS AND ARVARI RIVERpdf
Ar.Asna
 
Lean IP - Lecture by Dr Oliver Baldus at the MIPLM 2025
MIPLM
 
How to Add a Custom Button in Odoo 18 POS Screen
Celine George
 
Marketing Management PPT Unit 1 and Unit 2.pptx
Sri Ramakrishna College of Arts and science
 
How to Manage Expiry Date in Odoo 18 Inventory
Celine George
 
How to Send Email From Odoo 18 Website - Odoo Slides
Celine George
 
AI-assisted IP-Design lecture from the MIPLM 2025
MIPLM
 
Indian National movement PPT by Simanchala Sarab, Covering The INC(Formation,...
Simanchala Sarab, BABed(ITEP Secondary stage) in History student at GNDU Amritsar
 
AIMA UCSC-SV Leadership_in_the_AI_era 20250628 v16.pptx
home
 
DIGESTION OF CARBOHYDRATES ,PROTEINS AND LIPIDS
raviralanaresh2
 
Lesson 1 : Science and the Art of Geography Ecosystem
marvinnbustamante1
 
STATEMENT-BY-THE-HON.-MINISTER-FOR-HEALTH-ON-THE-COVID-19-OUTBREAK-AT-UG_revi...
nservice241
 
Life and Career Skills Lesson 2.pptxProtective and Risk Factors of Late Adole...
ryangabrielcatalon40
 
DIGITAL CITIZENSHIP TOPIC TLE 8 MATATAG CURRICULUM
ROBERTAUGUSTINEFRANC
 
Ward Management: Patient Care, Personnel, Equipment, and Environment.pptx
PRADEEP ABOTHU
 
Iván Bornacelly - Presentation of the report - Empowering the workforce in th...
EduSkills OECD
 
Natural Language processing using nltk.pptx
Ramakrishna Reddy Bijjam
 
Aerobic and Anaerobic respiration and CPR.pptx
Olivier Rochester
 
CAD25 Gbadago and Fafa Presentation Revised-Aston Business School, UK.pdf
Kweku Zurek
 
Indian Contract Act 1872, Business Law #MBA #BBA #BCOM
priyasinghy107
 
WATERSHED MANAGEMENT CASE STUDIES - ULUGURU MOUNTAINS AND ARVARI RIVERpdf
Ar.Asna
 
Ad

Introduction to pandas

  • 1. INTRODUCTION TO PANDAS A LIBRARY THAT IS USED FOR DATA MANIPULATION AND ANALYSIS TOOL USING POWERFUL DATA STRUCTURES
  • 2. TYPES OF DATA STRUCTUE IN PANDAS Data Structure Dimensions Description Series 1 1D labeled homogeneous array, sizeimmutable. Data Frames 2 General 2D labeled, size- mutable tabular structure with potentially heterogeneously typed columns. Panel 3 General 3D labeled, size- mutable array.
  • 3. SERIES • Series is a one-dimensional array like structure with homogeneous data. For example, the following series is a collection of integers 10, 23, 56, • … 10 23 56 17 52 61 73 90 26 72
  • 4. DataFrame • DataFrame is a two-dimensional array with heterogeneous data. For example, Name Age Gender Rating Steve 32 Male 3.45 Lia 28 Female 4.6 Vin 45 Male 3.9 Katie 38 Female 2.78
  • 5. Data Type of Columns Column Type Name String Age Integer Gender String Rating Float
  • 6. PANEL • Panel is a three-dimensional data structure with heterogeneous data. It is hard to represent the panel in graphical representation. But a panel can be illustrated as a container of DataFrame.
  • 7. DataFrame • A Data frame is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in rows and columns. • Features of DataFrame • Potentially columns are of different types • Size – Mutable • Labeled axes (rows and columns) • Can Perform Arithmetic operations on rows and columns
  • 10. • data • data takes various forms like ndarray, series, map, lists, dict, constants and also another DataFrame. • index • For the row labels, the Index to be used for the resulting frame is Optional Default np.arrange(n) if no index is passed. • columns • For column labels, the optional default syntax is - np.arrange(n). This is only true if no index is passed. • dtype • Data type of each column. • copy • This command (or whatever it is) is used for copying of data, if the default is False.
  • 11. • Create DataFrame • A pandas DataFrame can be created using various inputs like − • Lists • dict • Series • Numpy ndarrays • Another DataFrame
  • 12. Example • Example • import pandas as pd • Data = [_______] • Df = pd.DataFrame(data) • Print df Example 2 Import pandas as pd Data = {‘Name’ : [‘__’. ‘__’],’Age’: [___]} Df = pd.DataFrame(data) print df
  • 13. Example • import pandas as pd • data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}] • df = pd.DataFrame(data) • print df • ________________________________________ • import pandas as pd • data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}] • df = pd.DataFrame(data, index=['first', 'second']) • print df
  • 14. • import pandas as pd • data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}] • #With two column indices, values same as dictionary keys • df1 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b']) • #With two column indices with one index with other name • df2 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b1']) • print df1print df2
  • 15. The following example shows how to create a DataFrame with a list of dictionaries, row indices, and column indices. • import pandas as pd • data = [{'a': 1, 'b': 2},{'a': 5, 'b': 10, 'c': 20}] • #With two column indices, values same as dictionary keys • df1 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b']) • #With two column indices with one index with other name • df2 = pd.DataFrame(data, index=['first', 'second'], columns=['a', 'b1']) • print df1 • print df2
  • 16. Create a DataFrame from Dict of Series • import pandas as pd • d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']), 'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])} • df = pd.DataFrame(d) • print df
  • 17. Column Addition • import pandas as pd • d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']), 'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])} • df = pd.DataFrame(d) • # Adding a new column to an existing DataFrame object with column label by passing new series • print ("Adding a new column by passing as Series:") • df['three']=pd.Series([10,20,30],index=['a','b','c']) • print dfprint ("Adding a new column using the existing columns in DataFrame:") • df['four']=df['one']+df['three'] • print df
  • 18. Column Deletion • # Using the previous DataFrame, we will delete a column • # using del function • import pandas as pd • d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']), 'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd']), 'three' : pd.Series([10,20,30], index=['a','b','c'])} • df = pd.DataFrame(d)
  • 19. • print ("Our dataframe is:") • print df • # using del function • print ("Deleting the first column using DEL function:") • del df['one'] • print df # using pop function • print ("Deleting another column using POP function:") • df.pop('two') • print df
  • 20. Slicing in python •import pandas as pd d = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']), 'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])} •df = pd.DataFrame(d) •print df[2:4]
  • 21. Addition of rows • Df2 = pd.DataFrame([[5,6], [7,8]], columns = [‘a’, ‘b’]) • Df = df.append(df2 ) • Print df Deletion of rows • Df2 = pd.DataFrame([[5,6], [7,8]], columns = [‘a’, ‘b’]) • Df = df.drop(0) • Print df
  • 22. Reindexing • import pandas as pd • import numpy as np df1 = pd.DataFrame(np.random.randn(10,3),columns=['col1','col2','col3']) • df2 = pd.DataFrame(np.random.randn(7,3),columns=['col1','col2','col3'])df1 = df1.reindex_like(df2)print df1
  • 23. Concatenating objects • import pandas as pd • One = pd.DataFrame({ ‘Name’: [‘__’] , ‘subject_id’: [‘__’], ‘marks’: [‘__’]}, index = [] ) • two= pd.DataFrame({ ‘Name’: [‘__’] , ‘subject_id’: [‘__’], ‘marks’: [‘__’]}, index = [] ) • Print pd.concat([one, two])
  • 24. Handling categorical data • There are many data that are repetitive for example gender , country , and codes are always repetitive . • Categorical variables can take on only a limited • The categorical data type is useful in the following cases − • A string variable consisting of only a few different values. Converting such a string variable to a categorical variable will save some memory. • The lexical order of a variable is not the same as the logical order (“one”, “two”, “three”). By converting to a categorical and specifying an order on the categories, sorting and min/max will use the logical order instead of the lexical order. • As a signal to other python libraries that this column should be treated as a categorical variable (e.g. to use suitable statistical methods or plot types).
  • 25. • import pandas as pd • cat = pd.Categorical(['a', 'b', 'c', 'a', 'b', 'c']) • print cat • ____________________________________________ • import pandas as pd • import numpy as np • cat = pd.Categorical(["a", "c", "c", np.nan], categories=["b", "a", "c"]) • df = pd.DataFrame({"cat":cat, "s":["a", "c", "c", np.nan]}) • print df.describe() • print df["cat"].describe()