0% found this document useful (0 votes)

4 views

Exp 25_26

The document provides an introduction to the Pandas library in Python, detailing installation, importing, and data structures such as Series and DataFrame. It explains how to create and manipulate Series and DataFrames, including accessing elements, indexing, and performing binary operations. Additionally, it covers basic operations on rows and columns within DataFrames, including selection and indexing techniques.

Uploaded by

Prasad Nirmal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Exp 25_26

Uploaded by

Prasad Nirmal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 17

Started with Pandas

Installing Pandas
The first step in working with Pandas is to ensure whether it is installed in the system or not.
If not, then we need to install it on our system using the pip command.
pip install pandas
Importing Pandas
After the Pandas have been installed in the system, we need to import the library. This
module is generally imported as follows:
import pandas as pd
Note: Here, pd is referred to as an alias for the Pandas. However, it is not necessary to
import the library using the alias, it just helps in writing less code every time a method or
property is called.
Data Structures in Pandas Library
Pandas generally provide two data structures for manipulating data. They are:
 Series
 DataFrame

Python Pandas Series

A Pandas Series is a one-dimensional labeled array capable of holding data of any
type (integer, string, float, Python objects, etc.). The axis labels are collectively
called indexes.
Creating a Series
Pandas Series is created by loading the datasets from existing storage (which can be a SQL
database, a CSV file, or an Excel file).
Pandas Series can be created from lists, dictionaries, scalar values, etc.
Pandas Series Examples
# import pandas as pd
import pandas as pd

# simple array
data = [1, 2, 3, 4]
ser = pd.Series(data)
print(ser)

Output
0 1
1 2
2 3
3 4
dtype: int64
The axis labels are collectively called index. Pandas Series is nothing but a column in an excel
sheet.
Labels need not be unique but must be a hashable type. The object supports both integer
and label-based indexing and provides a host of methods for performing operations
involving the index.

Python Pandas Series

We will get a brief insight on all these basic operations which can be performed on Pandas
Series :
 Creating a Series
 Accessing element of Series
 Indexing and Selecting Data in Series
 Binary operation on Series
 Conversion Operation on Series

Creating a Pandas Series

In the real world, a Pandas Series will be created by loading the datasets from
existing storage, storage can be SQL Database, CSV file, and Excel file. Pandas Series can be
created from the lists, dictionary, and from a scalar value etc. Series can be created in
different ways, here are some ways by which we create a series:
Creating a series from array: In order to create a series from array, we have to import a
numpy module and have to use array() function.
# import pandas as pd
import pandas as pd
# import numpy as np
import numpy as np
# simple array
data = np.array(['g','e','e','k','s'])
ser = pd.Series(data)
print(ser)

Output
0 g
1 e
2 e
3 k
4 s
dtype: object
Creating a series from Lists:
In order to create a series from list, we have to first create a list after that we can create a
series from list.
import pandas as pd

# a simple list
list = ['g', 'e', 'e', 'k', 's']

# create series form a list

ser = pd.Series(list)
print(ser)

Output
0 g
1 e
2 e
3 k
4 s
dtype: object
For more details refer to Creating a Pandas Series.
Accessing element of Series
There are two ways through which we can access element of series, they are :
 Accessing Element from Series with Position
 Accessing Element Using Label (index)
Accessing Element from Series with Position : In order to access the series element refers to
the index number. Use the index operator [ ] to access an element in a series. The index
must be an integer. In order to access multiple elements from a series, we use Slice
operation.
Accessing first 5 elements of Series.
# import pandas and numpy
import pandas as pd
import numpy as np
# creating simple array
data = np.array(['g','e','e','k','s','f', 'o','r','g','e','e','k','s'])
ser = pd.Series(data)
#retrieve the first element
print(ser[5:])

Output
0 g
1 e
2 e
3 k
4 s
dtype: object
Accessing Element Using Label (index) :
In order to access an element from series, we have to set values by index label. A Series is
like a fixed-size dictionary in that you can get and set values by index label.
Accessing a single element using index label.
# import pandas and numpy
import pandas as pd
import numpy as np
# creating simple array
data = np.array(['g','e','e','k','s','f', 'o','r','g','e','e','k','s'])
ser = pd.Series(data,index=[10,11,12,13,14,15,16,17,18,19,20,21,22])

# accessing a element using index element

print(ser[16])

Output
o
For more details refer to Accessing element of Series
Indexing and Selecting Data in Series
Indexing in pandas means simply selecting particular data from a Series. Indexing could
mean selecting all the data, some of the data from particular columns. Indexing can also be
known as Subset Selection.
Indexing a Series using indexing operator [] :
Indexing operator is used to refer to the square brackets following an object.
The .loc and .iloc indexers also use the indexing operator to make selections. In this indexing
operator to refer to df[ ].
# importing pandas module
import pandas as pd
# making data frame
df = pd.read_csv("nba.csv")
ser = pd.Series(df['Name'])
data = ser.head(10)
data

Now we access the element of series using index operator [ ].

# using indexing operator
data[3:6]

Indexing a Series using .loc[ ] :

This function selects data by refering the explicit index . The df.loc indexer selects data in a
different way than just the indexing operator. It can select subsets of data.
# importing pandas module
import pandas as pd
# making data frame
df = pd.read_csv("nba.csv")
ser = pd.Series(df['Name'])
data = ser.head(10)
data
Now we access the element of series using .loc[] function.
# using .loc[] function
data.loc[3:6]
Output :

Indexing a Series using .iloc[ ] :

This function allows us to retrieve data by position. In order to do that, we’ll need to specify
the positions of the data that we want. The df.iloc indexer is very similar to df.loc but only
uses integer locations to make its selections.
# importing pandas module
import pandas as pd

# making data frame

df = pd.read_csv("nba.csv")

ser = pd.Series(df['Name'])
data = ser.head(10)
data
Output:

Now we access the element of Series using .iloc[] function.

# using .iloc[] function
data.iloc[3:6]
Output :

Binary Operation on Series

We can perform binary operation on series like addition, subtraction and many other
operation. In order to perform binary operation on series we have to use some function
like .add(),.sub() etc..
Code #1:
# importing pandas module
import pandas as pd
# creating a series
data = pd.Series([5, 2, 3,7], index=['a', 'b', 'c', 'd'])
# creating a series
data1 = pd.Series([1, 6, 4, 9], index=['a', 'b', 'd', 'e'])
print(data, "\n\n", data1)
Output
a 5
b 2
c 3
d 7
dtype: int64

a 1
b 6
d 4
e 9
dtype: int64
Now we add two series using .add() function.
# adding two series using
# .add
data.add(data1, fill_value=0)
Output :

Code #2:
# importing pandas module
import pandas as pd

# creating a series
data = pd.Series([5, 2, 3,7], index=['a', 'b', 'c', 'd'])
# creating a series
data1 = pd.Series([1, 6, 4, 9], index=['a', 'b', 'd', 'e'])
print(data, "\n\n", data1)

Output
a 5
b 2
c 3
d 7
dtype: int64

a 1
b 6
d 4
e 9
dtype: int64
Now we subtract two series using .sub function.
# subtracting two series using
# .sub
data.sub(data1, fill_value=0)
Output :

Pandas DataFrame
Pandas DataFrame is two-dimensional size-mutable, potentially heterogeneous
tabular data structure with labeled axes (rows and columns). A Data frame is a two-
dimensional data structure, i.e., data is aligned in a tabular fashion in rows and columns.
Pandas DataFrame consists of three principal components, the data, rows, and columns.
Creating a Pandas DataFrame
Pandas DataFrame will be created by loading the datasets from existing storage, storage can
be SQL Database, CSV file, and Excel file. Pandas DataFrame can be created from the lists,
dictionary, and from a list of dictionary etc.
Here are some ways by which we create a dataframe:
Creating a dataframe using List: DataFrame can be created using a single list or a list of lists.
import pandas as pd

# list of strings
lst = ['Geeks', 'For', 'Geeks', 'is', 'portal', 'for', 'Geeks']

# Calling DataFrame constructor on list

df = pd.DataFrame(lst)
print(df)
Output:
Output
Creating DataFrame from dict of ndarray/lists: To create DataFrame from dict of narray/list,
all the narray must be of same length. If index is passed then the length index should be
equal to the length of arrays. If no index is passed, then by default, index will be range(n)
where n is the array length.
# Python code demonstrate creating
# DataFrame from dict narray / lists
# By default addresses.
import pandas as pd
# intialise data of lists.
data = {'Name':['Tom', 'nick', 'krish', 'jack'],
'Age':[20, 21, 19, 18]}

# Create DataFrame
df = pd.DataFrame(data)

# Print the output.

print(df)

Output:
For more details refer to Creating a Pandas DataFrame
Table of Content
 Dealing with Rows and Columns
 Indexing and Selecting Data
 Selecting a single row
 Working with Missing Data
 Iterating over rows and columns

Dealing with Rows and Columns in Pandas DataFrame

A Data frame is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in
rows and columns. We can perform basic operations on rows/columns like selecting,
deleting, adding, and renaming.
Column Selection: In Order to select a column in Pandas DataFrame, we can either access
the columns by calling them by their columns name.
# Import pandas package
import pandas as pd

# Define a dictionary containing employee data

data = {'Name':['Jai', 'Princi', 'Gaurav', 'Anuj'],
'Age':[27, 24, 22, 32],
'Address':['Delhi', 'Kanpur', 'Allahabad', 'Kannauj'],
'Qualification':['Msc', 'MA', 'MCA', 'Phd']}
# Convert the dictionary into DataFrame
df = pd.DataFrame(data)
# select two columns
print(df[['Name', 'Qualification']])

Output:

Row Selection: Pandas provide a unique method to retrieve rows from a Data
frame. DataFrame.loc[] method is used to retrieve rows from Pandas DataFrame. Rows can
also be selected by passing integer location to an iloc[] function.

Note: We’ll be using nba.csv file in below examples.

# importing pandas package
import pandas as pd

# making data frame from csv file

data = pd.read_csv("nba.csv", index_col ="Name")
# retrieving row by loc method
first = data.loc["Avery Bradley"]
second = data.loc["R.J. Hunter"]
print(first, "\n\n\n", second)

Output:
As shown in the output image, two series were returned since there was only one parameter
both of the times.
For more Details refer to Dealing with Rows and Columns
Indexing and Selecting Data in Pandas
Indexing in pandas means simply selecting particular rows and columns of data from a
DataFrame. Indexing could mean selecting all the rows and some of the columns, some of
the rows and all of the columns, or some of each of the rows and columns. Indexing can also
be known as Subset Selection.
Indexing a Dataframe using indexing operator []
Indexing operator is used to refer to the square brackets following an object.
The .loc and .iloc indexers also use the indexing operator to make selections. In this indexing
operator to refer to df[].
In order to select a single column, we simply put the name of the column in-between the
brackets
# importing pandas package
import pandas as pd

# making data frame from csv file

data = pd.read_csv("nba.csv", index_col ="Name")
# retrieving columns by indexing operator
first = data["Age"]
print(first)
Output:

Indexing a DataFrame using .loc[ ]

This function selects data by the label of the rows and columns. The df.loc indexer selects
data in a different way than just the indexing operator. It can select subsets of rows or
columns. It can also simultaneously select subsets of rows and columns.
In order to select a single row using .loc[], we put a single row label in a .loc function.
# importing pandas package
import pandas as pd

# making data frame from csv file

data = pd.read_csv("nba.csv", index_col ="Name")
# retrieving row by loc method
first = data.loc["Avery Bradley"]
second = data.loc["R.J. Hunter"]
print(first, "\n\n\n", second)
Output:
As shown in the output image, two series were returned since there was only one parameter
both of the times.

Indexing a DataFrame using .iloc[ ]

This function allows us to retrieve rows and columns by position. In order to do that, we’ll
need to specify the positions of the rows that we want, and the positions of the columns
that we want as well. The df.iloc indexer is very similar to df.loc but only uses integer
locations to make its selections.
In order to select a single row using .iloc[], we can pass a single integer to .iloc[] function.
import pandas as pd

# making data frame from csv file

data = pd.read_csv("nba.csv", index_col ="Name")
# retrieving rows by iloc method
row2 = data.iloc[3]
print(row2)

Output:

Unit I: Data Handling Using Pandas and Data Visualization: Marks:30
No ratings yet
Unit I: Data Handling Using Pandas and Data Visualization: Marks:30
75 pages
Unit I: Data Handling Using Pandas and Data Visualization: Marks:25
No ratings yet
Unit I: Data Handling Using Pandas and Data Visualization: Marks:25
135 pages
Data Handling Using Pandas-1
No ratings yet
Data Handling Using Pandas-1
25 pages
1 IP 12 NOTES PythonPandas 2022 PDF
100% (3)
1 IP 12 NOTES PythonPandas 2022 PDF
66 pages
Pandas Ip PDF
100% (1)
Pandas Ip PDF
48 pages
Class XII Data Handlinng Using PandasI
No ratings yet
Class XII Data Handlinng Using PandasI
46 pages
Working With Pandas Notes
No ratings yet
Working With Pandas Notes
27 pages
Analyzing Data Using Pandas
No ratings yet
Analyzing Data Using Pandas
4 pages
Dataframe Notes
No ratings yet
Dataframe Notes
47 pages
1 Data Handlinng Using Pandas-I
No ratings yet
1 Data Handlinng Using Pandas-I
46 pages
Python Pandas Series
No ratings yet
Python Pandas Series
45 pages
Data Handlinng Using Pandas-I
No ratings yet
Data Handlinng Using Pandas-I
46 pages
Unit-1 Python Pandas (1)
No ratings yet
Unit-1 Python Pandas (1)
56 pages
Introduction to Pandas & Data Structures
No ratings yet
Introduction to Pandas & Data Structures
11 pages
XII_ip_Panda_I_Part_I_2023 (1) 1 1
No ratings yet
XII_ip_Panda_I_Part_I_2023 (1) 1 1
25 pages
Data Manipulation With Pandas
No ratings yet
Data Manipulation With Pandas
38 pages
Panda Ncert 1
No ratings yet
Panda Ncert 1
36 pages
UNIT 3(Chapter 2) Pandas
No ratings yet
UNIT 3(Chapter 2) Pandas
43 pages
Unit II Notes Revision
No ratings yet
Unit II Notes Revision
20 pages
Httpsncert.nic.Intextbookpdfleip102.PDF
No ratings yet
Httpsncert.nic.Intextbookpdfleip102.PDF
36 pages
Python Pandas
No ratings yet
Python Pandas
22 pages
14_Pandas
No ratings yet
14_Pandas
25 pages
Class12 Pandas Notes
No ratings yet
Class12 Pandas Notes
23 pages
UNIT - 3 Pandas
No ratings yet
UNIT - 3 Pandas
21 pages
Pandas-Creating Series & Dataframes (DR V Gowri, Srmist)
No ratings yet
Pandas-Creating Series & Dataframes (DR V Gowri, Srmist)
47 pages
DV
No ratings yet
DV
53 pages
Pandas - Series - Short - Notes
No ratings yet
Pandas - Series - Short - Notes
7 pages
CH 2
No ratings yet
CH 2
36 pages
Data Handling Python NCERT
No ratings yet
Data Handling Python NCERT
36 pages
Ln. 1 - Data handling using Pandas - Series & Dataframe
No ratings yet
Ln. 1 - Data handling using Pandas - Series & Dataframe
14 pages
Ip 102
No ratings yet
Ip 102
36 pages
Data Analytics Pandas
No ratings yet
Data Analytics Pandas
33 pages
Pandas Class 12 Ncertttt
No ratings yet
Pandas Class 12 Ncertttt
48 pages
Unit 4
No ratings yet
Unit 4
36 pages
Unit 2
No ratings yet
Unit 2
81 pages
ML Lab8
No ratings yet
ML Lab8
28 pages
Unit_III_part_2_1725700061785
No ratings yet
Unit_III_part_2_1725700061785
85 pages
XII IP Ch 1 Python Pandas - I Series
No ratings yet
XII IP Ch 1 Python Pandas - I Series
45 pages
CSE488_Lab5_Pandas
No ratings yet
CSE488_Lab5_Pandas
27 pages
Pandas
No ratings yet
Pandas
14 pages
12ip 22 23
No ratings yet
12ip 22 23
188 pages
Data Handling Using Pandas I - Series
No ratings yet
Data Handling Using Pandas I - Series
11 pages
Python Pandas (II)
No ratings yet
Python Pandas (II)
18 pages
Chapter 2 Data Handling using pandas - I(Series)
No ratings yet
Chapter 2 Data Handling using pandas - I(Series)
13 pages
Pandas Notes 1
No ratings yet
Pandas Notes 1
6 pages
P03 Introduction To Pandas Ans
No ratings yet
P03 Introduction To Pandas Ans
45 pages
Unit 1 Pandas - Series and DataFrame
No ratings yet
Unit 1 Pandas - Series and DataFrame
19 pages
Python Pandas
No ratings yet
Python Pandas
230 pages
09_Pandas slides
No ratings yet
09_Pandas slides
33 pages
Pandas
No ratings yet
Pandas
20 pages
Python Pandas
100% (1)
Python Pandas
35 pages
Class 12 IP Ch-1, 2 3
No ratings yet
Class 12 IP Ch-1, 2 3
28 pages
DV FINAL QB
No ratings yet
DV FINAL QB
60 pages
Unit-4Introduction To Pandas
No ratings yet
Unit-4Introduction To Pandas
44 pages
Data Handling Using Pandas-I-ORG
No ratings yet
Data Handling Using Pandas-I-ORG
44 pages
Chapter 1 and 2 Series and Data Frame
No ratings yet
Chapter 1 and 2 Series and Data Frame
45 pages
CH 02 - Data Handling Using Pandas Leip102 EDITED Smaller 01 Codes Only
No ratings yet
CH 02 - Data Handling Using Pandas Leip102 EDITED Smaller 01 Codes Only
15 pages
Informatics Practices Class 12 Study Material
No ratings yet
Informatics Practices Class 12 Study Material
128 pages
Python Pandas-Series-neww
100% (1)
Python Pandas-Series-neww
80 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Rail_Fence_Cipher
No ratings yet
Rail_Fence_Cipher
4 pages
VernamCipher
No ratings yet
VernamCipher
4 pages
Exp no-23,24
No ratings yet
Exp no-23,24
4 pages
Exp no-30
No ratings yet
Exp no-30
2 pages
OOP Unit 4 Notes
No ratings yet
OOP Unit 4 Notes
10 pages
User-Defined Functions in C
No ratings yet
User-Defined Functions in C
6 pages
JAVA unit 4
No ratings yet
JAVA unit 4
110 pages
C# Inheritance VS Composition
No ratings yet
C# Inheritance VS Composition
6 pages
Java Ques Java Important Questions
No ratings yet
Java Ques Java Important Questions
15 pages
CS111
No ratings yet
CS111
15 pages
File Stream
No ratings yet
File Stream
10 pages
Lecture 2
No ratings yet
Lecture 2
22 pages
Fdocuments - in Exception Handling in C 58bcee0564d59
No ratings yet
Fdocuments - in Exception Handling in C 58bcee0564d59
15 pages
Unit 3ac++
No ratings yet
Unit 3ac++
15 pages
temp_anr_960829797506665417
No ratings yet
temp_anr_960829797506665417
35 pages
W 2018 A - (Truexams - Com) - 1
No ratings yet
W 2018 A - (Truexams - Com) - 1
29 pages
1 Java Introduction
No ratings yet
1 Java Introduction
27 pages
JAVA Pattern questions
No ratings yet
JAVA Pattern questions
104 pages
cpp03 - 42 C++
No ratings yet
cpp03 - 42 C++
12 pages
De Thi 1
No ratings yet
De Thi 1
3 pages
Unix Unit-4 (Final)
No ratings yet
Unix Unit-4 (Final)
20 pages
SCSJ1023-201720181-Mid Term-Part A-Solution
No ratings yet
SCSJ1023-201720181-Mid Term-Part A-Solution
6 pages
Trace
No ratings yet
Trace
3 pages
Day-15 Python Constructors
No ratings yet
Day-15 Python Constructors
19 pages
JAVA
No ratings yet
JAVA
6 pages
Structure of Java Program
No ratings yet
Structure of Java Program
25 pages
Python Class & Objects
No ratings yet
Python Class & Objects
6 pages
Logs
No ratings yet
Logs
76 pages
Unit 2 Swing
No ratings yet
Unit 2 Swing
9 pages
05 C++ Threads
No ratings yet
05 C++ Threads
28 pages
Embed Lab 1 To 4
No ratings yet
Embed Lab 1 To 4
21 pages
HSSLive XII CA CH 3 Functions
No ratings yet
HSSLive XII CA CH 3 Functions
12 pages
1 Module1
No ratings yet
1 Module1
56 pages
Bartłomiej Filipek - C++ Initialization Story - A Guide Through All Initialization Options and Related C++ Areas-Leanpub (2022)
No ratings yet
Bartłomiej Filipek - C++ Initialization Story - A Guide Through All Initialization Options and Related C++ Areas-Leanpub (2022)
275 pages

Exp 25_26

Uploaded by

Exp 25_26

Uploaded by

Started with Pandas

Python Pandas Series

Python Pandas Series

Creating a Pandas Series

# create series form a list

# accessing a element using index element

Now we access the element of series using index operator [ ].

Indexing a Series using .loc[ ] :

Indexing a Series using .iloc[ ] :

# making data frame

Now we access the element of Series using .iloc[] function.

Binary Operation on Series

# Calling DataFrame constructor on list

# Print the output.

Dealing with Rows and Columns in Pandas DataFrame

# Define a dictionary containing employee data

Note: We’ll be using nba.csv file in below examples.

# making data frame from csv file

# making data frame from csv file

Indexing a DataFrame using .loc[ ]

# making data frame from csv file

Indexing a DataFrame using .iloc[ ]

# making data frame from csv file

You might also like