0% found this document useful (0 votes)

6 views

CSL-410-L16

Uploaded by

rpschauhan2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

CSL-410-L16

Uploaded by

rpschauhan2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Program:B.

Tech(CSE) IV Semester II Year

CSL-410: Data Science using Python

Unit No. 2
Pandas: I/O Tools

Lecture No. 16

Dr. Sanjay Jain

Associate Professor, CSA/SOET
Outlines
• Introduction
• Read and write CSV files
• Read and write excel files
• Examples
• References
Student Effective Learning Outcomes(SELO)
01: Ability to understand subject related concepts clearly along with
contemporary issues.
02: Ability to use updated tools, techniques and skills for effective domain
specific practices.
03: Understanding available tools and products and ability to use it
effectively.
Introduction
• The Pandas I/O API is a set of top level reader functions accessed like
pd.read_csv() that generally return a Pandas object.
• The two workhorse functions for reading text files (or the flat files) are
read_csv() and read_table(). They both use the same parsing code to
intelligently convert tabular data into a DataFrame object:
pandas.read_csv(filepath_or_buffer, sep=',', delimiter=None, header='infer',
names=None, index_col=None, usecols=None)

<SELO: 1> <Reference No.: R1,R4>

read.csv()
• read.csv reads data from the csv files and creates a DataFrame object.
import pandas as pd
df=pd.read_csv("temp.csv")
print (df)
• Output:
S.No Name Age City Salary
0 1 Tom 28 Toronto 20000
1 2 Lee 32 HongKong 3000
2 3 Steven 43 Bay Area 8300
3 4 Ram 38 Hyderabad 3900

<SELO: 1> <Reference No.: R1,R4>

read.csv()
custom index
• This specifies a column in the csv file to customize the index using
index_col.
import pandas as pd
df=pd.read_csv("temp.csv" ,index_col=['S.No'])
print (df)
• Output:
S.No Name Age City Salary
1 Tom 28 Toronto 20000
2 Lee 32 HongKong 3000
3 Steven 43 Bay Area 8300
4 Ram 38 Hyderabad 3900

<SELO: 1> <Reference No.: R1,R4>

read.csv()
• Converters: dtype of the columns can be passed as a dict.
import pandas as pd
df = pd.read_csv("temp.csv", dtype={'Salary': np.float64})
print (df.dtypes)
• Output:
S.No int64
Name object
Age int64
City object
Salary float64
dtype: object
• Note: By default, the dtype of the Salary column is int, but the result
shows it as float because we have explicitly casted the type.
<SELO: 1> <Reference No.: R1,R4>
read.csv()
• header_names: Specify the names of the header using the names
argument.
import pandas as pd
df=pd.read_csv("temp.csv", names=['a', 'b', 'c','d','e'])
print (df)

• Output:
a b c d e
S.No Name Age City Salary
0 1 Tom 28 Toronto 20000
1 2 Lee 32 HongKong 3000
2 3 Steven 43 Bay Area 8300
3 4 Ram 38 Hyderabad 3900

<SELO: 1> <Reference No.: R1,R4>

read.csv()
• header_names: Observe, the header names are appended with the
custom names, but the header in the file has not been eliminated.
Now, we use the header argument to remove that. If the header is in
a row other than the first, pass the row number to header. This will
skip the preceding rows.
import pandas as pd
df=pd.read_csv("temp.csv", names=['a', 'b', 'c','d','e'] ,header=0)
print (df)
• Output:
a b c d e
0 1 Tom 28 Toronto 20000
1 2 Lee 32 HongKong 3000
2 3 Steven 43 Bay Area 8300
3 4 Ram 38 Hyderabad 3900

<SELO: 1> <Reference No.: R1,R4>

read.csv()
• Skiprows: skiprows skips the number of rows specified.
import pandas as pd
df=pd.read_csv("temp.csv", skiprows=2)
print (df)
• Output:
2 Lee 32 HongKong 3000
0 3 Steven 43 Bay Area 8300
1 4 Ram 38 Hyderabad 3900

<SELO: 1> <Reference No.: R1,R4>

read.csv()
• head()
• Example:
import pandas as pd
url = 'https://raw.github.com/pandasdev/
pandas/master/pandas/tests/data/tips.csv'
tips=pd.read_csv(url)
print (tips.head())
• Output:
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4
<SELO: 1> <Reference No.: R1,R4>
read.csv()
• Column Selection:
import pandas as pd
url = 'https://raw.github.com/pandasdev/
pandas/master/pandas/tests/data/tips.csv'
tips=pd.read_csv(url)
print(tips[['total_bill', 'tip', 'smoker', 'time']].head(5))
• Output:

<SELO: 1> <Reference No.: R1,R4>

read.csv()
• Filtering: DataFrames can be filtered in multiple ways; the most
intuitive of which is using Boolean indexing.
import pandas as pd
url = 'https://raw.github.com/pandasdev/
pandas/master/pandas/tests/data/tips.csv'
tips=pd.read_csv(url)
print(tips[tips['time'] == 'Dinner'].head(5))
• Output:

<SELO: 1> <Reference No.: R1,R4>

read.csv()
• Group By: This operation fetches the count of records in each
group throughout a dataset. For instance, a query fetching us the
number of tips left by sex:
import pandas as pd
url = 'https://raw.github.com/pandasdev/
pandas/master/pandas/tests/data/tips.csv'
tips=pd.read_csv(url)
print(tips.groupby('sex').size())
• Output:
sex
Female 87
Male 157
dtype: int64

<SELO: 1> <Reference No.: R1,R4>

read.csv()
• head(): Top N rows
import pandas as pd
url = 'https://raw.github.com/pandasdev/
pandas/master/pandas/tests/data/tips.csv'
tips=pd.read_csv(url)
tips = tips[['smoker', 'day', 'time']].head(5)
print(tips)
• Output:

<SELO: 1> <Reference No.: R1,R4>

read.csv()
• tail(): Bottom N rows
import pandas as pd
url = 'https://raw.github.com/pandasdev/
pandas/master/pandas/tests/data/tips.csv'
tips=pd.read_csv(url)
tips = tips[['smoker', 'day', 'time']].tail(5)
print(tips)
• Output:

<SELO: 1> <Reference No.: R1,R4>

Writing CSV Files with to_csv()
• The process of creating or writing a CSV file through Pandas can be
a little more complicated than reading CSV, but it's still relatively
simple. We use the to_csv() function to perform this task. However,
you have to create a Pandas DataFrame first, followed by writing
that DataFrame to the CSV file.
• Example:
import pandas as pd
city = pd.DataFrame([['Sacramento', 'California'], ['Miami', 'Florida']],
columns=['City', 'State'])
city.to_csv('city.csv')
• In the above example, we have created a DataFrame named city.
Subsequently, we have written that DataFrame to a file named
"city.csv" using the to_csv() function.

<SELO: 1> <Reference No.: R1,R4>

to_excel()
• The to_excel() method stores the data as an excel file
import pandas as pd
url = 'https://raw.github.com/pandasdev/
pandas/master/pandas/tests/data/tips.csv'
tips=pd.read_csv(url)
tips.to_excel("tips.xlsx", sheet_name=“customer", index=False)

• In the example here, the sheet_name is named customer instead of the

default Sheet1. By setting index=False the row index labels are not saved
in the spreadsheet.

<SELO: 1> <Reference No.: R1,R4>

read_excel()
• read_excel() will reload the data to a DataFrame:
import pandas as pd
tips = pd.read_excel("tips.xlsx", sheet_name=“customer")
print (tips.head())
• Output:
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4

<SELO: 1> <Reference No.: R1,R4>

Learning Outcomes

The students have learn and understand the followings:

•Introduction
•Read and write CSV files
•Read and write excel files
•Examples
References

1. Data Science with Python by by Aaron England, Mohamed Noordeen

Alaudeen, and Rohan Chopra. Packt Publishing; July 2019
2. https://intellipaat.com/blog/what-is-data-science/
3. https://onlinecourses.nptel.ac.in/noc20_cs36/
Thank you

1745516832930-Pandas-Handbook
No ratings yet
1745516832930-Pandas-Handbook
33 pages
FedEx Shipment 780598911619: Your Package Is Delayed.
33% (3)
FedEx Shipment 780598911619: Your Package Is Delayed.
3 pages
12 Information Practices Text Book Preeti Arora
No ratings yet
12 Information Practices Text Book Preeti Arora
45 pages
Principle of Non-Life Insurance
100% (1)
Principle of Non-Life Insurance
18 pages
pandas data frame
No ratings yet
pandas data frame
11 pages
Introduction To Pandas
No ratings yet
Introduction To Pandas
27 pages
IP 12th Chapter 3
No ratings yet
IP 12th Chapter 3
9 pages
Pandas 1705297450
No ratings yet
Pandas 1705297450
21 pages
Experiment No 3 Importing and Exporting Data in Python Using Pandas Student
No ratings yet
Experiment No 3 Importing and Exporting Data in Python Using Pandas Student
6 pages
Importing Data Into Pandas Dataframes
No ratings yet
Importing Data Into Pandas Dataframes
5 pages
Revision Point - Dataframe
No ratings yet
Revision Point - Dataframe
11 pages
DataFrame.docx
No ratings yet
DataFrame.docx
95 pages
justenoughpython_pandas_220915_175329
No ratings yet
justenoughpython_pandas_220915_175329
64 pages
L32, 33 Pandas
No ratings yet
L32, 33 Pandas
7 pages
7 Days Analytics Course 3feiz7 4
No ratings yet
7 Days Analytics Course 3feiz7 4
8 pages
Pandas
No ratings yet
Pandas
8 pages
Introduction To Pandas Library
No ratings yet
Introduction To Pandas Library
4 pages
Chapter Notes - Data Handling Using Pandas DataFrame
No ratings yet
Chapter Notes - Data Handling Using Pandas DataFrame
16 pages
RM - Pandas_Importing Data
No ratings yet
RM - Pandas_Importing Data
15 pages
7th class of CSV and DataFrame
No ratings yet
7th class of CSV and DataFrame
9 pages
PPT for Assignment-3 (Final_Pandas_Lab)
No ratings yet
PPT for Assignment-3 (Final_Pandas_Lab)
40 pages
Pandas
No ratings yet
Pandas
27 pages
12-IP
No ratings yet
12-IP
4 pages
Exercise 3
No ratings yet
Exercise 3
12 pages
CH-6 Data Loading, Storage, and File Formats
No ratings yet
CH-6 Data Loading, Storage, and File Formats
163 pages
ainotes
No ratings yet
ainotes
5 pages
lab 1 ML lab
No ratings yet
lab 1 ML lab
15 pages
pandas.read_table(filepath_or_buffe
No ratings yet
pandas.read_table(filepath_or_buffe
7 pages
a5
No ratings yet
a5
28 pages
Importing A CSV File Into The DataFrame
No ratings yet
Importing A CSV File Into The DataFrame
11 pages
introduction to pandas
No ratings yet
introduction to pandas
14 pages
Pandas
No ratings yet
Pandas
94 pages
Python Unit 4&5 Que
No ratings yet
Python Unit 4&5 Que
33 pages
Lecture 21 Working with Pandas
No ratings yet
Lecture 21 Working with Pandas
11 pages
Pandas Tutorial 1: Pandas Basics (Reading Data Files, Dataframes, Data Selection)
No ratings yet
Pandas Tutorial 1: Pandas Basics (Reading Data Files, Dataframes, Data Selection)
15 pages
Pandas
No ratings yet
Pandas
4 pages
Panda Cheatsheet
No ratings yet
Panda Cheatsheet
17 pages
EMPLOYEE DATA ANALYSIS SYSTEM (IP CLASS XII)
No ratings yet
EMPLOYEE DATA ANALYSIS SYSTEM (IP CLASS XII)
26 pages
File Handling
No ratings yet
File Handling
6 pages
Actuators and Drivers
No ratings yet
Actuators and Drivers
23 pages
CHP 8 Pandas
No ratings yet
CHP 8 Pandas
49 pages
CSL-410-L17
No ratings yet
CSL-410-L17
27 pages
MOD-3 Dap
No ratings yet
MOD-3 Dap
41 pages
EDA - Session-1 - Basic Dataframe Opertaions-1
No ratings yet
EDA - Session-1 - Basic Dataframe Opertaions-1
7 pages
Pandas
No ratings yet
Pandas
5 pages
Ainotes dataframe
No ratings yet
Ainotes dataframe
5 pages
Day08-Pandas-Tutorial: Pandas - by Punith V T
No ratings yet
Day08-Pandas-Tutorial: Pandas - by Punith V T
8 pages
Cheat Sheet
No ratings yet
Cheat Sheet
10 pages
Pandas I Notes 06 - June 20
No ratings yet
Pandas I Notes 06 - June 20
13 pages
Exp_1_Introduction to Data Analytics and Python fundamentals_sdk_ok
No ratings yet
Exp_1_Introduction to Data Analytics and Python fundamentals_sdk_ok
9 pages
Pandas Notes (1)
No ratings yet
Pandas Notes (1)
10 pages
Pandas PDF(2)
No ratings yet
Pandas PDF(2)
25 pages
7.2 - Data Frame Basics.mp4
No ratings yet
7.2 - Data Frame Basics.mp4
3 pages
Rutu Project Ip
No ratings yet
Rutu Project Ip
30 pages
UNIT II Notes (1)
No ratings yet
UNIT II Notes (1)
23 pages
Data Analysis
No ratings yet
Data Analysis
4 pages
Exercise 3
No ratings yet
Exercise 3
25 pages
Pandas in Python
No ratings yet
Pandas in Python
59 pages
Pandas 1
No ratings yet
Pandas 1
89 pages
Cheat Sheet - Pandas
No ratings yet
Cheat Sheet - Pandas
12 pages
Intro Pandas
No ratings yet
Intro Pandas
18 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Internship-Report Iot
No ratings yet
Internship-Report Iot
28 pages
PRA_Iris_User_Manual_Version4
No ratings yet
PRA_Iris_User_Manual_Version4
54 pages
Tq-Eapp .1
No ratings yet
Tq-Eapp .1
5 pages
Alternative Impression Techniques For Severely Resorbed Ridges
No ratings yet
Alternative Impression Techniques For Severely Resorbed Ridges
8 pages
Settlement Agreement
No ratings yet
Settlement Agreement
1 page
2 Historical Perspectives of Nursing and Computers
No ratings yet
2 Historical Perspectives of Nursing and Computers
4 pages
Pre Trial Brief - Respondents
100% (1)
Pre Trial Brief - Respondents
1 page
Joint Stock Company
No ratings yet
Joint Stock Company
2 pages
Finance Practice
No ratings yet
Finance Practice
6 pages
Ifr Communications
No ratings yet
Ifr Communications
39 pages
Technova Brochure Print 2.0
No ratings yet
Technova Brochure Print 2.0
20 pages
Hoist Data - DZX-1102 - ZX084 - 6.0 & 12.0 ton
No ratings yet
Hoist Data - DZX-1102 - ZX084 - 6.0 & 12.0 ton
3 pages
"What's Happening?": The Loss of Air France Flight 447
No ratings yet
"What's Happening?": The Loss of Air France Flight 447
4 pages
Angliski Test Vtora Grupa
No ratings yet
Angliski Test Vtora Grupa
2 pages
Wy
100% (1)
Wy
2 pages
Construction Cost Handbook India 2016
No ratings yet
Construction Cost Handbook India 2016
120 pages
Lead Acid Battery
67% (3)
Lead Acid Battery
14 pages
Civil Engineer Interview Questions
100% (1)
Civil Engineer Interview Questions
7 pages
Tle 6 Ia Q3 Week 8 9
No ratings yet
Tle 6 Ia Q3 Week 8 9
11 pages
Watt and Bothered Monster Between the Sheets 1st Edition Fiona Davenport all chapter instant download
100% (1)
Watt and Bothered Monster Between the Sheets 1st Edition Fiona Davenport all chapter instant download
35 pages
Cruz v. Superior Court_ 121 Cal. App. 4th 646(1)
No ratings yet
Cruz v. Superior Court_ 121 Cal. App. 4th 646(1)
7 pages
Javier vs. People
No ratings yet
Javier vs. People
2 pages
Isolated Footing (4 Sides Wind)
No ratings yet
Isolated Footing (4 Sides Wind)
4 pages
Pepsodent
No ratings yet
Pepsodent
2 pages
United States v. Francis Peter Crosby, 314 F.2d 654, 2d Cir. (1963)
No ratings yet
United States v. Francis Peter Crosby, 314 F.2d 654, 2d Cir. (1963)
4 pages
Reflexive Pronouns - Interactive Worksheet
No ratings yet
Reflexive Pronouns - Interactive Worksheet
3 pages
Peer Tutoor Platform
No ratings yet
Peer Tutoor Platform
9 pages
Ultrasonic Diffuse Reflection Sensor: 1 Teach Button
No ratings yet
Ultrasonic Diffuse Reflection Sensor: 1 Teach Button
4 pages

CSL-410-L16

Uploaded by

CSL-410-L16

Uploaded by

Program:B.

Tech(CSE) IV Semester II Year

CSL-410: Data Science using Python

Dr. Sanjay Jain

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

• In the example here, the sheet_name is named customer instead of the

<SELO: 1> <Reference No.: R1,R4>

<SELO: 1> <Reference No.: R1,R4>

The students have learn and understand the followings:

1. Data Science with Python by by Aaron England, Mohamed Noordeen

You might also like