0% found this document useful (0 votes)

63 views

Pandas Dataframe Export The CSV File

Pandas is an open source library that provides tools for data analysis and manipulation. It offers flexible data structures like Series and DataFrames. Series are one-dimensional arrays that can store data of any type. DataFrames are two-dimensional data structures that allow storing and manipulating tabular data. DataFrames can be created from lists, dictionaries, NumPy arrays, or CSV files. They provide methods for viewing, selecting, and describing data.

Uploaded by

ammouna beng

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

63 views

Pandas Dataframe Export The CSV File

Uploaded by

ammouna beng

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

Introduction to dataframe

What’s pandas
Pandas is an open source package that provides numerous tools for data analysis. It
offers fast, flexible and expressive data structures that can be used for many
different data manipulation tasks.

In order to use Pandas in your Python IDE you need to import the Pandas library first :

import pandas as pd

Pandas data structures

The two primary data structures of pandas are:

1. Series: is one-dimensional array. It can store data of any type. Its values are mutable
but the size cannot be changed.
2. DataFrame: is two-dimensional data with mutable size, it allows to store and
manipulate tabular data in rows of observations and columns of variables.

How to create series

A series may be created from:
1. A numpy array:

import pandas as pd

import numpy as np
array = np.array(["blue", "yellow", "pink", "purple"]) # get the array

series1 = pd.Series(array) #create the series from the array

print(series1)

2. A list:

list = [19, 175, 41, 22]

series2 = pd.Series(list) #create the series from the list

print(series2)```

https://www.tutorialspoint.com/python_pandas/python_pandas_series.htm

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.add.html

https://subscription.packtpub.com/book/big_data_and_business_intelligence/9781787123137/3/
ch03lvl1sec31/re-indexing-a-series

Series Changing Index

A big advantage we gain compared to NumPy arrays is that we can create a Series
using our own indexes.
For example:

import pandas as pd

color=["pink", "white", "black", "blue"]

occurence = [20, 15, 6, 43]

S=pd.Series(occurence, index=color)

print(S)

Series addition
If we add two series with the same index, we get a new series with the same index
and the corresponding values will be added :

import pandas as pd

color=["pink", "white", "black", "blue"]

S1=pd.Series([20, 15, 6, 43], index=color)

S2=pd.Series([3, 22, 9, 10], index=color)

print(S1+S2)

Dataframe Introduction
DataFrame is a 2-dimensional labeled data structure with columns of potentially
different types.

 Dataframe columns are made up of pandas Series.

You can think of it like a spreadsheet or SQL table,or a dictionary of Series.

How to create DataFrame

1. From a list:

import pandas as pd

list = [['Jack', 34, 'Paris'], ['Thomas', 30, 'Roma'],

['Alexandre', 16, 'New York']]

df = pd.DataFrame(list, columns =['name', 'age', 'city'])

2. From dictionary:

dictionary = { 'name' : ['Jack', 'Thomas', 'Alexandre'],

'age' : [34, 30, 16],

'city' : ['Paris', 'Roma', 'New York']}

df = pd.DataFrame(dictionary)
print(df)

3. From numpy array:

import numpy as np

import pandas as pd

my_numpy_array=np.random.randn(3,4)

df=pd.DataFrame(my_numpy_array, columns=list("abcd"))

print(df)

4. From csv file:

Let's create a dataframe from this csv file

import pandas as pd

df=pd.read_csv("csv file example", sep=";")

https://databricks.com/glossary/pandas-dataframe

Dataframe exportation to csv

1. We created a dataframe using dictionary.
2. We uploaded it into a csv file.
3. We created a new dataframe from our csv file.
4. We used the head command to show the first 5 rows.
https://datatofish.com/export-dataframe-to-csv/

Getting information about dataframe

To show general information of different columns such as the type, we write:

df.info()

 Viewing our data

df.head() #to show the first 5 rows of our data

df.tail() # to show the last 5 rows of our data

 Describing our data

Now, let’s use the describe command for calculating some statistical data for one
specific column.
df.describe() # we will get a detailed description of numerical variables of our data
such as mean, min, std, max...etc

https://note.nkmk.me/en/python-pandas-head-tail/

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.describe.html
Dataframe Bracket Selection
In this course, we will often have to select specific rows or columns from our
DataFrame.
One of the easiest ways to do that is to use brackets:

https://datatofish.com/select-rows-pandas-dataframe/

https://pandas.pydata.org/pandas-docs/stable/reference/api/
pandas.DataFrame.loc.html

https://pandas.pydata.org/pandas-docs/stable/reference/api/
pandas.DataFrame.iloc.html

Dataframe loc/iloc
 Dataframe loc
The loc() method allows us to extract rows and columns by labeled index.

df.index=["Jack", "Thomas", "Alexandre", "Anne"] #get the index labeled

df.loc[["Jack", "Thomas"]] #select the first and second row

 Dataframe iloc
The iloc() follows the same rules as loc(). It extracts rows and columns by selecting
indexes.

df.iloc[:, 1:3] #select the second and third columns with keeping all rows

print(df)
Setting index in dataframe
We can use the set_index() function if we want to replace the index using one or
more existing column.

 Old Index

 New Index

https://pandas.pydata.org/pandas-docs/stable/reference/api/
pandas.DataFrame.set_index.html
Dataframe Concatenate

https://pandas.pydata.org/pandas-docs/stable/reference/api/
pandas.DataFrame.drop.html

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.concat.html

Dataframe drop
To drop specified labels from rows or columns, we simply use drop() method.
For example, we want to delete the country column we added previously:

df.drop("country", axis=1)

drop() method has inplace=False as default, you can see that the country column is
not gone. Take a break & make some research.

https://pandas.pydata.org/pandas-docs/stable/reference/api/
pandas.DataFrame.drop.html

Dataframe with Pandas Recap

 Pandas is an open source library which offers several tools for data analysis. It
provides fast, flexible and expressive data struc tures that can be used for numerous
data manipulation tasks.
 We can create DataFrame from list, dictionary, numpy array or csv file.
 To convert data to a csv file : data.to_csv(“file.csv”)
 To see information about columns : dataframe.info()
 To see a brief description about columns and their values : dataframe.describe()
 To view the first 5 rows : dataframe.head()
 To view the last 5 rows : dataframe.tail()
 There are multiple ways to select rows and columns from Pandas DataFrames. Iloc
and loc are the main operations for retrieving data: The iloc method is used to select
indexes for Pandas Dataframe. Whereas loc method ensure the extraction by selecting
labels .
 We can also select specific rows or columns from using the brackets.

Pandas Basics
No ratings yet
Pandas Basics
84 pages
Getting Started with SAS Programming: Using SAS Studio in the Cloud
From Everand
Getting Started with SAS Programming: Using SAS Studio in the Cloud
Ron Cody
No ratings yet
Unit 4
No ratings yet
Unit 4
36 pages
Lab-3 Pandas Library
No ratings yet
Lab-3 Pandas Library
14 pages
2_Pandas
No ratings yet
2_Pandas
22 pages
Introduction To Pandas For Data Analysis
No ratings yet
Introduction To Pandas For Data Analysis
6 pages
Pandas Notes(1)
No ratings yet
Pandas Notes(1)
44 pages
Exp1 - Manipulating Datasets Using Pandas
No ratings yet
Exp1 - Manipulating Datasets Using Pandas
15 pages
Pandas
No ratings yet
Pandas
41 pages
The Pandas Library
No ratings yet
The Pandas Library
39 pages
Python Pandas ch-2
No ratings yet
Python Pandas ch-2
56 pages
Python Pandas New Sylabus
No ratings yet
Python Pandas New Sylabus
53 pages
Python 3rd unit question and answer
No ratings yet
Python 3rd unit question and answer
25 pages
Pandas
No ratings yet
Pandas
25 pages
Pandas DataFrame
No ratings yet
Pandas DataFrame
70 pages
CHP 8 Pandas
No ratings yet
CHP 8 Pandas
49 pages
Pandas (Ziad)
No ratings yet
Pandas (Ziad)
38 pages
PPT for Assignment-3 (Final_Pandas_Lab)
No ratings yet
PPT for Assignment-3 (Final_Pandas_Lab)
40 pages
FDS Module 2 Notes
No ratings yet
FDS Module 2 Notes
24 pages
IP 12th Chapter 3
No ratings yet
IP 12th Chapter 3
9 pages
All Document Reader 1715619870900
No ratings yet
All Document Reader 1715619870900
6 pages
FALLSEMFY2023-24 BCSE101E ELA CH2023241700215 Reference Material II 24-11-2023 Introduction To Pandas
No ratings yet
FALLSEMFY2023-24 BCSE101E ELA CH2023241700215 Reference Material II 24-11-2023 Introduction To Pandas
15 pages
On Data Handling Using Pandas-I
100% (2)
On Data Handling Using Pandas-I
63 pages
Class XII IP Key Points (Python Pandas)
No ratings yet
Class XII IP Key Points (Python Pandas)
5 pages
Python Data Frame New
No ratings yet
Python Data Frame New
32 pages
On Data Handling Using Pandas-I
100% (2)
On Data Handling Using Pandas-I
64 pages
Pandas Class 12 Ncertttt
No ratings yet
Pandas Class 12 Ncertttt
48 pages
05Getting Started With Pandas
No ratings yet
05Getting Started With Pandas
44 pages
Python Pandas Tutorial For Beginners
No ratings yet
Python Pandas Tutorial For Beginners
203 pages
Pandas
No ratings yet
Pandas
12 pages
Dataframe Notes
No ratings yet
Dataframe Notes
47 pages
Chapter 1 - Part 2 - DataFrame (1)
No ratings yet
Chapter 1 - Part 2 - DataFrame (1)
48 pages
1 Data Handling Using Pandas 1
No ratings yet
1 Data Handling Using Pandas 1
63 pages
Unit-4Introduction To Pandas
No ratings yet
Unit-4Introduction To Pandas
44 pages
Data Handing Using Pandas-I
100% (2)
Data Handing Using Pandas-I
46 pages
Pandas
No ratings yet
Pandas
5 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
13 pages
Pandas Dataframe
No ratings yet
Pandas Dataframe
48 pages
18_Pandas
No ratings yet
18_Pandas
33 pages
Pandas
No ratings yet
Pandas
8 pages
Data Handling Using Pandas-I-ORG
No ratings yet
Data Handling Using Pandas-I-ORG
44 pages
ACFrOgCuxzI7id1LCXi9yoyuvISxGard75NvAshCzyRkhz0Fv_jimN6GuJsUI3qR2_jr7vxbRmHlwJPmcpRa7v3zCXyCokAXM23U17GlLnoA-5jSOz-osgZwdAL-ghXvjz5yld44_1rLLZaDMrebwXv-HRUry-kJjWFBo4Jkhw==
No ratings yet
ACFrOgCuxzI7id1LCXi9yoyuvISxGard75NvAshCzyRkhz0Fv_jimN6GuJsUI3qR2_jr7vxbRmHlwJPmcpRa7v3zCXyCokAXM23U17GlLnoA-5jSOz-osgZwdAL-ghXvjz5yld44_1rLLZaDMrebwXv-HRUry-kJjWFBo4Jkhw==
12 pages
Data Analysis with Pandas
No ratings yet
Data Analysis with Pandas
122 pages
Pandas
No ratings yet
Pandas
16 pages
Cheat Sheet
No ratings yet
Cheat Sheet
10 pages
Pandas 1705297450
No ratings yet
Pandas 1705297450
21 pages
14_Pandas
No ratings yet
14_Pandas
25 pages
Pandas
No ratings yet
Pandas
41 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
10 pages
Pandas DataFrameObject
No ratings yet
Pandas DataFrameObject
4 pages
Pandas
No ratings yet
Pandas
9 pages
Data Frames
No ratings yet
Data Frames
60 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
12 pages
99c949c0-5910-425f-9ac5-155882800fa5
No ratings yet
99c949c0-5910-425f-9ac5-155882800fa5
36 pages
Pandas: Import
100% (1)
Pandas: Import
13 pages
Pandas
No ratings yet
Pandas
29 pages
Pandas Basics
No ratings yet
Pandas Basics
21 pages
Python Pandas - DataFrame
No ratings yet
Python Pandas - DataFrame
12 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Mastering Pandas in Python: Course Book
From Everand
Mastering Pandas in Python: Course Book
Pedro Martins
No ratings yet
Data Science Syllabus EN 2022
No ratings yet
Data Science Syllabus EN 2022
23 pages
All Unit Question Bank
No ratings yet
All Unit Question Bank
4 pages
IP Record Python 23-24 Aryan
No ratings yet
IP Record Python 23-24 Aryan
42 pages
Data Science
No ratings yet
Data Science
24 pages
Company Interview
No ratings yet
Company Interview
24 pages
cbleippu02
No ratings yet
cbleippu02
8 pages
DSBDA Lab Manual
No ratings yet
DSBDA Lab Manual
110 pages
Data Science and Data Analytics Brochure Welcome To RISE INSTITUTE 1
No ratings yet
Data Science and Data Analytics Brochure Welcome To RISE INSTITUTE 1
13 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
49 pages
LAST MINUTES REVISION Pandas Series
No ratings yet
LAST MINUTES REVISION Pandas Series
6 pages
Artifical Intelligence and Machine Learning Lab
No ratings yet
Artifical Intelligence and Machine Learning Lab
109 pages
ML Interview Preparation Schedule
No ratings yet
ML Interview Preparation Schedule
242 pages
Pandas - Jupyter Notebook
No ratings yet
Pandas - Jupyter Notebook
4 pages
Internship Report (200490111006)
No ratings yet
Internship Report (200490111006)
41 pages
Python For Data Science
No ratings yet
Python For Data Science
71 pages
The Best Python Libraries b0d3576dpz
100% (1)
The Best Python Libraries b0d3576dpz
50 pages
Ip HHW
No ratings yet
Ip HHW
32 pages
Class Xii Informatics Practices
No ratings yet
Class Xii Informatics Practices
5 pages
Pandas 1
No ratings yet
Pandas 1
2 pages
Bhagya Sree Power BI
No ratings yet
Bhagya Sree Power BI
5 pages
CS3361 Set3
No ratings yet
CS3361 Set3
3 pages
Python-Starprogram O API
No ratings yet
Python-Starprogram O API
7 pages
Question Bank (1&2)
No ratings yet
Question Bank (1&2)
4 pages
IP Practical File Answer - Class XII
No ratings yet
IP Practical File Answer - Class XII
44 pages
Data Analysis Noaman,Makhlouf Amine,Raguig Asaad,Fatehllah
No ratings yet
Data Analysis Noaman,Makhlouf Amine,Raguig Asaad,Fatehllah
12 pages
Python Programming Changing
No ratings yet
Python Programming Changing
3 pages
Lecture 2.2
No ratings yet
Lecture 2.2
25 pages
Automation and Analytics Using Python Certisured Intership Report
No ratings yet
Automation and Analytics Using Python Certisured Intership Report
49 pages
Internship Report - K
No ratings yet
Internship Report - K
30 pages
AI and DS
No ratings yet
AI and DS
6 pages

Pandas Dataframe Export The CSV File

Uploaded by

Pandas Dataframe Export The CSV File

Uploaded by

Introduction to dataframe

Pandas data structures

How to create series

series1 = pd.Series(array) #create the series from the array

list = [19, 175, 41, 22]

series2 = pd.Series(list) #create the series from the list

Series Changing Index

color=["pink", "white", "black", "blue"]

occurence = [20, 15, 6, 43]

color=["pink", "white", "black", "blue"]

S2=pd.Series([3, 22, 9, 10], index=color)

 Dataframe columns are made up of pandas Series.

You can think of it like a spreadsheet or SQL table,or a dictionary of Series.

How to create DataFrame

list = [['Jack', 34, 'Paris'], ['Thomas', 30, 'Roma'],

['Alexandre', 16, 'New York']]

df = pd.DataFrame(list, columns =['name', 'age', 'city'])

dictionary = { 'name' : ['Jack', 'Thomas', 'Alexandre'],

'age' : [34, 30, 16],

'city' : ['Paris', 'Roma', 'New York']}

3. From numpy array:

4. From csv file:

Let's create a dataframe from this csv file

df=pd.read_csv("csv file example", sep=";")

Dataframe exportation to csv

Getting information about dataframe

 Viewing our data

df.head() #to show the first 5 rows of our data

df.tail() # to show the last 5 rows of our data

 Describing our data

df.index=["Jack", "Thomas", "Alexandre", "Anne"] #get the index labeled

df.loc[["Jack", "Thomas"]] #select the first and second row

Dataframe with Pandas Recap

You might also like