0% found this document useful (0 votes)
16 views

Pandas

MS SAIKUMAR NATIONAL DEGREE COLLEGE NANDYAL

Uploaded by

sai.national2635
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Pandas

MS SAIKUMAR NATIONAL DEGREE COLLEGE NANDYAL

Uploaded by

sai.national2635
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 11

PANDAS

By MS SAIKUMAR
Package overview
pandas is a Python package providing fast, flexible,
and expressive data structures designed to make
working with “relational” or “labeled” data both
easy and intuitive.
It aims to be the fundamental high-level building
block for doing practical, real-world data analysis in
Python.
Additionally, it has the broader goal of becoming the
most powerful and flexible open source data
analysis/manipulation tool available in any language.
It is already well on its way toward this goal.
pandas is well suited for many different kinds of data:
• Tabular data with heterogeneously-typed columns, as in an SQL
table or Excel spreadsheet
• Ordered and unordered (not necessarily fixed-frequency) time
series data.
• Arbitrary matrix data (homogeneously typed or heterogeneous)
with row and column labels
• Any other form of observational / statistical data sets. The data
need not be labeled at all to be placed into a pandas data
structure.
The two primary data structures of pandas,
1. Series (1-dimensional) and
2. DataFrame (2-dimensional),
handle the vast majority of typical use cases in finance, statistics,
social science, and many areas of engineering.
Here are just a few of the things that pandas does well:
• Easy handling of missing data (represented as NaN) in floating point
as well as non-floating point data
• Size mutability: columns can be inserted and deleted from
DataFrame and higher dimensional objects
• Automatic and explicit data alignment: objects can be explicitly
aligned to a set of labels, or the user can simply ignore the labels and
let Series, DataFrame, etc. automatically align the data for you in
computations
• Powerful, flexible group by functionality to perform split-apply-
combine operations on data sets, for both aggregating and
transforming data
• Make it easy to convert ragged, differently-indexed data in other
Python and NumPy data structures into DataFrame objects
• Intelligent label-based slicing, fancy indexing, and subsetting of large
data sets
• Intuitive merging and joining data sets
Data structures
Dimensions Name Description
1 Series 1D labeled homogeneously-
typed array

2 DataFrame General 2D labeled, size-


mutable tabular structure
with potentially
heterogeneously-typed
column

Team Contributors
https://pandas.pydata.org/about/team.html
Creating a Series:
We can create a Series in two ways:
• Create an empty Series
• Create a Series using inputs.
Create an Empty Series:
• We can easily create an empty series in Pandas which means it will
not have any value.
The syntax that is used for creating an Empty Series:
<series object> = pandas.Series()

import pandas as pd Create a Series using inputs:


x = pd.Series()
print (x) Array
Dict
Scalar value
Example - 1 Example – 2 using Dictionary
import pandas as pd import pandas as pd
import numpy as np import numpy as np
info = np.array(['P','a','n','d','a','s']) info = {'x' : 0., 'y' : 1., 'z' : 2.}
a = pd.Series(info) a = pd.Series(info)
print(a) print (a)
Example - 4
Example – 3 using Scalar import numpy as np
import pandas as pd import pandas as pd
import numpy as np x=pd.Series(data=[2,4,6,8])
x = pd.Series(4, index=[0, 1, 2, 3]) y=pd.Series(data=[11.2,18.6,22.5],
print (x) index=['a','b','c'])
print(x.index)
Note : The scalar value will be repeated for print(x.values)
matching the length of the index. print(y.index)
print(y.values)
DataFrame

Create a DataFrame:

We can create a DataFrame using


following ways:

dict
Lists
Numpy ndarrrays
Series
Create an empty DataFrame
The below code shows how to create an empty DataFrame in Pandas:

import pandas as pd #using List


df = pd.DataFrame() import pandas as pd
print (df) x = ['Python', 'Pandas']
df = pd.DataFrame(x)
print(df)

# using Dictionary
import pandas as pd
info = {'ID' :[101, 102, 103],'Department' :['B.Sc','B.Tech','M.Tech',]}
df = pd.DataFrame(info)
print (df)
Example:
import pandas as pd

# Define a dictionary containing employee data


data = {'Name':['Jai', 'Princi', 'Gaurav', 'Anuj'],
'Age':[27, 24, 22, 32],
'Address':['Delhi', 'Kanpur', 'Allahabad', 'Kannauj'],
'Qualification':['Msc', 'MA', 'MCA', 'Phd']}

# Convert the dictionary into DataFrame


df = pd.DataFrame(data)

# select two columns


import pandas as pd

# Define a dictionary containing Students data


data = {'Name': ['Jai', 'Princi', 'Gaurav', 'Anuj'],
'Height': [5.1, 6.2, 5.1, 5.2],
'Qualification': ['Msc', 'MA', 'Msc', 'Msc']}

# Convert the dictionary into DataFrame


df = pd.DataFrame(data)

# Declare a list that is to be converted into a column


address = ['Delhi', 'Bangalore', 'Chennai', 'Patna']

# Using 'Address' as the column name


# and equating it to the list
df['Address'] = address

# Observe the result


print(df)

You might also like