Pandas Notoes For XII PDF
Pandas Notoes For XII PDF
Introduction
Pandas or python pandas is Python’s library for data analysis. Pandas has derived
its name from “panel data system”, which is an ecometrics term for
multidimensional, structured data sets. Pandas has become a popular choice for
data analysis. Data analysis refers to process of evaluating big data set using
analytical and statistical tools so as to discover useful information and conclusions
to support business decision-making.
Using pandas- Pandas is an open source, BSD library built for Python
programming language. Pandas offer high-performance, easy to use data structures
and data analysis tools.
import pandas as pd
Why Pandas- Pandas is the most popular library in the scientific Python
ecosystem for doing data analysis. Pandas is capable of many tasks including:
Creating Series Objects: A series type object can be created in many ways
using pandas library’s Series(). We have to import pandas and numpy also.
dtype:int64
s1=pd.series((4,6,8,10))
data cab be given in the form of tuples.
Example-
import pandas as pd
s1=pd.series([“i”,”am”,”laughing”])
print(“series object”)
print(s1)
output:
Series object
0 i
1 am
2 laughing
(ii) Specify data as an ndarray
import pandas as pd
import numpy as np
nda1=np.arange(3,13,3.5)
print(nda1)
ser1=pd.series(nda1)
print(ser1)
Example:
import pandas as pd
s1=pd.Series(200,index=range(2020,2029,2))
print(s1)
OutPut:
2020 200
2022 200
2024 200
2026 200
2028 200
Dtype: int64
(iv) Spefify index(es) as well as data with Series()-
<Series Object>= pandas.Series(data=None,index=None)
Example-
import pandas as pd
obj1=pd.Series(data=[32,24,26],index=[‘a’,’b’,’c’])
print(obj1)
example-
import pandas as pd
section=[‘a’,’b’,’c’,’d’]
contri=[6700,5600,5000,5200]
s1=pd.Series(data=contri,index =section)
print(s1)
import pandas as pd
import numpy as np
a=np.arange(9,13)
s1=pd.Series(data=a,index =a*2)
print(s1)
Example-
import pandas as pd
import numpy as np
section=['a','b','c','d','e']
contri=np.array([6700,5600,5000,5200,np.NaN])
s1=pd.Series(data=contri,index =section,dtype=np.float32)
print(s1)
output:
a 6700.0
b 5600.0
c 5000.0
d 5200.0
e NaN
dtype: float32
import pandas as pd
s1=pd.series([4,6,8,10])
print(“series object1”)
print(s1)
s1[0]=100
s1[2:]=50
print(s1)
Output
series object1
0 4
1 6
2 8
3 10
dtype: int64
0 100
1 6
2 50
3 50
dtype: int64
Output:
0 2
1 4
dtype: int64
2 5
3 6
dtype: int64
Filtering Entries- User can give boolean condition with elements.
<series object>([ Boolean Expression])
Example-
import pandas as pd
info=pd.Series(data=[31,41,51])
print(info)
print(info>40)
print(info[info>40])
output
0 31
1 41
2 51
dtype: int64
0 False
1 True
2 True
dtype: bool
1 41
2 51
dtype: int64
print(info.sort_values(ascending=False))
Reindexing:
<Series Object>=<Object>.reindex(<sequence with new order>)
obj1= obj2.reindex([‘e’,’b’,’c’,’d’,’a’])
example-
import pandas as pd
data=[2,4,5,6]
trdata1=pd.Series(data)
print(trdata1)
trdata2=trdata1.reindex([3,2,1,0])
print(trdata2)
Example:
import pandas as pd
topperA={‘Rollno’:115,’Name’:’Pavni’,’Marks’:97.5}
topperB={‘Rollno’:116,’Name’:’Rishi’,’Marks’:98}
topperC={‘Rollno’:117,’Name’:’Paula’,’Marks’:98.5}
toppers=[topperA,topperB,topperC]
topdf=pd.DataFrame(toppers)
print(topdf)
Output:
Rollno Name Marks
0 115 Pavni 97.5
1 116 Rishi 98.0
2 117 Paula 98.5
Example:
import pandas as pd
list2=[[25,45,60],[34,67,89],[88,90,56]]
df2=pd.DataFrame(list2,index=[‘row1’,’row2’,’row3’])
print(df2)
Output
0 1 2
row1 25 45 60
DataFrame Attributes-
When you create a DataFrame object, all information related to it. It is
available through its attributes.
<DataFrame object>.<attribute name>
1. Len(df object)- count the rows
2. Df_object.count()- count the non-Na values for each column
3. Df_object.T – Transform a dataframe.
Selecting or accessing data-