0% found this document useful (0 votes)
103 views

DATAFRAME

Uploaded by

jangiddheeru29
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
103 views

DATAFRAME

Uploaded by

jangiddheeru29
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

S.

QUESTION ANSWER
N.
1 Write a suitable Python code to create an empty import pandas as pd
dataframe df=pd.DataFrame()
print(df)

2 Consider the following dataframe : student_df


Name class marks
AnamayXI 95
Aditi XI 82
Mehak XI 65 Student_df[‘marks’.min()]
Kriti XI 45

Write a statement to get the minimum value of


the column marks

3. A dataframe studdf stores data about the


students stream, marks. A part of it is shown
below: Studdf.pivot_table(index=’Stream’, Values=’marks’,
aggfunc=’mean’)
Class Stream Marks
11 Science 95
11 Commerce 80
11 Arts 75
11 Vocational 65

Using the above dataframe, write the command


to compute Average marks stream wise
4 Write a python code to create a import pandas as pd
dataframe with appropriate data = [['S101', 'Amy', 70], ['S102', 'Bandhi', 69], ['S104',
headings from the list given 'Cathy', 75], ['S105', 'Gundaho', 82]]
below : df = pd.DataFrame(data, columns = ['ID', 'Name', 'Marks'])

['S101', 'Amy', 70], print(df )


['S102', 'Bandhi', 69],
['S104', 'Cathy', 75],
['S105','Gundaho', 82]
5 Write a small python code to create a dataframe import pandas as pd
with headings(a and b) from the list given below df = pd.DataFrame([[1, 2], [3, 4]],
: columns = ['a','b'])
[[1,2],[3,4],[5,6],[7,8]] df2 =pd.DataFrame([[5, 6], [7, 8]],
columns = ['a','b'])
df = df.append(df2)

6 Consider the following dataframe, and answer


the questions given below:
import pandas as pd
df = pd.DataFrame
(
{“Quarter1":[2000, 4000, 5000,
4400, 10000],
"Quarter2":[5800, 2500, 5400,
3000, 2900], "Quarter3":
[20000,16000, 7000, 3600, 8200],
I. print(df.mean(axis = 1))
"Quarter4":[1400, 3700, 1700, print(df.mean(axis = 0))
2000,6000]
}
)
II. print(df.sum(axis = 1))
(i) Write the code to find mean value
from above dataframe df over the
index and column axis.
(ii) Use sum() function to find the sum of all III. print(df.median())
the values over the index axis.
(iii) Find the median of the dataframe df.

7 Given a data frame df1 as shown below:

City Maxtemp MinTemp


RainFall

Delhi 40 32 24.1
I. df1.sum()
Bengaluru 31 25 36.2

Chennai 35 27 40.8
II. df1[‘Rainfall’].mean()
Mumbai 29 21 35.2

Kolkata 39 23 41.8
III. df1.loc[:, ‘Maxtemp’].median( )

I. Write command to compute sum of


every column of the data frame.
II. Write command to compute mean of
column Rainfall.
III. Write command to compute Median of
the Maxtemp Column.
8. Find the output of the following code: a b
first 10 20
import pandas as pd second 6 32
data = [{'a': 10, 'b': 20},{'a': 6, 'b': 32, 'c': 22}]
#with two column indices, values same as a b1
dictionary keys first 10 NaN
df1 = pd.DataFrame(data, index=['first', second 6 NaN
'second'],
columns=['a', 'b'])
#With two column indices with one index with
0ther name
df2 = pd.DataFrame(data, index=['first',
'second'],
columns=['a', 'b1'])
print(df1)
print(df2)
9 Write the code in pandas to create the import pandas as pd
following dataframes : import numpy as np
df1 df2 df1 = pd.DataFrame({'mark1':[30,40,15,40], 'mark2':
mark1 mark2 mark1 mark2 [20,45,30,70]});
0 10 15 0 30 20 df2 = pd.DataFrame({'mark1':[10,20,20,50], 'mark2':
1 40 45 1 20 25 [15,25,30,30]});
2 15 30 2 20 30 print(df1)
3 40 70 3 50 30 print(df2)
Write the commands to do the following
operations on the dataframes given I. print(df1.add(df2))
above :
(i) To add dataframes df1 and df2. II. print(df1.subtract(df2))
(ii) To subtract df2 from df1
(iii) To rename column mark1 as marks1in III. df1.rename(columns={'mark1':'marks1'},
both the dataframes df1 and df2. inplace=True)
(iv) To change index label of df1 from 0 to zero print(df1)
and from 1 to one. IV. df1.rename(index = {0: "zero", 1:"one"}, inplace =
True) print(df1)
10 In a DataFrame, Axis= 1 represents the column
elements.
11 In Pandas the function used to check for null isnull()
values in a DataFrame is____________
12. Consider the following DataFrame df and answer
any four questions from (i)- (v) I. print(df.max())

Rollno name UT1 UT2 UT3 UT4 II. df1=df[df[‘rollno’]==4]


1. Prerna Singh 24 24 20 22 print(df1)
2. Manish Arora 18 17 19 22 or
3. Tanish Goel 20 22 18 24 df1=df[df.rollno==4]
4. Falguni Jain 22 20 24 20 print(df1)
5. Kanika Bhatnagar 15 20 18 22
6. Ramandeep Kaur 20 15 22 24
III. print(df.count())
I. Write down the or
command that print(df.count(0))
will give the
following output. IV. d. print(df.columns)
V. df [‘Grade’]=[’A’,’B’,’A’,’A’,’B’,’A’]

II. The teacher needs to know the marks


scored by the student with roll number 4.

III. Write the statement/s will give the exact


number of values in each column of the
dataframe?

IV. Write the command will display the


column labels of the DataFrame?

V. Ms. Sharma, the class teacher wants to


add a new column, the scores of Grade
with the values, ‘ A’, ‘B’, ‘A’, ‘A’, ‘B’,
‘A’ ,to the DataFrame.
13 Consider the following DataFrame, classframe I. classframe[‘Activity’]=[‘Swimming’,’Dancing
RollnoName ClassSectionCGPAStream ’,’Cricket’, ‘Singing’]
St11 Aman IX E 8.7 Science
St22 Preeti X F 8.9 Arts
St33 KartikeyIX D 9.2 Science
St44 Lakshay X A 9.4 Commerce II. classframe.loc[‘St5’]=[1,’Mridula’, ‘X’, ‘F’,
9.8, ‘Science’]
Write commands to :
I. Add a new column ‘Activity’ to the
Dataframe
II. Add a new row with values ( 5 ,
Mridula ,X, F , 9.8,
Science)
14 Write a program in Python Pandas to create iii. import pandas as pd
the following DataFrame batsman from a d1={
Dictionary: 'B_NO':[1,2,3,4],
'Name':["Sunil Pillai","Gaurav ,” Sharma”
"Piyush Goel","Kartik Thakur"],
B_NO Name Score1 Score2
'Score1':[90,65,70,80],
1. Sunil Pillai 90 80 'Score2':[80,45,95,76]
2. Gaurav Sharma 65 45 }
3. Piyush Goel 70 90 df=pd.DataFrame(d1)
4. Kartik Thakur 80 76 print(df)

Perform the following operations on the I. df['Total'] = df['Score1']+ df['Score2']


DataFrame : or
df['Total'] = sum(df['Score1'], df['Score2'])
i. Add both the scores of a batsman and
assign to column “Total” print(df)
ii. Display the highest score in both Score1
and Score2 of the DataFrame. II. print("Maximum scores are : " , max(df['Score1']),
iii. Display the DataFrame max(df['Score2']))

15 Which of the following can be used to specify All of these


the data while creating a DataFrame?
i. Series
ii. List of Dictionaries
iii. Structured ndarray
iv. All of these
16 (i) The index labels of df will include
Carefully observe the following code: Q1,Q2,Q3,Q4,A,B,C
import pandas as pd
Year1={'Q1':5000,'Q2':8000,'Q3':12000,'Q4': Print( df.index)
18000}
Year2={'A' :13000,'B':14000,'C':12000}
totSales={1:Year1,2:Year2} (ii) The column names of df will be: 1,2
df=pd.DataFrame(totSales)
print(df) Print( df.columns.values)
Answer the following:

i. List the index of the DataFrame df


ii. List the column names of DataFrame df.

17. Consider the given DataFrame ‘Stock’: i. Stock['Special_Price']=[135,150,200,400]

Name Price
0. Nancy Drew 150 ii. Stock.loc['4']=['The Secret',800]
1. Hardy boys 180
2. Diary of a wimpy kid 225 iii. Stock=Stock.drop('Special_Price',axis=1)
3. Harry Potter 500
Write suitable Python statements for the
following:
i. Add a column called Special_Price with
the following data: [135,150,200,440].
ii. Add a new book named ‘The Secret'
having price 800.
iii. Remove the column Special_Price.
18. Mr. Som, a data analyst has designed the
DataFrame df that contains data about
Computer Olympiad with ‘CO1’, ‘CO2’, ‘CO3’,
‘CO4’, ‘CO5’ as indexes shown below. Answer
the following questions:

A. Output:
Scho Tot_stude Topp First_Runne i. (5,4)
ol nts er rup
CO PPS 40 32 8
1 ii. School tot_students Topper First_ Runner_up
CO JPS 30 18 12
2 CO3 GPS 20 18 2
CO GPS 20 18 2 CO4 MPS 18 10 8
3
CO MPS 18 10 8 B. Python statement:
4
CO BPS 28 20 8 print(df.loc['CO2': 'CO4', 'Topper'])
5
C. print(df.Tot_students-df.First_Runnerup)

A. Predict the output of the following


python statement:
a. df.shape
b. df[2:4]
B. Write Python statement to display the
data of Topper column of indexes CO2
to CO4.

C. Write Python statement to compute and


display the difference of data of
Tot_students column and First_Runnerup
column of the above given DataFrame.

19. The python code written below has syntactical


errors. Rewrite the correct code and underline
the corrections made. import pandas as pd

Import pandas as pd df ={"Technology":["Programming","Robotics",


df ={"Technology": "3D”Printing"],
["Programming","Robotics","3D "Time(in months)":[4,4,3]
Printing"], }
"Time(in months)":[4,4,3]
} df= pd.DataFrame(df)
df= Pd.dataframe(df) print(df)
Print(df)

20 Create a DataFrame in Python from the given


list: [[‘Divya’,’HR’,95000], import pandas as pd
[‘Mamta’,’Marketing’,97000], df=[["Divya","HR",95000],
[‘Payal’,’IT’,980000], ["Mamta","Marketing",97000],
[‘Deepak’,’Sales’,79000]] ["Payal","IT",980000],
["Deepak","Sales",79000]]

df=pd.DataFrame(df,columns=["Name","Department",
"Salary"])
print(df)

21 Consider the given DataFrame ‘Genre’:


Type Code
0. Fiction F
1. Non Fiction NF i. Genre["Num_Copies"]=[300,290,450,760]
2. Drama D ii. Genre.loc[4]=["Folk Tale","FT",600]
3. Poetry P iii.Genre=Genre.rename({"Code":"Book_Code"}, axis=1)
OR
Write suitable Python statements for the Genre=Genre.rename({"Code":"Book_Code"},
following: axis="columns")
I. Add a column called Num_Copies with
the following data: [300,290,450,760].
II. Add a new genre of type ‘Folk Tale'
having code as “FT” and 600 number of
copies.
III. Rename the column ‘Code’ to
‘Book_Code’.

22 Ekam, a Data Analyst with a multinational i. a. 15


brand has designed the DataFrame df that
contains the four quarter’s sales data of
different stores as shown below:

(ii) df=df.drop(2)
OR
df.drop(2,axis=0)

iii.
Answer the following questions:
i. Predict the output of the following df["total"]=df["Qtr1"]+df["Qtr2"]+
python statement: df["Qtr3"]+ df["Qtr 4"]
a. print(df.size)
b. print(df[1:3])
ii. Delete the last row from the
DataFrame.
iii. Write Python statement to add a
new column Total_Sales which is the
addition of all the 4 quarter sales.

23 Which of the following Python statements (B). df['column_name']


can be used to select a column
column_name from a DataFrame df ?
A. df.getcolumn('column_name')
B. df['column_name']
C. df.select('column_name')
D. df(column_name)
24 Sneha is writing a Python program to create a import pandas as pd
DataFrame using a list of dictionaries. D1 = {'Name': 'Rakshit', 'Age': 25}
However, her code contains some mistakes. D2 = {'Name': 'Paul', 'Age': 30}
Identify the errors, rewrite the correct code, D3 = {'Name': 'Ayesha', 'Age': 28}
and underline the corrections made. data = [D1, D2, D3]
import Pandas as pd df = pd.DataFrame(data) print(df)
D1 = {'Name': 'Rakshit', 'Age': 25}
D2 = {'Name': 'Paul', 'Age': 30}
D3 = {'Name':'Ayesha", 'Age': 28}
data = [D1,D2,D3)
df = pd.Dataframe(data)
print(df)
25
Complete the given Python code to get the
required output (ignore the dtype attribute) as
Output:
Tamil Nadu Chennai
Uttar Pradesh Lucknow import pandas as pd
Manipur Imphal data = ['Chennai', 'Lucknow', 'Imphal']
Code: indx = ['Tamil Nadu','Uttar Pradesh','Manipur'] s =
import as pd pd.Series(data, indx)
data = ['Chennai',' print(s)
','Imphal'
]
indx = ['Tamil Nadu','Uttar
Pradesh','Manipur']
s = pd.Series___________,indx)
print( )

26 Write a Python program to create the following import pandas as pd


DataFrame using a list of dictionaries d1 = {'Product': 'Laptop', 'Price': 60000}
d2 = {'Product': 'Desktop', 'Price': 45000}
d3 = {'Product': 'Monitor', 'Price': 15000}
d4 = {'Product': 'Tablet', 'Price': 30000}
data = [d1, d2, d3, d4]
df = pd.DataFrame(data)
print(df)

27 Consider the DataFrame df shown below.

I. print(df.head(2))
II. print(df['Title'])
III. df = df.drop(‘Rating’, axis=1)
IV. print(df.loc[2:4,'Title'])
Write Python statements for the DataFrame df V. df.rename(columns={'Title':'Name'},
to: inplace=True)
I. Print the first two rows of the DataFrame
df.
II. Display titles of all the movies.
III. Remove the column rating.
IV. Display the data of the 'Title' column
from indexes 2 to 4 (both included)
V. Rename the column name 'Title' to
'Name'

28 Write a program in python to find import pandas as pd


maximum value over index in Data frame df = pd.DataFrame({"A":[4, 5, 2, 6],"B":[11, 2, 5, 8],
"C":[1, 8, 66, 4]})
print(df)
print(df.idxmax(axis = 0))

29 What are the purpose of 1. It displays the names of columns of the Dataframe.
following statements- 2. It will display all columns except the last 5 columns.
1. df.columns 3. It displays all columns with row index 2 to 7.
2. df.iloc[ : , :-5] 4. It will display entire dataframe with all rows and
columns.
3. df[2:8]
5. It will display all rows except the last 4 four rows.
4. df[ :]
5. df.iloc[ : -4 , : ]

30 Name Age Designatio


n import pandas as pd
Sanjeev 37 Manager name=pd.Series(['Sanjeev','Keshav','Rahul'])
age=pd.Series([37,42,38])
Keshav 42 Clerk designation=pd.Series(['Manager','Clerk','Accountant'])
Rahul 38 Accountant d1={'Name':name,'Age':age,'Designation':designation}
df=pd.DataFrame(d1)
Write a python program to sort print(df)
the following data according to df1=df.sort_values(by='Age')
ascending order of Age. print(df1)

31 Consider the following record d={'Player':['Hardik Pandya','K L Rahul','AndreRussel','Jasprit


in dataframe IPL Bumrah','Virat Kohli','Rohit Sharma'],
Player Team Catego BidPri Run
ry ce s 'Team':['Mumbai Indians','Kings Eleven','Kolkata Knight
Hardik Mumb Batsm 13 100 Riders','Mumbai Indians','RCB','Mumbai Indians'],
Pandy ai an 0
a Indian 'Category':
s ['Batsman','Batsman','Batsman','Bowler','Batsman','Batsman'] ,
KL Kings Batsm 12 240
'Bidprice':[13,12,7,10,17,15],
Rahul Eleve an 0
n 'Runs':[1000,2400,900,200,3600,3700]}
Andre Kolkat Batsm 7 900
Russel a an df=pd.DataFrame(d)
Knight print(df)
riders
Jasprit Mumb Bowler 10 200
Bumr ai
ah Indian
s
Virat RCB Batsm 17 360
Kohli an 0 1. print(df.iloc[:2,:])
Rohit Mumb Batsm 15 370 print(df.iloc[-3:,:])
Sharm ai an 0
a Indian 2. print(df[df['BidPrice']==df['BidPrice'].max()])
s
3. print(df.groupby('Team').Player.count())
1. Retrieve first 2 and last 3 rows using python
program. 4. val=df.groupby('Team')
2. Write a command to Find most expensive print(val['Player','BidPrice'].max())
Player.
3. Write a command to Print total players per 5. print(df.groupby(['Team']).Runs.mean())
team.
4. Write a command to Find player who had
highest BidPrice from each team.
5. Write a command to Find average runs of 6. print(df.sort_values(by='BidPrice'))
each team.
6. Write a command to Sort all players
according to BidPrice.

32 MaxTem MinTem City RainFa


p p ll
45 30 Delhi 25.6
34 24 Guwaha 41.5
ti
48 34 Chenna 36.8
i
32 22 Banglur 40.2
u 1. print(df.sum(axis=0))
44 29 Mumbai 38.5
39 37 Jaipur 24.9 2. print(df['MaxTemp'].mean())

Consider the above data frame 3. df[['MinTemp', 'Rainfall’]][:4].mean()


as df-
1. Write command to compute sum of every
column of the data frame.
2. Based on the above data frame df, Write a
command to compute mean of column
MaxTemp.

3. Based on the above data frame df, Write a


command to compute average MinTemp,
RainFall for first 4 rows
33 Write a program in python to join two xiia={'sub':['eng','mat','ip','phy','che','bio'],'id':
data frame. ['302','041','065','042','043','044']}

xiic={'sub':['eng','mat','ip','acc','bst','eco'],'id':
['302','041','065','055','056','057']}

df1=pd.DataFrame(xiia)
print(df1)
df2=pd.DataFrame(xiic)
print(df2)
print(df1.merge(df2,on='id'))
print(df1.merge(df2,on='id',how='outer'))

You might also like