0% found this document useful (0 votes)
5 views

aadarsh

This document is a practical file for the Bachelor of Computer Applications program at Maharaja Surajmal Institute, detailing various data visualization and analytics tasks. It includes a series of programming exercises using Python libraries such as pandas, matplotlib, and seaborn, covering topics like DataFrame manipulation, plotting graphs, and statistical analysis. Each practical task is accompanied by code snippets and expected outputs, demonstrating the application of data analysis techniques.

Uploaded by

prem prasad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

aadarsh

This document is a practical file for the Bachelor of Computer Applications program at Maharaja Surajmal Institute, detailing various data visualization and analytics tasks. It includes a series of programming exercises using Python libraries such as pandas, matplotlib, and seaborn, covering topics like DataFrame manipulation, plotting graphs, and statistical analysis. Each practical task is accompanied by code snippets and expected outputs, demonstrating the application of data analysis techniques.

Uploaded by

prem prasad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

MAHARAJA SURAJMAL INSTITUTE

Affiliated to GGSIP University & NAAC ‘A+’ grade accredited

BACHELOR OF COMPUTER APPLICATIONS

Data Visualization & Analytics


PRACTICAL FILE
SUBJECT CODE – BCAP 312

Submitted by : Aadarsh Jha Submitted to :- Dr. Sushma Malik


Enrollment no : 09121202022 Associate Professor , MSI
Sem : 6th Sec : B (2nd shift)
INDEX
S.No. Practical Date Sign
1. Write a program to create a DataFrame have E- 23/01/25
commerce data and perform selection of
row/column using loc() and iloc()
2. Create a Series object S5 containing numbers. Write 06/02/25
a program to store the square of the series values in
object S6. Display S6’s values which are >15.
3. Write a program to fill all missing values in a 06/02/25
DataFrame with zero.
4. Program for combining DataFrames using concat(), 13/02/25
join(),merge()
5. Write a program to draw bar graph for the following 27/02/25
data for the Medal tally of CWG-2018:-
Gold 26
Silver 20
Bronze 20
Total 66
6. Implementing Line plot, Dist plot, Lmplot, Count 27/02/25
plot using Seaborn library
7. Create a DataFrame namely aid that stores aid 06/03/25
(Toys,books,uniform,shoes) by NGO’s for different
states. Write a program to display the aid for:- (a)
Books and Uniforms only (b) Shoes only
8. Create a DataFrame ndf having Name, Gender, 20/03/25
Position, City, Age, Projects. Write a program to
summarize how many projects are being handled by
each position for each city? Use pivot()
9. Marks is a list that stores marks of a student in 10 20/03/25
unit test. Write a program to plot Line chart for the
student’s performance in these 10 test
10. Write a program to plot a horizontal bar chart from 27/03/25
the height of some students
11. Write a program to implement ANOVA 17/04/25
12. Write a program to show correlation between two 17/04/25
randomly generated numbers
13. Write a program to implement Covariance 24/04/25
14. Create a GUI based form for admission purpose for 24/04/25
your college
15. The Created GUI based application form is to be 01/05/25
connected to a database and use insert query to
enter data.
Practical – 1

Ques. Write a program to create a DataFrame have E-commerce data and


perform selection of row/column using loc() and iloc()

Code :-
import pandas as pd

# Sample e-commerce data


data = {
'OrderID': [101, 102, 103, 104, 105],
'Product': ['Laptop', 'Mouse', 'Keyboard', 'Monitor', 'Headphones'],
'Price': [1200, 20, 50, 300, 100],
'Quantity': [1, 2, 1, 1, 1],
'CustomerID': ['C001', 'C002', 'C003', 'C004', 'C005']
}

#Create Dataframe
df = pd.DataFrame(data)
df

Output :-
Code :-
# Selection using loc() - selecting rows and columns by labels
print("Selection using loc():")

# Selecting rows with index 0 to 2 and columns 'OrderID', 'Product', and


'Price'
df.loc[0:2, ['OrderID','Product','Price']]

Output :-

Code :-
# Selection using iloc() - selecting rows and columns by index positions
print("Selection using iloc():")

# Selecting first 3 rows and first 4 columns


df.iloc[:2,:4]

Output :-
Practical – 2

Ques. Create a Series object S5 containing numbers. Write a program to store


the square of the series values in object S6. Display S6’s values which are >15.

Code :-
import pandas as pd

#Creating series object S5 containing numbers


S5 = pd.Series([3,5,7,9,11,13,15,17,19,21])

#Store the square of the series value in object S6


S6 = S5**2
result = S6[S6>15]
print("Values in S6 greater than 15 : ")
result

Output :-
Practical – 3

Ques. Write a program to fill all missing values in a DataFrame with zero

Code :-
import pandas as pd

#Sample DataFrame with missing values


data = {
'A' : [1,None, 3, 4, None],
'B' : [None, 6, 7, None, 9],
'C' : [10, 11, None, 13, 14]
}

#Create DataFrame
df = pd.DataFrame(data)

#Fill missing values with zero


df_filled = df.fillna(0)

#Display the DataFrame after filling missing values


print("DataFrame with missing values filled with zero : " )
df_filled

Output:-
Practical – 4

Ques. Program for combining DataFrames using concat(), join(),merge()

Code :-
import pandas as pd

#Sample DataFrame
df1 = pd.DataFrame({
'A':[1,2,3],
'B':['a','b','c']
})

df2 = pd.DataFrame({
'A':[4,5,6],
'B':['d','e','f']
})

#Using concat() to combine DataFrame along rows


concatenated_df = pd.concat([df1,df2])
print("Concatenated DataFrame : ")
concatenated_df

Output:-
Code :-

#Sample DataFrame
df3 = pd.DataFrame({
'A': [1, 2, 3],
'C': ['x', 'y', 'z']
})

df4 = pd.DataFrame({
'D': ['p', 'q', 'r']},
index=[1, 2, 3])

#using join() to combine DataFrame based on index


joined_df = df3.join(df4)
print("Joined DataFrame : ")
joined_df

Output:-
Code :-
# Sample DataFrames for merge()
df5 = pd.DataFrame({
'A': [1, 2, 3],
'B': ['a', 'b', 'c'],
'Key': ['K1', 'K2', 'K3']
})

df6 = pd.DataFrame({
'C': ['x', 'y', 'z'],
'D': ['p', 'q', 'r'],
'Key': ['K1', 'K2', 'K3']
})

# Using merge() to combine DataFrames based on a common column


merged_df = pd.merge(df5, df6, on='Key')
print("Merged DataFrame:")
merged_df

Output:-
Practical – 5

Ques. Write a program to draw bar graph for the following data for the Medal
tally of CWG-2018:-
Gold Silver Bronze Total
26 20 20 66

Code :-
import matplotlib.pyplot as plt

# Data
medals = ['Gold', 'Silver', 'Bronze', 'Total']
counts = [26, 20, 20, 66]

# Create bar graph


plt.figure(figsize=(8, 5))
plt.bar(medals, counts, color=['gold', 'silver', 'brown', 'gray'])

# Add labels and title


plt.xlabel('Medal Type')
plt.ylabel('Count')
plt.title('CWG 2018 Medal Tally')
plt.grid(axis='y')

# Show plot
plt.show()

Output:-
Practical – 6

Ques. Implementing Line plot, Dist plot, Lmplot, Count plot using Seaborn
library

Code :-
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

# Sample data
data = pd.DataFrame({
'x': range(1, 11),
'y': [3, 5, 7, 4, 6, 8, 5, 9, 7, 10],
'category': ['A', 'B', 'A', 'B', 'A', 'B', 'A', 'B', 'A', 'B']
})

# Line plot
plt.figure(figsize=(8, 5))
sns.lineplot(x='x', y='y', data=data)
plt.title('Line Plot')
plt.show()

Output:-
Code :-
# Dist plot
plt.figure(figsize=(8, 5))
sns.distplot(data['y'], kde=False, bins=5)
plt.title('Distribution Plot')
plt.show()

Output:-
Code :-
# Lmplot
plt.figure(figsize=(8, 5))
sns.lmplot(x='x', y='y', data=data, hue='category')
plt.title('Lmplot')
plt.show()

Output:-
Code :-
# Count plot
plt.figure(figsize=(8, 5))
sns.countplot(x='category', data=data)
plt.title('Count Plot')
plt.show()

Output:-
Practical – 7

Ques. Create a DataFrame namely aid that stores aid


(Toys,books,uniform,shoes) by NGO’s for different states. Write a program to
display the aid for:- (a) Books and Uniforms only (b) Shoes only

Code :-
import pandas as pd

# Creating the DataFrame 'aid'


aid_data = {
'NGO': ['NGO1', 'NGO2', 'NGO3', 'NGO4'],
'State': ['State1', 'State2', 'State3', 'State4'],
'Toys': [100, 150, 200, 120],
'Books': [200, 250, 180, 220],
'Uniform': [150, 180, 200, 160],
'Shoes': [120, 100, 150, 130]
}

aid = pd.DataFrame(aid_data)

# Displaying aid for Books and Uniforms only


books_uniform_aid = aid[['NGO', 'State', 'Books', 'Uniform']]
print("Aid for Books and Uniforms only:")
books_uniform_aid

Output:-
Code :-
# Displaying aid for Shoes only
shoes_aid = aid[['NGO', 'State', 'Shoes']]
print("\nAid for Shoes only:")
shoes_aid

Output:-
Practical – 8

Ques. Create a DataFrame ndf having Name, Gender, Position, City, Age,
Projects. Write a program to summarize how many projects are being handled
by each position for each city?

Code :-
import pandas as pd

# Creating the DataFrame 'ndf'


data = {
'Name': ['John', 'Emily', 'Michael', 'Sophia', 'William'],
'Gender': ['M', 'F', 'M', 'F', 'M'],
'Position': ['Manager', 'Developer', 'Manager', 'Developer', 'Manager'],
'City': ['New York', 'Los Angeles', 'Chicago', 'New York', 'Chicago'],
'Age': [35, 28, 40, 32, 38],
'Projects': [5, 3, 4, 6, 5]
}

ndf = pd.DataFrame(data)

# Summarizing the number of projects by each position for each city using
pivot_table
summary = pd.pivot_table(ndf, index=['Position', 'City'], values='Projects',
aggfunc='sum')

print("Summary of projects by each position for each city:" )


summary

Output:-
Practical – 9

Ques. Marks is a list that stores marks of a student in 10 unit test. Write a
program to plot Line chart for the student’s performance in these 10 test

Code :-
import matplotlib.pyplot as plt

# Sample marks data (replace with actual marks)


marks = [85, 78, 90, 82, 88, 92, 79, 84, 87, 91]

# Create x-axis values (unit test numbers)


unit_tests = range(1, len(marks) + 1)

# Plot line chart


plt.figure(figsize=(8, 5))
plt.plot(unit_tests, marks, marker='o', linestyle='-')
plt.title("Student's Performance in 10 Unit Tests")
plt.xlabel('Unit Test Number')
plt.ylabel('Marks')
plt.xticks(unit_tests)
plt.grid(True)
plt.show()

Output:-
Practical – 10

Ques. Write a program to plot a horizontal bar chart from the height of some
students

Code :-
import matplotlib.pyplot as plt

# Student names and their heights (replace with actual data)


students = ['John', 'Emily', 'Michael', 'Sophia', 'William']
heights = [170, 165, 180, 160, 175] # Heights in centimeters

# Plot horizontal bar chart


plt.figure(figsize=(8, 5))
plt.barh(students, heights, color='skyblue')
plt.xlabel('Height (cm)')
plt.ylabel('Student')
plt.title('Heights of Students')
plt.grid(axis='x') # Show grid lines only on the x-axis
plt.show()

Output:-
Practical – 11

Ques. Write a program to implement ANOVA

Code :-
import scipy.stats as stats

# Sample data
group1 = [25, 30, 35, 40, 45]
group2 = [20, 22, 25, 28, 30]
group3 = [15, 18, 20, 22, 25]

# Perform one-way ANOVA


f_statistic, p_value = stats.f_oneway(group1, group2, group3)

# Print results
print("F-statistic:", f_statistic)
print("P-value:", p_value)

# Interpret results
alpha = 0.05
if p_value < alpha:
print("Reject null hypothesis: There is a significant difference between
group means.")
else:
print("Fail to reject null hypothesis: There is no significant difference
between group means.")

Output:-
Practical – 12

Ques. Write a program to show correlation between two randomly generated


numbers

Code :-
import numpy as np
import matplotlib.pyplot as plt

# Generate random data


np.random.seed(0) # for reproducibility
x = np.random.randn(100) # random numbers for x-axis
y = np.random.randn(100) # random numbers for y-axis

# Calculate correlation coefficient


correlation = np.corrcoef(x, y)[0, 1]

# Plot scatter plot


plt.figure(figsize=(8, 5))
plt.scatter(x, y, color='blue')
plt.title('Correlation between Two Randomly Generated Numbers')
plt.xlabel('X')
plt.ylabel('Y')
plt.grid(True)
plt.text(2, 2, f'Correlation: {correlation:.2f}', fontsize=12, color='red')
plt.show()

Output:-
Practical – 13

Ques. Write a program to implement Covariance

Code :-
import numpy as np

# Define two sets of random data


x = np.array([1, 2, 3, 4, 5])
y = np.array([5, 4, 3, 2, 1])

# Calculate the mean of each set


mean_x = np.mean(x)
mean_y = np.mean(y)

# Calculate covariance
covariance = np.mean((x - mean_x) * (y - mean_y))

print("Covariance between x and y:", covariance)

Output:-
Practical – 14

Ques. Create a GUI based form for admission purpose for your college

Code :-
import tkinter as tk
from tkinter import messagebox

def submit_form():
name = name_entry.get()
age = age_entry.get()
gender = gender_var.get()
course = course_var.get()

# Display submitted data


messagebox.showinfo("Admission Form Submitted",
f"Name: {name}\nAge: {age}\nGender: {gender}\nCourse:
{course}")

# Create main window


root = tk.Tk()
root.title("College Admission Form")

# Labels
tk.Label(root, text="Name:").grid(row=0, column=0, padx=10, pady=5)
tk.Label(root, text="Age:").grid(row=1, column=0, padx=10, pady=5)
tk.Label(root, text="Gender:").grid(row=2, column=0, padx=10, pady=5)
tk.Label(root, text="Course:").grid(row=3, column=0, padx=10, pady=5)

# Entry fields
name_entry = tk.Entry(root)
name_entry.grid(row=0, column=1, padx=10, pady=5)
age_entry = tk.Entry(root)
age_entry.grid(row=1, column=1, padx=10, pady=5)

# Radio buttons for gender


gender_var = tk.StringVar()
tk.Radiobutton(root, text="Male", variable=gender_var,
value="Male").grid(row=2, column=1, padx=10, pady=5)
tk.Radiobutton(root, text="Female", variable=gender_var,
value="Female").grid(row=2, column=2, padx=10, pady=5)

# Dropdown for course selection


course_var = tk.StringVar()
course_var.set("Select Course")
course_dropdown = tk.OptionMenu(root, course_var, "BCA", "MCA", "BBA", "BCOM",
"B.ED", "MBA")
course_dropdown.grid(row=3, column=1, padx=10, pady=5)

# Submit button
submit_btn = tk.Button(root, text="Submit", command=submit_form)
submit_btn.grid(row=4, column=0, columnspan=2, pady=10)

root.mainloop()

Output:-
Practical – 15

Ques. Create a GUI based form for admission purpose for your college
Code :-
import tkinter as tk
from tkinter import messagebox

import sqlite3

def submit_form():
try:

conn = sqlite3.connect("college_admission.db")
conn.execute('''CREATE TABLE IF NOT EXISTS admission_form

(name TEXT, age INTEGER, gender TEXT, course TEXT)''')


conn.execute("INSERT INTO admission_form VALUES (?, ?, ?, ?)",

(name_entry.get(), age_entry.get(), gender_var.get(), course_var.get()))


conn.commit()
conn.close()

messagebox.showinfo("Success", "Form submitted successfully!")


except Exception as e:

messagebox.showerror("Error", str(e))

root = tk.Tk()
root.title("College Admission Form")

tk.Label(root, text="Name:").grid(row=0, column=0)

tk.Label(root, text="Age:").grid(row=1, column=0)


tk.Label(root, text="Gender:").grid(row=2, column=0)

tk.Label(root, text="Course:").grid(row=3, column=0)

name_entry = tk.Entry(root); name_entry.grid(row=0, column=1)


age_entry = tk.Entry(root); age_entry.grid(row=1, column=1)
gender_var = tk.StringVar()

tk.Radiobutton(root, text="Male", variable=gender_var, value="Male").grid(row=2, column=1)


tk.Radiobutton(root, text="Female", variable=gender_var, value="Female").grid(row=2, column=2)

course_var = tk.StringVar(value="Select Course")


tk.OptionMenu(root, course_var, "BCA", "MCA", "BBA", "BCOM", "B.ED", "MBA").grid(row=3,
column=1)

tk.Button(root, text="Submit", command=submit_form).grid(row=4, column=0, columnspan=2)

root.mainloop()

Output:-

You might also like