0% found this document useful (0 votes)

12 views

Batch 2

The document outlines three distinct problem statements focusing on customer behavior analysis, health and fitness tracking, and transportation system analysis, each accompanied by Python code snippets for dataset generation. For each problem, specific analyses are suggested, including calculating averages, identifying top performers, and generating various visualizations. The overall aim is to leverage data analysis techniques to derive insights from the generated datasets.

Uploaded by

Ankit Sharma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

Batch 2

Uploaded by

Ankit Sharma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Question 1: Customer Behavior Analysis

Problem Statement:

1. Use the following Python code snippet to generate the dataset:

1 import pandas as pd
2 import numpy as np
3
4 np . random . seed (42)
5 customers = [ f " Customer_ { i } " for i in range (1 , 201) ]
6 ages = np . random . randint (18 , 70 , 200)
7 pu rc ha se _f re qu en cy = np . random . randint (1 , 20 , 200)
8 purchase_amount = np . random . uniform (50 , 1000 , 200)
9 data = {
10 " Customer_ID " : customers ,
11 " Age " : ages ,
12 " Pu rc ha se _F re qu en cy " : purchase_frequency ,
13 " A v e r a g e _ P u r c h a s e _ A m o u n t " : purchase_amount ,
14 }
15 customer_data = pd . DataFrame ( data )
16 customer_data . to_csv ( " customer_data . csv " , index = False )
17 print ( customer_data . head () )

Listing 1: Customer Data Generation

2. Perform the following analysis:

(a) Calculate the average purchase amount for different age

groups (e.g., 18-25, 26-40, etc.).
(b) Identify the top 10 customers based on total purchase amount
(Purchase Frequency * Average Purchase Amount).
(c) Create a histogram of Age.
(d) Generate a scatter plot of Age vs. Purchase Frequency.
(e) Create a heatmap to visualize correlations among Age, Purchase Frequency,
and Average Purchase Amount.

1
Question 2: Health and Fitness Tracking
Problem Statement:

1. Use the following Python code snippet to generate the dataset:

1 import pandas as pd
2 import numpy as np
3
4 np . random . seed (42)
5 user_ids = [ f " User_ { i } " for i in range (1 , 151) ]
6 steps = np . random . randint (1000 , 20000 , 150)
7 calories_burned = np . random . uniform (500 , 2500 , 150)
8 workout_duration = np . random . randint (10 , 120 , 150)
9 sleep_hours = np . random . uniform (4 , 10 , 150)
10 data = {
11 " User_ID " : user_ids ,
12 " Steps " : steps ,
13 " Calories_Burned " : calories_burned ,
14 " Workout_Duration " : workout_duration ,
15 " Sleep_Hours " : sleep_hours ,
16 }
17 health_data = pd . DataFrame ( data )
18 health_data . to_csv ( " health_data . csv " , index = False )
19 print ( health_data . head () )

Listing 2: Health Data Generation

2. Perform the following analysis:

(a) Calculate the average steps and calories burned by users

grouped by Workout Duration intervals (e.g., 10-30 min, 31-60
min, etc.).
(b) Identify users with more than 15,000 steps and their corre-
sponding Calories Burned and Workout Duration.
(c) Generate a boxplot to visualize Sleep Hours.
(d) Create a scatter plot of Workout Duration vs. Calories Burned.
(e) Generate a correlation heatmap for all numerical features.

2
Question 3: Transportation System Analysis
Problem Statement:

1. Use the following Python code snippet to generate the dataset:

1 import pandas as pd
2 import numpy as np
3
4 np . random . seed (42)
5 routes = [ f " Route_ { i } " for i in range (1 , 101) ]
6 distance = np . random . randint (5 , 500 , 100)
7 time_taken = np . random . uniform (0.5 , 10 , 100)
8 fuel_consumed = np . random . uniform (1 , 50 , 100)
9 vehicle_types = np . random . choice ([ " Car " , " Bus " , " Truck " ] ,
100)
10 data = {
11 " Route_ID " : routes ,
12 " Distance " : distance ,
13 " Time_Taken " : time_taken ,
14 " Fuel_Consumed " : fuel_consumed ,
15 " Vehicle_Type " : vehicle_types ,
16 }
17 transport_data = pd . DataFrame ( data )
18 transport_data . to_csv ( " transport_data . csv " , index = False )
19 print ( transport_data . head () )

Listing 3: Transportation Data Generation

2. Perform the following analysis:

(a) Calculate the average fuel efficiency (Distance/Fuel Consumed)

for each Vehicle Type.
(b) Identify routes with fuel efficiency below a threshold (e.g.,
5 km/L).
(c) Generate a bar plot to show Average Time Taken for each Vehicle Type.
(d) Create a scatter plot of Distance vs. Time Taken, colored by
Vehicle Type.
(e) Create a boxplot of Fuel Consumed for each Vehicle Type.

TB Woods XFC Series Datasheet1 177469416 PDF
No ratings yet
TB Woods XFC Series Datasheet1 177469416 PDF
36 pages
Solution - Data Analysis With Python-Project-2 - v1.0
No ratings yet
Solution - Data Analysis With Python-Project-2 - v1.0
14 pages
SMDM Business-Report Arvind Soni-2
0% (1)
SMDM Business-Report Arvind Soni-2
15 pages
Orthogonal Frequency Division Multiple Access Fundamentals and Applications PDF
No ratings yet
Orthogonal Frequency Division Multiple Access Fundamentals and Applications PDF
628 pages
Disability Benefits and Application File July 7, 2015 No Compression by Hackers
No ratings yet
Disability Benefits and Application File July 7, 2015 No Compression by Hackers
66 pages
Hgs Phase II
No ratings yet
Hgs Phase II
27 pages
Aerofit Case Study - Solution Approach
No ratings yet
Aerofit Case Study - Solution Approach
5 pages
prac2
No ratings yet
prac2
11 pages
Cardio Good Fitness Dataset
No ratings yet
Cardio Good Fitness Dataset
27 pages
Chapter03 PRJ Requirements
No ratings yet
Chapter03 PRJ Requirements
2 pages
Batch1
No ratings yet
Batch1
3 pages
prac2
No ratings yet
prac2
11 pages
index
No ratings yet
index
4 pages
02-Linear Regression Project - Solutions
No ratings yet
02-Linear Regression Project - Solutions
12 pages
Data Mining Journal 1 Kashan
No ratings yet
Data Mining Journal 1 Kashan
13 pages
Case Study Module 1
No ratings yet
Case Study Module 1
4 pages
PYF_Project_LearnerNotebook_LowCode
No ratings yet
PYF_Project_LearnerNotebook_LowCode
6 pages
Final Project DA 11.00
No ratings yet
Final Project DA 11.00
3 pages
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
30 pages
Exp 8_LM
No ratings yet
Exp 8_LM
10 pages
LAB01
No ratings yet
LAB01
8 pages
Car-price-prediction (1)
No ratings yet
Car-price-prediction (1)
42 pages
BPP Business School - Applied Modelling and Visualisation
No ratings yet
BPP Business School - Applied Modelling and Visualisation
19 pages
BCom(BusinessAnalytics) I Sem Pratical _QBank (1)
No ratings yet
BCom(BusinessAnalytics) I Sem Pratical _QBank (1)
3 pages
Main Phase 3 Dharani (1)
No ratings yet
Main Phase 3 Dharani (1)
19 pages
Python Class 6 Assignment Solution
No ratings yet
Python Class 6 Assignment Solution
9 pages
Assignment
No ratings yet
Assignment
10 pages
BA Questions
No ratings yet
BA Questions
5 pages
Aerofit_Case_Study
No ratings yet
Aerofit_Case_Study
16 pages
DBDAL LAB - MANUAL - Final
No ratings yet
DBDAL LAB - MANUAL - Final
93 pages
Projects On Big Data
No ratings yet
Projects On Big Data
4 pages
IP Project File Main
No ratings yet
IP Project File Main
15 pages
Set 2
No ratings yet
Set 2
3 pages
Phase 3 (2)
No ratings yet
Phase 3 (2)
19 pages
SMDM Project Report Dipti
No ratings yet
SMDM Project Report Dipti
14 pages
Enews-Guided - SUPPORTING
No ratings yet
Enews-Guided - SUPPORTING
27 pages
Vertopal.com AML Project LearnerNotebook LowCode
No ratings yet
Vertopal.com AML Project LearnerNotebook LowCode
74 pages
pandas__prac
No ratings yet
pandas__prac
4 pages
ITECH2302 MainAssessment Report
No ratings yet
ITECH2302 MainAssessment Report
8 pages
Practice - Exploring The Bank Promotion Data Set Using CAS and The Python API
No ratings yet
Practice - Exploring The Bank Promotion Data Set Using CAS and The Python API
3 pages
Machine Learning Project - Parijat
No ratings yet
Machine Learning Project - Parijat
26 pages
Daa-01
No ratings yet
Daa-01
11 pages
Some Exercises
No ratings yet
Some Exercises
9 pages
Revenue Predictor - Udit Ennam PDF
No ratings yet
Revenue Predictor - Udit Ennam PDF
30 pages
A - B Testing & Experimentation
No ratings yet
A - B Testing & Experimentation
5 pages
Data Challenge 2
No ratings yet
Data Challenge 2
4 pages
Assignment 3
No ratings yet
Assignment 3
2 pages
Logistic+Regression
No ratings yet
Logistic+Regression
3 pages
DSML Problem Statements
No ratings yet
DSML Problem Statements
8 pages
LAB01
No ratings yet
LAB01
7 pages
Links for Datasets
No ratings yet
Links for Datasets
3 pages
Extracted Notebook Content
No ratings yet
Extracted Notebook Content
17 pages
BDMDM Telemarketing
No ratings yet
BDMDM Telemarketing
16 pages
Monika Sree 11-07-2024
No ratings yet
Monika Sree 11-07-2024
36 pages
In Tenshi PPP Tte Jum Am
No ratings yet
In Tenshi PPP Tte Jum Am
23 pages
Data Science in Society Cat
No ratings yet
Data Science in Society Cat
5 pages
Data Science Manual
No ratings yet
Data Science Manual
155 pages
Big Data Analysis
No ratings yet
Big Data Analysis
33 pages
Matplotlib Project Report AIPT (2)
No ratings yet
Matplotlib Project Report AIPT (2)
6 pages
Problem Statements For PBL Internships
No ratings yet
Problem Statements For PBL Internships
3 pages
IS5312 Mini Project-2
No ratings yet
IS5312 Mini Project-2
5 pages
Data Analytics Project Ideas to Boost Your Resume (Chat GPT)
No ratings yet
Data Analytics Project Ideas to Boost Your Resume (Chat GPT)
3 pages
AP Computer Science Principles: Student-Crafted Practice Tests For Excellence
From Everand
AP Computer Science Principles: Student-Crafted Practice Tests For Excellence
Sama Alshatali
No ratings yet
Batch2_MASAI_MTH101_Re_MidTerm_Exam_solution
No ratings yet
Batch2_MASAI_MTH101_Re_MidTerm_Exam_solution
6 pages
NLP
No ratings yet
NLP
45 pages
DDPG
No ratings yet
DDPG
1 page
PPR 3
No ratings yet
PPR 3
12 pages
Adobe Scan 10-Nov-2022
No ratings yet
Adobe Scan 10-Nov-2022
25 pages
Mathematics Lec4 Vector Space-1-36
No ratings yet
Mathematics Lec4 Vector Space-1-36
36 pages
Tutorial 3
No ratings yet
Tutorial 3
2 pages
Tutorial 4
No ratings yet
Tutorial 4
3 pages
Tutorial - 2
No ratings yet
Tutorial - 2
2 pages
Tutorial 3
No ratings yet
Tutorial 3
2 pages
Matching. Graph
No ratings yet
Matching. Graph
13 pages
Problem Sheet-1
No ratings yet
Problem Sheet-1
2 pages
Brochure IIT Mandi 2023
No ratings yet
Brochure IIT Mandi 2023
19 pages
LinkedIn Hacking
No ratings yet
LinkedIn Hacking
46 pages
Inductor Losses Coilcraft
No ratings yet
Inductor Losses Coilcraft
1 page
Delta V Catalog PDF
No ratings yet
Delta V Catalog PDF
170 pages
Dokumen - Pub - Digital Watermarking and Steganography Fundamentals and Techniques Second Edition 2ed 1498738761 978 1 4987 3876 7
No ratings yet
Dokumen - Pub - Digital Watermarking and Steganography Fundamentals and Techniques Second Edition 2ed 1498738761 978 1 4987 3876 7
276 pages
Learning CoreDNS Configuring DNS for Cloud Native Environments 1st Edition John Belamaricpdf download
100% (1)
Learning CoreDNS Configuring DNS for Cloud Native Environments 1st Edition John Belamaricpdf download
45 pages
Internet of Things (IoT) and Fleet Management System (FMS) - Ver. 2.0
No ratings yet
Internet of Things (IoT) and Fleet Management System (FMS) - Ver. 2.0
3 pages
Advantage of cyber security
No ratings yet
Advantage of cyber security
4 pages
Lantek2 Manual v1
No ratings yet
Lantek2 Manual v1
110 pages
Additional Mcqs - RPA
No ratings yet
Additional Mcqs - RPA
3 pages
LGP GPOA Template1
No ratings yet
LGP GPOA Template1
2 pages
Galaxy Digital Research: Dogecoin: The Most Honest SH Tcoin
No ratings yet
Galaxy Digital Research: Dogecoin: The Most Honest SH Tcoin
22 pages
Spline Functions Basic Theory 3rd Edition Larry Schumaker download
100% (1)
Spline Functions Basic Theory 3rd Edition Larry Schumaker download
62 pages
MC 10169239 9999 PDF
No ratings yet
MC 10169239 9999 PDF
2 pages
Nasanearestobjects: 1 Nasa - Nearest Earth Objects
No ratings yet
Nasanearestobjects: 1 Nasa - Nearest Earth Objects
9 pages
STE UNIT-5 Notes
No ratings yet
STE UNIT-5 Notes
14 pages
Introduction To Computer Networks: Btech6 Sem
No ratings yet
Introduction To Computer Networks: Btech6 Sem
70 pages
2.1-1 Network Media Power Point
No ratings yet
2.1-1 Network Media Power Point
14 pages
105 Machine Learning Paper
No ratings yet
105 Machine Learning Paper
6 pages
New Perspectives Microsoft Office 365 & Office 2019 Introductory 2019th Edition Patrick Carey - Quickly download the ebook to never miss important content
100% (1)
New Perspectives Microsoft Office 365 & Office 2019 Introductory 2019th Edition Patrick Carey - Quickly download the ebook to never miss important content
59 pages
DAA QB ALL Six Unit
No ratings yet
DAA QB ALL Six Unit
8 pages
Document
No ratings yet
Document
27 pages
KTU - CST202 - Computer Organization and Architecture: Ontrol Ogic Esign
No ratings yet
KTU - CST202 - Computer Organization and Architecture: Ontrol Ogic Esign
17 pages
How Long Does It Take To Create Learning?: A Chapman Alliance, Research Study September 2010
No ratings yet
How Long Does It Take To Create Learning?: A Chapman Alliance, Research Study September 2010
36 pages
CISM Sample Exam Questions
No ratings yet
CISM Sample Exam Questions
37 pages
MMPC 008
No ratings yet
MMPC 008
5 pages
M. Waseem (Instrumentation Technician)
No ratings yet
M. Waseem (Instrumentation Technician)
2 pages
Queueing Theory - Solved Exercise - June - 27
No ratings yet
Queueing Theory - Solved Exercise - June - 27
19 pages

Batch 2

Uploaded by

Batch 2

Uploaded by

Question 1: Customer Behavior Analysis

1. Use the following Python code snippet to generate the dataset:

Listing 1: Customer Data Generation

2. Perform the following analysis:

(a) Calculate the average purchase amount for different age

1. Use the following Python code snippet to generate the dataset:

Listing 2: Health Data Generation

2. Perform the following analysis:

(a) Calculate the average steps and calories burned by users

1. Use the following Python code snippet to generate the dataset:

Listing 3: Transportation Data Generation

2. Perform the following analysis:

(a) Calculate the average fuel efficiency (Distance/Fuel Consumed)

You might also like