0% found this document useful (0 votes)

12 views3 pages

Assignment-I

The assignment for CSE-435 requires students to explore various aspects of Data Science, including its lifecycle, machine learning types, Python's role, and the importance of Exploratory Data Analysis (EDA). Students must discuss topics such as feature selection, big data challenges, ethical considerations, and the application of data science in healthcare. The assignment also emphasizes the significance of data cleaning, correlation analysis, and feature engineering in the data analytics process.

Uploaded by

aviichal1915.11c

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views3 pages

Assignment-I

Uploaded by

aviichal1915.11c

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Assignment – I

CSE-435

Due Dte: 7 October 2024 (Hard Copy in Self Handwriting)

1. What is Data Science? Explain the Data Science lifecycle and its importance in
modern industries.

2. Describe the key differences between Supervised, Unsupervised, and

Reinforcement Learning. Provide examples of each.

3. Explain the role of Python in Data Science. Discuss the various Python libraries
used for data manipulation, visualization, and machine learning.

4. Discuss the importance of Exploratory Data Analysis (EDA) in the Data Science
process. How do data visualization and statistical techniques help in EDA?

5. What is Machine Learning, and how is it applied in real-world scenarios? Discuss

the types of machine learning algorithms commonly used in industries.

6. Explain the process of feature selection and feature engineering in machine

learning. Why are these steps crucial for model performance?

7. Describe the concept of big data and its challenges. What technologies and tools
are used to handle big data in Data Science?

8. What are overfitting and underfitting in machine learning models? How can they
be prevented or corrected?

9. Discuss the ethical considerations in data science and machine learning. What
are the challenges of bias, privacy, and fairness in AI systems?

10. Explain the role of data science in healthcare. How has data science been used to
improve healthcare outcomes? Provide examples of its applications.

11. Explain the Data Analytics Process in detail. Discuss each step, from data
collection to decision-making, and illustrate how these steps interconnect in a
real-world project.

12. What is Exploratory Data Analysis (EDA), and why is it a crucial step in data
analytics? Discuss both quantitative and graphical techniques used in EDA,
providing examples of when and how they are used.

13. Compare and contrast quantitative and graphical techniques in Exploratory Data
Analysis (EDA). How do these techniques complement each other in providing a
complete understanding of the data?
14. Describe the role of data cleaning in the data analytics process. Why is it critical
to the success of data analysis, and what common techniques are used to clean
data?

15. How is correlation analysis performed in EDA? Explain the significance of the
Pearson and Spearman correlation coefficients and how they are interpreted.
Provide examples of how correlation is used in real-world data analysis.

16. Graphical techniques in EDA help uncover hidden patterns in data. Discuss how
visualizations such as histograms, box plots, scatter plots, and heatmaps
contribute to identifying trends, outliers, and relationships between variables.

17. Discuss the concept of feature engineering and its importance in the data
analytics process. How do new features improve the performance of predictive
models? Provide examples of feature engineering techniques.

18. What is the difference between descriptive and inferential statistics in data
analysis? How are both types of analysis used to derive insights from a dataset?
Provide examples of each.

19. What challenges are encountered when handling large datasets in the data
analytics process, especially during EDA? Discuss the techniques and tools used
to overcome these challenges, such as sampling, parallel processing, and using
specialized libraries.

20. How does predictive modeling fit into the data analytics process? Explain the
relationship between EDA and predictive modeling, and discuss how the insights
gathered during EDA influence the choice of models.

21. Explain the process of feature generation in detail. How do domain expertise,
brainstorming, and creativity contribute to generating meaningful features from
raw data? Provide examples from real-world applications.

22. What are the common challenges in feature generation when dealing with time
series data? Discuss techniques such as lag features, rolling statistics, and
seasonality extraction with practical examples.

23. Feature selection plays a critical role in improving the performance of machine
learning models. Compare and contrast different feature selection techniques
(Filter, Wrapper, and Embedded methods) and their applications.

24. Discuss the importance of feature selection in preventing overfitting and

improving model generalization. How do techniques like cross-validation and
regularization help in selecting the right features?
25. In the context of customer retention analysis, how can feature generation be used
to derive new insights from customer behavior data? Discuss how these features
impact predictive modeling.

26. How does L1 regularization (Lasso) aid in feature selection? Explain the
mathematical foundation of Lasso and provide examples of its use in high-
dimensional datasets.

27. What is the role of interaction terms in feature generation? How can interaction
terms enhance the predictive power of a model, and when might they be
unnecessary or harmful?

28. Explain how mutual information can be used as a feature selection criterion. What
are the advantages and limitations of using mutual information in selecting
features for machine learning models?

29. Feature selection often involves dealing with multicollinearity between variables.
Explain how multicollinearity affects models and discuss techniques for detecting
and resolving it.

30. In high-dimensional datasets, how do tree-based algorithms like Random Forest

and Gradient Boosting contribute to feature selection? Discuss how feature
importance scores are derived and used in practice.

Go Up! 3 SB
40% (5)
Go Up! 3 SB
108 pages
Cheat Sheet
No ratings yet
Cheat Sheet
5 pages
Algebra 2 Lesson Plan
No ratings yet
Algebra 2 Lesson Plan
3 pages
DS MCQ SEMESTER SUGGESSTION
No ratings yet
DS MCQ SEMESTER SUGGESSTION
26 pages
Data Science: Concepts, Strategies, and Applications
From Everand
Data Science: Concepts, Strategies, and Applications
Zemelak Goraga
No ratings yet
DS QB.docx
No ratings yet
DS QB.docx
3 pages
Data Science-1
No ratings yet
Data Science-1
6 pages
sfds aat
No ratings yet
sfds aat
8 pages
Data Science Interview Best
No ratings yet
Data Science Interview Best
48 pages
OCS353_Review Questions
No ratings yet
OCS353_Review Questions
3 pages
01.ad3491 Fdsa QB
No ratings yet
01.ad3491 Fdsa QB
16 pages
Data Science
No ratings yet
Data Science
6 pages
dapyq
No ratings yet
dapyq
6 pages
"Big Data Science" Basic Concepts and Applications
From Everand
"Big Data Science" Basic Concepts and Applications
Sukanta Bhattacharya
No ratings yet
Ml Chapter 2
No ratings yet
Ml Chapter 2
9 pages
Big Data (Imp-Questions)
No ratings yet
Big Data (Imp-Questions)
17 pages
7 - Foundations of DS
No ratings yet
7 - Foundations of DS
8 pages
Data Scientist Roadmap
From Everand
Data Scientist Roadmap
Mohammed Ahmed
5/5 (1)
DataWarehousing DataMining Question Bank
No ratings yet
DataWarehousing DataMining Question Bank
3 pages
Data Science Assignment
No ratings yet
Data Science Assignment
9 pages
DS QB
No ratings yet
DS QB
6 pages
Machine Learning Fundamentals: Concepts, Models, and Applications
From Everand
Machine Learning Fundamentals: Concepts, Models, and Applications
Amar Sahay
No ratings yet
FDS QP - Thy
No ratings yet
FDS QP - Thy
1 page
Big Data Analytics Suggestion
No ratings yet
Big Data Analytics Suggestion
3 pages
DATA SCIENCE QB
No ratings yet
DATA SCIENCE QB
2 pages
Synthetic Data Generation: A Beginner’s Guide
From Everand
Synthetic Data Generation: A Beginner’s Guide
Robert Johnson
No ratings yet
24CSPPC106 – ESSENTIALS OF DATA SCIENCE
No ratings yet
24CSPPC106 – ESSENTIALS OF DATA SCIENCE
3 pages
Data Science Mastery: From Beginner to Expert in Big Data Analytics
From Everand
Data Science Mastery: From Beginner to Expert in Big Data Analytics
Kameron Hussain
No ratings yet
Data Science
No ratings yet
Data Science
14 pages
Assignment 4 MB511
No ratings yet
Assignment 4 MB511
6 pages
PDS Question Bank
No ratings yet
PDS Question Bank
19 pages
Data Science
No ratings yet
Data Science
31 pages
Fdsa 12 - 2M
No ratings yet
Fdsa 12 - 2M
15 pages
Da #2
No ratings yet
Da #2
1 page
Technical Report Writing For Ca2 Examination: Topic: Introduction To Data Science
No ratings yet
Technical Report Writing For Ca2 Examination: Topic: Introduction To Data Science
7 pages
AssignmentBigData
No ratings yet
AssignmentBigData
7 pages
2 Marks With Answers
No ratings yet
2 Marks With Answers
39 pages
Machine Learning Algorithms for Data Scientists: An Overview
From Everand
Machine Learning Algorithms for Data Scientists: An Overview
Vinaitheerthan Renganathan
No ratings yet
DS
No ratings yet
DS
7 pages
"Data Analysis" Basic Concepts and Applications
From Everand
"Data Analysis" Basic Concepts and Applications
Sukanta Bhattacharya
No ratings yet
BADS (KMBA 106) - Qus Bank
No ratings yet
BADS (KMBA 106) - Qus Bank
7 pages
DWDM Important Questions
No ratings yet
DWDM Important Questions
2 pages
Question Samples
No ratings yet
Question Samples
4 pages
DE&V TWO MARKS QUESTIONS WITH ANSWERS
No ratings yet
DE&V TWO MARKS QUESTIONS WITH ANSWERS
19 pages
CD 404 Imp Que of Data Science
No ratings yet
CD 404 Imp Que of Data Science
3 pages
data science
No ratings yet
data science
10 pages
Revision Questions
No ratings yet
Revision Questions
19 pages
Data-Science-and-Analytics-Reviewer
No ratings yet
Data-Science-and-Analytics-Reviewer
5 pages
Chapter No.4 Exercise Solution (Computer)
No ratings yet
Chapter No.4 Exercise Solution (Computer)
8 pages
Question Bank For DM
No ratings yet
Question Bank For DM
4 pages
12 2marks With Ans
No ratings yet
12 2marks With Ans
21 pages
Mastering Machine Learning: A Comprehensive Guide to Success
From Everand
Mastering Machine Learning: A Comprehensive Guide to Success
Rick Spair
No ratings yet
ML Question Bank[1]
No ratings yet
ML Question Bank[1]
1 page
Predictive Analytics and Machine Learning for Managers
From Everand
Predictive Analytics and Machine Learning for Managers
J. Alberto Espinosa
No ratings yet
II CSE_A&B (96)DS-int 1 QP ANS-set1 - Copy
No ratings yet
II CSE_A&B (96)DS-int 1 QP ANS-set1 - Copy
7 pages
EDA - With Python Question Bank
No ratings yet
EDA - With Python Question Bank
3 pages
Data Science
No ratings yet
Data Science
10 pages
question bank with answers
No ratings yet
question bank with answers
103 pages
DA_F_QB
No ratings yet
DA_F_QB
14 pages
Interview Questions 1707074864
No ratings yet
Interview Questions 1707074864
6 pages
DS Unit 1
No ratings yet
DS Unit 1
35 pages
Data Mining and Warehousing (Combined Assignment)
No ratings yet
Data Mining and Warehousing (Combined Assignment)
3 pages
ixs8h-l8mgc
No ratings yet
ixs8h-l8mgc
40 pages
03 Breadth First Search
No ratings yet
03 Breadth First Search
73 pages
PYQ
No ratings yet
PYQ
4 pages
CCP Notes Module-5
No ratings yet
CCP Notes Module-5
42 pages
SKE309 Paper May 2023 (1)
No ratings yet
SKE309 Paper May 2023 (1)
4 pages
23
No ratings yet
23
4 pages
Memory Management
No ratings yet
Memory Management
58 pages
2) He Role of Statistical Tools in Research Data Analysis
No ratings yet
2) He Role of Statistical Tools in Research Data Analysis
5 pages
Luis Y. Ferrer Jr. Senior High School: Division of Cavite
No ratings yet
Luis Y. Ferrer Jr. Senior High School: Division of Cavite
3 pages
An-Opinion-Essay ST
No ratings yet
An-Opinion-Essay ST
4 pages
Fluency Assessment
No ratings yet
Fluency Assessment
9 pages
Research Proposal Marvin Cris
No ratings yet
Research Proposal Marvin Cris
53 pages
Application of laplace and Fourier transform
No ratings yet
Application of laplace and Fourier transform
18 pages
Grade 9 Probability Unit Test
No ratings yet
Grade 9 Probability Unit Test
9 pages
Action Research
No ratings yet
Action Research
44 pages
Semi - Detailed Lesson Plan Escorido Jessica BEED 3-A
No ratings yet
Semi - Detailed Lesson Plan Escorido Jessica BEED 3-A
5 pages
Mathematics 8 - Performance Task #1: 2 QUARTER - Week 1
No ratings yet
Mathematics 8 - Performance Task #1: 2 QUARTER - Week 1
4 pages
Alessandra_Buonanno
No ratings yet
Alessandra_Buonanno
7 pages
3rd Quarter MAPEH 8 Examination
No ratings yet
3rd Quarter MAPEH 8 Examination
2 pages
2016 Bjma East Coast Muay Thai Syllabus PDF
No ratings yet
2016 Bjma East Coast Muay Thai Syllabus PDF
26 pages
2nd-Summative Makabansa
100% (1)
2nd-Summative Makabansa
3 pages
Lesson 4
No ratings yet
Lesson 4
8 pages
PhysRevB.111.104410 Gd3Ni8Sn4 hall skyrmions I Das
No ratings yet
PhysRevB.111.104410 Gd3Ni8Sn4 hall skyrmions I Das
7 pages
History CPO 23 ALL Questions
No ratings yet
History CPO 23 ALL Questions
8 pages
SOP Aberdeen
No ratings yet
SOP Aberdeen
4 pages
Sample Journal Critique On Geosynthetics
No ratings yet
Sample Journal Critique On Geosynthetics
2 pages
NUST Student Handbook PG 2021
No ratings yet
NUST Student Handbook PG 2021
103 pages
Cpar (Q1 M10)
No ratings yet
Cpar (Q1 M10)
2 pages
Get Delivering Authentic Arts Education 4th Edition Judith Dinham - Ebook PDF Free All Chapters
100% (8)
Get Delivering Authentic Arts Education 4th Edition Judith Dinham - Ebook PDF Free All Chapters
51 pages
UNIT 14 Video Worksheets
No ratings yet
UNIT 14 Video Worksheets
2 pages
Worksheet 1
No ratings yet
Worksheet 1
19 pages
CERTIFICATION-GOOD MORAL........
No ratings yet
CERTIFICATION-GOOD MORAL........
17 pages
IPWANI-JOINING INSTRUCTION
No ratings yet
IPWANI-JOINING INSTRUCTION
6 pages
Adham Medhat CV N
No ratings yet
Adham Medhat CV N
2 pages

Assignment-I

Uploaded by

Assignment-I

Uploaded by

Assignment – I

Due Dte: 7 October 2024 (Hard Copy in Self Handwriting)

2. Describe the key differences between Supervised, Unsupervised, and

5. What is Machine Learning, and how is it applied in real-world scenarios? Discuss

6. Explain the process of feature selection and feature engineering in machine

24. Discuss the importance of feature selection in preventing overfitting and

30. In high-dimensional datasets, how do tree-based algorithms like Random Forest

You might also like