0% found this document useful (0 votes)
27 views

CD 404 Imp Que of Data Science

data science rgpv
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

CD 404 Imp Que of Data Science

data science rgpv
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Unit – I: Introduction

1. Define Data Science and discuss its evolution.


2. Explain the various roles in a Data Science project.
3. What are the different stages in a Data Science project? Describe each stage briefly.
4. Discuss the applications of Data Science in healthcare, finance, and marketing.
5. Explain the major data security issues faced in Data Science.

Unit – II: Data Collection and Data Pre-Processing

1. Describe different data collection strategies used in Data Science.


2. What are the main steps involved in data pre-processing?
3. Explain data cleaning techniques with examples.
4. Discuss data integration and transformation methods.
5. What is data reduction, and why is it important? Provide examples.
6. Explain the concept of data discretization and its significance.

Unit – III: Exploratory Data Analytics

1. Define and explain the importance of descriptive statistics in data analysis.


2. How do you calculate mean, standard deviation, skewness, and kurtosis? Provide
formulas and examples.
3. What is a box plot? How is it used in data analysis?
4. Describe the use of pivot tables and heat maps in exploratory data analysis.
5. Explain correlation statistics and their use in data analysis.
6. What is ANOVA, and how is it used in Data Science?

Unit – IV: Model Development

1. Describe the process of simple and multiple regression analysis.


2. How can model evaluation be performed using visualization techniques? Provide
examples.
3. Explain the significance of residual plots and distribution plots in model evaluation.
4. What are polynomial regression and pipelines? How are they used in Data Science?
5. Discuss various measures for in-sample evaluation of models.
6. Explain the process of making predictions and decision making using regression
models.

Unit – V: Model Evaluation

1. What is generalization error, and why is it important in model evaluation?


2. Describe the different out-of-sample evaluation metrics used in Data Science.
3. Explain the concept of cross-validation and its importance.
4. Discuss the issues of overfitting and underfitting in model selection.
5. How is ridge regression used for prediction? Provide an example.
6. Explain the process of testing multiple parameters using grid search.

----------------------------------------------------------------------------------------------------
----------------------------------Unit – I: Introduction
1. Define Data Science. Discuss its evolution and significance in the modern world.
2. What are the key roles in a Data Science project? Explain the responsibilities of
a Data Scientist, Data Engineer, and Data Analyst.
3. Describe the various stages in a Data Science project. Illustrate each stage with
examples.
4. Provide examples of applications of Data Science in healthcare, finance, and
marketing.
5. Discuss the major data security issues that arise in Data Science projects. How
can these issues be mitigated?

Unit – II: Data Collection and Data Pre-Processing

1. Explain different data collection strategies in Data Science. Highlight their


advantages and disadvantages.
2. What is data pre-processing? Outline the steps involved in this process.
3. Discuss the importance of data cleaning and the techniques used for cleaning
data.
4. Explain data integration and transformation. Why are they crucial in the data
preparation stage?
5. What is data reduction, and why is it important? Describe methods of data
reduction.
6. Define data discretization. How is it different from data binning? Provide
examples.

Unit – III: Exploratory Data Analytics

1. What is descriptive statistics? Discuss its role in data analysis.


2. How do you calculate mean, standard deviation, skewness, and kurtosis? Provide
formulas and interpret their meanings.
3. Describe a box plot and its components. How is it used to detect outliers?
4. What is a pivot table? How can it be used to summarize data?
5. Explain the purpose and creation of a heat map in exploratory data analysis.
6. Discuss the concept of correlation statistics. How can it be used to determine
relationships between variables?
7. What is ANOVA? Describe its application in comparing multiple groups.

Unit – IV: Model Development

1. Describe the process of developing a simple regression model. How does it differ
from multiple regression?
2. How can visualization techniques be used for model evaluation? Provide
examples of visual tools.
3. What are residual plots and distribution plots? How do they help in evaluating a
regression model?
4. Explain polynomial regression and the concept of pipelines in model
development.
5. Discuss various measures used for in-sample evaluation of models.
6. Describe the steps involved in making predictions and decision making using
regression models.
Unit – V: Model Evaluation

1. What is generalization error? Why is it important to minimize this error?


2. Describe different out-of-sample evaluation metrics used in model evaluation.
3. Explain the concept of cross-validation and its significance in model selection.
4. What is overfitting and underfitting? How can these issues be addressed?
5. Discuss the use of ridge regression for prediction. Provide a detailed example.
6. Explain the process of testing multiple parameters using grid search. How does it
aid in model optimization?

You might also like